How to add a new scheme to Inciter
Contents
- Rationale and plan
- 1. Add a new keyword
- 2. Add new option switch
- 3. Add new Charm++ chare proxy in Scheme
- 4. Add new Charm++ chare array
-
5. New C++ class
- ALECG::ALECG – Constructor
- Transporter::comfinal() – Complete communication maps
- ALECG::setup() – Start setup
- ALECG::start() – Start time step
- ALECG::rhs() & ALECG::comrhs() – Compute and communicate right hand side
- ALECG::solve() – Solve
- ALECG::refine() – Optionally refine mesh
- ALECG::resizePostAMR() – Resize data after mesh refinement
- ALECG::lhs() – Compute LHS
- New time step stage
- 6. Making it all work
- 7. Augment unit tests for Scheme
- 8. Add new regression tests
Inciter (Compressible flow solver) supports multiple discretization schemes. This page describes how to add a scheme of your choice by walking through an example of adding a new one. We also discuss the main steps of the execution logic, which, at a high level, is the same for all discretization schemes.
Rationale and plan
Similar to the existing discretization schemes, DiagCG
, or DG
, the new scheme, ALECG
(short for Arbitrary Lagrangian-Eulerian Continuous Galerkin), will interact with Discretization
in a child-base fashion, e.g., will directly access (and reuse) its member data and functions. It will also intereact with Refiner
, for mesh refinement (AMR), and will also be migratable to enable dynamic load balancing. In essence, it will have everything an existing scheme has. However, we will not implement the low-level details of the actual numerical method, only the glue-code necessary to interact with the rest of the code and we make it ready to start implementing the low-level details of a particular discretization, done by a PDE
class, held behind a derived class of, e.g., CGPDE
or DGPDE
. For more details on how these classes interact, see also the Inciter software design page.
1. Add a new keyword
A specific discretization scheme is selected by the user in the control (input) file via the scheme
keyword, e.g., scheme diagcg
. We add the new keyword, alecg
, which then can be recognized by the control file parser, in src/
by adding the following code block:
Control/Keywords.hpp
$ git diff src/Control/Keywords.hpp diff --git a/src/Control/Keywords.hpp b/src/Control/Keywords.hpp index 002869cb..c18f193a 100644 --- a/src/Control/Keywords.hpp +++ b/src/Control/Keywords.hpp @@ -4607,6 +4607,19 @@ struct diagcg_info { }; using diagcg = keyword< diagcg_info, TAOCPP_PEGTL_STRING("diagcg") >; +struct alecg_info { + static std::string name() { return "ALE-CG with RK"; } + static std::string shortDescription() { return "Select continuous Galerkin " + "with ALE + Runge-Kutta"; } + static std::string longDescription() { return + R"(This keyword is used to select the continuous Galerkin finite element + scheme in the arbitrary Lagrangian-Eulerian (ALE) reference frame combined + with Runge-Kutta (RK) time stepping. See Control/Inciter/Options/Scheme.hpp + for other valid options.)"; } +}; +using alecg = keyword< alecg_info, TAOCPP_PEGTL_STRING("alecg") >; + struct dg_info { static std::string name() { return "DG(P0) + RK"; } static std::string shortDescription() { return
We also add the new keyword to inciter's grammar's keywords pool:
Control/Inciter/InputDeck/InputDeck.hpp
$ git diff src/Control/Inciter/InputDeck/InputDeck.hpp diff --git a/src/Control/Inciter/InputDeck/InputDeck.hpp b/src/Control/Inciter/InputDeck/InputDeck.hpp index 83572480..20ce8975 100644 --- a/src/Control/Inciter/InputDeck/InputDeck.hpp +++ b/src/Control/Inciter/InputDeck/InputDeck.hpp @@ -144,6 +144,7 @@ class InputDeck : kw::scheme, kw::matcg, kw::diagcg, + kw::alecg, kw::dg, kw::dgp1, kw::flux,
This is required so that the compiler can generate a database containing the help for all the keywords in the grammar understood by inciter's control file parser. The above changes not only add the keyword but also some documentation that gets displayed when passing the -C
or -H
command line arguments to the inciter executable, so quick help is available at the user's fingertips:
$ inciter -C inciter Control File Keywords: advdiff string Specify the advection + diffusion physics configuration for a PDE advection string Specify the advection physics configuration for a PDE alecg Select continuous Galerkin with ALE + Runge-Kutta algorithm string Select mesh partitioning algorithm alpha real Set PDE parameter(s) alpha ... $ inciter -H alecg inciter control file keyword 'alecg' Select continuous Galerkin with ALE + Runge-Kutta (RK) This keyword is used to select the continuous Galerkin finite element scheme in the arbitrary Lagrangian-Eulerian (ALE) reference frame combined with Runge-Kutta (RK) time stepping. See Control/Inciter/Options/Scheme.hpp for other valid options.
2. Add new option switch
Next is to add a new state to the existing Scheme option switch. This "option
switch" is really only a fancy enum, used to store the user's choice of the discretization scheme after parsing the control file in a type-safe manner. This fancy enum is an option switch because it inherits from tk::src/
.
3. Add new Charm++ chare proxy in Scheme
Scheme
is a class that implements concept-based runtime polymorphism for migratable Charm++ chare arrays using value semantics. Client code, e.g., Transporter
, interacts with Discretization
and its children via a uniform interface provided by Scheme
, which dispatches entry method calls to the correct class instance, the base or the child, and is capable of performing broadcasts as well as addressing a particular chare array element. Read more details at src/ALECG
scheme, besides the existing ones, we make the following changes:
Inciter/Scheme.hpp
$ git diff src/Inciter/Scheme.hpp diff --git a/src/Inciter/Scheme.hpp b/src/Inciter/Scheme.hpp index 61510d01..dea3d78a 100644 --- a/src/Inciter/Scheme.hpp +++ b/src/Inciter/Scheme.hpp @@ -22,6 +22,7 @@ #include "NoWarning/matcg.decl.h" #include "NoWarning/diagcg.decl.h" +#include "NoWarning/alecg.decl.h" #include "NoWarning/distfct.decl.h" #include "NoWarning/dg.decl.h" #include "NoWarning/discretization.decl.h" @@ -51,6 +52,8 @@ class Scheme { } else if (scheme == ctr::SchemeType::DiagCG) { proxy = static_cast< CProxy_DiagCG >( CProxy_DiagCG::ckNew(m_bound) ); fctproxy= CProxy_DistFCT::ckNew(m_bound); + } else if (scheme == ctr::SchemeType::ALECG) { + proxy = static_cast< CProxy_ALECG >( CProxy_ALECG::ckNew(m_bound) ); } else if (scheme == ctr::SchemeType::DG || scheme == ctr::SchemeType::DGP1) { proxy = static_cast< CProxy_DG >( CProxy_DG::ckNew(m_bound) ); @@ -75,11 +78,12 @@ class Scheme { const CkArrayOptions& arrayoptions() { return m_bound; } //! Variant type listing all chare proxy types modeling the same concept - using Proxy = boost::variant< CProxy_DiagCG, CProxy_DG >; + using Proxy = + boost::variant< CProxy_DiagCG, CProxy_ALECG, CProxy_DG >; //! Variant type listing all chare element proxy types (behind operator[]) using ProxyElem = boost::variant< CProxy_DiagCG::element_t, - CProxy_DG::element_t >; + CProxy_ALECG::element_t, CProxy_DG::element_t >; protected: //! Variant storing one proxy to which this class is configured for
4. Add new Charm++ chare array
Next is to add a new class, ALECG
, which will serve as the glue between Transporter
, Refiner
, and CGPDE
. These classes, respectively, are the driver, the mesh refiner, and the polymorphic vector of PDE discretization class objects that hold the low-level details of the numerical implementation of spatial discretizations, dispatching to multiple specific systems of equations, e.g., cg::
or cg::
.
We create the following new files:
- Inciter/
alecg.ci, Charm++ interface file for ALECG, - NoWarning/
alecg.decl.h and NoWarning/ alecg.def.h, which help ignore compiler warnings in Charm++-generated code, and - Inciter/
ALECG.hpp and Inciter/ ALECG.cpp, header and implementation of ALECG.
Before we discuss the details of the above new files, let's get a couple of simple things out of the way. We also need to add the new include to Refiner.hpp
so, e.g., it can call back to ALECG::resize() after a mesh refinement step:
Inciter/Refiner.hpp
$ git diff src/Inciter/Refiner.hpp diff --git a/src/Inciter/Refiner.hpp b/src/Inciter/Refiner.hpp index dfcb1ffd..4fe743a4 100644 --- a/src/Inciter/Refiner.hpp +++ b/src/Inciter/Refiner.hpp @@ -29,6 +29,7 @@ #include "SchemeBase.hpp" #include "DiagCG.hpp" +#include "ALECG.hpp" #include "DG.hpp" #include "NoWarning/transporter.decl.h"
We also tell the build system about our new ALECG
class and its Charm++ module:
Inciter/CMakeLists.txt
$ gd src/Inciter/CMakeLists.txt diff --git a/src/Inciter/CMakeLists.txt b/src/Inciter/CMakeLists.txt index 141055ec..e339b65b 100644 --- a/src/Inciter/CMakeLists.txt +++ b/src/Inciter/CMakeLists.txt @@ -14,6 +14,7 @@ add_library(Inciter Sorter.cpp DiagCG.cpp + ALECG.cpp DG.cpp FluxCorrector.cpp DistFCT.cpp @@ -74,6 +75,7 @@ addCharmModule( "refiner" "Inciter" ) addCharmModule( "sorter" "Inciter" ) addCharmModule( "matcg" "Inciter" ) addCharmModule( "diagcg" "Inciter" ) +addCharmModule( "alecg" "Inciter" ) addCharmModule( "distfct" "Inciter" ) addCharmModule( "dg" "Inciter" )
The addCharmModule
cmake macro above, defined in cmake/charm.cmake
, ensures that build target Inciter
will properly depend on our new alecg
Charm++ module, defined in Inciter/
. The macro also tells cmake how the two files, alecg.decl.h
and alecg.def.h
, are generated from alecg.ci
: using charmc
, a compiler wrapper that generates Charm++-code to make the ALECG
from an ordinary C++ class into a Charm++ chare array, with entry methods callable across the network, make it migratable, enable its structured DAGger, etc. See also the Charm++ manual.
Now to the new files. First is the new Charm++ interface file, Inciter/
Inciter/alecg.ci
This is the file that is parsed by Charm++'s compiler which then generates additional code that makes ALECG a Charm++ chare array, makes it migratable, etc. The full listing is at Inciter/
Inciter/alecg.ci – External modules and header includes
extern module transporter; extern module discretization; extern module ghosts; include "UnsMesh.hpp"; include "PUPUtil.hpp";
First we declare some external Charm++ modules that ALECG needs to interact with and thus from where we need type information. The extern module
statements are followed by some usual C++ include
s (without the #
): these are in the Charm++ interface file because the Charm++ code below requires type information from them.
Inciter/alecg.ci – 1D Charm++ chare array
array [1D] ALECG {
Next comes the specification of the ALECG Charm++ chare array. This is a 1D array whose elements at runtime will be distributed across the available processing elements and compute nodes. If load balancing is enabled, the array elements (C++ objects) are migrated to homogenize load across a simulation. Because the array is 1D, we use a single integer index to address a particular array element. Charm++ also allows multi-dimensional arrays which can be useful if the problem naturally maps to a multi-dimensional notion, e.g., partitioning a 3D Cartesian mesh, so index calculations to address array elements (and thus work-units) become cleaner.
Inciter/alecg.ci – Entry methods
entry ALECG( const CProxy_Discretization& disc, const CProxy_Ghosts& ghostsproxy, const std::map< int, std::vector< std::size_t > >& bface, const std::map< int, std::vector< std::size_t > >& bnode, const std::vector< std::size_t >& triinpoel ); initnode void registerReducers(); entry void setup(); entry void box( tk::real v, const std::vector< tk::real >& blkvol ); entry void resizeComm(); entry void nodeNeighSetup(); entry void transferSol(); entry void start(); entry void refine( const std::vector< tk::real >& l2ref ); entry [reductiontarget] void advance( tk::real newdt, tk::real ); entry void comdfnorm( const std::unordered_map< tk::UnsMesh::Edge, std::array< tk::real, 3 >, tk::UnsMesh::Hash<2>, tk::UnsMesh::Eq<2> >& dfnorm ); entry void comnorm( const std::unordered_map< int, std::unordered_map< std::size_t, std::array< tk::real, 4 > > >& innorm ); entry void comblk( const std::vector< std::size_t >& gid, const std::vector< std::set< std::size_t > >& mb ); entry void comChBndGrad( const std::vector< std::size_t >& gid, const std::vector< std::vector< tk::real > >& G ); entry void comrhs( const std::vector< std::size_t >& gid, const std::vector< std::vector< tk::real > >& R ); entry void resized(); entry void meshveldone(); entry void lhs(); entry void step(); entry void next(); entry void stage(); entry void evalLB( int nrestart );
We simply list those member functions of ALECG as entry methods, e.g., ALECG::ALECG::
. Note also that there is an initnode
entry method, ALECG::
which is a special member function that is also declared as static in the C++ sense (see ALECG.hpp). This is static because the runtime system must be able to call this function without creating an object and a lot earlier than the actual ALECG chare array elements are created. This is how custom reducers can be associated in Charm++ to a chare array. Such custom reducers are an excellent way to rely on the asynchronous, tree-based implementation of parallel reductions in Charm++ yet still do it on custom, arbitrarily complex data types, e.g., a hash-map that holds vectors, as long as one defines how aggregation is to be performed when merging such data. Such an example is given in Inciter/
Inciter/alecg.ci – Structured DAG
entry void wait4meshblk() { when ownblk_complete(), comblk_complete() serial "meshblk" { continueSetup(); } } entry void wait4norm() { when ownnorm_complete(), comnorm_complete(), owndfnorm_complete(), comdfnorm_complete(), transfer_complete() serial "norm" { mergelhs(); } } entry void wait4grad() { when owngrad_complete(), comgrad_complete() serial "grad" { rhs(); } } entry void wait4rhs() { when ownrhs_complete(), comrhs_complete() serial "rhs" { solve(); } } entry void wait4mesh() { when norm_complete(), resize_complete() serial "trans" { if (m_newmesh == 0) meshvelstart(); else transfer(); } } entry void ownblk_complete(); entry void comblk_complete(); entry void ownnorm_complete(); entry void comnorm_complete(); entry void owndfnorm_complete(); entry void comdfnorm_complete(); entry void transfer_complete(); entry void ownrhs_complete(); entry void comrhs_complete(); entry void owngrad_complete(); entry void comgrad_complete(); entry void norm_complete(); entry void resize_complete();
The entry methods, defined in the .ci
file and with when
keywords, form a structured directed acyclic graph (DAG). These specify logical relations among tasks and execution logic within the class. For example, wait4grad
tells the runtime system that only when owngrad_complete()
and comgrad_complete()
are both done will rhs()
be called. In this case, this construct ensures that the runtime system will call a member function that requires the assembled right-hand side, when both the local and external contributions are complete. Note that this logic only relates to a given array element, say with index 2. Another one, say index 3, may perform this operation at a very different time and independently, thus computation and communication can overlap. The entry methods listed at the bottom, e.g., owngrad_complete()
can be thought of as "labels" to the runtime system that help define the task logic. These labels are functions that the runtime system defines and we call them when the given task is complete. Note that the construct we used here, when A and B are both complete then do C, is probably the simplest task-logic Charm++ allows prescribing. There are many more advanced ways of expressing such logic, e.g., using loops. For more details, see Section Structured Control Flow: Structured Dagger in the Charm++ manual.
NoWarning/alecg.decl.h and NoWarning/alecg.def.h
The newly added files to the NoWarning/
directory simply include the Charm++-generated alecg.decl.h
and alecg.def.h
files and locally, around the include, turn off specific compiler warnings for various compilers – we will not discuss them here further. Full listings are at NoWarning/alecg.decl.h and NoWarning/alecg.def.h.
5. New C++ class
Next are the newly added Inciter/
ALECG::ALECG – Constructor
{ usesAtSync = true; // enable migration at AtSync auto d = Disc(); // Perform optional operator-access-pattern mesh node reordering if (g_inputdeck.get< tag::operator_reorder >()) { // Create new local ids based on access pattern of PDE operators std::unordered_map< std::size_t, std::size_t > map; std::size_t n = 0; for (std::size_t p=0; p<m_u.nunk(); ++p) { // for each point p if (map.find(p) == end(map)) map[p] = n++; for (auto q : tk::Around(m_psup,p)) { // for each edge p-q if (map.find(q) == end(map)) map[q] = n++; } } Assert( map.size() == d->Gid().size(), "Map size mismatch" ); // Remap data in bound Discretization object d->remap( map ); // Recompute elements surrounding points m_esup = tk::genEsup( d->Inpoel(), 4 ); // Recompute points surrounding points m_psup = tk::genPsup( d->Inpoel(), 4, m_esup ); // Remap boundary triangle face connectivity tk::remap( m_triinpoel, map ); } // Query/update boundary-conditions-related data structures from user input queryBnd(); // Activate SDAG wait for initially computing normals, and mesh blocks thisProxy[ thisIndex ].wait4norm(); thisProxy[ thisIndex ].wait4meshblk(); d->comfinal(); }
As discussed in Section Creating workers on the Inciter software design page, the worker chare array elements, such as ALECG, are created using Charm++'s dynamic array insertion feature. This is an asynchronous call, issued from Sorter::
In the constructor's body, listed above, various initialization steps are executed, including enabling migration for the class. Mesh-to-mesh solution transfer is also configured, calling member functions of Discretization, which then eventually signals the runtime system that extra communication buffers (other than those alrady stored in Discretization), specific to this particular ALECG scheme, have been created. This is a reduction call, issued by all array elements, eventually calling the reduction target Transporter::
a single time.
Transporter::comfinal() – Complete communication maps
{ auto meshid = tk::cref_find( m_meshid, static_cast<std::size_t>(summeshid) ); if (initial > 0) { m_scheme[meshid].bcast< Scheme::setup >(); // Turn on automatic load balancing if (++m_ncom == m_nelem.size()) { // all worker arrays have finished m_ncom = 0; auto print = printer(); m_progWork.end( print ); tk::CProxy_LBSwitch::ckNew(); print.diag( "Load balancing on (if enabled in Charm++)" ); } } else { m_scheme[meshid].bcast< Scheme::lhs >(); } }
Though asynchronously executed, the reduction operation targeting Transporter::
is a global synchronization point: all chares arrive in that function body, synchronized, and all continue from there again by calling ALECG::
Transporter::setup()
member function can be invoked. This is because setup()
is allowed to start using those communication maps that have been constructed before calling setup().
ALECG::setup() – Start setup
void ALECG::setup() // ***************************************************************************** // Start setup for solution // ***************************************************************************** { auto d = Disc(); // Determine nodes inside user-defined IC box g_cgpde[d->MeshId()].IcBoxNodes( d->Coord(), d->Inpoel(), d->ElemBlockId(), m_boxnodes, m_nodeblockid, m_nusermeshblk ); // Communicate mesh block nodes to other chares on chare-boundary if (d->NodeCommMap().empty()) // in serial we are done comblk_complete(); else // send mesh block information to chare-boundary nodes to fellow chares for (const auto& [c,n] : d->NodeCommMap()) { // data structure assigning block ids (set of values) to nodes (index). // although nodeblockid is a map with key-blockid and value-nodeid, the // sending data structure has to be inverted, because of how communication // data is handled. std::vector< std::set< std::size_t > > mb( n.size() ); std::size_t j = 0; for (auto i : n) { for (const auto& [blid, ndset] : m_nodeblockid) { // if node was found in a block, add to send-data if (ndset.find(tk::cref_find(d->Lid(),i)) != ndset.end()) mb[j].insert(blid); } if (m_nusermeshblk > 0) Assert(mb[j].size() > 0, "Sending no block data for node"); ++j; } thisProxy[c].comblk( std::vector<std::size_t>(begin(n),end(n)), mb ); } ownblk_complete(); } void ALECG::comblk( const std::vector< std::size_t >& gid, const std::vector< std::set< std::size_t > >& mb ) // ***************************************************************************** // Receive mesh block information for nodes on chare-boundaries //! \param[in] gid Global mesh node IDs at which we receive RHS contributions //! \param[in] mb Block ids for each node on chare-boundaries //! \details This function receives mesh block information for nodes on chare //! boundaries. While m_nodeblockid stores block information for own nodes, //! m_nodeblockidc collects the neighbor chare information during //! communication. This way work on m_nodeblockid and m_nodeblockidc is //! overlapped. The two are combined in continueSetup(). // ***************************************************************************** { Assert( mb.size() == gid.size(), "Size mismatch" ); for (std::size_t i=0; i<gid.size(); ++i) { for (const auto& blid : mb[i]) { m_nodeblockidc[blid].insert(gid[i]); } } // When we have heard from all chares we communicate with, this chare is done if (++m_nmblk == Disc()->NodeCommMap().size()) { m_nmblk = 0; comblk_complete(); } } void ALECG::continueSetup() // ***************************************************************************** // Continue setup for solution, after communication for mesh blocks // ***************************************************************************** { auto d = Disc(); // Combine own and communicated mesh block information for (const auto& [blid, ndset] : m_nodeblockidc) { for (const auto& i : ndset) { auto lid = tk::cref_find(d->Lid(), i); m_nodeblockid[blid].insert(lid); } } // clear receive buffer tk::destroy(m_nodeblockidc); // Compute volume of user-defined box IC d->boxvol( m_boxnodes, m_nodeblockid, m_nusermeshblk ); // Query time history field output labels from all PDEs integrated const auto& hist_points = g_inputdeck.get< tag::history_output, tag::point >(); if (!hist_points.empty()) { std::vector< std::string > histnames; auto n = g_cgpde[d->MeshId()].histNames(); histnames.insert( end(histnames), begin(n), end(n) ); d->histheader( std::move(histnames) ); } }
In the ALECG::
The current DAG in Inciter/alecg.ci – Structured DAG should be consulted for the task logic. When the setup phase is done, ALECG::
ALECG::start() – Start time step
Eventually called by ALECG::dt
allowed in the give time step across the whole problem:
conserved( m_u, Disc()->Vol() ); if (g_inputdeck.get< tag::steady_state >()) { // compute new dt for each mesh point g_cgpde[d->MeshId()].dt( d->It(), d->Vol(), m_u, m_dtp ); // find the smallest dt of all nodes on this chare mindt = *std::min_element( begin(m_dtp), end(m_dtp) ); } else { // compute new dt for this chare // find the smallest dt of all equations on this chare auto eqdt = g_cgpde[d->MeshId()].dt( d->Coord(), d->Inpoel(), d->T(), d->Dtn(), m_u, d->Vol(), d->Voln() ); if (eqdt < mindt) mindt = eqdt; } volumetric( m_u, Disc()->Vol() );
The above code snippet shows a for
loop that calls the the dt()
member function of all types of PDEs configured by the user and finds the minimum size of the next time step.
// Actiavate SDAG waits for next time step stage thisProxy[ thisIndex ].wait4grad(); thisProxy[ thisIndex ].wait4rhs(); // Contribute to minimum dt across all chares and advance to next step contribute( sizeof(tk::real), &mindt, CkReduction::min_double, CkCallback(CkReductionTarget(ALECG,advance), thisProxy) );
Once we have the time step size, we enable a couple of SDAG waits to get ready for some communication steps and issue a reduction to ALECG::dt
across all chares. advance()
saves the new time step in Discretization::
, which is its master copy, then starts computing the right hand sides of all PDEs integrated, which requires the primitive variable gradients first (requiring its own communication step).
ALECG::rhs() & ALECG::comrhs() – Compute and communicate right hand side
When both the own and communicated contributions are complete on a chare, the runtime system continues as described in the DAG in alecg.ci. ALECG::
ALECG::solve() – Solve
// Recompute mesh volumes if ALE is enabled if (g_inputdeck.get< tag::ale, tag::ale >()) { transfer_complete(); // Save nodal volumes at previous time step stage d->Voln() = d->Vol(); // Prepare for recomputing the nodal volumes d->startvol(); auto meshid = d->MeshId(); contribute( sizeof(std::size_t), &meshid, CkReduction::nop, CkCallback(CkReductionTarget(Transporter,resized), d->Tr()) ); } else { norm_complete(); resized(); }
The above code snippet shows what happens immediately after advancing the solution on a chare. If ALE mesh movement is enabled, the mesh nodes have been moved in physical space, so the nodal volumes need to be recomputed so we get ready for this. This is followed by optionally computing diagnostics, which is a catch-all phrase for various norms and integral quantities, see Inciter/m_diag.compute()
returns true, diagnostics have been computed in this time step. If diagnostics have been computed, their correct values require global reduction operations, performing different aggregation operations depending on the value. As almost all reductions, diagnostics are also collected by Transporter, this time in target Transporter::
, which calls back, via a broadcast, to ALECG::
ALECG::refine() – Optionally refine mesh
void ALECG::refine( const std::vector< tk::real >& l2res ) // ***************************************************************************** // Optionally refine/derefine mesh //! \param[in] l2res L2-norms of the residual for each scalar component //! computed across the whole problem // ***************************************************************************** { auto d = Disc(); const auto steady = g_inputdeck.get< tag::steady_state >(); const auto residual = g_inputdeck.get< tag::residual >(); const auto rc = g_inputdeck.get< tag::rescomp >() - 1; if (steady) { // this is the last time step if max time of max number of time steps // reached or the residual has reached its convergence criterion if (d->finished() or l2res[rc] < residual) m_finished = 1; } else { // this is the last time step if max time or max iterations reached if (d->finished()) m_finished = 1; } auto dtref = g_inputdeck.get< tag::amr, tag::dtref >(); auto dtfreq = g_inputdeck.get< tag::amr, tag::dtfreq >(); // Activate SDAG waits for re-computing the normals m_newmesh = 1; // recompute normals after AMR (if enabled) thisProxy[ thisIndex ].wait4norm(); thisProxy[ thisIndex ].wait4mesh(); // if t>0 refinement enabled and we hit the frequency if (dtref && !(d->It() % dtfreq)) { // refine // Convert to conserved unknowns, since the next step changes volumes conserved(m_u, d->Vol()); m_refinedmesh = 1; d->startvol(); d->Ref()->dtref( m_bface, m_bnode, m_triinpoel ); d->refined() = 1; } else { // do not refine m_refinedmesh = 0; d->refined() = 0; norm_complete(); resized(); } }
The above snippet shows that mesh refinement happens only at every few time step with its frequency configured by the user. If the mesh is not refined, we simply enable the SDAG waits associated to the tasks of the mesh refinement step. If the mesh is refined, we call a member function of the mesh refiner object held by Discretization, Refiner::
ALECG::resizePostAMR() – Resize data after mesh refinement
void ALECG::resizePostAMR( const std::vector< std::size_t >& /*ginpoel*/, const tk::UnsMesh::Chunk& chunk, const tk::UnsMesh::Coords& coord, const std::unordered_map< std::size_t, tk::UnsMesh::Edge >& addedNodes, const std::unordered_map< std::size_t, std::size_t >& /*addedTets*/, const std::set< std::size_t >& removedNodes, const std::unordered_map< std::size_t, std::size_t >& amrNodeMap, const tk::NodeCommMap& nodeCommMap, const std::map< int, std::vector< std::size_t > >& bface, const std::map< int, std::vector< std::size_t > >& bnode, const std::vector< std::size_t >& triinpoel, const std::unordered_map< std::size_t, std::set< std::size_t > >& elemblockid ) // ***************************************************************************** // Receive new mesh from Refiner //! \param[in] ginpoel Mesh connectivity with global node ids //! \param[in] chunk New mesh chunk (connectivity and global<->local id maps) //! \param[in] coord New mesh node coordinates //! \param[in] addedNodes Newly added mesh nodes and their parents (local ids) //! \param[in] addedTets Newly added mesh cells and their parents (local ids) //! \param[in] removedNodes Newly removed mesh node local ids //! \param[in] amrNodeMap Node id map after amr (local ids) //! \param[in] nodeCommMap New node communication map //! \param[in] bface Boundary-faces mapped to side set ids //! \param[in] bnode Boundary-node lists mapped to side set ids //! \param[in] triinpoel Boundary-face connectivity //! \param[in] elemblockid Local tet ids associated with mesh block ids // ***************************************************************************** { auto d = Disc(); d->Itf() = 0; // Zero field output iteration count if AMR ++d->Itr(); // Increase number of iterations with a change in the mesh // Resize mesh data structures after mesh refinement d->resizePostAMR( chunk, coord, amrNodeMap, nodeCommMap, removedNodes, elemblockid ); // Remove newly removed nodes from solution vectors m_u.rm(removedNodes); m_un.rm(removedNodes); m_rhs.rm(removedNodes); // Resize auxiliary solution vectors auto npoin = coord[0].size(); auto nprop = m_u.nprop(); m_u.resize( npoin ); m_un.resize( npoin ); m_rhs.resize( npoin ); m_chBndGrad.resize( d->Bid().size() ); tk::destroy(m_esup); tk::destroy(m_psup); m_esup = tk::genEsup( d->Inpoel(), 4 ); m_psup = tk::genPsup( d->Inpoel(), 4, m_esup ); // Update solution on new mesh for (const auto& n : addedNodes) for (std::size_t c=0; c<nprop; ++c) { Assert(n.first < m_u.nunk(), "Added node index out of bounds post-AMR"); Assert(n.second[0] < m_u.nunk() && n.second[1] < m_u.nunk(), "Indices of parent-edge nodes out of bounds post-AMR"); m_u(n.first,c) = (m_u(n.second[0],c) + m_u(n.second[1],c))/2.0; } // Update physical-boundary node-, face-, and element lists m_bnode = bnode; m_bface = bface; m_triinpoel = tk::remap( triinpoel, d->Lid() ); auto meshid = d->MeshId(); contribute( sizeof(std::size_t), &meshid, CkReduction::nop, CkCallback(CkReductionTarget(Transporter,resized), d->Tr()) ); }
The above snippet shows the function that is called by Refiner when it finished mesh refinement. Besides resizing the mesh-related data held locally by ALECG, e.g., ALECG::
ALECG::lhs() – Compute LHS
void ALECG::lhs() // ***************************************************************************** // Compute the left-hand side of transport equations //! \details Also (re-)compute all data structures if the mesh changed. // ***************************************************************************** { // No need for LHS in ALECG // (Re-)compute boundary point-, and dual-face normals norm(); }
As the above ALECG::
void ALECG::mergelhs() // ***************************************************************************** // The own and communication portion of the left-hand side is complete // ***************************************************************************** { // Combine own and communicated contributions of normals normfinal(); if (Disc()->Initial()) { volumetric( m_u, Disc()->Vol() ); // Output initial conditions to file writeFields( CkCallback(CkIndex_ALECG::start(), thisProxy[thisIndex]) ); } else { norm_complete(); } }
The own and communication contributions to the normal vectors are combined in ALECG::wait4mesh
, which resizes the data structures that change after a mesh refinement step.
New time step stage
When all of all necessary data structures have been recomputed after the optional AMR step, execution continues in ALECG::
void ALECG::stage() // ***************************************************************************** // Evaluate whether to continue with next time step stage // ***************************************************************************** { transfer_complete(); // Increment Runge-Kutta stage counter ++m_stage; // if not all Runge-Kutta stages complete, continue to next time stage, // otherwise output field data to file(s) if (m_stage < 3) chBndGrad(); else out(); }
If it is the last Runge-Kutta stage, we optionally output results to file then continue to the next time step calling ALECG::
6. Making it all work
Only a couple of minor, but important, steps remain. First we add the new Charm++ module as an external module in inciter's Charm++ module. This is required so that all Charm++ code that references the new ALECG Charm++ chare array is visible and can correctly interact with Inciter's main charm chare.
Main/inciter.ci
$ git diff src/Main/inciter.ci diff --git a/src/Main/inciter.ci b/src/Main/inciter.ci index bf7eac98..e9b114b6 100644 --- a/src/Main/inciter.ci +++ b/src/Main/inciter.ci @@ -14,6 +14,7 @@ mainmodule inciter { extern module partitioner; extern module matcg; extern module diagcg; + extern module alecg; extern module dg; extern module charestatecollector;
The second, and final, step is to enable triggering the instantiation of specialized CGPDE class objects for our new ALECG scheme when the system of systems is instantiated. This associates the type of generic PDE systems that is used to instantiate the PDE classes, selected by user configuration. Since ALECG will be a node-centered scheme, we assign it to use the CGPDE polymorphic interface (instead of DGPDE, which is tailored for cell-centered discretizations).
PDE/PDEStack.cpp
$ git diff src/PDE/PDEStack.cpp diff --git a/src/PDE/PDEStack.cpp b/src/PDE/PDEStack.cpp index 438cb5e3..9b2e14e7 100644 --- a/src/PDE/PDEStack.cpp +++ b/src/PDE/PDEStack.cpp @@ -108,7 +108,9 @@ PDEStack::selectedCG() const std::vector< CGPDE > pdes; // will store instantiated PDEs const auto sch = g_inputdeck.get< tag::discr, tag::scheme >(); - if (sch == ctr::SchemeType::DiagCG) { + if (sch == ctr::SchemeType::DiagCG || sch == ctr::SchemeType::ALECG) { for (const auto& d : g_inputdeck.get< tag::selected, tag::pde >()) { if (d == ctr::PDEType::TRANSPORT) pdes.push_back( createCG< tag::transport >( d, cnt ) ); else if (d == ctr::PDEType::COMPFLOW) pdes.push_back( createCG< tag::compflow >( d, cnt ) ); else Throw( "Can't find selected CGPDE" ); } }
7. Augment unit tests for Scheme
Though this is not strictly necessary, we also augment the unit tests of Scheme exercising our new discretization scheme:
$ git diff develop src/UnitTest/TUTSuite.hpp tests/unit/Inciter/TestScheme.cpp diff --git a/src/UnitTest/TUTSuite.hpp b/src/UnitTest/TUTSuite.hpp index 191b3972..dd904b02 100644 --- a/src/UnitTest/TUTSuite.hpp +++ b/src/UnitTest/TUTSuite.hpp @@ -61,7 +61,7 @@ class TUTSuite : public CBase_TUTSuite { { "Base/Factory", 2 } , { "Base/PUPUtil", 14 } , { "Base/Timer", 1 } - , { "Inciter/Scheme", 3 } + , { "Inciter/Scheme", 4 } }; // Tests that must be run on PE 0 diff --git a/tests/unit/Inciter/TestScheme.cpp b/tests/unit/Inciter/TestScheme.cpp index 6dc48c75..e4acfce4 100644 --- a/src/UnitTest/tests/Inciter/TestScheme.cpp +++ b/src/UnitTest/tests/Inciter/TestScheme.cpp @@ -84,6 +84,8 @@ void Scheme_object::test< 1 >() { ensure_equals( "Underlying type", c.which(), 1 ); inciter::Scheme d( inciter::ctr::SchemeType::DG ); ensure_equals( "Underlying type", d.which(), 2 ); + inciter::Scheme a( inciter::ctr::SchemeType::ALECG ); + ensure_equals( "Underlying type", a.which(), 3 ); } //! Test if operator[] returns the correct underlying type @@ -97,6 +99,8 @@ void Scheme_object::test< 2 >() { ensure_equals( "Underlying element type", c.which_element(), 1 ); inciter::Scheme d( inciter::ctr::SchemeType::DG ); ensure_equals( "Underlying element type", d.which_element(), 2 ); + inciter::Scheme a( inciter::ctr::SchemeType::ALECG ); + ensure_equals( "Underlying element type", a.which_element(), 3 ); } @@ -162,6 +166,27 @@ void Scheme_object::test< 5 >() { inciter::Scheme( inciter::ctr::SchemeType::DG ), 2, "DG" ); } +//! Test Pack/Unpack of Scheme holding CProxy_AELCG +//! \details Every Charm++ migration test, such as this one, consists of two +//! unit tests: one for send and one for receive. Both trigger a TUT test, +//! but the receive side is created manually, i.e., without the awareness of +//! the TUT library. Unfortunately thus, there is no good way to count up +//! these additional tests, and thus if a test such as this is added to the +//! suite this number must be updated in UnitTest/TUTSuite.hpp in +//! unittest::TUTSuite::m_migrations. +template<> template<> +void Scheme_object::test< 6 >() { + // This test spawns a new Charm++ chare. The "1" at the end of the test name + // signals that this is only the first part of this test: the part up to + // firing up an asynchronous Charm++ chare. The second part creates a new test + // result, sending it back to the suite if successful. If that chare never + // executes, the suite will hang waiting for that chare to call back. + set_test_name( "Charm:migrate Scheme(ALECG) 1" ); + + CProxy_Receiver::ckNew( + inciter::Scheme( inciter::ctr::SchemeType::ALECG ), 3, "ALECG" ); +}
Now that we will test ALECG using the unit test harness, UnitTest, we also have to make the UnitTest build target depend on the new ALECG Charm++ module:
$ git diff src/UnitTest/CMakeLists.txt diff --git a/src/UnitTest/CMakeLists.txt b/src/UnitTest/CMakeLists.txt index bb740285..e0ea47fe 100644 --- a/src/UnitTest/CMakeLists.txt +++ b/src/UnitTest/CMakeLists.txt @@ -48,6 +48,7 @@ add_dependencies("UnitTest" "unittestCharmModule") if (ENABLE_INCITER) add_dependencies("UnitTest" "matcgCharmModule") add_dependencies("UnitTest" "diagcgCharmModule") + add_dependencies("UnitTest" "alecgCharmModule") add_dependencies("UnitTest" "distfctCharmModule") add_dependencies("UnitTest" "dgCharmModule") add_dependencies("UnitTest" "discretizationCharmModule")
8. Add new regression tests
Finally, we also add a bunch of new regression tests that stress-test the asynchronous logic in the discretization scheme classes:
$ git diff tests/regression/inciter/transport/SlotCyl/asynclogic/CMakeLists.txt index b54a207d..62732129 100644 --- a/tests/regression/inciter/transport/SlotCyl/asynclogic/CMakeLists.txt +++ b/tests/regression/inciter/transport/SlotCyl/asynclogic/CMakeLists.txt @@ -1,7 +1,7 @@ # See cmake/add_regression_test.cmake for documentation on the arguments to # add_regression_test(). -foreach(scheme matcg diagcg dg) +foreach(scheme matcg diagcg dg alecg) foreach(virt 0.0 0.5 0.9) foreach(npes RANGE 1 8) add_regression_test(asynclogic_${scheme}_${virt} ${INCITER_EXECUTABLE}