next up previous contents
Next: Getting started Up: Multi-level Schwarz preconditioners Previous: Background   Contents


Algebraic multi-level Schwarz preconditioners based on smoothed aggregation

We outline a general framework for building and applying algebraic multilevel preconditioners, which encompasses all the multi-level preconditioners available in MLD2P4.

Given the linear system , where $A=(a_{ij}) \in \Re^{n \times n}$ is a nonsingular sparse matrix with a symmetric nonzero pattern, we assume as finest index space the set of row (column) indices of $A$, $\Omega=\{1, 2, \ldots, n\}$. An algebraic multilevel method generates a hierarchy of index spaces and a corresponding hierarchy of matrices,

\begin{displaymath}
\Omega^1 \equiv \Omega \supset \Omega^2 \supset \ldots \supset \Omega^{nlev},
\quad A^1 \equiv A, A^2, \ldots, A^{nlev},
\end{displaymath} (2)

by using the information contained in $A$, without assuming any knowledge of the geometry of the problem from which $A$ originates. A vector space $\Re^{n_{k}}$ is associated with $\Omega^k$, where $n_k$ is the size of $\Omega^k$. In the following we use the term contiguous for two index spaces, vector spaces or associated matrices that correspond to subsequent levels $k$ and $k+1$ of the hierarchy.

For each $k < nlev$, the method builds two maps between two contiguous vector spaces, i.e. a prolongation and a restriction operator

\begin{displaymath}
P^k: \Re^{n_{k+1}} \longrightarrow \Re^{n_k}, \quad
R^k: \Re^{n_k} \longrightarrow \Re^{n_{k+1}};
\end{displaymath} (3)

it is common to choose $R^k=(P^k)^T$. The matrix $A^{k+1}$ is computed by exploiting the previous maps, usually according to the Galerkin approach, i.e.
\begin{displaymath}
A^{k+1}=R^kA^kP^k.
\end{displaymath} (4)

A smoother $S^k$ is set, which provides an approximation of the inverse of the matrix $A^k$. At the coarsest level, a direct solver is generally considered to obtain the inverse of $A^{nlev}$. The process just described corresponds to the so-called build phase of the preconditioner and is sketched in Figure 1.
Figure 1: Build phase of a multilevel preconditioner.

\begin{algorithmic}
\State $A^1\gets A$, $\Omega^1\gets\Omega$\
\For {$k=1,...
...pute $A^{k+1}=R^kA^kP^k$
\State set up $S^{k+1}$
\EndFor
\end{algorithmic}

The components produced in the build phase may be combined in several ways to obtain different multilevel preconditioners (see [24, Chapters 2 and 3]); this is done in the application phase, i.e. in the computation of a vector of type $w=M^{-1}v$, where $M$ denotes the preconditioner, usually within an iteration of a Krylov solver. Two basic types of preconditioners can be identified, i.e. the additive and the multiplicative one. In the additive case, at each level the smoother and the coarse-level correction are applied to the restriction of the vector $v$ to that level, and the resulting vectors are added; this leads to the algorithm shown in Figure 2. In the multiplicative case, at each level the smoother and the coarse-level correction are applied in turn, the former on a vector resulting from the application of the latter and/or vice versa. An example of such a combination, where the smoother is applied before and after the coarse-level correction, is given in Figure 3; this is known as symmetrized multiplicative multilevel preconditioner or, more generally, V-cycle. Multiplicative multi-level preconditioners with pre-smoothing only are obtained by neglecting the last three statements in the second loop, while multiplicative multi-level ones with post-smoothing only correspond to setting $y^k=0$ in the first loop, i.e. to performing only the restriction $v^{k+1} \gets R^k v^k$ in the first loop and to changing into $y^{k} \gets P^k y^{k+1}$ the first statement of the second loop.

Figure 2: Application phase of an additive multilevel preconditioner.

\begin{algorithmic}
\State $v^1 \gets v$
\State ! Apply restrictor at all le...
...k} \gets y^k + P^k y^{k+1}$
\EndFor
\State $w \gets y^1$
\end{algorithmic}

At each level $k < nlev$, MLD2P4 implements, as smoothers, domain decomposition AS preconditioners [6,9,24]. Therefore, the multilevel preconditioners based on the multiplicative framework are referred to as hybrid, i.e. multiplicative among the levels and additive inside any single level. The point point-Jacobi smoother is available too; furthermore, multiple sweeps of each smoother can be performed.

In the AS methods the index space $\Omega^k$ is divided into $m_k$ subsets $\Omega^k_i$ of size $n_{k,i}$. The subsets may have an overlap $\delta$, which is defined using the adjacency graph $(\Omega^k, E^k)$ of the matrix $A^k$, where $\Omega^k$ is the vertex set and $E^k=\{(i,j) : a^k_{ij} \neq 0\}$ is the edge set. Two vertices are called adjacent if there is an edge connecting them. For any integer $\delta > 0$, a $\delta$-overlap partition of $\Omega^k$ is defined recursively as follows. Given a 0-overlap partition of $\Omega^k$, i.e. a set of $m_k$ disjoint nonempty sets of $\Omega^k_i(0) \subset \Omega^k$ such that $\cup_{i=1}^{m_k} \Omega^k_i(0) = \Omega^k$, a $\delta$-overlap partition of $\Omega^k$ is obtained by considering the sets $\Omega^k_i(\delta)) \supset \Omega^k_i(\delta-1)$ obtained by including the vertices that are adjacent to any vertex in $\Omega^k_i(\delta-1)$. In MLD2P4 the number of sets $\Omega^k_i$ matches the number of available processors and each processor is assigned one of these set. The 0-overlap partition at the finest level is determined by the (general row-block) distribution of the matrix to be preconditioned (see [17]).

Figure 3: Application phase of a symmetrized multiplicative multilevel preconditioner.

\begin{algorithmic}
\State $v^1 \gets v$
\State ! Apply smoother and restric...
... \State $y^k \gets y^k+r^k$
\EndFor
\State $w \gets y^1$
\end{algorithmic}

For each $i$ we consider the restriction operator $R_i^k: \Re^{n_k}
\longrightarrow \Re^{n_{k,i}}$ that maps a vector $v^k$ to the vector $v_i^k$ made of the components of $v^k$ with indices in $\Omega^k_i$, and the prolongation operator $P^k_i = (R_i^k)^T$. These operators are then used to build $A_i^k=R_i^kA^kP_i^k$, which is a restriction of $A^k$ to the index space $\Omega^k_i$. The classical AS preconditioner is defined as

\begin{displaymath}
S^k_{AS} = \sum_{i=1}^{m_k} P_i^k (A_i^k)^{-1} R_i^{k},
\end{displaymath}

where $A_i^k$ is supposed to be nonsingular. We observe that an approximate inverse of $A_i^k$ is usually considered instead of $(A_i^k)^{-1}$. The application of $S^k_{AS}$ to a vector $w^k \in \Re^{n_k}$, i.e. the computation of $z^k=S^k_{AS}w^k$, requires Variants of the classical AS method, which use modifications of the restriction and prolongation operators, are also implemented in MLD2P4. Among them, the Restricted AS (RAS) preconditioner usually outperforms the classical AS preconditioner in terms of convergence rate and of computation and communication time on parallel distributed-memory computers, and is therefore the most widely used among the AS preconditioners [7,16]. Direct solvers based on the LU factorization as well as approximate solvers based on the ILU factorization or on the point- and block-Jacobi iterative methods are implemented as smoothers at the coarsest level.

In MLD2P4 the hierarchy of index spaces (see ) and the corresponding mapping operators (see ) are built by applying the smoothed aggregation technique[2,22,28]. The basic idea is to build the coarse set of indices $\Omega^{k+1}$ by suitably grouping the indices of $\Omega^k$ into disjoint subsets (aggregates) and to define a simple ``tentative'' prolongator $\widetilde{P}^k$ whose range should contain the near null space of $A^k$; the final interpolation operator $P^k$ is formed by applying a suitable smoother to $\widetilde{P}^k$, in order to obtain low energy coarse basis functions and hence good convergence rates.

To build the aggregates we have implemented the coarsening algorithm sketched in [5]. According to [28], a modification of this algorithm has been actually considered, in which each aggregate $N^k_r$ is made of vertices of $\Omega^k$ that are strongly coupled to a certain root vertex $r \in \Omega^k$, i.e.

\begin{displaymath}N^k_r = \left\{s \in \Omega^k: \vert a^k_{rs}\vert > \theta \...
...t a^k_{rr}a^k_{ss}\vert} \right\}
\cup \left\{ r \right\} ,
\end{displaymath}

for a given $\theta \in [0,1]$. Since this algorithm has a sequential nature, a decoupled version of it has been considered, where each processor $i$ independently applies the algorithm to the set of vertices $\Omega^k_i(0)$ assigned to it according to the initial data distribution [27]. This version is embarrassingly parallel, since it does not require any data communication. On the other hand, it may produce non-uniform aggregates near boundary vertices, i.e. near vertices adjacent to vertices in other processors, and is strongly dependent on the number of processors and on the initial partitioning of the matrix $A$. Nevertheless, this algorithm has been chosen for the implementation in MLD2P4, since it has been shown to produce good results in practice [1,4,5,27].

The tentative prolongator $\widetilde{P}^k$ is piecewise constant interpolation operator, defined as

\begin{displaymath}
\widetilde{P}^k=(\widetilde{p}^k_{ij}), \quad \widetilde{p}...
...^k_j \\
0 & \quad \mbox{otherwise}
\end{array} \right. .
\end{displaymath} (5)

Then $P^k$ is computed as
\begin{displaymath}
P^k = S \widetilde{P}^k,
\end{displaymath} (6)

where
\begin{displaymath}
S = I - \omega D^{-1} A
\end{displaymath} (7)

is the damped Jacobi smoother and the value of $\omega$ can be chosen using some estimate of the spectral radius of $D^{-1}A$ [2].


next up previous contents
Next: Getting started Up: Multi-level Schwarz preconditioners Previous: Background   Contents