The parareal algorithm for American options

This note provides a description of the parareal method for American contracts, a numerical section to assess its performance. The scalar case is investigated. Least-Square Monte Carlo (LSMC) and parareal time decomposition with two or more levels are used, leading to an eﬃcient parallel implementation. It contains also a convergence argument for the two-level parareal Monte Carlo method when the time step used for the Euler scheme at each level is appropriate. This argument provides also a tool for analyzing the multilevel case.


Introduction
In quantitative finance, risk assessment is computer intensive and expensive, and there is a market for cheaper and faster methods, as seen from the large literature on parallelism and GPU implementation of numerical methods for option pricing [1,6,7,11,12,[14][15][16]21].
American contracts are not easy to compute on a parallel computer; even if a large number of them have to be computed at once, an embarrassingly parallel problem, still the cost of the transfer of data makes parallelism at the level of one contract attractive.But the task is not easy, especially when the number of underlying assets is large [3,5,22], ruling out the PDE approach [2].Furthermore the most popular sequential algorithm is the Least-Square Monte Carlo (LSMC) method of Longstaff and Schwartz [19].Exploiting parallelism by allocating blocks of Monte Carlo paths to different processors is not convincingly efficient [7], because the backward regression is essentially sequential and needs all Monte Carlo paths in the same processor.
In this note, we investigate the parareal method, introduced in [18], for the task.An earlier study by Bal and Maday [4] has paved the way but it is restricted to Stochastic Differential Equations (SDE) without LSMC.Yet it contains a convergence proof for the two-level method in the restricted case where the solution is computed exactly at the lowest level [4].
This note provides a description of the parareal method for American contracts, a numerical section to assess its performance.The scalar case is investigated.Least-Square Monte Carlo (LSMC) and parareal time decomposition with two or more levels are used, leading to an efficient parallel implementation.It contains also a convergence argument for the two-level parareal Monte Carlo method when the time step used for the Euler scheme at each level is appropriate.
Convergence of LSMC for American contracts has been proved by Clément, Lamberton and Protter [9]; it is not unreasonable to expect an extension of their estimates for the parareal method, but this note does not contain such a result, only a numerical assessment.

The problem
With the usual notations [17], consider a probability space ( , A, P), and functions b, A (vanilla) European contract on X is defined by its maturity T and its payoff in the case of a put of strike price κ and interest rate r.An American style contract allows the owner to claim the payoff f (t, X t ) at any time ∈ [0, T ].So a rational strategy to maximize the average profit V at time t is to find the [t, T ]-valued F -stopping time solution to the Snell envelope problem: where F = (F t ) t∈(0,T ) is the (augmented) filtration of W and T F t denotes the set of [t, T ]-valued F -stopping times.Such an optimal stopping time exists (see [8,20]).We do not specify b, σ or f to stay in a general Optimal Stopping framework.
In practice, American style options are replaced by the so-called Bermuda options where the exercise instants are restricted to a time grid t k = kh, k = 0, . . ., K , where h = T K (K ∈ N * ).Owing to the Markov property of {X t k } K k=0 , the corresponding Snell envelope reads (V (t k , X t k )) k=0,...,K and satisfies a Backward Dynamic Programming recursion on k: The optimal stopping times τ k (from time t k ) are given by a similar backward recursion: When (X t k ) k=0,...,K cannot be simulated at a reasonable computational cost, it can be approximated by the Euler scheme with step h, denoted ( Xh t k ) k=0,...,K , which is a simulable Markov chain recursively defined by where From now on we switch to the Euler scheme, its Snell envelope, etc.
In LSMC, for each k, the conditional expectation t k , is approximated by its L 2 -projection on the linear space spanned by the monomials {x p } P p=0 from the values {e −rh V (t k+1 , Xh,(m) by M Monte Carlo paths using (4); then each path has its own optimal stopping time at each k ∈ {0, . . ., K − 1} given by (for the stopping problem starting at k) Xh,(m) .
Finally the price of the American contract is computed by in the least-square sense (e.g., Longstaff and Schwartz's paper [19]) in the vector subspace ( Xh t k ) p , p = 0 : P of L 2 (P).

The parareal method
Consider an ODE Assume that G δ (x k , t k ) is a high-precision integrator that computes x at t k+1 from x k at t k .Assume G is a similar integrator but of low precision.The parareal algorithm is an iterative process with n = 0, . . ., N − 1 above a forward loop in time, k = 0, . . ., K − 1 ( So the coarse-grid solution is corrected by the difference between the fine-grid prediction computed from the old value on that interval and the coarse-grid old solution.In this analysis, G δ and G are Euler explicit schemes with time steps δt < t respectively. The same method can be applied to an SDE in the context of the Monte Carlo method provided the random variables {Z m k, j } m=1,...,M j:=1,..., J −1,k:=1,...,K defining W k in (4) are sampled once and for all in the initial phase of the algorithm and reused for all n (see the initialization step in Algorithm 3.2 for the notations).
The iterative process ( 5) is applied on each sample path with G a single step of (4) with h = t and G δ the result of J steps of (4) with h = δt.An error analysis is available in [4] for the stochastic case in the limit case δt = 0, i.e. when the fine integrator is the exact solution.For further results of the parareal method applied to deterministic ODEs and PDEs, see [18] and [13].In this note, we also extend the result of [4] to the case 0 < δt < t.

Algorithm
We denote by V k a realization of V (t k , X t k ), k = 0, . . ., K = T t ; consider a refinement of each interval (t k , t t k+1 ) by a uniform sub-partition of time step δt = t J , for some integer J > 1.Then Denote by P f the L 2 -projection of f on the monomials 1, x, . . ., x P .Let n = 0, . . ., N − 1 be the iteration index of the parareal algorithm.
(ii) compute the coarse-grid solution at t k+1 : X t k+1 = Xn+1 Note that all fine-grid computations are local and can be allocated to a separate processor for each k, for parallelization.
The following partial results can be established for Algorithm 3.2bis obtained from 3.2 by changing the first step into Ṽ δ,n k, J = P E( V n k+1 | Xδ,n k, J ) and the last step into: Then there exist C , independent of k, t and n, such that for k = 0, . . ., K , n = 0, . . ., N: Furthermore, Xn t k = Xδ t k for all n ≥ k (e.g., definition (iii)).

Corollary 3.2. For a fixed δt and n parareal iterations, the final and uniform errors satisfy
respectively where C only depends on the Lipschitz constants and norms of b, b , b , σ , σ , σ and on T.
This estimate shows that when t << C , the method converges exponentially in n and geometrically in t.
Remark 2. The estimate (6) indicates that a recursive use of parareal with each sub-interval redivided into J = O ( t −1 ) smaller intervals, the so-called multilevels parareal, is better than many iterations at the second level only.Indeed, as the error decreases proportionally to ( t) n 2 at each level and as t becomes t 2 at the next grid level, the error after L levels is decreased by ( t) nL 2 .
where T F t k denotes the set of {t k , t k+1 , . . . ,t K }-valued F -stopping times.Then, for some constant C , max k=0,...,K Ṽ ,n (Note that ( V ,δ t k ) k=0,...,K is but the coarse Snell envelope of the refined Euler scheme.)At a fixed time t k , we have the better estimate Ṽ ,n ,K denote the "fine" Snell envelope of the refined Euler scheme at times t k defined by Then, for some constant C , Remark 3. A result similar to (a) can be obtained for ( V n t k ) k=0,...,K , i.e. when F t k is replaced by Xn t k in the expectation defining V n t k at the cost of losing a √ t in the error estimate.Both quantities Ṽ ,n t k and V n t k do not coincide as Xn is not Markovian (it also depends on Xn−1 ).

Table 1
Absolute error from the American payoff computed on the fine grid by a sequential LSMC standard algorithm and the same computed using the parareal iterative Algorithms 3.2 and 3.2bis.The coarse grid has K intervals; the coarse time step is t/K ; the fine grid has a fixed number of points, hence each interval (t k , t t k+1 ) has J sub-intervals.The top 4 lines of numbers correspond to Algorithm 3.2, while the last 4 lines correspond to Algorithm 3.2bis, for which a partial convergence estimate can be obtained, but which does not work as well numerically.

The Black-Scholes case
Here the underlying asset is given by the Black-Scholes SDE, σ (x, t) = σ 0 x, b(x, t) = rx.In the test, σ 0 = 0.2.We have chosen a fine grid with δt = T /32.The free parameters are t, which governs the number of points on the coarse grid and n the number of parareal algorithms.The error between the American payoff computed on the fine grid by LSMC and the same computed by the parareal algorithm is displayed on Table 1 for both Algorithms 3.2 and 3.2bis.
The same information about convergence is now displayed in the two graphs in Fig. 1 for the errors versus t and the errors versus n.We were not able to decrease t to smaller values because the computing time becomes too large.

The constant elasticity case
The volatility is now a function of price [10]: σ (x, t) = σ 0 x 0.7 .All parameters have the same values as above.The results are shown in Fig. 2.

Multilevel parareal algorithm
The previous construction being recursive, one can again apply the two-level parareal algorithm to LSMC on each interval [t k , t t k+1 ].The problem of finding the optimal strategy for parallelism and computing time is complex, because there are so many parameters; in what follows, the number of levels is L = 4; furthermore, when an interval with J + 1 points is divided into sub-intervals, each one is endowed with a partition using J + 1 points as well.So, if the coarse grid has K intervals,

Table 2
Absolute error between the computed payoff with the multilevels parareal method and the reference value of Longstaff-Schwartz.The number of levels is L = 4, each level is subdivided into K intervals; K 4 is the number of intervals at the deepest level.

Table 3
Absolute error between the computed payoff with the multilevel parareal method and the reference value of Longstaff-Schwartz.There are L = 4 levels; at level l − 1, to obtain level l each interval is divided into J l intervals.The errors are given versus the number of parareal iterations n = 1, 2, 3, 4. Note that all subdivisions give more or less the same precision; computationally and for parallelism the last one is the best.

Time-step Total
Absolute-error   the 4th grid has K 4 intervals.The results are compared with the reference value of Longstaff-Schwartz, 4.478, shown in Table 2 and in Fig. 3.The number of parareal iterations is 4 and the error is displayed at each n.We have also carried out some tests with sub-partitions using J = K .Thus each level has its own number of points per interval, J l .Errors shown on Table 3 are also shown in Fig. 3 for n = 2.It seems to be O (K 4 ) for K small and O (K 2 ) for K bigger.The method was implemented in parallel; each interval is allocated to a processor, at each level in a round-robin order.Almost perfect parallelism is obtained in our tests on a machine with 32 processors, as shown in Fig. 4 and Table 3.

Fig. 1 .
Fig. 1.Black-Scholes case: errors on the payoff versus t on the left for several values of n and versus n on the right for several values of t.Both graphs are for Algorithm 3.2 in log-log scales and indicate a general behavior of the error not incompatible with (3.1).

Fig. 3 .
Fig. 3.Comparison between a standard LSMC solution and the parareal solution for the same number of time intervals at the finest level.The 4 points have respectively 1,2,3,4 levels; the first data point has one level and 4 intervals, the second has 2 levels and 16 intervals, the third one 3 levels and 64 intervals, the fourth one 4 levels and 256 intervals.The total number of time steps is on the horizontal axis, in log scale, and the error at n = 2 is on the vertical axis, in log scale as well.

Fig. 4 .
Fig.4.Speed-up versus the number of processors, i.e. the parareal CPU time on a parallel machine divided by the parareal CPU time on the same machine but running on one processor.There are two levels only; the parameters are K = 1, 2.., 32, n = 2 and J = 100 so as to keep each processor fully busy.