# 5 Stochastic process

This section details how to create, manipulate and propagate some stochastic processes.

In this document, we note:

• $X:\Omega ×𝒟\to {ℝ}^{d}$ a multivariate stochastic process of dimension $d$, where $\omega \in \Omega$ is an event, $𝒟$ is a domain of ${ℝ}^{n}$, $\underline{t}\in 𝒟$ is a multivariate index and $X\left(\omega ,\underline{t}\right)\in {ℝ}^{d}$;

• ${X}_{\underline{t}}:\Omega \to {ℝ}^{d}$ the random variable at index $\underline{t}\in 𝒟$ defined by ${X}_{\underline{t}}\left(\omega \right)=X\left(\omega ,\underline{t}\right)$;

• $X\left(\omega \right):𝒟\to {ℝ}^{d}$ a realization of the process $X$, for a given $\omega \in \Omega$ defined by $X\left(\omega \right)\left(\underline{t}\right)=X\left(\omega ,\underline{t}\right)$.

If $n=1$, $t$ may be interpreted as a time stamp to recover the classical notation of a stochastic process.

If the process is a second order process, we note:

• $m:𝒟\to {ℝ}^{d}$ its mean function, defined by $m\left(\underline{t}\right)=𝔼\left[{X}_{\underline{t}}\right]$,

• $C:𝒟×𝒟\to {ℳ}_{d×d}\left(ℝ\right)$ its covariance function, defined by $C\left(\underline{s},\underline{t}\right)=𝔼\left[\left({X}_{\underline{s}}-m\left(\underline{s}\right)\right){\left({X}_{\underline{t}}-m\left(\underline{t}\right)\right)}^{t}\right]$,

• $R:𝒟×𝒟\to {ℳ}_{d×d}\left(ℝ\right)$ its correlation function, defined for all $\left(\underline{s},\underline{t}\right)$, by $R\left(\underline{s},\underline{t}\right)$ such that for all $\left(i,j\right)$, ${R}_{ij}\left(\underline{s},\underline{t}\right)={C}_{ij}\left(\underline{s},\underline{t}\right)/\sqrt{{C}_{ii}\left(\underline{s},\underline{t}\right){C}_{jj}\left(\underline{s},\underline{t}\right)}$.

We recall here some useful definitions.

Spatial (temporal) and Stochastic Mean: The spatial mean of the process $X$ is the function $m:\Omega \to {ℝ}^{d}$ defined by:

 $\begin{array}{c}\hfill m\left(\omega \right)=\frac{1}{|𝒟|}{\int }_{𝒟}X\left(\omega \right)\left(\underline{t}\right)\phantom{\rule{0.166667em}{0ex}}d\underline{t}\end{array}$ (28)

If $n=1$ and if the mesh is a regular grid $\left({t}_{0},\cdots ,{t}_{N-1}\right)$, then the spatial mean corresponds to the temporal mean defined by:

 $\begin{array}{c}\hfill m\left(\omega \right)=\frac{1}{{t}_{N-1}-{t}_{0}}{\int }_{{t}_{0}}^{{t}_{N-1}}X\left(\omega \right)\left(t\right)\phantom{\rule{0.166667em}{0ex}}dt\end{array}$ (29)

The spatial mean is estimated from one realization of the process (see the use case on Field or Time series).

The stochastic mean of the process $X$ is the function $g:𝒟\to {ℝ}^{d}$ defined by:

 $\begin{array}{c}\hfill g\left(\underline{t}\right)=𝔼\left[{X}_{\underline{t}}\right]\end{array}$ (30)

The stochastic mean is estimated from a sample of realizations of the process (see the use case on the Process sample).

For an ergodic process, the stochastic mean and the spatial mean are equal and constant (equal to the constant vector noted $\underline{c}$):

 $\begin{array}{c}\hfill \forall \omega \in \Omega ,\phantom{\rule{0.166667em}{0ex}}\forall \underline{t}\in ℳ,\phantom{\rule{0.166667em}{0ex}}m\left(\omega \right)=g\left(\underline{t}\right)=\underline{c}\end{array}$ (31)

Normal process: A stochastic process is normal if all its finite dimensional joint distributions are normal, which means that for all $k\in ℕ$ and ${I}_{k}\in {ℕ}^{*}$, with $\mathrm{card}{I}_{k}=k$, there exist ${\underline{m}}_{1},\cdots ,{\underline{m}}_{k}\in {ℝ}^{d}$ and ${\underline{\underline{C}}}_{1,\cdots ,k}\in {ℳ}_{kd,kd}\left(ℝ\right)$ such that :

 $\begin{array}{c}\hfill 𝔼\left[exp\left\{i{\underline{X}}_{{I}_{k}}^{t}{\underline{U}}_{k}\right\}\right]=exp\left\{i{\underline{U}}_{k}^{t}{\underline{M}}_{k}-\frac{1}{2}{\underline{U}}_{k}^{t}{\underline{\underline{C}}}_{1,\cdots ,k}{\underline{U}}_{k}\right\}\end{array}$ (32)

where ${\underline{X}}_{{I}_{k}}^{t}=\left({X}_{{\underline{t}}_{1}}^{t},\cdots ,{X}_{{\underline{t}}_{k}}^{t}\right)$, ${\underline{U}}_{k}^{t}=\left({\underline{u}}_{1}^{t},\cdots ,{\underline{u}}_{k}^{t}\right)$ and ${\underline{M}}_{k}^{t}=\left({\underline{m}}_{1}^{t},\cdots ,{\underline{m}}_{k}^{t}\right)$ and ${\underline{\underline{C}}}_{1,\cdots ,k}$ is the symmetric matrix :

 $\begin{array}{c}\hfill {\underline{\underline{C}}}_{1,\cdots ,k}=\left(\begin{array}{cccc}C\left({\underline{t}}_{1},{\underline{t}}_{1}\right)& C\left({\underline{t}}_{1},{\underline{t}}_{2}\right)& \cdots & C\left({\underline{t}}_{1},{\underline{t}}_{k}\right)\\ \cdots & C\left({\underline{t}}_{2},{\underline{t}}_{2}\right)& \cdots & C\left({\underline{t}}_{2},{\underline{t}}_{k}\right)\\ \cdots & \cdots & \cdots & \cdots \\ \cdots & \cdots & \cdots & C\left({\underline{t}}_{k},{\underline{t}}_{k}\right)\end{array}\right)\end{array}$ (33)

A normal process is entirely defined by its mean function $m$ and its covariance function $C$ (or correlation function $R$).

Weak stationarity (second order stationarity) : A process $X$ is weakly stationary or stationary of second order if its mean function is constant and its covariance function is invariant by translation :

 $\begin{array}{cc}\hfill \forall \left(\underline{s},\underline{t}\right)\in 𝒟,& \phantom{\rule{0.166667em}{0ex}}m\left(\underline{t}\right)=m\left(\underline{s}\right)\hfill \\ \hfill \forall \left(\underline{s},\underline{t},\underline{h}\right)\in 𝒟,& \phantom{\rule{0.166667em}{0ex}}C\left(\underline{s},\underline{s}+\underline{h}\right)=C\left(\underline{t},\underline{t}+\underline{h}\right)\hfill \end{array}$ (5)

We note ${C}^{stat}\left(\underline{\tau }\right)$ for $C\left(\underline{s},\underline{s}+\underline{\tau }\right)$ as this quantity does not depend on $\underline{s}$.

In the continuous case, $𝒟$ must be equal to ${ℝ}^{n}$as it is invariant by any translation. In the discrete case, $𝒟$ is a lattice $ℒ=\left({\delta }_{1}ℤ×\cdots ×{\delta }_{n}ℤ\right)$ where $\forall i,{\delta }_{i}>0$.

Stationarity : A process $X$ is stationary if its distribution is invariant by translation: $\forall k\in ℕ$, $\forall \left({\underline{t}}_{1},\cdots ,{\underline{t}}_{k}\right)\in 𝒟$, $\forall \underline{h}\in {ℝ}^{n}$, we have:

 $\forall k\in ℕ,\phantom{\rule{0.166667em}{0ex}}\forall \left({\underline{t}}_{1},\cdots ,{\underline{t}}_{k}\right)\in 𝒟,\phantom{\rule{0.166667em}{0ex}}\forall \underline{h}\in {ℝ}^{n},\phantom{\rule{0.166667em}{0ex}}\left({X}_{{\underline{t}}_{1}},\cdots ,{X}_{{\underline{t}}_{k}}\right)\stackrel{𝒟}{=}\left({X}_{{\underline{t}}_{1}+\underline{h}},\cdots ,{X}_{{\underline{t}}_{k}+\underline{h}}\right)$ (35)

Spectral density function: If $X$ is a zero-mean weakly stationary continuous process and if for all $\left(i,j\right)$, ${C}_{i,j}^{stat}:{ℝ}^{n}\to {ℝ}^{n}$ is ${ℒ}^{1}\left({ℝ}^{n}\right)$ (ie ${\int }_{{ℝ}^{n}}|{C}_{i,j}^{stat}\left(\underline{\tau }\right)|\phantom{\rule{0.166667em}{0ex}}d\underline{\tau }\phantom{\rule{0.166667em}{0ex}}<+\infty$), we define the bilateral spectral density function $S:{ℝ}^{n}\to {ℋ}^{+}\left(d\right)$ where ${ℋ}^{+}\left(d\right)\in {ℳ}_{d}\left(ℂ\right)$ is the set of $d$-dimensional positive definite hermitian matrices, as the Fourier transform of the covariance function ${C}^{stat}$ :

 $\forall \underline{f}\in {ℝ}^{n},\phantom{\rule{0.166667em}{0ex}}S\left(\underline{f}\right)={\int }_{{ℝ}^{n}}exp\left\{-2i\pi <\underline{f},\underline{\tau }>\right\}{C}^{stat}\left(\underline{\tau }\right)\phantom{\rule{0.166667em}{0ex}}d\underline{\tau }$ (36)

Furthermore, if for all $\left(i,j\right)$, ${S}_{i,j}:{ℝ}^{n}\to ℂ$ is ${ℒ}^{1}\left(ℂ\right)$ (ie ${\int }_{{ℝ}^{n}}|{S}_{i,j}\left(\underline{f}\right)|\phantom{\rule{0.166667em}{0ex}}d\underline{f}\phantom{\rule{0.166667em}{0ex}}<+\infty$), ${C}^{stat}$ may be evaluated from $S$ as follows :

 ${C}^{stat}\left(\underline{\tau }\right)={\int }_{{ℝ}^{n}}exp\left\{2i\pi <\underline{f},\underline{\tau }>\right\}S\left(\underline{f}\right)\phantom{\rule{0.166667em}{0ex}}d\underline{f}$ (37)

In the discrete case, the spectral density is defined for a zero-mean weakly stationary process, where $𝒟=\left({\delta }_{1}ℤ×\cdots ×{\delta }_{n}ℤ\right)$ with $\forall i,{\delta }_{i}>0$ and where the previous integrals are replaced by sums.

## 5.1 UC : Creation of a mesh

This section details how to create a mesh $ℳ$ associated to a domain $𝒟\in {ℝ}^{n}$.

A mesh is defined from vertices in ${ℝ}^{n}$ and a topology that connects the vertices: the simplices. The simplex $Indices\left(\left[{i}_{1},\cdots ,{i}_{n+1}\right]\right)$ relies the vertices of index $\left({i}_{1},\cdots ,{i}_{n+1}\right)$ in ${ℕ}^{n}$. In dimension 1, a simplex is an interval $Indices\left(\left[{i}_{1},{i}_{2}\right]\right)$; in dimension 2, it is a triangle $Indices\left(\left[{i}_{1},{i}_{2},{i}_{3}\right]\right)$.

The mesh can be imported from a MSH file.

OpenTURNS enables to easily create a mesh which is a box of dimension $d=1$ or $d=2$ regularly meshed in all its directions, thanks to the object IntervalMesher.

Consider $X:\Omega ×𝒟\to {ℝ}^{d}$ a multivariate stochastic process of dimension $d$, where $𝒟\in {ℝ}^{n}$. The mesh $ℳ$ is a discretization of the domain $𝒟$.

 Requirements $•$ vertices : myVertices type: NumericalSample $•$ simplices : mySimplices type: IndicesCollection $•$ a MSH file : myMSHFile.msh type: a MSH file Results $•$ meshes of dimension 1 and 2 : myMesh1D, myMesh2D, myMSHmesh, myMeshBox type: Mesh

Python script for this UseCase :

script_docUC_StocProc_Mesh.py

######################## # Case 1: Define a mesh from its vertices and simplices   # Define a one dimensional mesh vertices = [[0.5], [1.5], [2.1], [2.7]] simplicies = IndicesCollection([[0, 1], [1, 2], [2, 3]]) myMesh1D = Mesh(vertices, simplicies)   # Define a bi dimensional mesh vertices = [[0.0, 0.0], [1.0, 0.0], [1.0, 1.0],;            [1.5, 1.0], [2.0, 1.5], [0.5, 1.5]] simplicies = IndicesCollection(     [[0, 1, 2], [1, 2, 3], [2, 3, 4], [2, 4, 5], [0, 2, 5]]) myMesh2D = Mesh(vertices, simplicies)   # Import a mesh from a MSHFile # myMSHmesh=Mesh.ImportFromMSHFile('myMSHFile.msh')   # Draw the mesh mygraph1 = myMesh1D.draw() # Show(mygraph1) mygraph2 = myMesh2D.draw() # Show(mygraph2)   ######################## # Case 2: Define a mesh which is regularly meshed box # in dimension 1 or 2   # Define the number of intervals in each direction of the box myIndices = Indices([5, 10]) myMesher = IntervalMesher(myIndices)   # Create the mesh of the box [0., 2.] * [0., 4.] lowerBound = [0., 0.] upperBound = [2., 4.] myInterval = Interval(lowerBound, upperBound) myMeshBox = myMesher.build(myInterval) mygraph3 = myMeshBox.draw() # Show(mygraph3)

The first example illustrated in the Figure 53 is the 1D case of the upper script where the mesh is defined in $ℝ$ by 4 nodes and 3 intervals.

The second example illustrated in the Figure 53 is the 2D case of the upper script where the mesh is defined in ${ℝ}^{2}$ by 6 nodes and 5 triangles.

The last example illustrated in the Figure 54 is the 2D case of the upper script where the mesh is the box $\left[0.,0.\right]×\left[2.,4.\right]$ regularly meshed with 5 intervals along the first direction and 10 intervals along the second direction.

The same kind of mesh, defined from the box $\left[1.,2.\right]×\left[0.,\Pi \right]$, regularly meshed with 50 intervals along each direction and mapped through the function $f\left(r,\theta \right)=\left(rcos\theta cos5\theta ,rsin\theta sin5\theta \right)$ is drawn in the Figure 54.

The Figure 55 draws a bidimensional mesh made of 19750 triangles and 10001 nodes.

 A mesh in dimension 1. A mesh in dimension 2.
Figure 53
 A mesh defined from a Box. A box mesh mapped through $f\left(r,\theta \right)=\left(rcos\theta cos5\theta ,rsin\theta sin5\theta \right)$.
Figure 54
 A mesh in dimension 2.
Figure 55

## 5.2 UC : Creation of a time grid

This section details first how to create a regular time grid. Note that a time grid is a particular mesh of $𝒟=\left[0,T\right]\in ℝ$.

A regular time grid is a regular discretization of the interval $\left[0,T\right]\in ℝ$ into $N$ points, noted $\left({t}_{0},\cdots ,{t}_{N-1}\right)$.

The time grid can be defined using $\left({t}_{Min},\Delta t,N\right)$ where $N$ is the number of points in the time grid. $\Delta t$ the time step between two consecutive instants and ${t}_{0}={t}_{Min}$. Then, ${t}_{k}={t}_{Min}+k\Delta t$ and ${t}_{Max}={t}_{Min}+\left(N-1\right)\Delta t$.

Consider $X:\Omega ×𝒟\to {ℝ}^{d}$ a multivariate stochastic process of dimension $d$, where $n=1$, $𝒟=\left[0,T\right]$ and $t\in 𝒟$ is interpreted as a time stamp. Then the mesh associated to the process $X$ is a (regular) time grid.

 Requirements none Results $•$ a time grid : myRegularGrid type: RegularGrid

Python script for this UseCase :

script_docUC_StocProc_TimeGrid.py

# Define the lower bound of the time grid, the number of steps # and the time step between two instants   tMin = 0. timeStep = 0.1 n = 10   # Create the RegularGrid myRegularGrid = RegularGrid(tMin, timeStep, n)   # Get the first and the last instants, # the step and the number of points of a RegularGrid myMin = myRegularGrid.getStart() myMax = myRegularGrid.getEnd() myStep = myRegularGrid.getStep() myRegularGridSize = myRegularGrid.getN()

## 5.3 UC : Manipulation of a process

The objective here is to manipulate a multivariate stochastic process $X:\Omega ×𝒟\to {ℝ}^{d}$, where $𝒟\in {ℝ}^{n}$ is discretized on the mesh $ℳ$.

A continuous realization from the process $X$ is defined as the numerical math function defined for a given $\omega \in \Omega$:

 $\begin{array}{c}\hfill X\left(\omega \right):𝒟\to {ℝ}^{d}\end{array}$ (38)

According to the process, the continuous realizations are build :

• either using a dedicated functional model if it exists: e.g. a functional basis process (see Figures 56 to 57);

• or using an interpolation from a discrete realization of the process on $ℳ$: in dimension $d=1$, a linear interpolation and in dimension $d\ge 2$, a piecewise constant function (the value at a given position is equal to the value at the nearest vertex of the mesh of the process).

We get a continuous realization thanks to the method getContinuousRealization.

• to extract its marginal process for $j\in \left[0,d-1\right]$ : ${X}_{j}:\Omega ×𝒟\to ℝ$ thanks to the method getMarginalProcess;

• to get one or several realization(s) of the process, thanks to the methods getRealization, getSample;

• to get its mesh, thanks to the method getMesh and its time grid, thanks to the method getTimeGrid when the mesh can be interpreted as a regular time grid;

• to check wether the process is normal or stationary, thanks to the methods isNormal and isStationary.

 Requirements $•$ a stochastic process myProcess type: Process Results $•$ a time grid or a mesh: myTimeGrid, myMesh type: RegularGrid, Mesh $•$ a field or a sample of fields :: myField, myFieldSample type: Field, ProcessSample $•$ a stochastic process myMarginalProcess type: Process $•$ a continuous realization: myContReal type: NumericalMathFunction

Python script for this Use Case : script_docUC_StocProc_Process.py

# Get the dimension d of the process dimension = myProcess.getDimension()   # Get the mesh of the process myMesh = myProcess.getMesh()   # Get the time grid of the process # only when the mesh can be interpreted as a regular time grid myTimeGrid = myProcess.getTimeGrid()   # Get a realisation of the process myField = myProcess.getRealization()   # Get several realisations of the process number = 10 myFieldSample = myProcess.getSample(number)   # Get a continuous realisation of the process myContReal = myProcess.getContinuousRealization()   # Get its first marginal myContReal_marg1 = myContReal.getMarginal(0) # Get the corners of the mesh minMesh = myMesh.getVertices().getMin() maxMesh = myMesh.getVertices().getMax() myGraph = myContReal_marg1.draw(minMesh[0], maxMesh[0]) # Show(myGraph)   # Get the marginal of the process at index 1 # Care! Numerotation begins at 0 # Not yet implemented for some processes # myMarginalProcess = myProcess.getMarginalProcess(1)   # Get the marginal of the process at index in indices # Not yet implemented for some processes # indices = Indices([0, 1]) # myMarginalProcess_2D = myProcess.getMarginalProcess(indices)   # Check  wether the process is normal print(myProcess.isNormal())   # Check  wether the process is stationary print(myProcess.isStationary())

Figures 56 and 56 draw two continuous realizations of the functional basis process $X:\Omega ×{\left[-4,4\right]}^{2}\to ℝ$ defined by:

 $\begin{array}{c}\hfill X\left(\omega ,\left(x,y\right)\right)=\sum _{i=1}^{3}\sum _{j=1}^{3}{\alpha }_{i,j}\left(\omega \right)sin\left(ix\right)sin\left(jy\right)\end{array}$ (39)

where the random variables ${\alpha }_{i,j}$ are iid according to the standard normal distribution.

 One realization of the functional basis process. One realization of the functional basis process.
Figure 56

In Figure 57, we draw one realization versus one continuous realization of the functional basis process $X:\Omega ×\left[0,1\right]\to ℝ$ defined by:

 $\begin{array}{c}\hfill X\left(\omega ,x\right)=\sum _{i=1}^{10}{\Xi }_{i}\left(\omega \right)sin\left(2i\pi x\right)\end{array}$ (40)

on the regular grid of $\left[0,1\right]$ discretized with 21 points, where the coefficient s ${\Xi }_{i}$ are independent and respectively distributed according to $Normal\left(0,\sigma =1.0/i\right)$.

When we ask for a realization, we get a field (each vertex ${x}_{k}$ of the mesh is associated to a value of the process $X$). The method draw draws a linear interpolation of the values: see the blue line.

When we ask for a continuous realization, we get a function $X\left(\omega \right):\left[0,1\right]\to ℝ$ defined by:

 $\begin{array}{c}\hfill X\left(\omega \right)\left(x\right)=\sum _{i=1}^{10}{\xi }_{i}\left(\omega \right)sin\left(2i\pi x\right)\end{array}$ (41)

where ${\xi }_{i}={\Xi }_{i}\left(\omega \right)$. The method draw draws the graph of the function, which is continuous and can be evaluated on other points than those of the mesh: see the red line.

 One realization versus one continuous realization of the function basis process X.
Figure 57

In Figure 58, we define a normal process of dimension 1 which covariance model is $Exponential\left(a,\lambda \right)$ with $a=1.0$ and $\lambda =4.0$. The associated mesh is the regular grid of $\left[0,1\right]$ discretized with 21 points.

When we ask for a realization, we get a field and the method draw draws a linear interpolation of the values: see the blue line.

When we ask for a continuous realization, we get a function $X\left(\omega \right):\left[0,1\right]\to ℝ$ which is built using a linear interpolation of the values of the field: see the red line.

In that case, both methods getRealization and getContinuousRealization lead to the same graph.

 One realization versus one continuous realization of temporal normal process of dimension 1.
Figure 58

## 5.4 UC : Manipulation of a field

The objective here is to create and manipulate a field. A field is the agregation of a mesh $ℳ$ of a domain $𝒟\in {ℝ}^{n}$ and a sample of values in ${ℝ}^{d}$ associated to each vertex of the mesh.

We note $\left({\underline{t}}_{0},\cdots ,{\underline{t}}_{N-1}\right)$ the vertices of $ℳ$ and $\left({\underline{x}}_{0},\cdots ,{\underline{x}}_{N-1}\right)$ the associated values in ${ℝ}^{d}$.

The spatial mean of a field is defined by:

 $\begin{array}{c}\hfill \frac{1}{N}\sum _{i=0}^{N-1}{\underline{x}}_{i}\end{array}$ (42)

It is possible to export the field into the VTK format thanks to the method exportToVTKFormat, which allows to visualize it using e.g. ParaView: see Figure 59.

A field is stored in the Field object that stores the mesh and the values at each vertex of the mesh.

The spatial mean is evaluated thanks to the method getSpatialMean. When the mesh $ℳ$ is a regular time grid, the spatial mean corresponds to a temporal mean which is also evaluated with the method getTemporalMean.

Note that if the mesh is of dimension 1 or 2, it is possible to draw one marginal of the field as follows, where the marginal is noted $j\in \left[0,d-1\right]$:

• drawMarginal(j,False) draws the marginal $j$ of the field with no interpolation between the points. Then, in dimension 1, we get the cloud ${\left({t}_{i},{v}_{i}^{j}\right)}_{0\le i\le N-1}$. In dimension 2, we draw a bullet on each vertex ${\underline{t}}_{i}\in ℳ$ which color depends on the value of the process ${v}_{i}^{j}$: see Figure 59.

• drawMarginal(j) draws the marginal $j$ of the field with linear interpolation between the points: see Figure 59.

A field can be obtained as a realization of a multivariate stochastic process $X:\Omega ×𝒟\to {ℝ}^{d}$ of dimension $d$ where $𝒟\in {ℝ}^{n}$, when the realization is discretized on the mesh $ℳ$ of $𝒟$. The values $\left({\underline{x}}_{0},\cdots ,{\underline{x}}_{N-1}\right)$ of the field are defined by:

 $\begin{array}{c}\hfill \forall i\in \left[0,N-1\right],\phantom{\rule{1.em}{0ex}}{\underline{x}}_{i}=X\left(\omega \right)\left({\underline{t}}_{i}\right)\end{array}$ (43)

If each simplex of the mesh has the same volume, then the spatial mean (42) is an estimation of the spatial mean of the process $X$ defined in (28).

If the mesh $ℳ$ can be interpreted as a regular time grid, then the spatial mean (42) is an estimation of the temporal mean of the process $X$ defined in (29).

 Requirements $•$ a mesh : myMesh type: Mesh $•$ some values : myValues type: NumericalSample $•$ a process: myProcess type: Process $•$ the mesh of the field: myMesh type: Mesh $•$ the values of the field: myValues type: NumericalSample Results $•$ two fields: myField, myField2 type: Field $•$ the spatial mean: mySpatMean type: NumericalPoint $•$ a value of the field: myValue_i type: NumericalPoint $•$ the values of the field: myGeneratedValues type: NumericalSample

Python script for this UseCase :

script_docUC_StocProc_Field.py

########################### # Case 1: Create a field from a mesh and some values myField = Field(myMesh, myValues)   ########################### # Case 2: Get a field from the process myProcess myField2 = myProcess.getRealization()   # Get all the values of the field myGeneratedValues = myField.getValues()   # Get the value of the field at the vertex i i = 1 myValue_i = myGeneratedValues[i]   # Compute the spatial mean of the field mySpatMean = myField.getSpatialMean()   # Draw the field without interpolation myGraph1 = myField.drawMarginal(0, False) # Show(myGraph1)   # Draw the field with interpolation myGraph2 = myField.drawMarginal(0) # Show(myGraph2)   # Export to the VTK format myField.exportToVTKFile('myFile.vtk')

Consider the temporal normal process of dimension 1 which covariance model is $Exponential\left(a,\lambda \right)$ with $\left(a,\lambda \right)=\left(1.0,0.2\right)$ and the bidimensional mesh which is the box $\left[0,2\right]×\left[0,1\right]$ regularly discretized with 80 points in the first dimension and 40 points in the second one.

We draw a realization of this process on the mesh using no interpolation: see Figure 59 and using a linear interpolation: see the Figure 59 .

The Figure 59 shows the field as visualized using ParaView.

 One field with no interpolation. Field of Figure 59 using linear interpolation. Field of Figure 59 visualized using ParaView.
Figure 59

## 5.5 UC : Manipulation of a time series

The objective here is to create and manipulate a time series. A time series is a particular field where the mesh $ℳ$ can be considered as a regular time grid $\left({t}_{0},\cdots ,{t}_{N-1}\right)$.

The spatial mean of a time series corresponds to a temporal mean and is evaluated as defined in (42).

It is possible to draw a time series, using interpolation between the values: see the use case on the Field.

A time series can be obtained as a realization of a multivariate stochastic process $X:\Omega ×\left[0,T\right]\to {ℝ}^{d}$ of dimension $d$ where $\left[0,T\right]$ is discretized according to the regular grid $\left({t}_{0},\cdots ,{t}_{N-1}\right)$ . The values $\left({\underline{x}}_{0},\cdots ,{\underline{x}}_{N-1}\right)$ of the time series are defined by:

 $\begin{array}{c}\hfill \forall i\in \left[0,N-1\right],\phantom{\rule{1.em}{0ex}}{\underline{x}}_{i}=X\left(\omega \right)\left({t}_{i}\right)\end{array}$ (44)

A time series is stored in the TimeSeries object that stores the regular time grid and the associated values.

 Requirements $•$ a time grid : myTimeGrid type: RegularGrid $•$ some values : myValues type: NumericalSample Results $•$ a time series : myTimeSeries type: TimeSeries

Python script for this UseCase :

script_docUC_StocProc_TimeSeries.py

########################### # Case 1: Create a time series from a time grid and values # Care! The number of steps of the time grid must correspond to the size # of the values myTimeSeries = TimeSeries(myTimeGrid, myValues)   ########################### # Case 2: Get a time series from the process myProcess myTimeSeries2 = myProcess.getRealization()   # Get the number of values of the time series print('Dimension = ', myTimeSeries.getSize())   # Get the dimension of the values observed at each time print('Dimension = ', myTimeSeries.getDimension())   # Get the value Xi at index i # Care! Numerotation begins at i=0 i = 37 print('Xi = ', myTimeSeries.getValueAtIndex(i))   # Get the value Xi of the observed time series at time # the nearest of myTime myTime = 3.4 print('Xi at time the nearest of myTime',       myTimeSeries.getValueAtNearestTime(myTime))   # Get the time series at index i : (ti, Xi) i = 37 print('(ti, Xi) = ', myTimeSeries[i])   # Get a the marginal value at index i of the time series i = 37 # get the time stamp: print('ti = ', myTimeSeries[i, 0]) # get the first component of the corresponding value : print('Xi1 = ', myTimeSeries[i, 1])   # Get all the values (X1, .., Xn) of the time series allValues = myTimeSeries.getSample()   # Compute the temporal Mean # It corresponds to the mean of the values of the time series myTemporalMean = myTimeSeries.getTemporalMean()   # Draw the marginal i of the time series # Care! Numerotation begins at i=0 # using linear interpolation myMarginalTimeSerie = myTimeSeries.drawMarginal(0) # Show(myMarginalTimeSerie)   # with no interpolation myMarginalTimeSerie2 = myTimeSeries.drawMarginal(0, False) # Show(myMarginalTimeSerie2)

Consider the temporal normal process of dimension 1 which covariance model is $Exponential\left(a,\lambda \right)$ with $\left(a,\lambda \right)=\left(1.0,0.2\right)$ and the time grid $\left[0,1\right]$ regularly discretized with 101 points.

We draw a realization of this process on the time grid using no interpolation: see Figure 60 and using a linear interpolation: see Figure 60.

 One time series with no interpolation. Time series of Figure 60 using linear interpolation.
Figure 60

## 5.6 UC : Manipulation of a process sample

The objective here is to create and manipulate a process sample. A process sample is a collection of fields which share the same mesh $ℳ\in {ℝ}^{n}$.

The method computeMean evaluates the mean of the values associated to the same vertex ${\underline{t}}_{i}$ of the common mesh $ℳ$. If $K$ is the number of fields of the process sample, the method evaluates:

 $\begin{array}{c}\hfill \frac{1}{K}\sum _{k=1}^{K}{\underline{x}}_{i}^{k}\end{array}$ (45)

where $\left({\underline{x}}_{0}^{k},\cdots ,{\underline{x}}_{N-1}^{k}\right)$ are the values of the field $k$ associated to the vertices $\left({\underline{t}}_{0},\cdots ,{\underline{t}}_{N-1}\right)$ of $ℳ$. It creates a numerical sample of size $N$ and dimension $d$.

The method computeSpatialMean evaluates the spatial mean defined in (42) for each field contained in the process sample. It creates a numerical sample of size $K$ and dimension $d$.

A process sample can be obtained as $K$ realizations of a multivariate stochastic process $X:\Omega ×𝒟\to {ℝ}^{d}$ of dimension $d$ where $𝒟\in {ℝ}^{n}$, when the realizations are discretized on the same mesh $ℳ$ of $𝒟$. The values $\left({\underline{x}}_{0}^{k},\cdots ,{\underline{x}}_{N-1}^{k}\right)$ of the field $k$ are defined by:

 $\begin{array}{c}\hfill \forall i\in \left[0,N-1\right],\phantom{\rule{1.em}{0ex}}{\underline{x}}_{i}=X\left({\omega }_{k}\right)\left({\underline{t}}_{i}\right)\end{array}$ (46)

The mean defined in (45) is an estimation of the stochastic mean of the process $X$ defined in (30).

The $q$-quantiles per component vector of level $q$ of the random variable ${X}_{{\underline{t}}_{i}}$ is the vector of the marginal quantiles of level $q$ of ${X}_{{\underline{t}}_{i}}$. The method computeQuantilePerComponent(q) of the process $X$ creates a field that associates the $q$-quantiles per component vector of the random variable ${X}_{{\underline{t}}_{i}}$ to each vertex ${\underline{t}}_{i}\in ℳ$. The marginal quantiles are evaluated from the empirical distribution defined by the process sample.

 Requirements $•$ a field : myField type: Field $•$ a process : myProcess type: Process Results $•$ a sample of processes : myProcessSample, myProcessSample_1 type: ProcessSample $•$ the stochastic mean process : myMeanField type: Field $•$ the spatial mean : myMeanNS type: NumericalSample $•$ the field of the quantiles per component : myQuantileField type: Field

Python script for this Use Case :

script_docUC_StocProc_ProcessSample.py

############################ # Case 1: Create a process sample # by duplicating the same field number = 10 myProcessSample_1 = ProcessSample(number, myField)   ############################ # Case 1: Create a process sample of size 10 # from a process myProcessSample = myProcess.getSample(10)   # Add a field to the process sample myProcessSample.add(myField)   # Get the field of index i=2 myFieldIndexI = myProcessSample[2]   # Compute the  mean of the process sample # The result is a field myMeanField = myProcessSample.computeMean()   # Compute the spatial mean of the process sample # The result is a numerical sample myMeanNS = myProcessSample.computeSpatialMean()   # Compute the quantiles per component associated to the level q # The result is a field q = 0.50 myQuantileField = myProcessSample.computeQuantilePerComponent(q)

## 5.7 Transformation of fields

### 5.7.1 UC : Trend computation

A multivariate stochastic process $X:\Omega ×𝒟\to {ℝ}^{d}$ of dimension $d$ where $𝒟\in {ℝ}^{n}$ may write as the sum of a trend function ${f}_{trend}:{ℝ}^{n}\to {ℝ}^{d}$ and a stationary multivariate stochastic process ${X}_{stat}:\Omega ×𝒟\to {ℝ}^{d}$ of dimension $d$ as follows:

 $\begin{array}{c}\hfill \forall \omega \in \Omega ,\phantom{\rule{0.166667em}{0ex}}\forall \underline{t}\in 𝒟,\phantom{\rule{0.166667em}{0ex}}X\left(\omega ,\underline{t}\right)={X}_{stat}\left(\omega ,\underline{t}\right)+{f}_{trend}\left(\underline{t}\right)\end{array}$ (47)

The objective here is to identify the trend function ${f}_{trend}$ from a given field of the process $X$ and then to remove this last one from the initial field. The resulting field is a realization of the process ${X}_{stat}$.

OpenTURNS also gives the possibility to the User to define the function ${f}_{trend}$ and to remove it from the initial field to get the resulting stationary field.

Among various method, one consists in fixing a basis of function $ℬ$ and estimating ${f}_{trend}$ using a least square method. We consider the functional basis $ℬ$ :

 $\begin{array}{c}\hfill ℬ=\left({f}_{1},{f}_{2},...,{f}_{K}\right)\phantom{\rule{4pt}{0ex}}with\phantom{\rule{4pt}{0ex}}{f}_{j}:{ℝ}^{n}⟶{ℝ}^{d}\phantom{\rule{4pt}{0ex}}\forall j\phantom{\rule{4pt}{0ex}}\in \left\{1,2,...,K\right\}\end{array}$

on which the trend function ${f}_{trend}$ is decomposed :

 $\begin{array}{c}\hfill {f}_{trend}\left(\underline{t}\right)=\sum _{j=1}^{K}{\alpha }_{j}{f}_{j}\left(\underline{t}\right)\end{array}$ (48)

The coefficients ${\alpha }_{j}\in ℝ$ have to be computed, tanks to the least square method for example. However, in the case where the number of available data is of the same ordre as K, the least square system is ill-posed and a more complex algorithm may be used. Some algorithms combine cross validation techniques and advanced regression strategies, in order to provide the estimation of the coefficients associated to the best model among the basis functions (sparse model). For example, we can use the least angle regression (LAR) method for the choice of sparse models. Then, some fitting algorithms like the leave one out, coupled to the regression strategy, assess the error on the prediction and enable the selection of the best sparse model.

We note $\left({\underline{x}}_{0},\cdots ,{\underline{x}}_{N-1}\right)$ the values of the initial field associated to the mesh $ℳ$ of $𝒟\in {ℝ}^{n}$, where ${\underline{x}}_{i}\in {ℝ}^{d}$ and $\left({\underline{x}}_{0}^{stat},\cdots ,{\underline{x}}_{N-1}^{stat}\right)$ the values of the resulting stationary field.

OpenTURNS creates a factory of trend function thanks to the object TrendFactory, from :

• a regression strategy that generates a basis using the Least Angle Regression (LAR) method thanks to the object LAR,

• and a fitting algorithm that estimates the empirical error on each sub-basis using the leave one out strategy, thanks to the object CorrectedLeaveOneOut or the KFold algorithm thanks to the object KFold.

Then, OpenTURNS estimates the trend transformation from the initial field $\left({\underline{x}}_{0},\cdots ,{\underline{x}}_{N-1}\right)$ and a function basis $ℬ$ thanks to the method build of the object TrendFactory, which produces an object of type TrendTransform. This last object enables to :

• add the trend to a given field ${\underline{w}}_{0},\cdots ,{\underline{w}}_{N-1}$ defined on the same mesh $ℳ$: the resulting field shares the same mesh than the initial field. The addition is done with the operand ().

For example, it may be useful to add the trend to a realization of the stationary process ${X}_{stat}$ in order to get a realization of the process $X$;

• get the function ${f}_{trend}$ defined in (48) that evaluates the trend thanks to the method getEvaluation();

• create the inverse trend transformation in order to remove the trend the intiail field $\left({\underline{x}}_{0},\cdots ,{\underline{x}}_{N-1}\right)$ and to create the resulting stationary field $\left({\underline{x}}_{0}^{stat},\cdots ,{\underline{x}}_{N-1}^{stat}\right)$ such that:

 $\begin{array}{c}\hfill {\underline{x}}_{i}^{stat}={\underline{x}}_{i}-{f}_{trend}\left({\underline{t}}_{i}\right)\end{array}$ (49)

where ${\underline{t}}_{i}$ is the simplex associated to the value ${\underline{x}}_{i}$.

This creation of the inverse trend function $-{f}_{trend}$ is done thanks to the method getInverse() which produces an object of type InverseTrendTransform that can be evaluated on a a field thanks to the operand ().

For example, it may be useful in order to get the stationary field $\left({\underline{x}}_{0}^{stat},\cdots ,{\underline{x}}_{N-1}^{stat}\right)$ and then analyse it with methods adapted to stationary processes (ARMA decomposition for example).

 Requirements $•$ some functions $g,h:ℝ\to ℝ$ type: NumericalMathFunction $•$ a basis sequence myBasisSequenceFactory type: BasisSequenceFactory $•$ a fitting algorithm myFittingAlgorithm type: FittingAlgorithm $•$ a basis of function myFunctionBasis type: NumericalMathFunctionCollection $•$ a time series myYt type: TimeSeries Results $•$ a trend factory : myTrendFactory type: TrendFactory $•$ a trend transformation and its inverse (to perform ()) : myTrendTranform, myInverseTrendTransform type: TrendTransform, InverseTrendTransform $•$ the trend function evaluation $f$ myEvaluation_f type: EvaluationImplementation

Python script for this Use Case :

script_docUC_StocProc_TrendComputation.py

################################ # CASE 1 : we estimate the trend from the field ################################   # Define the regression stategy using the LAR method myBasisSequenceFactory = LAR()   # Define the fitting algorithm using the # Corrected Leave One Out or KFold algorithms myFittingAlgorithm = CorrectedLeaveOneOut() myFittingAlgorithm_2 = KFold()   # Define the basis function # For example composed of 3 functions func1 = NumericalMathFunction(["t", "s"], ["1"]) func2 = NumericalMathFunction(["t", "s"], ["t"]) func3 = NumericalMathFunction(["t", "s"], ["s"]) func4 = NumericalMathFunction(["t", "s"], ["t^2"]) func5 = NumericalMathFunction(["t", "s"], ["s^2"]) myFunctionBasis = NumericalMathFunctionCollection(0) myFunctionBasis.add(func1) myFunctionBasis.add(func2) myFunctionBasis.add(func3) myFunctionBasis.add(func4) myFunctionBasis.add(func5)   # Define the trend function factory algorithm myTrendFactory = TrendFactory(myBasisSequenceFactory, myFittingAlgorithm)   # Check the definition of the created factory print('regression stategy : ', myTrendFactory.getBasisSequenceFactory()) print('fitting strategy : ', myTrendFactory.getFittingAlgorithm())   # Create the trend transformation  of type TrendTransform myTrendTransform = myTrendFactory.build(myYField, Basis(myFunctionBasis))   # Check the estimated trend function print("Trend function = ", myTrendTransform)   ################################ # CASE 2 : we impose the trend (or its inverse) ################################   # The function g computes the trend : R^2 -> R # g :      R^2 --> R #          (t,s) --> 1+2t+2s g = NumericalMathFunction(["t", "s"], ["1+2*t+2*s"]) gTemp = TrendTransform(g)   # Get the inverse trend transformation # from the trend transform already defined myInverseTrendTransform = myTrendTransform.getInverse() print('Inverse trend fucntion = ', myInverseTrendTransform)   # Sometimes it is more useful to define # the opposite trend h : R^2 -> R # in fact h = -g h = NumericalMathFunction(["t", "s"], ["-(1+2*t+2*s)"]) myInverseTrendTransform_2 = InverseTrendTransform(h)   ################################ # Remove the trend from the field myYField # myXField = myYField - f(t,s) myXField2 = myTrendTransform.getInverse()(myYField) # or from the class InverseTrendTransform myXField3 = myInverseTrendTransform(myYField)   # Add the trend to the field myXField2 # myYField = f(t,s) + myXField2 myInitialYField = myTrendTransform(myXField2)   # Get the trend function f(t,s) myEvaluation_f = myTrendTransform.getEvaluation()   # Evaluate the trend function f at a particular vertex # which is the lower corner of the mesh myMesh = myYField.getMesh() vertices = myMesh.getVertices() vertex = vertices.getMin() trend_t = myEvaluation_f(vertex)

### 5.7.2 UC : Box Cox transformation

The objective of this Use Case is to estimate a Box Cox transformation from a field which all values are positive (eventually after a shift to satisfy the positiveness) and to apply it on the field.

We consider $X:\Omega ×𝒟\to {ℝ}^{d}$ a multivariate stochastic process of dimension $d$ where $𝒟\in {ℝ}^{n}$ and $\omega \in \Omega$ is an event. We suppose that the process is ${ℒ}^{2}\left(\Omega \right)$.

We note ${X}_{\underline{t}}:\Omega \to {ℝ}^{d}$ the random variable at the vertex $\underline{t}\in 𝒟$ defined by ${X}_{\underline{t}}\left(\omega \right)=X\left(\omega ,\underline{t}\right)$.

If the variance of ${X}_{\underline{t}}$ depends on the vertex $\underline{t}$, the Box Cox transformation maps the process $X$ into the process $Y$ such that the variance of ${Y}_{\underline{t}}$ is constant (at the first order at least) with respect to $\underline{t}$.

We present here :

• the estimation of the Box Cox transformation from a given field of the process $X$,

• the action of the Box Cox transformation on a field generated from $X$.

We note $h:{ℝ}^{d}\to {ℝ}^{d}$ the Box Cox transformation which maps the process $X$ into the process $Y:\Omega ×𝒟\to {ℝ}^{d}$, where $Y=h\left(X\right)$, such that $\mathrm{Var}\left[{Y}_{\underline{t}}\right]$ is independent of $\underline{t}$ at the first order.

We suppose that ${X}_{\underline{t}}$ is a positive random variable for any $\underline{t}$. To verify that constraint, it may be needed to consider the shifted process $X+\underline{\alpha }$.

We illustrate some usual Box Cox transformations $h$ in the scalar case ($d$=1), using the Taylor development of $h:ℝ\to ℝ$ at the mean point of ${X}_{\underline{t}}$.

In the mutlivariate case, we estimate the Box Cox transformation component by component and we define the multivariate Box Cox transformation as the aggregation of the marginal Box Cox transformations.

Marginal Box Cox tranformation:

The first order Taylor development of $h$ around $𝔼\left[{Y}_{\underline{t}}\right]$ writes:

 $\begin{array}{ccc}\hfill \forall \underline{t}\in 𝒟,h\left({X}_{\underline{t}}\right)& =& h\left(𝔼\left[{X}_{\underline{t}}\right]\right)+\left({X}_{\underline{t}}-𝔼\left[{X}_{\underline{t}}\right]\right){h}^{\text{'}}\left(𝔼\left[{X}_{\underline{t}}\right]\right)\hfill \end{array}$

 $\begin{array}{ccc}\hfill 𝔼\left[h\left({X}_{\underline{t}}\right)\right]& =& h\left(𝔼\left[{X}_{\underline{t}}\right]\right)\hfill \end{array}$

and then:

 $\begin{array}{ccc}\hfill \mathrm{Var}\left[h\left({X}_{\underline{t}}\right)\right]& =& {h}^{\text{'}}{\left(𝔼\left[{X}_{\underline{t}}\right]\right)}^{2}\mathrm{Var}\left[{X}_{\underline{t}}\right]\hfill \end{array}$

To have $\mathrm{Var}\left[h\left({X}_{\underline{t}}\right)\right]$ constant with respect to $\underline{t}$ at the first order, we need :

 $\begin{array}{c}\hfill {h}^{\text{'}}\left(𝔼\left[{X}_{\underline{t}}\right]\right)=k{\left(\mathrm{Var}\left[{X}_{\underline{t}}\right]\right)}^{-1/2}\end{array}$ (50)

Now, we make some additional hypotheses on the relation between $𝔼\left[{X}_{\underline{t}}\right]$ and $\mathrm{Var}\left[{X}_{\underline{t}}\right]$ :

• If we suppose that $\mathrm{Var}\left[{X}_{\underline{t}}\right]\propto 𝔼\left[{X}_{\underline{t}}\right]$, then (50) leads to the function $h\left(y\right)\propto \sqrt{y}$ and we take $h\left(y\right)=\sqrt{y},y\phantom{\rule{3.33333pt}{0ex}}>\phantom{\rule{3.33333pt}{0ex}}0$;

• If we suppose that $\mathrm{Var}\left[{X}_{\underline{t}}\right]\propto {\left(𝔼\left[{X}_{\underline{t}}\right]\right)}^{2}$ , then (50) leads to the function $h\left(y\right)\propto logy$ and we take $h\left(y\right)=logy,y>0$;

• More generally, if we suppose that $\mathrm{Var}\left[{X}_{\underline{t}}\right]\propto {\left(𝔼\left[{X}_{\underline{t}}\right]\right)}^{\beta }$, then (50) leads to the function ${h}_{\lambda }$ parametered by the scalar $\lambda$ :

 $\begin{array}{ccc}\hfill {h}_{\lambda }\left(y\right)& =& \left\{\begin{array}{cc}\frac{{y}^{\lambda }-1}{\lambda }\hfill & \lambda \ne 0\hfill \\ log\left(y\right)\hfill & \lambda =0\hfill \end{array}\right\\hfill \end{array}$ (3)

where $\lambda =1-\frac{\beta }{2}$.

The inverse Box Cox transformation is defined by:

 $\begin{array}{ccc}\hfill {h}_{\lambda }^{-1}\left(y\right)& =& \left\{\begin{array}{cc}{\left(\lambda y+1\right)}^{\frac{1}{\lambda }}\hfill & \lambda \ne 0\hfill \\ exp\left(y\right)\hfill & \lambda =0\hfill \end{array}\right\\hfill \end{array}$ (5.7)

Estimation of the Box Cox tranformation:

The parameter $\lambda$ is estimated from a given field of the process $X$ as follows.

The estimation of $\lambda$ given below is optimized in the case when ${h}_{\lambda }\left({X}_{\underline{t}}\right)\sim 𝒩\left(\beta ,{\sigma }^{2}\right)$ at each vertex $\underline{t}$. If it is not the case, that estimation can be considered as a proposition, with no garantee.

The parameters $\left(\beta ,\sigma ,\lambda \right)$ are then estimated by the maximum likelihood estimators. We note ${\Phi }_{\beta ,\sigma }$ and ${\phi }_{\beta ,\sigma }$ respectively the cumulative distribution function and the density probability function of the $𝒩\left(\beta ,{\sigma }^{2}\right)$ distribution.

For all vertices $\underline{t}$, we have :

 $\begin{array}{ccc}\hfill \forall v\ge 0,\phantom{\rule{0.166667em}{0ex}}ℙ\left({X}_{\underline{t}}\le v\right)& =& ℙ\left({h}_{\lambda }\left({X}_{\underline{t}}\right)\le {h}_{\lambda }\left(v\right)\right)\hfill \\ & =& {\Phi }_{\beta ,\sigma }\left({h}_{\lambda }\left(v\right)\right)\hfill \end{array}$ (5.7)

from which we derive the density probability function $p$ of ${X}_{\underline{t}}$ for all vertices $\underline{t}$ :

 $\begin{array}{ccc}\hfill p\left(v\right)& =& {h}_{\lambda }^{\text{'}}\left(v\right){\phi }_{\beta ,\sigma }\left(v\right)={v}^{\lambda -1}{\phi }_{\beta ,\sigma }\left(v\right)\hfill \end{array}$ (5.7)

Using (5.7), the likelihood of the values $\left({x}_{0},\cdots ,{x}_{N-1}\right)$ with respect to the model (5.7) writes:

 $\begin{array}{ccc}\hfill L\left(\beta ,\sigma ,\lambda \right)& =& \underset{\Psi \left(\beta ,\sigma \right)}{\underbrace{\frac{1}{{\left(2\pi \right)}^{N/2}}×\frac{1}{{\left({\sigma }^{2}\right)}^{N/2}}×exp\left[-\frac{1}{2{\sigma }^{2}}{\sum }_{k=0}^{N-1}{\left({h}_{\lambda }\left({x}_{k}\right)-\beta \right)}^{2}\right]}}×\prod _{k=0}^{N-1}{x}_{k}^{\lambda -1}\hfill \end{array}$ (5.7)

We notice that for each fixed $\lambda$, the likelihood equation is proportional to the likelihood equation which estimates $\left(\beta ,{\sigma }^{2}\right)$. Thus, the maximum likelihood estimator for $\left(\beta \left(\lambda \right),{\sigma }^{2}\left(\lambda \right)\right)$ for a given $\lambda$ are :

 $\begin{array}{ccc}\hfill \stackrel{^}{\beta }\left(\lambda \right)& =& \frac{1}{N}\sum _{k=0}^{N-1}{h}_{\lambda }\left({x}_{k}\right)\hfill \\ \hfill {\stackrel{^}{\sigma }}^{2}\left(\lambda \right)& =& \frac{1}{N}\sum _{k=0}^{N-1}{\left({h}_{\lambda }\left({x}_{k}\right)-\beta \left(\lambda \right)\right)}^{2}\hfill \end{array}$ (5.7)

Substituting (5.7) into (5.7) and taking the $log-$likelihood, we obtain :

 $\begin{array}{ccc}\hfill \ell \left(\lambda \right)=logL\left(\stackrel{^}{\beta }\left(\lambda \right),\stackrel{^}{\sigma }\left(\lambda \right),\lambda \right)& =& C-\frac{N}{2}log\left[{\stackrel{^}{\sigma }}^{2}\left(\lambda \right)\right]\phantom{\rule{0.277778em}{0ex}}+\phantom{\rule{0.277778em}{0ex}}\left(\lambda -1\right)\sum _{k=0}^{N-1}log\left({x}_{i}\right)\phantom{\rule{0.166667em}{0ex}},\hfill \end{array}$ (5.7)

The parameter $\stackrel{^}{\lambda }$ is the one maximising $\ell \left(\lambda \right)$ defined in (5.7).

OpenTURNS objects:

The OpenTURNS object BoxCoxFactory enables to create a factory of Box Cox transformation.

Then, OpenTURNS estimates the Box Cox transformation ${h}_{\underline{\lambda }}$ from the initial field values $\left({\underline{x}}_{0},\cdots ,{\underline{x}}_{N-1}\right)$ thanks to the method build of the object BoxCoxFactory, which produces an object of type BoxCoxTransform.

If the field values $\left({\underline{x}}_{0},\cdots ,{\underline{x}}_{N-1}\right)$ have some negative values, it is possible to translate the values with respect a given shift $\underline{\alpha }$ which has to be mentionned either at the creation of the object BoxCoxFactory or when using the method build.

Then the Box Cox transformation is the composition of ${h}_{\underline{\lambda }}$ and this translation.

The object BoxCoxTransform enables to :

• transform the field values $\left({\underline{x}}_{0},\cdots ,{\underline{x}}_{N-1}\right)$ of dimension $d$ into the values $\left({\underline{y}}_{0},\cdots ,{\underline{y}}_{N-1}\right)$ with stabilized variance, such that for each vertex ${\underline{t}}_{i}$ we have:

 ${\underline{y}}_{i}={h}_{\underline{\lambda }}\left({\underline{x}}_{i}\right)$ (58)

or

 ${\underline{y}}_{i}={h}_{\underline{\lambda }}\left({\underline{x}}_{i}+\underline{\alpha }\right)$ (59)

thanks to the operand (). The field based on the values ${\underline{y}}_{i}$ shares the same mesh than the initial field.

• create the inverse Box Cox transformation such that :

 ${\underline{x}}_{i}={h}_{\underline{\lambda }}^{-1}\left({\underline{y}}_{i}\right)$ (60)

or

 ${\underline{x}}_{i}={h}_{\underline{\lambda }}^{-1}\left({\underline{y}}_{i}\right)-\underline{\alpha }$ (61)

thanks to the method getInverse() which produces an object of type InverseBoxCoxTransform that can be evaluated on a field thanks to the operator (). The new field based shares the same mesh than the initial field.

 Requirements $•$ somefields: myField type: Field $•$ $\underline{\lambda }$: myLambda type: NumericalPoint Results $•$ a Box Cox factory: myBoxCoxFactory type: BoxCoxFactory $•$ a Box Cox transformation and its inverse: myModelTranform, myInverseModelTransform type: BoxCoxTransform, InverseBoxCoxTransform $•$ $\underline{\lambda }$: estimatedLambda type: NumericalPoint $•$ some mapped fields: myStabilizedField type: Field

Python script for this Use Case :

script_docUC_StocProc_BoxCox.py

################################ # CASE 1 : we estimate the Box Cox transformation from the data ################################   # Initiate a BoxCoxFactory myBoxCoxFactory = BoxCoxFactory()   # We estimate the lambda parameter from the field myField # In dimension upper than one, the estimate is done marginal by marginal # We suppose here that all values of the field are positive myModelTransform = myBoxCoxFactory.build(myField) print(myModelTransform)   # Get the estimated lambda estimatedLambda = myModelTransform.getLambda()   ################################ # CASE 2 : we impose the Box Cox transformation ################################   # It is possible to impose the lambda factor myLambda # for example in dimension 1 myLambda = 0.01 myModelTransform_lambda = BoxCoxTransform(myLambda)   ################################ # Get the statibilized field   # Apply the transform method to myField # myStabilizedField = h(myField) or h(myField + alpha) myStabilizedField = myModelTransform(myField) myStabilizedField.setName('Stabilized TS')   # Get the inverse of the Box Cox transformation myInverseModelTransform = myModelTransform.getInverse()   # Apply it to the time series myFinalTimeSeries # myInitialField = h^-1(myFinalField) or h^-1(myFinalField) - alpha myInitialField = myInverseModelTransform(myStabilizedField)

Consider the temporal normal process of dimension 1 which covariance model is $Exponential\left(a,\lambda \right)$ with $\left(a,\lambda \right)=\left(1.0,0.2\right)$ and which bidimensional mesh is the box $\left[0,2\right]×\left[0,1\right]$ regularly discretized with 40 points in the first dimension and 20 points in the second one.

Then we map this process $X$ into the process $Y=exp\left(X\right)$ in order to get a non stationary positive process.

Then we get a field generated by the process $Y$ and we apply the Box Cox transformation. Figures 61 and 61 respectively draw the initial field and the stabilized one. We note that the variations in the values have decreased (colors are more uniform after the Box Cox transformation). The associated $\lambda \simeq 0$.

 One field from the Y process. Field of Figure 61 after the Box Cox transformation.
Figure 61

## 5.8 ARMA stochastic process

Consider a stationary multivariate stochastic process $X:\Omega ×\left[0,T\right]\to {ℝ}^{d}$ of dimension $d$, where ${X}_{t}:\Omega \to {ℝ}^{d}$ is the random variable at time $t$. Under some general conditions,$X$ can be modeled by an $ARMA\left(p,q\right)$ model defined at the time stamp $t$ by :

 ${X}_{t}+{\underline{\underline{A}}}_{\phantom{\rule{0.166667em}{0ex}}1}\phantom{\rule{0.166667em}{0ex}}{X}_{t-1}+\cdots +{\underline{\underline{A}}}_{\phantom{\rule{0.166667em}{0ex}}p}\phantom{\rule{0.166667em}{0ex}}{X}_{t-p}={\epsilon }_{t}+{\underline{\underline{B}}}_{\phantom{\rule{0.166667em}{0ex}}1}\phantom{\rule{0.166667em}{0ex}}{\epsilon }_{t-1}+\cdots +{\underline{\underline{B}}}_{\phantom{\rule{0.166667em}{0ex}}q}\phantom{\rule{0.166667em}{0ex}}{\epsilon }_{t-q}$ (62)

where the coefficients of the recurrence are matrix in ${ℝ}^{d}$ and ${\left({\epsilon }_{t}\right)}_{t}$is white noise discretized on the same time grid as the process $X$.

The coefficients $\left({\underline{\underline{A}}}_{\phantom{\rule{0.166667em}{0ex}}1},\cdots ,{\underline{\underline{A}}}_{\phantom{\rule{0.166667em}{0ex}}p}\right)$ form the Auto Regressive (AR) part of the model, while the coefficients $\left({\underline{\underline{B}}}_{\phantom{\rule{0.166667em}{0ex}}1}\phantom{\rule{0.166667em}{0ex}}\cdots ,{\underline{\underline{B}}}_{\phantom{\rule{0.166667em}{0ex}}q}\right)$ the Moving Average (MA) part.

We introduce the homogeneous system associated to (62) :

 ${X}_{t}+{\underline{\underline{A}}}_{\phantom{\rule{0.166667em}{0ex}}1}\phantom{\rule{0.166667em}{0ex}}{X}_{t-1}+\cdots +{\underline{\underline{A}}}_{\phantom{\rule{0.166667em}{0ex}}p}\phantom{\rule{0.166667em}{0ex}}{X}_{t-p}=0$ (63)

To get stationary solutions of (62), it is necessary to get its characteristic polynomial defined in (64) :

 $\Phi \left(\underline{r}\right)={\underline{r}}^{p}+\sum _{i=1}^{p}{a}_{i}{\underline{r}}^{p-i}$ (64)

Thus the solutions of (63) are of the form $P\left(t\right){\underline{r}}_{i}^{t}$ where the ${\underline{r}}_{i}$ are the roots of the polynomials $\Phi \left(\underline{r}\right)$ defined in (64) and $P$ is a polynomials of degree the order of the root ${\underline{r}}_{i}$ :

The processes $P\left(t\right){\underline{r}}_{i}^{t}$ decrease with time if and only if the modulus of all the components of the roots ${\underline{r}}_{i}$ are less than 1 :

 $\forall i,j\in \left[1,d\right],\phantom{\rule{0.166667em}{0ex}}|{r}_{ij}|<1$ (65)

Once given the coefficients of the model $ARMA\left(p,q\right)$, OpenTURNS evaluates the roots of the polynomials $\Phi \left(\underline{r}\right)$ and checks the previous condition (65). The roots ${\underline{r}}_{i}$, are the eigen values of the matrix $\underline{\underline{M}}$ which writes in dimension $d$ as :

 $\underline{\underline{M}}=\left(\begin{array}{cccccc}{\underline{\underline{0}}}_{\phantom{\rule{0.166667em}{0ex}}d}& {\underline{\underline{1}}}_{\phantom{\rule{0.166667em}{0ex}}d}& {\underline{\underline{0}}}_{\phantom{\rule{0.166667em}{0ex}}d}& \cdots & {\underline{\underline{0}}}_{\phantom{\rule{0.166667em}{0ex}}d}& {\underline{\underline{0}}}_{\phantom{\rule{0.166667em}{0ex}}d}\\ {\underline{\underline{0}}}_{\phantom{\rule{0.166667em}{0ex}}d}& {\underline{\underline{0}}}_{\phantom{\rule{0.166667em}{0ex}}d}& {\underline{\underline{1}}}_{\phantom{\rule{0.166667em}{0ex}}d}& \cdots & {\underline{\underline{0}}}_{\phantom{\rule{0.166667em}{0ex}}d}& {\underline{\underline{0}}}_{\phantom{\rule{0.166667em}{0ex}}d}\\ \cdots \\ {\underline{\underline{0}}}_{\phantom{\rule{0.166667em}{0ex}}d}& {\underline{\underline{0}}}_{\phantom{\rule{0.166667em}{0ex}}d}& {\underline{\underline{0}}}_{\phantom{\rule{0.166667em}{0ex}}d}& \cdots & {\underline{\underline{0}}}_{\phantom{\rule{0.166667em}{0ex}}d}& {\underline{\underline{1}}}_{\phantom{\rule{0.166667em}{0ex}}d}\\ -{\underline{\underline{A}}}_{\phantom{\rule{0.166667em}{0ex}}1}& -{\underline{\underline{A}}}_{\phantom{\rule{0.166667em}{0ex}}2}& -{\underline{\underline{A}}}_{\phantom{\rule{0.166667em}{0ex}}3}& \cdots & -{\underline{\underline{A}}}_{\phantom{\rule{0.166667em}{0ex}}p-1}& -{\underline{\underline{A}}}_{\phantom{\rule{0.166667em}{0ex}}p}\end{array}\right)$ (66)

and in dimension 1 :

 $\underline{\underline{M}}=\left(\begin{array}{cccccc}0& 1& 0& \cdots & 0& 0\\ 0& 0& 1& \cdots & 0& 0\\ \cdots \\ 0& 0& 0& \cdots & 0& 1\\ -{\alpha }_{1}& -{\alpha }_{2}& -{\alpha }_{3}& \cdots & -{\alpha }_{p-1}& -{\alpha }_{p}\end{array}\right)$ (67)

The matrix $\underline{\underline{M}}$ is known to be the companion matrix.

### 5.8.1 UC : Creation of an ARMA process

The creation of an ARMA model requires the data of the AR and MA coefficients which are :

• a list of scalars in the unidmensional case : $\left({a}_{1},\cdots ,{a}_{p}\right)$ for the AR-coefficients and $\left({b}_{1},\cdots ,{b}_{q}\right)$ for the MA-coefficients, defined thanks to a NumericalPoint

• a list of square matrix $\left({\underline{\underline{A}}}_{\phantom{\rule{0.166667em}{0ex}}1},\cdots ,\underline{\underline{A}}{\phantom{\rule{0.166667em}{0ex}}}_{p}\right)$ for the AR-coefficients and $\left({\underline{\underline{B}}}_{\phantom{\rule{0.166667em}{0ex}}1}\phantom{\rule{0.166667em}{0ex}}\cdots ,{\underline{\underline{B}}}_{\phantom{\rule{0.166667em}{0ex}}q}\right)$ for the MA-coefficients, defined thanks to a SquareMatrixCollection

Il also requires the definition of a white noise $\underline{\epsilon }$ that contains the same time grid as the one of the process.

The current state of an ARMA model is characterized by its last $p$ values and the last $q$ values of its white noise. It is possible to get that state thanks to the methods getState.

It is possible to create an ARMA with a specific current state. That specific current state is taken into account to generate possible futurs but not to generate realizations (in order to respect the stationarity property of the model).

At the creation step, OpenTURNS checks whether the process $ARMA\left(p,q\right)$ is stationnary, by evaluating the roots of the polynomials (64) associated to the homogeneous system (63). When the process is not stationary, OpenTURNS sends a message to prevent the User.

 Requirements $•$ coefficients : myARList, myMAList type: NumericalPoint or SquareMatrixCollection $•$ a white noise : myWhiteNoise type: WhiteNoise $•$ the last realizations : $myLastValues,myLastNoiseValues$ type: NumericalSample Results $•$ the AR and MA coefficients : myARCoef, myMACoef type: ARMACoefficients $•$ an ARMA process myARMA type: ARMA $•$ an ARMAState myARMAState type: ARMAState

Python script for this UseCase :

script_docUC_StocProc_ARMA_Creation.py

##################################### # CASE 1 : Whithout specifying the current state #####################################   # Create the AR and MA coefficients   # From the lists of the coefficeints # which are vectors in dimension 1 and # square matrix in dimension d>1 myARCoef = ARMACoefficients(myARList) myMACoef = ARMACoefficients(myMAList)   # Create the ARMA model   # From the ARMA coefficients, the white noise # whithout specifying the current state myARMA = ARMA(myARCoef, myMACoef, myWN_1d)   ##################################### # CASE 2 : Specifying the current state # Usefull to get possible futurs from the current state #####################################   # Define the current state of the ARMA   # Set the last p-values of the process # and the last q-values of the noise myARMAState = ARMAState(myLastValues, myLastNoiseValues)   # Create the ARMA model   # From the AR-MA coefficients, the white noise and a specific state myARMA = ARMA(myARCoef, myMACoef, myWN_1d, myARMAState)

### 5.8.2 UC : Manipulation of an ARMA process

Once an $ARMA\left(p,q\right)$ model has been created, it is possible to get :

• its linear recurrence thanks to the method print,

• its AR and MA coefficients thanks to the methods getARCoefficients, getMACoefficients,

• its white noise thanks to the method getWhiteNoise, that contains the time grid of the process,

• its current state, that is its last $p$ values and the last $q$ values of its white noise, thanks to the method getState,

• a realization thanks to the method getRealization,

• a sample of realizations thanks to the method getSample,

• a possible future of the model, which is a possible prolongation of the current state on the next ${n}_{prol}$ instants, thanks to the method getFuture.

• $n$ possible futures of the model, which correspond to $n$ possible prolongations of the current state on the next ${n}_{prol}$ instants, thanks to the method getFuture(${n}_{prol}$, $n$).

It is important to note that :

• when asking for a realization of the stationary process modeled by $ARMA\left(p,q\right)$, one has to obtain a realization that does not depend on the current state of the process;

• whereas, when one asks for a possible future extending a particular curent state of the process, the realization of the model must depend on that current sate.

How to proceed to respect these constraints?

If we note ${\underline{X}}_{1}\left(\omega ,t\right)$ and ${\underline{X}}_{2}\left(\omega ,t\right)$ two distinct solutions of (62) associated to two distinct intial states, then, the process $\underline{D}\left(\omega ,t\right)={\underline{X}}_{2}\left(\omega ,t\right)-{\underline{X}}_{1}\left(\omega ,t\right)$ is solution of the homogeneous equation associated to (62) and then decreases with time under the condition (65). Let us note ${N}_{ther}$ the number such that :

 ${\left(\underset{i,j}{max}|{r}_{ij}|\right)}^{{N}_{ther}}<\epsilon ,⟺{N}_{ther}>\frac{ln\epsilon }{ln{max}_{i,j}|{r}_{ij}|}$ (68)

where the ${r}_{i}$ are the roots of the polynomials (64) and $\epsilon$ is the precision of the computer ( $\epsilon ={2}^{-53}\simeq {10}^{-16}$). Then, after ${N}_{ther}$ instants, the process $\underline{D}\left(\omega ,t\right)$ has disappeared, which means that the processes ${\underline{X}}_{1}\left(\omega ,t\right)$ and ${\underline{X}}_{2}\left(\omega ,t\right)$ do not differ any more. As a conclusion, after ${N}_{ther}$ instants, the realization of the ARMA process does not depend any more on the initial state.

That is why, when making a realization of the ARMA model, OpenTURNS makes a thermalization step that simply consists in realizing the model upon ${N}_{ther}$ additional instants, erasing the ${N}_{ther}$ first values and finally only retaining the other ones. That step ensures that the realization of the process does not depend on the intial state.

By default, the number ${N}_{ther}$ is evaluated according to (68) by the method computeNThermalization. The User could get access to it with the method getNThermalization and can change the value with the method setNThermalization. (In order to give back to ${N}_{ther}$ its default value, it is necessary to re-use the method computeNThermalization).

On the contrary, in the context of getting a possible future from a specified current state, the User should care that the number of additional instants ${N}_{it}$ on which he wants to extend the process, is such that ${N}_{it}<{N}_{ther}$ because beyond ${N}_{ther}$, the future has no link with the present.

More precisely, after ${N}_{it}^{*}$ instants, such that :

 ${\left(\underset{i,j}{max}|{r}_{ij}|\right)}^{{N}_{it}^{*}}<\underset{i}{max}{\sigma }_{i},⟺{N}_{ther}>\frac{{max}_{i}{\sigma }_{i}}{ln{max}_{i,j}|{r}_{ij}|}$ (69)

where the ${\sigma }_{i}$ are the components of the covariance matrix of the white noise $\underline{\epsilon }$, the influence of the initial state is of same order than the influence of the white noise.

Let us note that when the ARMA model is created whithout specifying the current state, Open TURN automatically proceeds to a thermalization step at the creation of the ARMA object.

Before asking for the generation of a possible futur, the User has to specify the current state of the ARMA model, thanks to the creation method that takes into account the current state. In that case, OpenTURNS does not proceed to the thermalization step.

As an ARMA model is a stochastic process, the object ARMA inherits the methods of the Process object. Thus, it is possible to get its marginal processes, its time grid, its dimension and to get several realizations at a time of the process.

 Requirements $•$ an ARMA process myARMA type: ARMA Results $•$ the AR and MA coefficients : myARCoef, myMACoef type: ARMACoefficients $•$ an ARMAState myARMAState type: ARMAState $•$ the last realizations : $myLastValues,myLastNoiseValues$ type: NumericalSample $•$ a white noise : myWhiteNoise type: WhiteNoise $•$ a time series : ts type: TimeSeries

Python script for this Use Case :

script_docUC_StocProc_ARMA_Manipulation.py

# Check the linear recurrence print("myARMA = ", myARMA)   # Get the coefficients of the recurrence print('AR coeff = ', myARMA.getARCoefficients()) print('MA coeff = ', myARMA.getMACoefficients())   # Get the white noise myWhiteNoise = myARMA.getWhiteNoise()   # Generate one time series ts = myARMA.getRealization() ts.setName('my time series')   # Draw the time series  : marginal index 0 myTSGraph = ts.drawMarginal(0) # View(myTSGraph).show()   # Generate a k time series k = 5 myProcessSample = myARMA.getSample(k)   # Then get the current state of the ARMA myARMAState = myARMA.getState() # From the myARMAState, get the last values myLastValues = myARMAState.getX() # From the ARMAState, get the last noise values myLastEpsilonValues = myARMAState.getEpsilon()   # Get the number of iterations before getting a stationary state nThermal = myARMA.getNThermalization() print('Thermalization number = ', nThermal)   # This may be important to evaluate it  with another precision epsilon epsilon = 1e-8 newThermalValue = myARMA.computeNThermalization(epsilon) myARMA.setNThermalization(newThermalValue)   # Make a prediction from the curent state of the ARMA # on the next Nit instants Nit = 100 # at first, specify a current state myARMAState myARMA = ARMA(myARCoef, myMACoef, myWhiteNoise, myARMAState)   # then, generate a possible future possibleFuture = myARMA.getFuture(Nit)   # Generate N possible futures on the Nit next points N = 5 possibleFuture_N = myARMA.getFuture(Nit, N) possibleFuture_N.setName('Possible futures')   # Draw the future  : marginal index 0 myFutureGraph = possibleFuture_N.drawMarginal(0) # View(myFutureGraph).show()

The example illustrated below is a 1D ARMA process with $p=4$, $q=0$ with the following coefficients :

 $\begin{array}{c}\hfill {X}_{t}+0.4{X}_{t-1}+0.3{X}_{t-2}+0.2{X}_{t-3}+0.1{X}_{t-4}={\epsilon }_{t}\end{array}$

The $\epsilon$ considered is a Normal white noise with $\sigma =0.01$. This process is stationary.

Figures (62) to (63) respectively draw the graphs of one realization of the stationary process, a sample of 5 realizations of the process, one then 5 possible futures from the first realization on the next 50 instants (let us note that the thermalization number according to (68) is 75 greater than 50).

 One realization of ARMA(4,0). 5 realizations of ARMA(4,0).
Figure 62
 One possible future of ARMA(4,0) on the next 50 instants. 5 possible futures of ARMA(4,0) on the next 50 instants.
Figure 63

### 5.8.3 UC : Estimation of a scalar ARMA process

The objective of the Use Case is to estimate an ARMA model from a scalar stationary time series using the Whittle estimator and a centered normal white noise.

The data can be a unique time series or several time series collected in a process sample.

If the User specifies the order $\left(p,q\right)$, OpenTURNS fits a model $ARMA\left(p,q\right)$ to the data by estimating the coefficients $\left({a}_{1},\cdots ,{a}_{p}\right),\left({b}_{1},\cdots ,{b}_{q}\right)$ and the variance $\sigma$ of the white noise.

If the User specifies a range of orders $\left(p,q\right)\in In{d}_{p}×In{d}_{q}$, where $In{d}_{p}=\left[{p}_{1},{p}_{2}\right]$ and $In{d}_{q}=\left[{q}_{1},{q}_{2}\right]$, OpenTURNS finds the best model $ARMA\left(p,q\right)$ that fits to the data and estimates the corresponding coefficients. The best model is considered with respect to the $AI{C}_{c}$ criteria (corrected Akaike Information Criterion), defined by :

 $\begin{array}{c}\hfill AICc=-2log{L}_{w}+2\left(p+q+1\right)\frac{m}{m-p-q-2}\end{array}$

where $m$ is half the number of points of the time grid of the process sample (if the data are a process sample) or in a block of the time series (if the data are a time series).

Two other criteria are computed for each order $\left(p,q\right)$ :

• the AIC criterion :

 $\begin{array}{c}\hfill AIC=-2log{L}_{w}+2\left(p+q+1\right)\end{array}$
• and the BIC criterion:

 $\begin{array}{c}\hfill BIC=-2log{L}_{w}+2\left(p+q+1\right)log\left(p+q+1\right)\end{array}$

The BIC criterion leads to a model that gives a better prediction; the AIC criterion selects the best model that fits the given data; the $AI{C}_{c}$ criterion improves the previous one by penalizing a too high order that would artificially fit to the data.

For each order $\left(p,q\right)$, the estimation of the coefficients ${\left({a}_{k}\right)}_{1\le k\le p}$, ${\left({b}_{k}\right)}_{1\le k\le q}$ and the variance ${\sigma }^{2}$ is done using the Whittle estimator which is based on the maximization of the likelihood function in the frequency domain.

The principle is detailed hereafter for the case of a time series : in the case of a process sample, the estimator is similar except for the periodogram which is computed differently.

We consider a time series associated to the time grid $\left({t}_{0},\cdots ,{t}_{n-1}\right)$ and a particular order $\left(p,q\right)$. Using the notation (), the spectral density function of the $ARMA\left(p,q\right)$ process writes :

 $f\left(\lambda ,\underline{\theta },{\sigma }^{2}\right)=\frac{{\sigma }^{2}}{2\pi }\frac{|1+{b}_{1}exp\left(-i\lambda \right)+...+{b}_{q}{exp\left(-iq\lambda \right)|}^{2}}{|1+{a}_{1}exp\left(-i\lambda \right)+...+{a}_{p}{exp\left(-ip\lambda \right)|}^{2}}=\frac{{\sigma }^{2}}{2\pi }g\left(\lambda ,\underline{\theta }\right)$ (70)

where $\underline{\theta }=\left({a}_{1},{a}_{2},...,{a}_{p},{b}_{1},...,{b}_{q}\right)$ and $\lambda$ is the frequency value.

The Whittle log-likelihood writes :

 $log{L}_{w}\left(\underline{\theta },{\sigma }^{2}\right)=-\sum _{j=1}^{m}logf\left({\lambda }_{j},\underline{\theta },{\sigma }^{2}\right)-\frac{1}{2\pi }\sum _{j=1}^{m}\frac{I\left({\lambda }_{j}\right)}{f\left({\lambda }_{j},\underline{\theta },{\sigma }^{2}\right)}$ (71)

where :

• $I$ is the non parametric estimate of the spectral density, expressed in the Fourier space (frequencies in $\left[0,2\pi \right]$ instead of $\left[-{f}_{max},{f}_{max}\right]$). OpenTURNS uses by default the Welch estimator.

• ${\lambda }_{j}$ is the $j-th$ Fourier frequency, ${\lambda }_{j}=\frac{2\pi j}{n}$, $j=1...m$ with $m$ the largest integer $\le \frac{n-1}{2}$.

We estimate the $\left(p+q+1\right)$ scalar coefficients by maximizing the log-likelihood function. The corresponding equations lead to the following relation :

 ${\sigma }^{2*}=\frac{1}{m}\sum _{j=1}^{m}\frac{I\left({\lambda }_{j}\right)}{g\left({\lambda }_{j},{\underline{\theta }}^{*}\right)}$ (72)

where ${\underline{\theta }}^{*}$ maximizes :

 $log{L}_{w}\left(\underline{\theta }\right)=m\left(log\left(2\pi \right)-1\right)-mlog\left(\frac{1}{m}\sum _{j=1}^{m}\frac{I\left({\lambda }_{j}\right)}{g\left({\lambda }_{j},\underline{\theta }\right)}\right)-\sum _{j=1}^{m}g\left({\lambda }_{j},\underline{\theta }\right)$ (73)

The Whitle estimation requires that :

• the determinant of the eigen values of the companion matrix associated to the polynomial $1+{a}_{1}X+\cdots +{a}_{p}{X}^{p}$ are outside the unit disc,, which garantees the stationarity of the process;

• the determinant of the eigen values of thez companion matrix associated to the polynomial $1+b{a}_{1}X+\cdots +{b}_{q}{X}^{q}$ are outside the unit disc, which garantees the invertibility of the process.

OpenTURNS proceeds as follows :

• the object WhittleFactory is created with either a specified order $\left(p,q\right)$ or a range $In{d}_{p}×In{d}_{q}$. By default, the Welch estimator (object Welch) is used with its default parameters.

• for each order $\left(p,q\right)$, the estimation of the parameters is done by maximizing the reduced equation of the Whittle likelihood function (73), thanks to the method build of the object WhittleFactory. This method applies to a time series or a process sample.

If the User wants to get the quantified criteria $AI{C}_{c},AIC$ and BIC of the model $ARMA\left(p,q\right)$, he has to specify it by giving a NumericalPoint of size 0 (NumericalPoint()) as input parameter of the method build.

• the output of the estimation is, in all the cases, one unique ARMA : the ARMA with the specified order or the optimal one with respect to the $AI{C}_{c}$ criterion.

• in the case of a range $In{d}_{p}×In{d}_{q}$, the User can get all the estimated models thanks to the method getHistory of the object WhittleFactory. If the build has been parameterized by a NumericalPoint of size 0, the User also has access to all the quantified criteria.

 Requirements $•$ a time series or collection of time series : myTimeSeries, myProcessSample type: TimeSeries or ProcessSample Results $•$ the Whittle factory : myFactory type: WhittleFactory type: SpectralFactory $•$ the white noise : myWhiteNoise type: WhiteNoise $•$ an ARMA process myARMA, arma type: ARMA $•$ a set of information criteria myCriterion type: NumericalPoint $•$ a collection of Whittle estimates myWhittleHistory type: WhittleFactoryStateCollection

Python script for this UseCase :

script_docUC_StocProc_ARMA_Estimation_Whittle.py

################################### # CASE 1 : we specify a (p,q) order ###################################   # Specify the order (p,q) p = 4 q = 2   # Build a Whittle factory # with default SpectralModelFactory (WelchFactory) myFactory_42 = WhittleFactory(p, q)   # Check the default SpectralModelFactory print("Default Spectral Model Factory = ",       myFactory_42.getSpectralModelFactory())   # To set the spectral model factory # For example, set WelchFactory as SpectralModelFactory # with the Hanning filtering window # The Welch estimator splits the time series in four blocs without overlap myFilteringWindow = Hanning() mySpectralFactory = WelchFactory(myFilteringWindow, 4, 0) myFactory_42.setSpectralModelFactory(mySpectralFactory) print("The new Spectral Model Factory = ",       myFactory_42.getSpectralModelFactory())   ################################### # CASE 2 : we specify a range of (p,q) orders ###################################   # Range for p = [1, 2, 4] pIndices = Indices([1, 2, 4]) # Range for q = [4,5,6] qIndices = Indices(3) # fill form 4 by step = 1 qIndices.fill(4, 1)   # Build a Whittle factory with default SpectralModelFactory (WelchFactory) # this time using ranges of order p and q myFactory_Range = WhittleFactory(pIndices, qIndices)   ################################### # Coefficients estimation ###################################   # Estimate the ARMA model from a time series # To get the quantified AICc, AIC and BIC criteria myCriterion = NumericalPoint()   myARMA_42 = myFactory_42.build(TimeSeries(myTimeSeries), myCriterion)   # Estimate the arma model from a process sample myARMA_Range = myFactory_Range.build(myProcessSample, myCriterion)   ################################### # Results exploitation ###################################   # Get the white noise of the (best) estimated arma myWhiteNoise = myARMA_Range.getWhiteNoise()   # When specified, get the quantified criterion # for the best model print("Criteria AICc = ", myCriterion[0]) print("Criteria AIC = ", myCriterion[1]) print("Criteria BIC = ", myCriterion[2])   # Get all the estimated models myWhittleHistory = myFactory_Range.getHistory()   # Print the all the models and the criterion in the history for i in range(myWhittleHistory.getSize()):     model_i = myWhittleHistory[i]     arma = model_i.getARMA()     print("Order(p,q) = ",  model_i.getP(), ", ", model_i.getQ())     print("AR coeff = ", model_i.getARCoefficients())     print("MA coeff = ", model_i.getMACoefficients())     print("White Noise - Sigma = ", model_i.getSigma2())     print("Criteria AICc, AIC, BIC = ", model_i.getInformationCriteria())     print('')

### 5.8.4 UC : Estimation of a multivariate ARMA process

The objective of the Use Case is to estimate a multivariate ARMA model from a stationary time series using the maximum likelihood estimator and a centered normal white noise.

The data can be a unique time series or several time series collected in a process sample.

Let ${\left({t}_{i},{\underline{X}}_{i}\right)}_{0\le i\le n-1}$ be a multivariate time series of dimension $d$ generated by an ARMA process represented by equation (62), where $\left(p,q\right)$ are supposed to be known. We assume that the white noise $\epsilon$ is distributed according to the normal distribution with zero mean and with covariance matrix ${\underline{\underline{\Sigma }}}_{\epsilon }={\sigma }^{2}\underline{\underline{Q}}$ where $|\underline{\underline{Q}}|=1$ .

The normality of the white noise implies the normality of the process. If we note $\underline{W}=\left({\underline{X}}_{0},\cdots ,{\underline{X}}_{n-1}\right)$, then $\underline{W}$ is normal with zero mean. Its covariance matrix writes $𝔼\left(\underline{W}{\underline{W}}^{t}\right)={\sigma }^{2}{\Sigma }_{\underline{W}}$ which depends on the coefficients $\left({\underline{\underline{A}}}_{k},{\underline{\underline{B}}}_{l}\right)$ for $k=1,...,p$ and $l=1,...,q$ and on the matrix $\underline{\underline{Q}}$.

The likelihood of $\underline{W}$ writes :

 $L\left(\underline{\beta },{\sigma }^{2}|\underline{W}\right)={\left(2\pi {\sigma }^{2}\right)}^{-\frac{dn}{2}}{|{\Sigma }_{w}|}^{-\frac{1}{2}}exp\left(-{\left(2{\sigma }^{2}\right)}^{-1}{\underline{W}}^{t}{\Sigma }_{\underline{W}}^{-1}\underline{W}\right)$ (74)

where $\underline{\beta }=\left({\underline{\underline{A}}}_{k},{\underline{\underline{B}}}_{l},\underline{\underline{Q}}\right),\phantom{\rule{4pt}{0ex}}k=1,...,p$, $l=1,...,q$ and where $|.|$ denotes the determinant.

The difficulty arises from the great size ($dn×dn$) of ${\Sigma }_{\underline{W}}$ which is a dense matrix in the general case. Maurucio J. A.(The exact likelihood function of a vector autoregressive moving average model, 1996) proposes an efficient algorithm to evaluate the likelihood function. The main point is to use a change of variable that leads to a block-diagonal sparse covariance matrix.

The Whitle estimation requires that :

• the determinant of the eigen values of the companion matrix associated to the polynomial $\underline{\underline{I}}+{\underline{\underline{A}}}_{1}\underline{\underline{X}}+\cdots +{\underline{\underline{A}}}_{p}{\underline{\underline{X}}}^{p}$ are outside the unit disc, which garantees the stationarity of the process;

• the determinant of the eigen values of the companion matrix associated to the polynomial $\underline{\underline{I}}+{\underline{\underline{B}}}_{1}\underline{\underline{X}}+\cdots +{\underline{\underline{B}}}_{q}{\underline{\underline{X}}}^{q}$ are outside the unit disc, which garantees the invertibility of the process.

OpenTURNS estimates $\left(\underline{\beta },{\sigma }^{2}\right)$ thanks to the ARMALikelihoodFactory object and its method build, acting on a time series or on a sample of time series. It produces a result of type ARMA.

Note that no evaluation of selection criteria such as AIC and BIC is done.

 Requirements $•$ order of the model and dimension of the underlying process: p, q, d type: integer $•$ a time series or collection of time series : myTimeSeries, mySample type: TimeSeries, ProcessSample Results $•$ the ARMA likelihood factory : myFactory type: ARMALikelihoodFactory $•$ the white noise : myWhiteNoise type: WhiteNoise $•$ an ARMA process myARMA type: ARMA

Python script for this Use Case :

# Build a factory   myFactory = ARMALikelihoodFactory(p, q, d)     # Estimate the ARMA model coefficients from a time series   myARMA = myFactory.build(myTimeSeries)     # Estimate the ARMA model coefficients from a process sample   myARMA = myFactory.build(mySample)     # Get the white noise of the ARMA given   myWhiteNoise = myARMA.getWhiteNoise()

## 5.9 Normal processes

### 5.9.1 UC : Creation of a parametric stationary covariance function

Let $X:\Omega ×𝒟\to {ℝ}^{d}$ be a multivariate stationary normal process where $𝒟\in {ℝ}^{n}$. The process is supposed to be zero mean. It is entirely defined by its covariance function ${C}^{stat}:𝒟\to {ℳ}_{d×d}\left(ℝ\right)$, defined by ${C}^{stat}\left(\underline{\tau }\right)=𝔼\left[{X}_{\underline{s}}{X}_{\underline{s}+\underline{\tau }}^{t}\right]$ for all $\underline{s}\in {ℝ}^{n}$.

If the process is continuous, then $𝒟={ℝ}^{n}$. In the discrete case, $𝒟$ is a lattice.

This use case illustrates how the User can create a covariance function from parametric models. OpenTURNS implements the multivariate Exponential model as a parametric model for the covariance function ${C}^{stat}$.

The multivariate exponential model: This model defines the covariance function ${C}^{stat}$ by:

 $\forall \underline{\tau }\in 𝒟,\phantom{\rule{1.em}{0ex}}{C}^{stat}\left(\underline{\tau }\right)=\left[\underline{\underline{A}}\underline{\underline{\Delta }}\left(\underline{\tau }\right)\right]\phantom{\rule{0.166667em}{0ex}}\underline{\underline{R}}\phantom{\rule{0.166667em}{0ex}}\left[\underline{\underline{\Delta }}\left(\underline{\tau }\right)\underline{\underline{A}}\right]$ (75)

where $\underline{\underline{R}}\in {ℳ}_{d×d}\left(\left[-1,1\right]\right)$ is a correlation matrix, $\underline{\underline{\Delta }}\left(\underline{\tau }\right)\in {ℳ}_{d×d}\left(ℝ\right)$ is defined by:

 $\underline{\underline{\Delta }}\left(\underline{\tau }\right)=\text{Diag}\left({e}^{-{\lambda }_{1}|\tau |/2},\cdots ,{e}^{-{\lambda }_{d}|\tau |/2}\right)$ (76)

and $\underline{\underline{A}}\in {ℳ}_{d×d}\left(ℝ\right)$ is defined by:

 $\underline{\underline{A}}=\text{Diag}\left({a}_{1},\cdots ,{a}_{d}\right)$ (77)

whith ${\lambda }_{i}>0$ and ${a}_{i}>0$ for any $i$.

We call $\underline{a}$ the amplitude vector and $\underline{\lambda }$ the scale vector.

The expression of ${C}^{stat}$ is the combination of:

• the matrix $\underline{\underline{R}}$ that models the spatial correlation between the components of the process $X$ at any vertex $\underline{t}$ (since the process is stationary):

 $\forall \underline{t}\in 𝒟,\phantom{\rule{1.em}{0ex}}\underline{\underline{R}}=\mathrm{Cor}\left[{X}_{\underline{t}},{X}_{\underline{t}}\right]$ (78)
• the matrix $\underline{\underline{\Delta }}\left(\underline{\tau }\right)$ that models the correlation between the marginal random variables ${X}_{\underline{t}}^{i}$ and ${X}_{\underline{t}+\underline{\tau }}^{i}$:

 $\begin{array}{c}\hfill \mathrm{Cor}\left[{X}_{\underline{t}}^{i},{X}_{\underline{t}+\underline{\tau }}^{i}\right]={e}^{-{\lambda }_{i}|\tau |}\end{array}$ (79)
• the matrix $\underline{\underline{A}}$ that models the variance of each marginal random variable:

 $\begin{array}{c}\hfill \mathrm{Var}\left[{X}_{\underline{t}}\right]=\left({a}_{1},\cdots ,{a}_{d}\right)\end{array}$ (80)

This model is such that:

 $\begin{array}{cc}\hfill {C}_{ij}^{stat}\left(\underline{\tau }\right)& ={a}_{i}{e}^{-{\lambda }_{i}|\tau |/2}{R}_{i,j}{a}_{j}{e}^{-{\lambda }_{j}|\tau |/2},\phantom{\rule{1.em}{0ex}}i\ne j\hfill \\ \hfill {C}_{ii}^{stat}\left(\underline{\tau }\right)& ={a}_{i}{e}^{-{\lambda }_{i}|\tau |/2}{R}_{i,i}{a}_{i}{e}^{-{\lambda }_{j}|\tau |/2}={a}_{i}^{2}{e}^{-{\lambda }_{i}|\tau |}\hfill \end{array}$ (5.9)

It is possible to define the exponential model from the spatial covariance matrix ${\underline{\underline{C}}}^{spat}$ rather than the correlation matrix $\underline{\underline{R}}$ :

 $\forall \underline{t}\in 𝒟,\phantom{\rule{1.em}{0ex}}{\underline{\underline{C}}}^{spat}=𝔼\left[{X}_{\underline{t}}{X}_{\underline{t}}^{t}\right]=\underline{\underline{A}}\phantom{\rule{0.166667em}{0ex}}\underline{\underline{R}}\phantom{\rule{0.166667em}{0ex}}\underline{\underline{A}}$ (82)

OpenTURNS implements the multivariate exponential model thanks to the object ExponentialModel wich is created from :

• the scale and amplitude vectors $\left(\underline{\lambda },\underline{a}\right)$: in that case, by default $\underline{\underline{R}}=\underline{\underline{I}}$;

• the scale and amplitude vectors and the spatial correlation matrix $\left(\underline{\lambda },\underline{a},\underline{\underline{R}}\right)$;

• the scale and amplitude vectors and the spatial covariance matrix $\left(\underline{\lambda },\underline{a},\underline{\underline{C}}\right)$; Then $\underline{\underline{C}}$ is mapped into the associated correlation matrix $\underline{\underline{R}}$ according to (82) and the previous constructor is used.

 Requirements $•$ $\underline{a}$, $\underline{\lambda }$ : amplitude, scale type: NumericalPoint $•$ $\underline{\underline{R}}$ : spatialCorrelation type: CorrelationMatrix $•$ $\underline{\underline{C}}$ : spatialCovariance type: CovarianceMatrix Results $•$ a covariance model : myCovarianceModel type: StationaryCovarianceModel

Python script for this UseCase :

script_docUC_StocProc_StationaryCovarianceFunction_Param.py

# Create the covariance model # for example : the Exponential model with spatial dimension=1 spatialDimension = 1   # from the amplitude and scale, no spatial correlation myCovarianceModel = ExponentialModel(spatialDimension, amplitude, scale)   # from the amplitude, scale and spatialCovariance myCovarianceModel = ExponentialModel(     spatialDimension, amplitude, scale, spatialCorrelation)   # or from the scale and spatialCovariance myCovarianceModel = ExponentialModel(     spatialDimension, scale, spatialCovariance)

### 5.9.2 UC : Creation of a User defined stationary covariance function

This use case illustrates how the User can define his own stationary covariance model.

A stationary covariance model ${C}^{stat}$ is defined by : ${C}^{stat}:𝒟\to {ℳ}_{d×d}\left(ℝ\right)$ where ${C}^{stat}\left(\underline{\tau }\right)$ is a covariance matrix of dimension $d$.

Note that $𝒟={ℝ}^{n}$ in the continuous case and $𝒟$ is a lattice in the discrete case.

OpenTURNS allows the User to define his own stationary covariance model thanks to the object UserDefinedStationaryCovarianceModel defined from :

• a mesh $ℳ$ of dimension $n$ defined by the vertices $\left({\underline{\tau }}_{0},\cdots ,{\underline{\tau }}_{N-1}\right)$ and the associated simplices,

• a collection of covariance matrices stored in the object CovarianceMatrixCollection noted $\left({\underline{\underline{C}}}_{0},\cdots ,{\underline{\underline{C}}}_{N-1}\right)$ where ${\underline{\underline{C}}}_{k}\in {ℳ}_{d×d}\left(ℝ\right)$ for $0\le k\le N-1$.

Then OpenTURNS builds a stationary covariance function which is a piecewise constant function on $𝒟$ defined by:

 $\begin{array}{c}\hfill \forall \underline{\tau }\in 𝒟,\phantom{\rule{0.166667em}{0ex}}{C}^{stat}\left(\underline{\tau }\right)={\underline{\underline{C}}}_{k}\phantom{\rule{4.pt}{0ex}}\text{where}\phantom{\rule{4.pt}{0ex}}k\phantom{\rule{4.pt}{0ex}}\text{is}\phantom{\rule{4.pt}{0ex}}\text{such}\phantom{\rule{4.pt}{0ex}}\text{that}\phantom{\rule{4.pt}{0ex}}{\underline{\tau }}_{k}\phantom{\rule{4.pt}{0ex}}\text{is}\phantom{\rule{4.pt}{0ex}}\text{the}\phantom{\rule{4.pt}{0ex}}\text{vertex}\phantom{\rule{4.pt}{0ex}}\text{of}\phantom{\rule{4.pt}{0ex}}ℳ\phantom{\rule{4.pt}{0ex}}\text{the}\phantom{\rule{4.pt}{0ex}}\text{nearest}\phantom{\rule{4.pt}{0ex}}\text{to}\phantom{\rule{4.pt}{0ex}}\underline{t}.\end{array}$

Care: in its version 1.3, OpenTURNS only implements the case $n=1$ where the mesh $ℳ$ is a regular time grid $\left({t}_{0},\cdots ,{t}_{N-1}\right)$ discretizing $𝒟=\left[0,T\right]$.

 Requirements $•$ a mesh : myShiftMesh type: Mesh $•$ a collection of covariance matrices : myCovarianceCollection type: CovarianceMatrixCollection $•$ one vertex : tau type: NumericalPoint Results $•$ a stationary covariance model : myCovarianceModel type: StationaryCovarianceModel

Python script for this UseCase :

script_docUC_StocProc_StationaryCovarianceFunction_UserDefined.py

# Create the covariance model myCovarianceModel = UserDefinedStationaryCovarianceModel(     myShiftMesh, myCovarianceCollection)   # Get the covariance function computed at the vertex tau myCovarianceMatrix = myCovarianceModel(tau)

In the following example, we illustrate how to create a covariance model of dimension $d=1$ and when the domain $𝒟=\left[0,T\right]$.

We model the covariance function defined by: ${C}^{stat}:ℝ\to ℝ$ where ${C}^{stat}\left(\tau \right)=\frac{1}{1+{\tau }^{2}}$. In this example, the domain $𝒟$ is with $\left[0,20\right]$ discretized with the time step $\Delta t=0.5$.

The Figure 64 draws the covariance function.

 User defined stationary covariance model in dimension 1 when 𝒟=[0,T].
Figure 64

### 5.9.3 UC : Creation of a User defined covariance function

This use case illustrates how the User can define his own covariance model.

A covariance function $C$ is defined by : $C:𝒟×𝒟\to {𝕄}_{d×d}\left(ℝ\right)$ where $C\left(\underline{s},\underline{t}\right)$ is a covariance matrix of dimension $d$.

The domaine $𝒟$ is discretized on the mesh $ℳ$.

OpenTURNS allows the User to define his own covariance model thanks to the object UserDefinedCovarianceModel defined from :

• a mesh $ℳ\in {ℝ}^{n}$ defined by the vertices $\left({\underline{t}}_{0},\cdots ,{\underline{t}}_{N-1}\right)$ and the associated simplices,

• a collection of $N\left(N+1\right)/2$ covariance matrices stored in the object CovarianceMatrixCollection noted ${\left({\underline{\underline{C}}}_{k,\ell }\right)}_{0\le \ell \le k\le N-1}$ where ${\underline{\underline{C}}}_{k,\ell }\in {ℳ}_{d×d}\left(ℝ\right)$.

Care: The covariance matrices ${\left({\underline{\underline{C}}}_{i,j}\right)}_{0\le j\le i\le N-1}$ must be given in the following order:

 $\begin{array}{c}\hfill {\underline{\underline{C}}}_{0,0},\phantom{\rule{0.166667em}{0ex}}{\underline{\underline{C}}}_{1,0},\phantom{\rule{0.166667em}{0ex}}{\underline{\underline{C}}}_{1,1},\phantom{\rule{0.166667em}{0ex}}{\underline{\underline{C}}}_{2,0},\phantom{\rule{0.166667em}{0ex}}{\underline{\underline{C}}}_{2,1},\phantom{\rule{0.166667em}{0ex}}{\underline{\underline{C}}}_{2,2},\phantom{\rule{0.166667em}{0ex}}\cdots \end{array}$

which corresponds to the global covariance matrix, which lower part is:

 $\begin{array}{c}\hfill \left(\begin{array}{cccc}{\underline{\underline{C}}}_{0,0}& & & \\ {\underline{\underline{C}}}_{1,0}& {\underline{\underline{C}}}_{1,1}& & \\ {\underline{\underline{C}}}_{2,0}& {\underline{\underline{C}}}_{2,1}& {\underline{\underline{C}}}_{2,2}& \\ \cdots & \cdots & \cdots & \cdots \end{array}\right)\end{array}$

Using that collection of covariance matrices, OpenTURNS builds a covariance function which is a piecewise constant function defined on $𝒟×𝒟$ by:

 $\begin{array}{c}\hfill \forall \left(\underline{s},\underline{t}\right)\in 𝒟×𝒟,\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{1.em}{0ex}}C\left(\underline{s},\underline{t}\right)={\underline{\underline{C}}}_{k\left(\underline{s}\right),k\left(\underline{t}\right)}\end{array}$

where $k\left(\underline{s}\right)$ is such that ${\underline{t}}_{k\left(\underline{s}\right)}$ is the vertex of $ℳ$ the nearest to $\underline{s}$.

It follows that:

 $\begin{array}{c}\hfill C\left({\underline{t}}_{k},{\underline{t}}_{\ell }\right)={\underline{\underline{C}}}_{k,\ell }\end{array}$

Concerning the collection of covariance matrices that is used to build the discretized covariance model, we have that:

• the matrix ${\underline{\underline{C}}}_{k,\ell }$ has the index $n=\ell +\frac{k\left(k+1\right)}{2}$.

• inversely, the matrix stored at index $n$ in the collection of covariance matrices, is the matrix ${\underline{\underline{C}}}_{k,\ell }$ where:

 $\begin{array}{c}\hfill k=⌊\frac{1}{2}\left(\sqrt{8n+1}-1\right)⌋\end{array}$

and

 $\begin{array}{c}\hfill \ell =n-\frac{k\left(k+1\right)}{2}\end{array}$
 Requirements $•$ a mesh : myMesh type: Mesh $•$ a collection of covariance matrices : myCovarianceCollection type: CovarianceMatrixCollection $•$ two vertices : s,t type: NumericalPoint Results $•$ a covariance model : myCovarianceModel type: UserDefinedCovarianceModel

Python script for this UseCase :

script_docUC_StocProc_NonStationaryCovarianceFunction_UserDefined.py

# Create the covariance model myCovarianceModel = UserDefinedCovarianceModel(myMesh, myCovarianceCollection)   # Get the covariance function computed at (s,t) # for example (1.5, 2.5) s = 1.5 t = 2.5

In the following example, we illustrate the piecewise constant covariance that OpenTURNS builds from a collection of covariance matrices that comes from the continuous covariance function $C:𝒟×𝒟\to ℝ$ defined by:

 $\begin{array}{c}\hfill C\left(s,t\right)=exp\left(-\frac{4|s-t|}{1+{s}^{2}+{t}^{2}}\right)\end{array}$ (83)

where the domain $𝒟=\left[-4,4\right]$ discretized on the mesh $ℳ$ which is the regular time grid with $N=64$ points: $\left({t}_{0},\cdots ,{t}_{63}\right)$.

As we have $n=1$ and $d=1$, the covariance matrices are scalars and the colelction corresponds to ${\left(C\left({t}_{i},{t}_{j}\right)\right)}_{0\le j\le i\le 63}$.

The Figure 65 draws the iso contours of the continuous model $C$ and the piecewise constant model built by OpenTURNS: the discretized models approaches the real one with a good precision.

 User defined non stationary covariance model in dimension 1 on 𝒟=[-4,4].
Figure 65

### 5.9.4 UC : Estimation of a stationary covariance function

Let $X:\Omega ×𝒟\to {ℝ}^{d}$ be a multivariate stationary normal process of dimension $d$. We only treat here the case where the domain is of dimension 1: $𝒟\in ℝ$ ($n=1$).

If the process is continuous, then $𝒟=ℝ$. In the discrete case, $𝒟$ is a lattice.

$X$ is supposed a second order process with zero mean. It is entirely defined by its covariance function ${C}^{stat}:𝒟\to {ℳ}_{d×d}\left(ℝ\right)$, defined by ${C}^{stat}\left(\tau \right)=𝔼\left[{X}_{s}{X}_{s+\tau }^{t}\right]$ for all $s\in 𝒟$.

In addition, we suppose that its spectral density function $S:ℝ\to {ℋ}^{+}\left(d\right)$ is defined, where ${ℋ}^{+}\left(d\right)\in {ℳ}_{d}\left(ℂ\right)$ is the set of $d$-dimensional positive definite hermitian matrices.

The objective of this use case is to estimate ${C}^{stat}$ from a field or a sample of fields from the process $X$, using first the estimation of the spectral density function and then mapping $S$ into ${C}^{stat}$ using the inversion relation (37), when it is possible.

As the mesh is a time grid ($n=1$), the fields can be interpreted as time series.

The estimation algorithm is outlined hereafter.

Let $\left({t}_{0},{t}_{1},\cdots {t}_{N-1}\right)$ be the time grid on which the process is observed and let also $\left({\underline{X}}^{0},\cdots ,,{\underline{X}}^{M-1}\right)$ be $M$ independent realizations of $X$ or $M$ segments of one realization of the process.

Using (37), the covariance function writes:

 ${C}_{i,j}^{stat}\left(\tau \right)={\int }_{ℝ}exp\left\{2i\pi f\tau \right\}{S}_{i,j}\left(f\right)\phantom{\rule{0.166667em}{0ex}}df$ (84)

where ${C}_{i,j}^{stat}$ is the element $\left(i,j\right)$ of the matrix ${C}^{stat}\left(\tau \right)$ and ${S}_{i,j}\left(f\right)$ the one of $S\left(f\right)$. The integral (84) is approximated by its evaluation on the finite domain $\Omega \in ℝ$:

 ${C}_{i,j}^{stat}\left(\tau \right)={\int }_{\Omega }exp\left\{2i\pi f\tau \right\}{S}_{i,j}\left(f\right)\phantom{\rule{0.166667em}{0ex}}df$ (85)

Let us consider the partition of the domain as follows:

• $\Omega =\left[-{\Omega }_{c},{\Omega }_{c}\right]$ is subdivised into $M$ segments $\Omega$ = ${\cup }_{k=1}^{M}{ℳ}_{k}$ with ${ℳ}_{k}=\left[{f}_{k}-\frac{\Delta f}{2},{f}_{k}+\frac{\Delta f}{2}\right]$

• $\Delta f$ be the frequency step, $\Delta f=\frac{2{\Omega }_{c}}{M}$

• ${f}_{k}$ be the frequences on which the spectral density is computed, ${f}_{k}=-{\Omega }_{c}+\left(k-\frac{1}{2}\right)\Delta f=\left(2k-1-M\right)\frac{\Delta f}{2}$ with $k=1,2,\cdots ,M$

The equation (85) can be rewritten as:

 ${C}_{i,j}^{stat}\left(\tau \right)=\sum _{k=1}^{M}{\int }_{{ℳ}_{k}}exp\left\{2i\pi f\tau \right\}{S}_{i,j}\left(f\right)\phantom{\rule{0.166667em}{0ex}}df$

We focus on the integral on each subdomain ${ℳ}_{k}$. Using numerical approximation, we have:

 ${\int }_{{ℳ}_{k}}exp\left\{2i\pi f\tau \right\}{S}_{i,j}\left(f\right)\phantom{\rule{0.166667em}{0ex}}df\approx \Delta f{S}_{i,j}\left({f}_{k}\right)exp\left\{2i\pi {f}_{k}\tau \right\}$

$\tau$ must be in correspondance with frequency values with respect to the Shannon criteria. Thus the temporal domain of estimation is the following:

• $\Delta t$ is the time step, $\Delta t=\frac{1}{2{\Omega }_{c}}$ such as $\Delta f\Delta t=\frac{1}{M}$

• $\stackrel{˜}{𝒯}$ =$\left[-T,T\right]$ is subdivised into $M$ segments $\stackrel{˜}{𝒯}$ = ${\cup }_{m=1}^{M}{𝒯}_{m}$ with ${𝒯}_{m}=\left[{t}_{m}-\frac{\Delta t}{2},{t}_{m}+\frac{\Delta t}{2}\right]$

• ${t}_{m}$ be the time values on which the covariance is estimated, ${t}_{m}=-\frac{M}{2{\Omega }_{c}}+\left(m-\frac{1}{2}\right)\Delta t=\left(2m-1-M\right)\frac{\Delta t}{2}$

The estimate of the covariance value at time value ${\tau }_{m}$ depends on the quantities of form:

 ${\int }_{{ℳ}_{k}}exp\left\{2i\pi f{\tau }_{m}\right\}{S}_{i,j}\left(f\right)\phantom{\rule{0.166667em}{0ex}}df\approx \Delta f{S}_{i,j}\left({f}_{k}\right)exp\left\{2i\pi {f}_{k}{\tau }_{m}\right\}$ (86)

We develop the expression of ${f}_{k}$ and ${\tau }_{m}$ and we get:

 $\left\{\begin{array}{c}2m-1-M=2\left(m-1\right)-\left(M-1\right)\hfill \\ 2k-1-M=2\left(k-1\right)-\left(M-1\right)\hfill \end{array}\right\$

Thus,

 $\left(2m-1-M\right)\left(2k-1-M\right)=4\left(m-1\right)\left(k-1\right)-\left(M-1\right)\left(2m-1-M\right)-2\left(k-1\right)\left(M-1\right)$

and :

 $\left(2m-1-M\right)\left(2k-1-M\right)\frac{\Delta t}{2}\frac{\Delta f}{2}=\frac{\left(m-1\right)\left(k-1\right)}{M}-\frac{\left(M-1\right)\left(2m-1-M\right)}{4M}-\frac{\left(k-1\right)\left(M-1\right)}{2M}$

We denote :

 $\left\{\begin{array}{c}\delta \left(m\right)=exp\left\{-i\frac{\pi }{2M}\left(M-1\right)\left(2m-1-M\right)\right\}\hfill \\ {\phi }_{k}=exp\left\{-i\frac{\pi }{M}\left(k-1\right)\left(M-1\right)\right\}{S}_{i,j}\left({f}_{k}\right)\hfill \end{array}\right\$

Finally, we get the followig expression for integral in (86) :

 ${\int }_{{ℳ}_{k}}exp\left\{2i\pi f{\tau }_{m}\right\}{S}_{i,j}\left(f\right)\phantom{\rule{0.166667em}{0ex}}df\approx \Delta fexp\left\{2i\frac{\pi }{M}\left(m-1\right)\left(k-1\right)\right\}\delta \left(m\right){\phi }_{k}$

It follows that :

 ${C}_{i,j}^{stat}\left({\tau }_{m}\right)=\Delta f\delta \left(m\right)\sum _{k=1}^{M}{\phi }_{k}*exp\left\{2i\frac{\pi }{M}\left(m-1\right)\left(k-1\right)\right\}$ (87)

In the equation (87), we notice a discret inverse Fourier transform.

OpenTURNS builds an estimation of the stationary covariance function on a ProcessSample or TimeSeries using the previous algorithm implemented in the StationaryCovarianceModelFactory class. The result consists in a UserDefinedStationaryCovarianceModel which is easy to manipulate.

Such an object is composed of a time grid and a collection of $K$ square matrices of dimension $d$. $K$ corresponds to the number of time steps of the final time grid on which the covariance is estimated. When estimated from a time series , the UserDefinedStationaryCovarianceModel may have a time grid different from the initial time grid of the time series.

 Requirements $•$ a time series myTimeSeries type: TimeSeries $•$ a sample of time series mySample type: ProcessSample $•$ a spectral model factory mySpectralFactory type: SpectralModelFactory Results $•$ a factory myFactory type: StationaryCovarianceModelFactory $•$ a stationary covariance model: myEstimatedModel_TS, myEstimatedModel_PS, type: UserDefinedStationaryCovarianceModel

Python script for this Use Case:

script_docUC_StocProc_StationaryCovarianceFunction_Estimation.py

# Build a factory of stationary covariance function myCovarianceFactory = StationaryCovarianceModelFactory()   # Set the spectral factory algorithm myCovarianceFactory.setSpectralModelFactory(mySpectralFactory)   # Check the current spectral factory print(myCovarianceFactory.getSpectralModelFactory())   ######################################### # Case 1 :  Estimation on a ProcessSample   # The spectral model factory computes the spectral density function # without using the block and overlap arguments of the Welch factories myEstimatedModel_PS = myCovarianceFactory.build(mySample)   ######################################### # Case 2 :  Estimation on a TimeSeries   # The spectral model factory compute the spectral density function using # the block and overlap arguments of spectral model factories myEstimatedModel_TS = myCovarianceFactory.build(myTimeSeries)   ######################################### # Manipulation of the estimated model   # Evaluate the covariance function at each time step # Care : if estimated from a time series, the time grid has changed for i in range(N):     tau = myTimeGrid.getValue(i)     cov = myEstimatedModel_PS(tau)

The following example illustrates the case where the avalaible data is a sample of ${10}^{4}$ realizations of the process, defined on the time grid $\left[0,10\right]$, discretized every $\Delta t$ = $0.1$. The covariance model is the Exponential model parametered by $\lambda =1$ and $a=1$, i.e. ${C}^{stat}\left(\tau \right)=exp\left(-|\tau |\right)$.

The Figure (66) draws the graph of the exact covariance function and its estimation.

 Covariance function C(0,t) for t in the time grid : estimation and exact model.
Figure 66

### 5.9.5 UC : Estimation of a non stationary covariance function

Let $X:\Omega ×𝒟\to {ℝ}^{d}$ be a multivariate normal process of dimension $d$ where $𝒟\in {ℝ}^{n}$. $X$ is supposed to be a second order process and we note $C:𝒟×𝒟\to {ℳ}_{d×d}\left(ℝ\right)$ its covariance function.

The objective of this use case is to estimate $C$ from several fields generated by the process $X$. We suppose that the process is not stationary.

We denote $\left({\underline{t}}_{0},\cdots ,{\underline{t}}_{N-1}\right)$ the vertices of the common mesh $ℳ$ and $\left({\underline{x}}_{0}^{k},\cdots ,{\underline{x}}_{N-1}^{k}\right)$ the associated values of the field $k$. We suppose that we have $K$ fields.

We recall that the covariance function $C$ writes:

 $\forall \left(\underline{s},\underline{t}\right)\in 𝒟×𝒟,\phantom{\rule{1.em}{0ex}}C\left(\underline{s},\underline{t}\right)=𝔼\left[\left({X}_{\underline{s}}-m\left(\underline{s}\right)\right){\left({X}_{\underline{t}}-m\left(\underline{t}\right)\right)}^{t}\right]$ (88)

where the mean function $m:𝒟\to {ℝ}^{d}$ is defined by :

 $\forall \underline{t}\in 𝒟,\phantom{\rule{1.em}{0ex}}m\left(\underline{t}\right)=𝔼\left[{X}_{\underline{t}}\right]$ (89)

First, OpenTURNS estimate the covariance function $C$ on the vertices of the mesh $ℳ$. At each vertex ${\underline{t}}_{i}\in ℳ$, we use the empirical mean estimator applied to the $K$ fields to estimate :

1. $m\left({\underline{t}}_{i}\right)$ at the vertex ${\underline{t}}_{i}$:

 $\forall {\underline{t}}_{i}\in ℳ,\phantom{\rule{1.em}{0ex}}m\left({\underline{t}}_{i}\right)\simeq \frac{1}{K}\sum _{k=1}^{K}{\underline{x}}_{i}^{k}$ (90)
2. $C\left({\underline{t}}_{i},{\underline{t}}_{j}\right)$ at the vertices $\left({\underline{t}}_{i},{\underline{t}}_{j}\right)$:

 $\forall \left({\underline{t}}_{i},{\underline{t}}_{j}\right)\in 𝒟×𝒟,\phantom{\rule{1.em}{0ex}}C\left({\underline{t}}_{i},{\underline{t}}_{j}\right)\simeq \frac{1}{K}\sum _{k=1}^{K}\left({\underline{x}}_{i}^{k}-m\left({\underline{t}}_{i}\right)\right){\left({\underline{x}}_{j}^{k}-m\left({\underline{t}}_{j}\right)\right)}^{t}$ (91)

Then, OpenTURNS builds a covariance function defined on $𝒟×𝒟$ which is a piecewise constant function defined on $𝒟×𝒟$ by:

 $\begin{array}{c}\hfill \forall \left(\underline{s},\underline{t}\right)\in 𝒟×𝒟,\phantom{\rule{0.166667em}{0ex}}{C}^{stat}\left(\underline{s},\underline{t}\right)=C\left({\underline{t}}_{k},{\underline{t}}_{l}\right)\end{array}$

where $k$ is such that ${\underline{t}}_{k}$ is the vertex of $ℳ$ the nearest to $\underline{s}$ and ${\underline{t}}_{l}$ the nearest to $\underline{t}$.

OpenTURNS uses the object NonStationaryCovarianceModelFactory wich creates a

UserDefinedCovarianceModel.

 Requirements $•$ a set of fields myFieldSample type: ProcessSample Results $•$ a factory myFactory type: NonStationaryCovarianceModelFactory $•$ a covariance model: myEstimatedModel type: UserDefinedCovarianceModel

Python script for this Use Case:

script_docUC_StocProc_NonStationaryCovarianceFunction_Estimation.py

# Build a covariance model factory myFactory = NonStationaryCovarianceModelFactory()   # Estimation on a the ProcessSample myEstimatedModel = myFactory.build(myFieldSample)

In the following example, we illustrate the estimation of the non stationary covariance model $C:𝒟×\left[-4,4\right]\to ℝ$ defined by:

 $\begin{array}{c}\hfill C\left(s,t\right)=exp\left(-\frac{4|s-t|}{1+{s}^{2}+{t}^{2}}\right)\end{array}$ (92)

The domaine $𝒟$ is discretized on a mesh $ℳ$ which is a time grid with 64 points.

We build a normal process $X:\Omega ×\left[-4,4\right]\to ℝ$ with zero mean and $C$ as covariance function. OpenTURNS discretizes the covariance model $C$ using $C\left({t}_{k},{t}_{\ell }\right)$ for each $\left({t}_{k},{t}_{\ell }\right)\in ℳ×ℳ$.

We get a $N={10}^{3}$ fields from the process $X$ from wich we estimate the covariance model $C$. The Figure 67 draws the iso contours of the estimated model compared to the theoretical one.

 Estimation of a non stationary covariance model of a scalar process on [-4,4].
Figure 67

### 5.9.6 UC : Creation of a parametric spectral density function

Let $X:\Omega ×𝒟\to {ℝ}^{d}$ be a multivariate stationary normal process of dimension $d$. We only treat here the case where the domain is of dimension 1: $𝒟\in ℝ$ ($n=1$).

If the process is continuous, then $𝒟=ℝ$. In the discrete case, $𝒟$ is a lattice.

$X$ is supposed to be a second order process with zero mean and we suppose that its spectral density function $S:ℝ\to {ℋ}^{+}\left(d\right)$ defined in (36) exists. ${ℋ}^{+}\left(d\right)\in {ℳ}_{d}\left(ℂ\right)$ is the set of $d$-dimensional positive definite hermitian matrices.

This use case illustrates how the User can create a density spectral function from parametric models. OpenTURNS implements the Cauchy spectral model as a parametric model for the spectral density fucntion $S$.

The Cauchy spectral model: Its is associated to the Exponential covariance model. The Cauchy spectral model is defined by :

 ${S}_{ij}\left(f\right)=\frac{4{R}_{ij}{a}_{i}{a}_{j}\left({\lambda }_{i}+{\lambda }_{j}\right)}{{\left({\lambda }_{i}+{\lambda }_{j}\right)}^{2}+{\left(4\pi f\right)}^{2}},\phantom{\rule{1.em}{0ex}}\forall \left(i,j\right)\le d$ (93)

where $\underline{\underline{R}}$, $\underline{a}$ and $\underline{\lambda }$ are the parameters of the Exponential covariance model defined in section 5.9.1. The relation (93) can be explicited with the spatial covariance function ${\underline{\underline{C}}}^{spat}\left(\tau \right)$ defined in (82).

OpenTURNS defines this model thanks to the object CauchyModel.

 Requirements $•$ $\underline{a}$, $\underline{\lambda }$ : amplitude, scale type: NumericalPoint $•$ $\underline{\underline{R}}$, : spatialCorrelation type: CorrelationMatrix $•$ ${\underline{\underline{C}}}^{s}$, : spatialCovariance type: CovarianceMatrix $•$ a time grid : myTimeGrid type: RegularGrid Results $•$ a spectral model : mySpectralModel_Corr, mySpectralModel_Cov type: SpectralModel

Python script for this UseCase :

script_docUC_StocProc_DensitySpectralFunction_Param.py

# Create the spectral model # for example : the Cauchy model   # from the amplitude, scale and spatialCovariance mySpectralModel_Corr = CauchyModel(amplitude, scale, spatialCorrelation)   # or from the scale and spatialCovariance mySpectralModel_Cov = CauchyModel(scale, spatialCovariance)

### 5.9.7 UC : Creation of a User defined spectral density function

Let $X:\Omega ×𝒟\to {ℝ}^{d}$ be a multivariate stationary normal process of dimension $d$. We only treat here the case where the domain is of dimension 1: $𝒟\in ℝ$ ($n=1$).

If the process is continuous, then $𝒟=ℝ$. In the discrete case, $𝒟$ is a lattice.

$X$ is supposed to be a second order process with zero mean and we suppose that its spectral density function $S:ℝ\to {ℋ}^{+}\left(d\right)$ defined in (36) exists. ${ℋ}^{+}\left(d\right)\in {ℳ}_{d}\left(ℂ\right)$ is the set of $d$-dimensional positive definite hermitian matrices.

This use case illustrates how the User can define his own density spectral function from parametric models. OpenTURNS allows it thanks to the object UserDefinedSpectralModel defined from :

• a frequency grid $\left(-{f}_{c},\cdots ,{f}_{c}\right)$ with step $\delta f$, stored in the object RegularGrid,

• a collection of hermitian matrices $\in {𝕄}_{d}\left(ℂ\right)$ stored in the object HermitianMatrixCollection, which are the images of each point of the frequency grid through the density spectral function.

OpenTURNS builds a constant piecewise function on $\left[-{f}_{c},{f}_{c}\right]$, where the intervals where the density spectral function is constant are centered on the points of the frequency grid, of length $\delta f$. Then, it is possible to evaluate the spectral density function for a given frequency thanks to the method computeSpectralDensity: if the frequency is not inside the interval $\left[-{f}_{c},{f}_{c}\right]$, OpenTURNS returns an exception. Otherwise, it returns the hermitian matrix of the subinterval of $\left[-{f}_{c},{f}_{c}\right]$ that contains the given frequency.

 Requirements $•$ a frequency grid : myFrequencyGrid type: RegularGrid $•$ a collection of hermitian matrices : myHermitianCollection type: HermitianMatrixCollection Results $•$ a spectral model : mySpectralModel type: SpectralModel

Python script for this UseCase :

script_docUC_StocProc_DensitySpectralFunction_UserDefined.py

# Create the spectral model mySpectralModel = UserDefinedSpectralModel(myFrequencyGrid, myCollection)   # Get the spectral density function computed at first frequency values firstFrequency = myFrequencyGrid.getStart() frequencyStep = myFrequencyGrid.getStep() myFirstHermitian = mySpectralModel(firstFrequency)   # Get the spectral function at frequency + 0.3 * frequencyStep mySpectralModel(frequency + 0.3 * frequencyStep)

In the following example, we illustrate how to create a modified low pass model of dimension $d=1$ with exponential decrease defined by : $S:ℝ\to ℝ$ where

• Frequency value $f$ should be positive,

• for $f<5Hz$, the spectral density function is constant : $S\left(f\right)=1.0$,

• for $f>5Hz$, the spectral density function is equal to $S\left(f\right)=exp\left[-2.0{\left(f-5.0\right)}^{2}\right]$.

The frequency grid is $\right]0,{f}_{c}\right]=\right]0,10\right]$ with $\delta f=0.2$ Hz. The Figure 68 draws the spectral density.

 User defined spectral model
Figure 68

### 5.9.8 UC : Estimation of a spectral density function

Let $X:\Omega ×𝒟\to {ℝ}^{d}$ be a multivariate stationary normal process of dimension $d$. We only treat here the case where the domain is of dimension 1: $𝒟\in ℝ$ ($n=1$).

If the process is continuous, then $𝒟=ℝ$. In the discrete case, $𝒟$ is a lattice.

$X$ is supposed to be a second order process with zero mean and we suppose that its spectral density function $S:ℝ\to {ℋ}^{+}\left(d\right)$ defined in (36) exists. ${ℋ}^{+}\left(d\right)\in {ℳ}_{d}\left(ℂ\right)$ is the set of $d$-dimensional positive definite hermitian matrices.

This objective of this use case is to estimate the spectral density function $S$ from data, which can be a sample of time series or one time series.

Depending on the available data, we proceed differently :

• if the data correspond to several independent realizations of the process (stored in a ProcessSample object), a statistical estimate is performed using statistical average of a realization-based estimator;

• if the data correspond to one realization of the process at different time stamps (stored in a TimeSeries object), the process being observed during a long period of time, an ergodic estimate is performed using a time average of an ergodic-based estimator.

The estimation of the spectral density function from data may use some parametric or non parametric methods.

The Welch method is a non parametric estimation technique, known to be performant. We detail it in the case where the available data on the process is a time series which values are $\left({\underline{x}}_{0},\cdots ,{\underline{x}}_{N-1}\right)$ associated to the time grid $\left({t}_{0},\cdots ,{t}_{N-1}\right)$ which is a discretization of the domain $\left[0,T\right]$.

We assume that the process has a spectral density $S$ defined on $|f|\le \frac{T}{2}$.

The method is based on the segmentation of the time series into $K$ segments of length $L$, possibly overlapping (size of overlap $R$).

Let ${\underline{X}}_{1}\left(j\right),\phantom{\rule{4pt}{0ex}}j=0,1,...,L-1$ be the first such segment. Then :

 ${\underline{X}}_{1}\left(j\right)=\underline{X}\left(j\right),\phantom{\rule{4pt}{0ex}}j=0,1,...,L-1$

Applying the same decomposition,

 ${\underline{X}}_{2}\left(j\right)=\underline{X}\left(j+\left(L-R\right)\right),\phantom{\rule{4pt}{0ex}}j=0,1,...,L-1$

and finally :

 ${\underline{X}}_{K}\left(j\right)=\underline{X}\left(j+\left(K-1\right)\left(L-R\right)\right),\phantom{\rule{4pt}{0ex}}j=0,1,...,L-1$

The objective is to get a statistical estimator from these $K$ segments. We define the periodogram associated with the segment ${\underline{X}}_{k}$ by:

 $\begin{array}{ccc}\hfill {\underline{X}}_{k}\left({f}_{p},T\right)& =& \Delta t\sum _{n=0}^{L-1}\underline{x}\left(n\Delta t\right)exp\left[\frac{-j2\pi pn}{N}\right],\phantom{\rule{1.em}{0ex}}p=0,\cdots ,L-1\hfill \\ \hfill {\stackrel{^}{G}}_{\underline{x}}\left({f}_{p},T\right)& =& \frac{2}{T}{\underline{X}}_{k}\left({f}_{p},T\right)\phantom{\rule{0.166667em}{0ex}}{{\underline{X}}_{k}{\left({f}_{p},T\right)}^{*}}^{t},\phantom{\rule{1.em}{0ex}}p=0,\cdots ,L/2-1\hfill \end{array}$

with $\Delta t=\frac{T}{N}$ and ${f}_{p}=\frac{p}{T}=\frac{p}{N}\frac{1}{\Delta t}$.

It has been proven that the periodogram has bad statistical properties. Indeed, two quantities summarize the properties of an estimator: its bias and its variance. The bias is the expected error one makes on the average using only a finite number of time series of finite length, whereas the covariance is the expected fluctuations of the estimator around its mean value. For the periodogram, we have:

• Bias$=𝔼\left[{\stackrel{^}{G}}_{\underline{x}}\left({f}_{p},T\right)-{G}_{\underline{X}}\left({f}_{p}\right)\right]=\left(\frac{1}{T}{W}_{B}\left({f}_{p},T\right)-{\delta }_{0}\right)*{G}_{\underline{X}}\left({f}_{p}\right)$ where ${W}_{B}\left({f}_{p},T\right)={\left(\frac{sin\pi fT}{\pi fT}\right)}^{2}$ is the squared module of the Fourier transform of the function ${w}_{B}\left(t,T\right)$ (Barlett window) defined by:

 ${w}_{B}\left(t,T\right)={\mathbf{1}}_{\left[0,T\right]}\left(t\right)$ (94)

This estimator is biased but this bias vanishes when $T\to \infty$ as ${lim}_{T\to \infty }\frac{1}{T}{W}_{B}\left({f}_{p},T\right)={\delta }_{0}$.

• Covariance$=\frac{1}{T}{W}_{B}\left({f}_{p},T\right)*{G}_{\underline{X}}\left({f}_{p}\right)\to {G}_{\underline{X}}\left({f}_{p}\right)$ as $T\to \infty$, which means that the fluctuations of the periodogram are of the same order of magnitude as the quantity to be estimated and thus the estimator is not convergent.

The periodogram's lack of convergence may be easily fixed if we consider the averaged periodogram over $K$ independent time series or segments:

 ${\stackrel{^}{G}}_{\underline{x}}\left({f}_{p},T\right)=\frac{2}{KT}\sum _{k=0}^{K-1}{\underline{X}}^{\left(k\right)}\left({f}_{p},T\right){\underline{X}}^{\left(k\right)}{\left({f}_{p},T\right)}^{t}$ (95)

The averaging process has no effect on the significant bias of the periodogram.

The use of a tapering window $w\left(t,T\right)$ may significantly reduce it. The time series $\underline{x}\left(t\right)$ is replaced by a tapered time series $w\left(t,T\right)\underline{x}\left(t\right)$ in the computation of $\underline{X}\left({f}_{p},T\right)$. One gets :

 $𝔼\left[{\stackrel{^}{G}}_{\underline{x}}\left({f}_{p},T\right)-{G}_{\underline{X}}\left({f}_{p}\right)=\left(\frac{1}{T}W\left({f}_{p},T\right)-{\delta }_{0}\right)*{G}_{\underline{X}}\left({f}_{p}\right)$ (96)

where $W\left({f}_{p},T\right)$ is the square module of the Fourier transform of $w\left(t,T\right)$ at the frequency ${f}_{p}$. A judicious choice of tapering function such as the Hanning window ${w}_{H}$ can dramatically reduce the bias of the estimate:

 ${w}_{H}\left(t,T\right)=\sqrt{\frac{8}{3}}\left(1-{cos}^{2}\left(\frac{\pi t}{T}\right)\right){\mathbf{1}}_{\left[0,T\right]}\left(t\right)$ (97)

OpenTURNS builds an estimation of the spectrum on a TimeSeries by fixing the number of segments, the overlap size parameter and a FilteringWindows. The available ones are :

• The Hamming window

 $w\left(t,T\right)=\sqrt{\frac{1}{K}}\left(0.54-0.46{cos}^{2}\left(\frac{2\pi t}{T}\right)\right){\mathbf{1}}_{\left[0,T\right]}\left(t\right)$ (98)

with $K$ = $\sqrt{0.{54}^{2}+\frac{1}{2}0.{46}^{2}}$

• The Hanning window described in (97) which is supposed to be the most usefull.

The result consists in a UserDefinedSpectralModel which is simple to manipulate.

Furthermore, OpenTURNS build an estimation of the spectral function on a process sample by considering that $k-th$ segment is the $k-th$ time series of the process sample . User should pay attention that data must be centred otherwise user could center them himself.

 Requirements $•$ number of segments mySegmentNumber type: integer $•$ size of overlap myOverlapSize type: integer $•$ a time series myTimeSeries type: TimeSeries $•$ a set of time series mySample type: ProcessSample Results $•$ a spectral model : myEstimatedModel_TS, myEstimatedModel_PS type: UserDefinedSpectralModel

Python script for this Use Case :

script_docUC_StocProc_DensitySpectralFunction_Estimation.py

# Build a spectral model factory myFactory = WelchFactory(Hanning(), mySegmentNumber, myOverlapSize)   # Estimation on a TimeSeries or on a ProcessSample myEstimatedModel_TS = myFactory.build(myTimeSeries) myEstimatedModel_PS = myFactory.build(mySample)   # Change the filtering window myFactory.setFilteringWindows(Hamming())   # Get the FFT algorithm myFFT = myFactory.getFFTAlgorithm()   # Get the frequencyGrid frequencyGrid = myEstimatedModel_PS.getFrequencyGrid()

The following example illustrates the case where the available data is a sample of ${10}^{3}$ realizations of the process, defined on the time grid $\left[0,102.3\right]$, discretized every $\Delta t=0.1$. The spectral model of the process is the Cauchy model parameterized by $\underline{\lambda }=\left(5\right)$ and $\underline{a}=\left(3\right)$.

The Figure (69) draws the graph of the real spectral model and its estimation from the sample of time series.

 Comparison of spectral density : estimation with the Welch method vs Cauchy Model
Figure 69

### 5.9.9 UC : Creation of stationary parametric second order model

This use case details how to create a stationary second order model that insures the coherence between the covariance function ${C}^{stat}:𝒟×𝒟\to {ℳ}_{d×d}\left(ℝ\right)$ and the spectral density function $S:ℝ\to {ℋ}^{+}\left(d\right)$. We only treat here the case where the domain is of dimension 1: $𝒟\in ℝ$ ($n=1$).

If the process is continuous, then $𝒟=ℝ$. In the discrete case, $𝒟$ is a lattice.

The coherence is done through the relation (36) and in some cases, it is not possible: for example, the spectral model is defined but the associated covariance model is not analytical.

OpenTURNS saves the complete information of a second order model in the object SecondOrderModel.

A second order model can be used to create zero-mean stationary normal processes, stored either in a TemporalNormalProcess object or in a SpectralNormalProcess one (see the use case of section 5.9.10).

OpenTURNS implements the parametric second order model ExponentialCauchy where the covariance function is the Exponential model (see the use case 5.9.1) and the associated spectral density function is the Cauchy model (see the use case 5.9.6) .

 Requirements $•$ $\underline{a}$, $\underline{\lambda }$ : amplitude, scale type: NumericalPoint $•$ $\underline{\underline{R}}$ : spatialCorrelation type: CorrelationMatrix $•$ $\underline{\underline{C}}$ : spatialCovariance type: CovarianceMatrix Results $•$ a second order model : mySecondOrderModel type: SecondOrderModel

Python script for this UseCase :

# Create the second order model   # for example : the Exponential Cauchy     # from the amplitude, scale and spatialCovariance   mySecondOrderModel = ExponentialCauchy(amplitude, scale, spatialCorrelation)     # or from the scale and spatialCovariance   mySecondOrderModel = ExponentialCauchy(scale,spatialCovariance)

### 5.9.10 UC : Creation of a normal process

This Use Case details how to create a normal process $X:\Omega ×𝒟\to {ℝ}^{d}$ either from its temporal covariance function or/and its spectral density function $S$ (when it exists).

A normal process defined by its temporal covariance function may have a trend: in that case, the normla process is the sum of the trend fucntion and a zero-mean normal process.

A zero-mean normal process is completely defined either by :

• its (temporal) covariance function $C:𝒟×𝒟\to {ℳ}_{d×d}\left(ℝ\right)$, which writes, in the stationary case: ${C}^{stat}:𝒟\to {ℳ}_{d×d}\left(ℝ\right)$. In that case, the normal process is used through its temporal view only, thanks to the object TemporalNormalProcess,

• or its bilateral spectral density function (when it exists), in the stationary case, $S:ℝ\to {ℋ}^{+}\left(d\right)$. In that case, the normal process is used through its spectral view only, thanks to the object SpectralNormalProcess. OpenTURNS restricts that possibility to processes for which the domain is of dimension 1: $𝒟\in ℝ$ ($n=1$).

When the zero-mean process is stationary, in order to manipulate the same normal process through both the temporal and spectral views, it is necessary to create a second order model that insures the coherence between the covariance function ${C}^{stat}$ and the spectral density function $S$ through the relation (36).

In that purpose, the object SecondOrderModel is built (see the use case 5.9.9) and then used to create a TemporalNormalProcess and the associated SpectralNormalProcess .

The choice between a TemporalNormalProcess and a SpectralNormalProcess based on the same second order model is motivated by performance considerations of the specific algorithms either in terms of memory requirements and CPU requirements. For example, the performance of the algorithms related to the spectral density function (FFT) varies greatly if the number of frequencies is a power of 2 or not.

In the case of a TemporalNormalProcess, we can ask for the trend function of the process thanks to the method getTrend or the covariance model and evaluate the covariance function thanks to the method computeCovariance or discretize it on a specific mesh thanks to the method discretizeCovariance, which creates the matrix defined in (33).

In the case of a SpectralNormalProcess, we can ask for the spectral model and evaluate the bilateral spectral density function $S$ defined in (36) thanks to the method computeSpectralDensity.

We note $N$ the number of vertices of the mesh $ℳ$ on wich $𝒟$ is discretized.

The first call to the method getRealization implies different actions according to the type of the normal process :

• in case of a TemporalNormalProcess, OpenTURNS builds the covariance matrix ${\underline{\underline{C}}}_{1,\cdots ,N}\in {ℳ}_{Nd,Nd}\left(ℝ\right)$ defined in (33), using the method discretizeCovariance. Then ${\underline{\underline{C}}}_{1,\cdots ,N}$ is factorized using the Cholesky algorithm: ${\underline{\underline{C}}}_{1,\cdots ,N}=\underline{\underline{G}}\phantom{\rule{0.166667em}{0ex}}{\underline{\underline{G}}}^{t}$.

• in case of a SpectralNormalProcess, OpenTURNS builds the $n$ bilateral spectral density matrices ${\left(\underline{\underline{S}}\left({f}_{i}\right)\right)}_{1\le i\le ,N}$ defined in (36) where ${\left({f}_{i}\right)}_{1\le i\le ,N}$ are the frequencies associated to the time grid. Then each matrix $matS\left({f}_{i}\right)$ is factorized using the Cholesky algorithm: $\underline{\underline{S}}\left({f}_{i}\right)={\underline{\underline{H}}}_{i}\phantom{\rule{0.166667em}{0ex}}{\overline{\underline{\underline{H}}}}_{i}^{t}$.

These matrices $\underline{\underline{G}}$ and ${\underline{\underline{H}}}_{i}$ are used to get some realizations of the process from realizations of a standard normal process (with zero mean and unit covariance matrix).

In order to get the Cholesky factor of ${\underline{\underline{C}}}_{1,\cdots ,N}$ or ${\left(\underline{\underline{S}}\left({f}_{i}\right)\right)}_{1\le i\le N}$, we might need to scale the matrices, due to some numerical precisions. This scale consists in replacing the matrix ${\underline{\underline{C}}}_{1,\cdots ,N}$ (resp. $\underline{\underline{S}}\left({f}_{i}\right)$) by ${\underline{\underline{C}}}_{1,\cdots ,N}+ϵ*\underline{\underline{I}}$ (resp. $\underline{\underline{S}}\left({f}_{i}\right)+ϵ*\underline{\underline{I}}$) with $\underline{\underline{I}}$ the identity matrix and $ϵ$ a negligible scalar variable.

In this case, the User gets a warning message to inform him about the used $ϵ$ value to get the Cholesky factor.

 Requirements $•$ a mesh : myMesh type: Mesh $•$ a time grid : myTimeGrid type: RegularGrid $•$ a trend function : myTrend type: TrendTransform $•$ covariance models : myCovarianceModel type: StationaryCovarianceModel $•$ a spectral model : mySpectralModel type: SpectralModel $•$ a second order model : mySecondOrderModel type: SecondOrderModel Results $•$ normal processes : myTempNormProc1, myTempNormProc2 type: TemporalNormalProcess $•$ the normal process : mySpectNormProc1, mySpectNormProc2 type: SpectralNormalProcess

Python script for this UseCase :

script_docUC_StocProc_NormalProcess_Creation.py

#################################### # CASE 1 : the normal process is defined by its temporal covariance # function ONLY   # Create a normal process with the temporal view ONLY myTempNormProc1 = TemporalNormalProcess(myTrend, myCovarianceModel, myMesh)   #################################### # CASE 2 : the normal process is defined by its spectral density function ONLY # Stationary process ONLY # Care! The mesh must be a time grid (n=1)   # Create a normal process with the spectral view ONLY mySpectNormProc1 = SpectralNormalProcess(mySpectralModel, myTimeGrid)   ########################################################################## # CASE 3 : the normal process is defined by a second order that contains both the temporal covariance function and the associated spectral density function # Stationary process ONLY # Care! The mesh must be a time grid (n=1)   # Create the normal process to use its temporal properties myTempNormProc2 = TemporalNormalProcess(mySecondOrderModel, myTimeGrid)   # Create the normal process to use its spectral properties mySpectNormProc2 = SpectralNormalProcess(mySecondOrderModel, myTimeGrid)

The example illustrated below is a bivariate normal process with the Exponential model for the covariance one, parameterized by $\underline{\lambda }=\left(1,1\right)$, $\underline{a}=\left(1,1\right)$ and $\underline{\underline{R}}$ the identity matrix. We built a TemporalNormalProcess and a SpectralNormalProcess using the same SecondOrderModel and the same RegularGrid. Figures (70) to (71) respectively draw the graphs of:

• one realization of the temporal process (both marginals are illustrated),

• a sample of 5 realizations of the process (the first marginal is presented here) based on the covariance of the second order model,

• one realization of the spectral process (both marginals are illustrated),

• a sample of 5 realizations of the process (the first marginal is presented here) based on the spectral density of the second order model.

 Realization of TemporalNormalProcess 5 realizations of the TemporalNormalProcess
Figure 70
 Realization of SpectralNormalProcess 5 realizations of the SpectralNormalProcess
Figure 71

## 5.10 Other processes

### 5.10.1 UC : Creation of a White Noise

This section details first how to create and manipulate a white noise.

A second order white noise $\epsilon :\Omega ×𝒟\to {ℝ}^{d}$ is a stochastic process of dimension $d$ such that the covariance function $C\left(\underline{s},\underline{t}\right)=\delta \left(\underline{t}-\underline{s}\right)C\left(\underline{s},\underline{s}\right)$ where $C\left(\underline{s},\underline{s}\right)$ is the covariance matrix of the process at vertex $\underline{s}$ and $\delta$ the Kroenecker function.

A process $\epsilon$ is a white noise if all finite family of locations ${\left({\underline{t}}_{i}\right)}_{i=1,\cdots ,n}\in 𝒟$, ${\left({\epsilon }_{{\underline{t}}_{i}}\right)}_{i=1,\cdots ,n}$ is independent and identically distributed.

OpenTURNS proposes to model it through the object WhiteNoise defined on a mesh and a distribution with zero mean and finite standard deviation.

If the distribution has a mean different from zero, OpenTURNS sends a message to prevent the User and does not allow the creation of such a white noise.

 Requirements $•$ a distribution : myDist type: Distribution $•$ a mesh : myMesh type: Mesh Results $•$ a white noise : myWN type: WhiteNoise

Python script for this UseCase :

script_docUC_StocProc_WhiteNoise.py

# Create  a white noise myWN = WhiteNoise(myDist, myMesh)

The first example illustrated in the Figures 72 to 72 is a univariate white noise distributed according to the standard normal distribution. We draw some realizations of the time grid = $\left[0,99\right]$ with $\Delta t=1.0$.

The second example illustrated in the Figures 73 and 73 is a univariate white noise distributed according to the standard normal distribution. We draw some realizations on a mesh of dimension 2.

The Figures 73 and 73 respectively draw one realization of the process when the values are interpolated and not interpolated.

 Realization of a white noise with distribution $𝒩\left(0,1\right)$ 5 realizations of a white noise with distribution $𝒩\left(0,1\right)$.
Figure 72
 One realization of the white noise defined on the mesh with distribution $𝒩\left(0,1\right)$ when the values are interpolated. Previous realization of the white noise defined on the mesh with distribution $𝒩\left(0,1\right)$ when the values are not interpolated.
Figure 73

### 5.10.2 UC : Creation of a Random Walk

This section details first how to create and manipulate a random walk.

A random walk $X:\Omega ×𝒟\to {ℝ}^{d}$ is a process where $𝒟=ℝ$ discretized on the time grid ${\left({t}_{i}\right)}_{i\ge 0}$ such that:

 $\begin{array}{ccc}\hfill {X}_{{t}_{0}}& =& {\underline{x}}_{{t}_{0}}\hfill \\ \hfill \forall n>0,\phantom{\rule{0.222222em}{0ex}}{X}_{{t}_{n}}& =& {X}_{{t}_{n-1}}+{\epsilon }_{{t}_{n}}\hfill \end{array}$ (5.10)

where ${\underline{x}}_{0}\in {ℝ}^{d}$ and $\epsilon$ is a white noise of dimension $d$.

OpenTURNS proposes to model it through the object RandomWalk defined thanks to the origin, the distribution of the white noise and the time grid.

 Requirements $•$ a numerical point : myOrigin type: NumericalPoint $•$ two distributionq : myDist type: Distribution $•$ a time grid : myTimeGrid type: RegularGrid Results $•$ a random walk myRandomWalk type: RandomWalk

Python script for this UseCase :

script_docUC_StocProc_RandomWalk.py

# Creation of a random walk myRandomWalk = RandomWalk(myOrigin, myDist, myTimeGrid)

Figures (74) and (74) illustrate realizations of a 1D random walk where the distribution $𝒟$ is respectively :

• discrete : the support is the two points $\left\{-1,10\right\}$ with respective weights $0.9$ and $0.1$,

• continuous : the Normal distribution $𝒩\left(0,1\right)$.

 Realizations of a random walk with the discrete distribution : $P\left(-1\right)=0.9,P\left(10\right)=0.1$. Realizations of a random walk with distribution $𝒩\left(0,1\right)$.
Figure 74

Figures (75) and (75) illustrate realizations of a 2D random walk where the distribution $𝒟$ is respectively :

• discrete : the support is the two points $\left\{-1,10\right\}$ with respective weights $0.9$ and $0.1$,

• continuous : the 2D Normal distribution $𝒩\left(\underline{0},\underline{\underline{1}}\right)$.

The realizations are presented in the phase space ${X}_{1}$ versus ${X}_{2}$.

 Realizations of a random walk with the uniform discrete distribution over the points $\left(-1,-2\right)$ and $\left(1,3\right)$. 5 realizations of a random walk with distribution $𝒩\left(\underline{0},\underline{\underline{I}}\right)$
Figure 75

### 5.10.3 UC : Creation of a Functional Basis Process

The objective of this Use Case is to define $X:\Omega ×𝒟\to {ℝ}^{d}$ a multivariate stochastic process of dimension $d$ where $𝒟\in {ℝ}^{n}$, as a linear combination of $K$ deterministic functions ${\left({\phi }_{i}\right)}_{i=1,\cdots ,K}:{ℝ}^{n}\to {ℝ}^{d}$:

 $\begin{array}{c}\hfill X\left(\omega ,\underline{t}\right)=\sum _{i=1}^{K}{A}_{i}\left(\omega \right){\phi }_{i}\left(\underline{t}\right)\end{array}$

where $\underline{A}=\left({A}_{1},\cdots ,{A}_{K}\right)$ is a random vector of dimension $K$.

We suppose that $ℳ$ is discretized on the mesh $ℳ$ wich has $N$ vertices.

A realization of $X$ on $ℳ$ consists in generating a realization $\underline{\alpha }$ of the random vector $\underline{A}$ and in evaluating the functions ${\left({\phi }_{i}\right)}_{i=1,\cdots ,K}$ on the mesh $ℳ$. If we note $\left({\underline{x}}_{0},\cdots ,{\underline{x}}_{N-1}\right)$ the realization of $X$, where $X\left(\omega ,{\underline{t}}_{k}\right)={\underline{x}}_{k}$, we have:

 $\begin{array}{c}\hfill \forall k\in \left[0,N-1\right],\phantom{\rule{1.em}{0ex}}{\underline{x}}_{k}=\sum _{i=1}^{K}{\alpha }_{i}{\phi }_{i}\left({\underline{t}}_{k}\right)\end{array}$
 Requirements $•$ the distribution of $\underline{A}$: coefDist type: Distribution $•$ the mesh $ℳ$ : myMesh type: Mesh $•$ the functional basis ${\left({\phi }_{i}\right)}_{i=1,\cdots ,K}$: myBasis type: Basis Results $•$ the functional process myOutputProcess type: FunctionalBasisProcess

Python script for this UseCase :

script_docUC_StocProc_FunctionalBasisProcess.py

# Create the output process of interest myOutputProcess = FunctionalBasisProcess(coefDist, Basis(myBasis), myMesh)

## 5.11 Process transformation

### 5.11.1 UC : Creation of a Dynamical Function

The objective here is to create dynamical functions, that can act on fields.

OpenTURNS defines a new function called dynamical function and two particular dynamical functions: the spatial function and the temporal function.

Dynamical function:

A dynamical function ${f}_{dyn}:𝒟×{ℝ}^{d}\to {𝒟}^{\text{'}}×{ℝ}^{q}$ where $𝒟\in {ℝ}^{n}$ and ${𝒟}^{\text{'}}\in {ℝ}^{p}$ is defined by:

 $\begin{array}{c}\hfill {f}_{dyn}\left(\underline{t},\underline{x}\right)=\left({t}^{\text{'}}\left(\underline{t}\right),{v}^{\text{'}}\left(\underline{t},\underline{x}\right)\right)\end{array}$ (100)

with ${t}^{\text{'}}:𝒟\to {𝒟}^{\text{'}}$ and ${v}^{\text{'}}:𝒟×{ℝ}^{d}\to {ℝ}^{q}$.

A dynamical function ${f}_{dyn}$ transforms a multivariate stochastic process:

 $\begin{array}{c}\hfill X:\Omega ×𝒟\to {ℝ}^{d}\end{array}$ (101)

where $𝒟\in {ℝ}^{n}$ is discretized according to the $ℳ$ into the multivariate stochastic process:

 $\begin{array}{c}\hfill Y={f}_{dyn}\left(X\right)\end{array}$ (102)

such that:

 $\begin{array}{c}\hfill Y:\Omega ×{𝒟}^{\text{'}}\to {ℝ}^{q}\end{array}$

where the mesh ${𝒟}^{\text{'}}\in {ℝ}^{p}$ is discretized according to the ${ℳ}^{\text{'}}$.

A dynamical function ${f}_{dyn}$ also acts on fields and produces fields of possibly different dimension ($q\ne d$) and mesh ($𝒟\ne {𝒟}^{\text{'}}$ or $ℳ\ne {ℳ}^{\text{'}}$).

OpenTURNS v1.3 only proposes dynamical functions where ${𝒟}^{\text{'}}=𝒟$ and ${ℳ}^{\text{'}}=ℳ$ which means that ${t}^{\text{'}}=Id$ through the spatial function and the temporal function. It follows that the process $Y$ shares the same mesh with $X$, only its values have changed.

Spatial function:

A spatial function ${f}_{spat}:𝒟×{ℝ}^{d}\to 𝒟×{ℝ}^{q}$ is a particular dynamical function that lets invariant the mesh of a field and defined by a function $g:{ℝ}^{d}\to {ℝ}^{q}$ such that:

 $\begin{array}{c}\hfill {f}_{spat}\left(\underline{t},\underline{x}\right)=\left(\underline{t},g\left(\underline{x}\right)\right)\end{array}$ (103)

Let's note that the input dimension of ${f}_{spat}$ still designs the dimension of $\underline{x}$ : $d$. Its output dimension is equal to $q$.

The creation of the SpatialFunction object of OpenTURNS requires the numerical math function $g$ and the integer $n$: the dimension of the vertices of the mesh $ℳ$. This data is required for tests on the compatibility of dimension when a composite process is created using the spatial function.

Temporal function:

A temporal function ${f}_{temp}:𝒟×{ℝ}^{d}\to 𝒟×{ℝ}^{q}$ is a particular dynamical function that lets invariant the mesh of a field and defined by a function $h:{ℝ}^{n}×{ℝ}^{d}\to {ℝ}^{q}$ such that:

 $\begin{array}{c}\hfill {f}_{temp}\left(\underline{t},\underline{x}\right)=\left(\underline{t},h\left(\underline{t},\underline{x}\right)\right)\end{array}$ (104)

Let's note that the input dimension of ${f}_{temp}$ still design the dimension of $\underline{x}$ : $d$. Its output dimension is equal to $q$.

The creation of the TemporalFunction object of OpenTURNS requires the numerical math function $h$ and the integer $n$: the dimension of the vertices of the mesh $ℳ$.

The use case illustrates the creation of a spatial (dynamical) function from the function $g:{ℝ}^{2}\to {ℝ}^{2}$ such as :

 $\begin{array}{c}\hfill g\left(\underline{x}\right)=\left({x}_{1}^{2},{x}_{1}+{x}_{2}\right)\end{array}$ (105)

and the creation of a temporal (dynamical) function from the function $h:{ℝ}^{2}×{ℝ}^{2}$ such as :

 $\begin{array}{c}\hfill h\left(\underline{t},\underline{x}\right)=\left({t}_{1}+{t}_{2}+{x}_{1}^{2}+{x}_{2}^{2}\right)\end{array}$ (106)
 Requirements $•$ some functions g,h type: NumericalMathFunction Results $•$ a spatial function mySpatialFunction type: SpatialFunction $•$ a temporal function myTemporalFunction type: TemporalFunction $•$ two dynamical functions myDynamicalFunctionFromSpatial and myDynamicalFunctionFromTemporal type: DynamicalFunction

Python script for this use case :

script_docUC_LSF_DynamicalFunction.py

# Create the function g : R^d --> R^q # for example : R^2 --> R^2 #               (x1,x2) --> (x1^2, x1+x2) g = NumericalMathFunction(['x1', 'x2'],  ['x1^2', 'x1+x2'])   # Create the function h : R^n*R^d --> R^q # for example : R^2*R^2 --> R #               (t1,t2,x1,x2) --> (t1+t2+x1^2+x2^2) h = NumericalMathFunction(['t1', 't2', 'x1', 'x2'],  ['t1+t2+x1^2+x2^2'])   ########################################### # Creation of a spatial dynamical function ###########################################   # Convert g : R^d --> R^q into a spatial fucntion # n is the dimension of the mesh # of the field on wich g will be applied n = 2 mySpatialFunction = SpatialFunction(g, n) print("spatial function=", mySpatialFunction)   ########################################### # Creation of a temporal dynamical function ###########################################   # Convert h: R^n*R^d --> R^q into a temporal function # n is the dimension of the mesh # here n = dimension (t1,t2) n = 2 myTemporalFunction = TemporalFunction(h, n) print("temporal function=", myTemporalFunction)

### 5.11.2 UC : Trend addition, Box Cox transformation, Composite process

The objective here is to create a process $Y$ as the image through a dynamical function ${f}_{dyn}$ of another process $X$:

 $\begin{array}{c}\hfill Y={f}_{dyn}\left(X\right)\end{array}$

General case:

In the general case, $X:\Omega ×𝒟\to {ℝ}^{d}$ is a multivariate stochastic process of dimension $d$ where $𝒟\in {ℝ}^{n}$, $Y:\Omega ×{𝒟}^{\text{'}}\to {ℝ}^{q}$ a multivariate stochastic process of dimension $q$ where ${𝒟}^{\text{'}}\in {ℝ}^{p}$ and ${f}_{dyn}:𝒟×{ℝ}^{d}\to {𝒟}^{\text{'}}×{ℝ}^{q}$ and ${f}_{dyn}$ is defined in (100).

OpenTURNS builds the transformed process $Y$ thanks to the object CompositeProcess from the data: ${f}_{dyn}$ and the process $X$.

OpenTURNS proposes two kinds of dynamical function: the spatial functions defined in (103) and the temporal functions defined in (104).

Trend modifications:

Very often, we have to remove a trend from a process or to add it. If we note ${f}_{trend}:{ℝ}^{n}\to {ℝ}^{d}$ the function modelling a trend, then the dynamical function which consists in adding the trend to a process is the temporal function ${f}_{temp}:𝒟×{ℝ}^{d}\to {ℝ}^{n}×{ℝ}^{d}$ defined by:

 $\begin{array}{c}\hfill {f}_{temp}\left(\underline{t},\underline{x}\right)=\left(\underline{t},\underline{x}+{f}_{trend}\left(\underline{t}\right)\right)\end{array}$ (107)

OpenTURNS enables to directly convert the numerical math function ${f}_{trend}$ into the temporal function ${f}_{temp}$ thanks to the TrendTranform object which maps ${f}_{trend}$ into the temporal function ${f}_{temp}$.

Then, the process $Y$ is built with the object CompositeProcess from the data: ${f}_{temp}$ and the process $X$ such that:

 $\begin{array}{c}\hfill \forall \omega \in \Omega ,\forall \underline{t}\in 𝒟,\phantom{\rule{1.em}{0ex}}Y\left(\omega ,\underline{t}\right)=X\left(\omega ,\underline{t}\right)+{f}_{trend}\left(\underline{t}\right)\end{array}$

Box Cox transformation:

If the transformation of the process $X$ into $Y$ corresponds to the Box Cox transformation ${f}_{BoxCox}:{ℝ}^{d}\to {ℝ}^{d}$ which transforms $X$ into a process $Y$ with stabilized variance, then the corresponding dynamical function is the spatial function ${f}_{spat}:𝒟×{ℝ}^{d}\to 𝒟×{ℝ}^{d}$ defined by:

 $\begin{array}{c}\hfill {f}_{spat}\left(\underline{t},\underline{x}\right)=\left(\underline{t},{f}_{BoxCox}\left(\underline{x}\right)\right)\end{array}$ (108)

OpenTURNS enables to directly convert the numerical math function ${f}_{BoxCox}$ into the spatial function ${f}_{spat}$ thanks to the SpatialFunction object.

Then, the process $Y$ is built with the object CompositeProcess from the data: ${f}_{spat}$ and the process $X$ such that:

 $\begin{array}{c}\hfill \forall \omega \in \Omega ,\forall \underline{t}\in 𝒟,\phantom{\rule{1.em}{0ex}}Y\left(\omega ,\underline{t}\right)={f}_{BoxCox}\left(X\left(\omega ,\underline{t}\right)\right)\end{array}$
 Requirements $•$ a stochastic process : myXtProcess type: Process $•$ a dynamical function : myDynFct type: DynamicalFunction $•$ the trend function : fTrend type: NumericalMathFunction Results $•$ the trend transformation : fTemp type: TrendTransform $•$ the Box Cox transformation : fSpat type: BoxCoxTransform $•$ some transformed processes : myYtProcess, myYtProcess_trend, myYtProcess_boxcox type: CompositeProcess

Python script for this Use Case :

script_docUC_StocProc_CompositeProcess.py

####################### # Case 1 : General case #######################   # Create the  image Y myYtProcess = CompositeProcess(myDynFunc, myXtProcess)   # Get the antecedent : myXtProcess print('My antecedent process = ', myYtProcess.getAntecedent())   # Get the dynamical function # which performs the transformation print('My dynamical function = ', myYtProcess.getFunction())   ############################## # Case 2 : Addition of a trend ##############################   # Create the temporal function fTemp # from the function fTrend fTemp = TrendTransform(fTrend)   # Create the composite process # myYtProcess_trend = myXtProcess + fTrend(t) myYtProcess_trend = CompositeProcess(fTemp, myXtProcess)   ################################# # Case 3 : Box Cox transformation #################################   # Create a Box Cox transformation # for example for a process of dimension 2 # fBoxCox : R^2 --> R^1 #            (x1,x2) --> (log(x1),log(x2)) # fSpat will be applied on process with bidimensional mesh n = 2 fSpat = SpatialFunction(BoxCoxTransform([0.0, 0.0]), n)   # Create the process # myYtProcesss_boxcox = BoxCoxTransform(myXtProcess) myYtProcess_boxcox = CompositeProcess(fSpat, myXtProcess)

## 5.12 Stationarity test

### 5.12.1 UC : Dickey Fuller stationarity tests

This use case details some Dickey Fuller stationarity tests and the strategy proposed by OpenTURNS. The tests are applied on a scalar time series or a sample of scalar time series.

Two forms of non stationarity can be distinguished :

• deterministic non stationarity when the marginal distribution of the process is time dependent. For example, the mean is time dependent (temporal trend) or the variance is time dependent;

• stochastic non stationarity when the random perturbation at time $t$ does not vanish with time (for example a random walk).

Specific tests exist to detect the first case. The Dickey-Fuller tests only focus on the stochastic non stationarity. It assumes that the underlying process, discretized on the time grid $\left({t}_{0},\cdots ,{t}_{n-1}\right)$ writes :

 ${X}_{t}=a+bt+\rho {X}_{t-1}+{\epsilon }_{t}$ (109)

where $\rho >0$ and where $a$ or $b$ or both $\left(a,b\right)$ can be assumed to be equal to 0.

When $\left(a\ne 0$ and $b=0\right)$, the model (109) is said to have a drift. When $\left(a=0$ and $b\ne 0\right)$, the model (109) is said to have a linear trend.

In the model (109), the only way to have stochastic non stationarity is to have $\rho =1$ (if $\rho >1$, then the time series diverges with time which is readily seen in data). The Dickey-Fuller test in the general case is a unit root test to detect whether $\rho =1$ against $\rho <1$ :

 $\left\{\begin{array}{cc}{ℋ}_{0}:\hfill & \hfill \rho =1\\ {ℋ}_{1}:\hfill & \hfill \rho <1\end{array}\right\$ (110)

The test statistics and its limit distribution depend on the a priori knowledge we have on $a$ and $b$. In case of absence of a priori knowledge on the structure of the model, several authors have proposed a global strategy to cover all the subcases of the model (109), depending on the possible values on $a$ and $b$. Figure Fig.76 explicitates the strategy implemented in OpenTURNS, recommended by Enders (Applied Econometric Times Series, Enders, W., second edition, John Wiley & sons editions, 2004.).

 Dickey Fuller strategy
Figure 76

We note $\left({X}_{1},\cdots ,{X}_{n}\right)$ the data, by $W\left(r\right)$ the Wiener process, and ${W}^{a}\left(r\right)=W\left(r\right)-{\int }_{0}^{1}W\left(r\right)dr$, ${W}^{b}\left(r\right)={W}^{a}\left(r\right)-12\left(r-\frac{1}{2}\right){\int }_{0}^{1}\left(s-\frac{1}{2}\right)W\left(s\right)ds$

We assume the model 1:

 ${X}_{t}=a+bt+\rho {X}_{t-1}+{\epsilon }_{t}$ (111)

The coefficients $\left(a,b,\rho \right)$ are estimated by $\left({\stackrel{^}{a}}_{n},{\stackrel{^}{b}}_{n},{\stackrel{^}{\rho }}_{n}\right)$ using ordinary least-squares fitting, which leads to:

 $\underset{\underline{\underline{M}}}{\underbrace{\left(\begin{array}{ccc}n-1\hfill & {\sum }_{i=1}^{n}{t}_{i}\hfill & {\sum }_{i=2}^{n}{y}_{i-1}\hfill \\ \sum _{i=1}^{n}{t}_{i}\hfill & {\sum }_{i=1}^{n}{t}_{i}^{2}\hfill & {\sum }_{i=2}^{n}{t}_{i}{y}_{i-1}\hfill \\ \sum _{i=2}^{n}{y}_{i-1}\hfill & {\sum }_{i=2}^{n}{t}_{i}{y}_{i-1}\hfill & {\sum }_{i=2}^{n}{y}_{i-1}^{2}\hfill \end{array}\right)}}\left(\begin{array}{c}{\stackrel{^}{a}}_{n}\\ {\stackrel{^}{b}}_{n}\\ {\stackrel{^}{\rho }}_{n}\end{array}\right)=\left(\begin{array}{c}\sum _{i=1}^{n}{y}_{i}\hfill \\ \sum _{i=1}^{n}{t}_{i}{y}_{i}\hfill \\ \sum _{i=2}^{n}{y}_{i-1}{y}_{i}\hfill \end{array}\right)$ (112)

We first test :

 $\left\{\begin{array}{cc}{ℋ}_{0}:\hfill & \hfill \rho =1\\ {ℋ}_{1}:\hfill & \hfill \rho <1\end{array}\right\$ (113)

thanks to the Student statistics :

 ${t}_{\rho =1}=\frac{{\rho }_{n}-1}{{\stackrel{^}{\sigma }}_{{\rho }_{n}}}$ (114)

where ${\sigma }_{{\rho }_{n}}$ is the least square estimate of the standard deviation of ${\stackrel{^}{\rho }}_{n}$, given by:

 ${\sigma }_{{\rho }_{n}}={\underline{\underline{M}}}_{33}^{-1}\sqrt{\frac{1}{n-1}{\sum }_{i=2}^{n}{\left({y}_{i}-\left({\stackrel{^}{a}}_{n}+{\stackrel{^}{b}}_{n}{t}_{i}+{\stackrel{^}{\rho }}_{n}{y}_{i-1}\right)\right)}^{2}}$ (115)

residual which converges in distribution to the Dickey-Fuller distribution associated to the model with drift and trend:

 ${t}_{\rho =1}\stackrel{ℒ}{⟶}\frac{{\int }_{0}^{1}{W}^{b}\left(r\right)dW\left(r\right)}{{\int }_{1}^{0}{W}^{b}{\left(r\right)}^{2}dr}$ (116)

The null hypothesis ${ℋ}_{0}$ from (113) is accepted when ${t}_{\rho =1}>{C}_{\alpha }$ where ${C}_{\alpha }$ is the test threshold of level $\alpha$, which is tabulated in table 4.

 $\alpha$ 0.01 0.05 0.1 ${C}_{\alpha }$ -3.96 -3.41 -3.13
Quantiles of the Dickey-Fuller statistics for the model with drift and linear trend
Table 4
• If the null hypothesis ${ℋ}_{0}$ from (113) is rejected, we test whether $b=0$ :

 $\left\{\begin{array}{cc}{ℋ}_{0}:\hfill & \hfill b=0\\ {ℋ}_{1}:\hfill & \hfill b\ne 0\end{array}\right\$ (117)

where the statistics ${t}_{n}=\frac{|{\stackrel{^}{b}}_{n}|}{{\sigma }_{{b}_{n}}}$ converges in distribution to the Student distribution $Student\left(\nu =n-4\right)$, where ${\sigma }_{{b}_{n}}$ is the least square estimate of the standard deviation of ${\stackrel{^}{b}}_{n}$, given by:

 ${\sigma }_{{b}_{n}}={\underline{\underline{M}}}_{22}^{-1}\sqrt{\frac{1}{n-1}{\sum }_{i=2}^{n}{\left({y}_{i}-\left({\stackrel{^}{a}}_{n}+{\stackrel{^}{b}}_{n}{t}_{i}+{\stackrel{^}{\rho }}_{n}{y}_{i-1}\right)\right)}^{2}}$ (118)
• If ${ℋ}_{0}$ from (117) is rejected, then the model 1 (111) is confirmed. And the test (113) proved that the unit root is rejected : $\rho <1$. We then conclude that the final model is : ${X}_{t}=a+bt+\rho {X}_{t-1}+{\epsilon }_{t}$ whith $\rho <1$ which is a trend stationary model.

• If ${ℋ}_{0}$ from (117) is accepted, then the model 1 (111) is not confirmed, since the trend presence is rejected and the test (113) is not conclusive (since based on a wrong model). We then have to test the second model (121).

• If the null hypothesis ${ℋ}_{0}$ from (113) is accepted, we test whether $\left(\rho ,b\right)=\left(1,0\right)$ :

 $\left\{\begin{array}{cc}{ℋ}_{0}:\hfill & \hfill \left(\rho ,b\right)=\left(1,0\right)\\ {ℋ}_{1}:\hfill & \hfill \left(\rho ,b\right)\ne \left(1,0\right)\end{array}\right\$ (119)

with the Fisher statistics :

 ${\stackrel{^}{F}}_{1}=\frac{\left({S}_{1,0}-{S}_{1,b}\right)/2}{{S}_{1,b}/\left(n-3\right)}$ (120)

where ${S}_{1,0}={\sum }_{i=2}^{n}{\left({y}_{i}-\left({\stackrel{^}{a}}_{n}+{y}_{i-1}\right)\right)}^{2}$ is the sum of the square errors of the model 1 (111) assuming ${ℋ}_{0}$ from (119) and ${S}_{1,b}={\sum }_{i=2}^{n}{\left({y}_{i}-\left({\stackrel{^}{a}}_{n}+{\stackrel{^}{b}}_{n}{t}_{i}+{\stackrel{^}{\rho }}_{n}{y}_{i-1}\right)\right)}^{2}$ is the same sum when we make no assumption on $\rho$ and $b$.

The statistics ${\stackrel{^}{F}}_{1}$ converges in distribution to the Fisher-Snedecor distribution $F\left(2,n-3\right)$. The null hypothesis ${ℋ}_{0}$ from (113) is accepted when ${\stackrel{^}{F}}_{1}<{\Phi }_{\alpha }$ where ${\Phi }_{\alpha }$ is the test threshold of level $\alpha$.

• If ${ℋ}_{0}$ from (119) is rejected, then the model 1 (111) is confirmed since the presence of linear trend is confirmed. And the test (113) proved that the unit root is accepted : $\rho =1$. We then conclude that the model is : ${X}_{t}=a+bt+{X}_{t-1}+{\epsilon }_{t}$ which is a non stationary model.

• If ${ℋ}_{0}$ from (119) is accepted, then the model 1 (111) is not confirmed, since the presence of the linear trend is rejected and the test (113) is not conclusive (since based on a wrong model). We then have to test the second model (121).

We assume the model 2:

 ${X}_{t}=a+\rho {X}_{t-1}+{\epsilon }_{t}$ (121)

The coefficients $\left(a,\rho \right)$ are estimated as follows :

 $\underset{\underline{\underline{N}}}{\underbrace{\left(\begin{array}{cc}n-1\hfill & {\sum }_{i=2}^{n}{y}_{i-1}\hfill \\ \sum _{i=2}^{n}{y}_{i-1}\hfill & {\sum }_{i=2}^{n}{y}_{i-1}^{2}\hfill \end{array}\right)}}\left(\begin{array}{c}{\stackrel{^}{a}}_{n}\\ {\stackrel{^}{\rho }}_{n}\end{array}\right)=\left(\begin{array}{c}\sum _{i=1}^{n}{y}_{i}\hfill \\ \sum _{i=2}^{n}{y}_{i-1}{y}_{i}\hfill \end{array}\right)$ (122)

We first test :

 $\left\{\begin{array}{cc}{ℋ}_{0}:\hfill & \hfill \rho =1\\ {ℋ}_{1}:\hfill & \hfill \rho <1\end{array}\right\$ (123)

thanks to the Student statistics :

 ${t}_{\rho =1}=\frac{{\rho }_{n}-1}{{\sigma }_{{\rho }_{n}}}$ (124)

where ${\sigma }_{{\rho }_{n}}$ is the least square estimate of the standard deviation of ${\stackrel{^}{\rho }}_{n}$, given by:

 ${\sigma }_{{\rho }_{n}}={\underline{\underline{N}}}_{22}^{-1}\sqrt{\frac{1}{n-1}{\sum }_{i=2}^{n}{\left({y}_{i}-\left({\stackrel{^}{a}}_{n}+{\stackrel{^}{\rho }}_{n}{y}_{i-1}\right)\right)}^{2}}$ (125)

which converges in distribution to the Dickey-Fuller distribution associated to the model with drift and no linear trend:

 ${t}_{\rho =1}\stackrel{ℒ}{⟶}\frac{{\int }_{0}^{1}{W}^{a}\left(r\right)dW\left(r\right)}{{\int }_{1}^{0}{W}^{a}{\left(r\right)}^{2}dr}$ (126)

The null hypothesis ${ℋ}_{0}$ from (123) is accepted when ${t}_{\rho =1}>{C}_{\alpha }$ where ${C}_{\alpha }$ is the test threshold of level $\alpha$, which is tabulated in table 5.

 $\alpha$ 0.01 0.05 0.1 ${C}_{\alpha }$ -3.43 -2.86 -2.57
Quantiles of the Dickey-Fuller statistics for the model with drift
Table 5
• If the null hypothesis ${ℋ}_{0}$ from (123) is rejected, we test whether $a=0$ :

 $\left\{\begin{array}{cc}{ℋ}_{0}:\hfill & \hfill a=0\\ {ℋ}_{1}:\hfill & \hfill a\ne 0\end{array}\right\$ (127)

where the statistics ${t}_{n}=\frac{|{\stackrel{^}{a}}_{n}|}{{\sigma }_{{a}_{n}}}$ converges in distribution to the Student distribution $Student\left(\nu =n-3\right)$, where ${\sigma }_{{a}_{n}}$ is the least square estimate of the standard deviation of ${\stackrel{^}{a}}_{n}$, given by:

 ${\sigma }_{{a}_{n}}={\underline{\underline{N}}}_{11}^{-1}\sqrt{\frac{1}{n-1}{\sum }_{i=2}^{n}{\left({y}_{i}-\left({\stackrel{^}{a}}_{n}+{\stackrel{^}{\rho }}_{n}{y}_{i-1}\right)\right)}^{2}}$ (128)
• If ${ℋ}_{0}$ from (127) is rejected, then the model 2 (121) is confirmed. And the test (123) proved that the unit root is rejected : $\rho <1$. We then conclude that the final model is : ${X}_{t}=a+\rho {X}_{t-1}+{\epsilon }_{t}$ whith $\rho <1$ which is a stationary model.

• If ${ℋ}_{0}$ from (127) is accepted, then the model 2 (121) is not confirmed, since the drift presence is rejected and the test (113) is not conclusive (since based on a wrong model). We then have to test the third model (131).

• If the null hypothesis ${ℋ}_{0}$ from (123) is accepted, we test whether $\left(\rho ,a\right)=\left(1,0\right)$ :

 $\left\{\begin{array}{cc}{ℋ}_{0}:\hfill & \hfill \left(\rho ,a\right)=\left(1,0\right)\\ {ℋ}_{1}:\hfill & \hfill \left(\rho ,a\right)\ne \left(1,0\right)\end{array}\right\$ (129)

with a Fisher test. The statistics is :

 ${\stackrel{^}{F}}_{2}=\frac{\left(SC{R}_{2,c}-SC{R}_{2}\right)/2}{SC{R}_{2}/\left(n-2\right)}$ (130)

where $SC{R}_{2,c}$ is the sum of the square errors of the model 2(121) assuming ${ℋ}_{0}$ from (129) and $SC{R}_{2}$ is the same sum when we make no assumption on $\rho$ and $a$.

The statistics ${\stackrel{^}{F}}_{2}$ converges in distribution to the Fisher distribution $F\left(2,n-2\right)$. The null hypothesis ${ℋ}_{0}$ from (113) is accepted if when ${\stackrel{^}{F}}_{2}<{\Phi }_{\alpha }$ where ${\Phi }_{\alpha }$ is the test threshold of level $\alpha$.

• If ${ℋ}_{0}$ from (129) is rejected, then the model 2 (121) is confirmed since the presence of the drift is confirmed. And the test (123) proved that the unit root is accepted : $\rho =1$. We then conclude that the model is : ${X}_{t}=a+{X}_{t-1}+{\epsilon }_{t}$ which is a non stationary model.

• If ${ℋ}_{0}$ from (129) is accepted, then the model 2 (121) is not confirmed, since the drift presence is rejected and the test (123) is not conclusive (since based on a wrong model). We then have to test the third model (131).

We assume the model 3::

 ${X}_{t}=\rho {X}_{t-1}+{\epsilon }_{t}$ (131)

The coefficients $\rho$ are estimated as follows :

 ${\stackrel{^}{\rho }}_{n}=\frac{{\sum }_{i=2}^{n}{y}_{i-1}{y}_{i}}{{\sum }_{i=2}^{n}{y}_{i-1}^{2}}$ (132)

We first test :

 $\left\{\begin{array}{cc}{ℋ}_{0}:\hfill & \hfill \rho =1\\ {ℋ}_{1}:\hfill & \hfill \rho <1\end{array}\right\$ (133)

thanks to the Student statistics :

 ${t}_{\rho =1}=\frac{{\stackrel{^}{\rho }}_{n}-1}{{\sigma }_{{\rho }_{n}}}$ (134)

where ${\sigma }_{{\rho }_{n}}$ is the least square estimate of the standard deviation of ${\stackrel{^}{\rho }}_{n}$, given by:

 ${\sigma }_{{\rho }_{n}}=\sqrt{\frac{1}{n-1}{\sum }_{i=2}^{n}{\left({y}_{i}-{\stackrel{^}{\rho }}_{n}{y}_{i-1}\right)}^{2}}/\sqrt{{\sum }_{i=2}^{n}{y}_{i-1}^{2}}$ (135)

which converges in distribution to the Dickey-Fuller distribution associated to the random walk model:

 ${t}_{\rho =1}\stackrel{ℒ}{⟶}\frac{{\int }_{0}^{1}W\left(r\right)dW\left(r\right)}{{\int }_{1}^{0}W{\left(r\right)}^{2}dr}$ (136)

The null hypothesis ${ℋ}_{0}$ from (133) is accepted when ${t}_{\rho =1}>{C}_{\alpha }$ where ${C}_{\alpha }$ is the test threshold of level $\alpha$, which is tabulated in table 6.

 $\alpha$ 0.01 0.05 0.1 ${C}_{\alpha }$ -2.57 -1.94 -1.62
Quantiles of the Dickey-Fuller statistics for the random walk model
Table 6
• If ${ℋ}_{0}$ from (133) is rejected, we then conclude that the model is : ${X}_{t}=\rho {X}_{t-1}+{\epsilon }_{t}$ where $\rho <1$ which is a stationary model.

• If ${ℋ}_{0}$ from (133) is accepted, we then conclude that the model is : ${X}_{t}={X}_{t-1}+{\epsilon }_{t}$ which is a non stationary model.

OpenTURNS implements the estimation of the coefficients of the different models by the following methods of the object DickeyFuller:

• coefficients of (112): method estimateDriftAndLinearTrendModel

• coefficients of (122): method estimateDriftModel

• coefficients of (132): method estimateAR1Model

The global DickeyFuller strategy is implemented in the method testUnitRoot of the object DickeyFuller. For more expert Users, the different tests of the Dickey-Fuller strategy are implemented by the following methods of the object DickeyFuller:

• test (113) : method testUnitRootInDriftAndLinearTrendModel

• test (117) : method testNoUnitRootAndNoLinearTrendInDriftAndLinearTrendModel

• test (119) : method testUnitRootAndNoLinearTrendInDriftAndLinearTrendModel

• test (123) : method testUnitRootInDriftModel

• test (127) : method testNoUnitRootAndNoDriftInDriftModel

• test (129) : method testUnitRootAndNoDriftInDriftModel

• test (133) : method testUnitRootInAR1Model

 Requirements $•$ a time series : myTimeSeries type: TimeSeries $•$ the level of the test : level type: NumericalScalar Results $•$ a test class : myDickeyFullerTest type: DickeyFullerTest $•$ a test result : myTestResult type: TestResult

Python script for this Use Case :

# Initiate a DickeyFullerTest class       myDickeyFullerTest = DickeyFullerTest(myTimeSeries)         # H0 : rho = 1       # Test = True <=> time series non stationary       # p-value threshold : probability of the H0 reject zone : 1 - 0.95       # p-value : probability (test variable decision > test variable decision evaluated on the time series)       # Test = True <=> p-value > p-value threshold         # Test of the model (1)       myTestResultNoConstant = myDickeyFullerTest.testLinearTrendModel(level)         # Test the model (2)       # The model contains a drift       myTestResultNoConstant = myDickeyFullerTest.testDriftModel(level)         # Test of model (3)       # The test contains a trend       myTestResultNoConstant = myDickeyFullerTest.testDriftAndLinearTrendModel(level)         # Run the strategy test       myTestResultNoConstant = myDickeyFullerTest.runStrategy(level)

## 5.13 Event based on process

### 5.13.1 UC : Creation of an event based on a process

This section gives elements to create an event based on a multivariate stochastic process.

Let $X:\Omega ×𝒟\to {ℝ}^{d}$ be a stochastic process of dimension $d$, where $𝒟\in {ℝ}^{n}$ is discretized on the mesh $ℳ$. We suppose that $ℳ$ contains $N$ vertices.

We define the event $ℰ$ as:

 $\begin{array}{c}\hfill ℰ\left(X\right)=\bigcup _{\underline{t}\in ℳ}\left\{{X}_{\underline{t}}\in 𝒜\right\}\end{array}$ (137)

where $𝒜$ is a domain of ${ℝ}^{d}$.

A particular domain $𝒜$ is the cartesian product of type :

 $\begin{array}{c}\hfill 𝒜=\prod _{i=1}^{d}\left[{a}_{i},{b}_{i}\right]\end{array}$

In that case, OpenTURNS defines $𝒜$ by its both extreme points : $\underline{a}$ and $\underline{b}$.

In the general case, $𝒜$ is a Domain object that is able to check if it contains a given point in ${ℝ}^{d}$.

OpenTURNS creates an Event object from the process $X$ and the domain $𝒜$. Then, it is possible to get a realization of the event $ℰ$, which is scalar ${1}_{ℰ\left(X\right)}\left({\underline{x}}_{0},\cdots ,{\underline{x}}_{N-1}\right)$ if $\left({\underline{x}}_{0},\cdots ,{\underline{x}}_{N-1}\right)$ is a realization of $X$ on $ℳ$.

 Requirements $•$ the domain $𝒜$: myDomainA type: Domain $•$ the process : myProcess type: Process Results $•$ the Event $ℰ$: myEvent type: Event

Python script for this UseCase :

script_docUC_StocProc_Event.py

myEvent = Event(myProcess, myDomainA)

### 5.13.2 UC : Monte Carlo Probability of an event based on a process

The objective of this Use Case is to evaluate the probability of an event based on a stochastic process, using the Monte Carlo estimator.

Let $X:\Omega ×𝒟\to {ℝ}^{d}$ be a stochastic process of dimension $d$, where $𝒟\in {ℝ}^{n}$ is discretized on the mesh $ℳ$.

We define the event $ℰ$ as:

 $\begin{array}{c}\hfill ℰ\left(X\right)=\bigcup _{\underline{t}\in ℳ}\left\{{X}_{\underline{t}}\in 𝒜\right\}\end{array}$ (138)

where $𝒜$ is a domain of ${ℝ}^{d}$.

We estimate the probabilty $p=ℙ\left(ℰ\left(X\right)\right)$ with the Monte Carlo estimator.

The Monte Carlo algorithm is manipulated the same way as in the case where the event is based on a random variable independent of time. Details on the manipulation of the Monte Carlo algorithm and its results are presented in the Use Case 3.5.9.

 Requirements $•$ the domain $𝒜$: myDomainA type: Domain $•$ the process $X$: myProcess type: Process Results $•$ the event $ℰ$ : myEvent type: Event $•$ the Monte-Carlo algorithm : myMonteCarloAlgo type: MonteCarlo

Python script for this Use Case :

script_docUC_StocProc_MonteCarlo.py

# Create an event from a Process and a Domain # myEvent = EventProcess(myProcess, myDomain) myEvent = Event(myProcess, myDomainA)   # Create a Monte-Carlo algorithm based on myEvent myMonteCarloAlgo = MonteCarlo(myEvent)   # Define the maximum number of simulations myMonteCarloAlgo.setMaximumOuterSampling(1000)   # Define the block size myMonteCarloAlgo.setBlockSize(100)   # Define the maximum coefficient of variation myMonteCarloAlgo.setMaximumCoefficientOfVariation(0.0025)   # Run the algorithm myMonteCarloAlgo.run()   # Get the result result = myMonteCarloAlgo.getResult() print(result)   # Draw the convergence graph convGraph = myMonteCarloAlgo.drawProbabilityConvergence(0.95)

We illustrate the algorithm on the example of the bidimensionnal white noise process $\epsilon :\Omega ×𝒟\to {ℝ}^{2}$ where $𝒟\in ℝ$, distributed according to the bidimensionnal standard normal distribution (with zero mean, unit variance and independent marginals).

We consider the domain $𝒜=\left[1,2\right]×\left[1,2\right]$. Then the event $ℰ$ writes :

 $\begin{array}{c}\hfill ℰ\left(\epsilon \right)=\bigcup _{\underline{t}\in ℳ}\left\{{\epsilon }_{t}\in 𝒜\right\}\end{array}$

For all time stamps $t\in ℳ$, the probability ${p}_{1}$ that the process enters into the domain $𝒜$ at time $t$ writes, using the independence property of the marginals :

 $\begin{array}{c}\hfill {p}_{1}=ℙ\left({\epsilon }_{t}\in 𝒜\right)={\left(\Phi \left(2\right)-\Phi \left(1\right)\right)}^{2}\end{array}$

with $\Phi$ the cumulative distribution function of the scalar standard Normal distribution.

As the proces is discretized on a time grid of size $N$ and using the independance property of the white noise between two different time stamps and the fact that the white noise follows the same distribution at each time $t$, the final probability $p$ writes :

 $p=ℙ\left(ℰ\left(\epsilon \right)\right)=1-{\left(1-{p}_{1}\right)}^{N}$ (139)

With $K={10}^{4}$ realizations, using the Monte Carlo estimator, we obtain ${p}_{K}=0.1627$, to be compared to the exact value $p=0.17008$ for a time grid of size $N=10$.

The Figure (77) draws the convergence graph of the estimator, with confidence level = $95%$, equal to $C{I}_{0.95}=\left[0.168,0.173\right]$.

 Convergence of the Monte-Carlo estimator based on a process event.
Figure 77