This section is organized in three parts. The first one gives the list of probabilistic uncertainty models proposed by OpenTURNS. The second part gives an overview of the content of the statistical toolbox that may be used to build these uncertainty models if data are available. The last part is dedicated to the mathematical description of each method.
OpenTURNS proposes two different types of probabilistic models: nonparametric and parametric ones.
[Empirical Cumulative Distribution Function] – see page 3.3
[Kernel smoothing] – see page 3.3.1
[Usual uni and multidimensional probability distribution functions] – see page 3.3.2
[Copulas : a mathematical tool for multidimensional distributions] – see page 3.3.3
[Random Mixture : affine combination of independent univariate distributions] – see page 3.3.4
Building a dataset may require to aggregate several data sources; OpenTURNS offers some techniques to check beforehand if these data sources are indeed related to the same probability distribution.
Moreover, when a parametric model is used, OpenTURNS provide statistical tools to estimate the parameters, validate the resulting model and address the important issue of dependencies among uncertainty sources.
Qualitative analysis
[Graphical analysis using QQplot] – see page 3.3.5
Quantitative analysis
[Smirnov test] – see page 3.3.6
[Maximum Likelihood Principle] – see page 3.3.7
[Bayesian Calibration] – see page 3.3.8
[Parametric estimation] – see page 3.3.10
Qualitative goodnessoffit analysis
[Graphical analysis : QQplot and Henry line] – see page 3.3.11
Quantitative goodnessoffit analysis
[Chisquare test] – see page 3.3.12
[KolmogorovSmirnov test] – see page 3.3.13
[CramerVonMises test] – see page 3.3.14
[AndersonDarling test] – see page 3.3.15
[Bayesian Information Criterion (BIC)] – see page 3.3.16
Linear correlations
[Pearson correlation coefficient] – see page 3.3.17
[Pearson independence test] – see page 3.3.18
Monotonous correlations
[Spearman correlation coefficient] – see page 3.3.19
[Spearman independance test] – see page 3.3.20
Modelfree dependency analysis
[Chisquare independence test] – see page 3.3.21
Regression methods
[Linear regression] – see page 3.3.22
Mathematical description
The empirical cumulative distribution function provides a graphical representation of the probability distribution of a random vector without implying any prior assumption concerning the form of this distribution. It concerns a nonparametric approach which enables the description of complex behaviour not necessarily detected with parametric approaches.
Therefore, using general notation, this means that we are looking for an estimator ${\widehat{F}}_{N}$ for the cumulative distribution function ${F}_{X}$ of the random variable $\underline{X}=\left({X}^{1},...,{X}^{{n}_{X}}\right)$:
$$\begin{array}{c}\hfill {\widehat{F}}_{N}\leftrightarrow {F}_{X}\end{array}$$ 
Principle of the method for ${n}_{X}=1$
Let us first consider the unidimensional case, and let us denote $\underline{X}={X}^{1}=X$. The empirical probability distribution is the distribution created from a sample of observed values $\left\{{x}_{1},{x}_{2},...,{x}_{N}\right\}$. It corresponds to a discrete uniform distribution on $\left\{{x}_{1},{x}_{2},...,{x}_{N}\right\}$: where ${X}^{\text{'}}$ follows this distribution,
$$\begin{array}{c}\hfill \forall \phantom{\rule{0.277778em}{0ex}}i\in \left\{1,...,N\right\},\phantom{\rule{4pt}{0ex}}\mathrm{Pr}\left({X}^{\text{'}}={x}_{i}\right)=\frac{1}{N}\end{array}$$ 
The empirical cumulative distribution function ${\widehat{F}}_{N}$ with this distribution is constructed as follows:
$$\begin{array}{c}\hfill {F}_{N}\left(x\right)=\frac{1}{N}\sum _{i=1}^{N}{\mathbf{1}}_{\left\{{x}_{i}\le x\right\}}\end{array}$$ 
The empirical cumulative distribution function ${F}_{N}\left(x\right)$ is defined as the proportion of observations that are less than (or equal to) $x$ and is thus an approximation of the cumulative distribution function ${F}_{X}\left(x\right)$ which is the probability that an observation is less than (or equal to) $x$.
$$\begin{array}{c}\hfill {F}_{X}\left(x\right)=\mathrm{Pr}\left(X\le x\right)\end{array}$$ 
The diagram below provides an illustration of an ordered sample $\left\{5,6,10,22,27\right\}$.
Principle of the method for ${n}_{X}>1$
The method is similar for the case ${n}_{X}>1$. The empirical probability distribution is a distribution created from a sample $\left\{{\underline{x}}_{1},{\underline{x}}_{2},...,{\underline{x}}_{N}\right\}$. It corresponds to a discrete uniform distribution on $\left\{{\underline{x}}_{1},{\underline{x}}_{2},...,{\underline{x}}_{N}\right\}$: where ${\underline{X}}^{\text{'}}$ follows this distribution,
$$\begin{array}{c}\hfill \forall \phantom{\rule{0.277778em}{0ex}}i\in \left\{1,...,N\right\},\phantom{\rule{4pt}{0ex}}\mathrm{Pr}\left({\underline{X}}^{\text{'}}={\underline{x}}_{i}\right)=\frac{1}{N}\end{array}$$ 
Thus we have:
$$\begin{array}{c}\hfill {F}_{N}\left(\underline{x}\right)=\frac{1}{N}\sum _{i=1}^{N}{\mathbf{1}}_{\left\{{x}_{i}^{1}\le {x}^{1},...,{x}_{N}^{{n}_{X}}\le {x}^{{n}_{X}}\right\}}\end{array}$$ 
in comparison with the theoretical probability density function ${F}_{X}$:
$$\begin{array}{c}\hfill {F}_{X}\left(x\right)=\mathbb{P}\left({X}^{1}\le {x}^{1},...,{X}^{{n}_{X}}\le {x}^{{n}_{X}}\right)\end{array}$$ 
Link with OpenTURNS methodology
The following bibliographical references provide main starting points for further study of this method:
Saporta G. (1990). "Probabilités, Analyse de données et Statistique", Technip
Dixon W.J. & Massey F.J. (1983) "Introduction to statistical analysis (4th ed.)", McGrawHill
Mathematical description
In dimension 1, the kernel smoothed probability density function $\widehat{p}$ has the following expression, where $K$ is the univariate kernel, $n$ the numerical sample size and $({X}_{1},\cdots ,{X}_{n})\in {\mathbb{R}}^{n}$ the univariate random sample with $\forall i,\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}{X}_{i}\in \mathbb{R}$ :
$$\widehat{p}\left(x\right)=\frac{1}{nh}\sum _{i=1}^{n}K\left(\frac{x{X}_{i}}{h}\right)$$  (1) 
The kernel $K$ is a function satisfying $\int K\left(x\right)\phantom{\rule{0.166667em}{0ex}}dx=1$. Usually, $K$ is chosen to be a unimodal probability density fucntion that is symmetric about 0.
The parameter $h$ is called the bandwidth.
In dimension $d>1$, the kernel may be defined as a product kernel ${K}_{d}$, as follows where $\underline{x}=({x}_{1},\cdots ,{x}_{d})\in {\mathbb{R}}^{d}$ :
$$\begin{array}{c}\hfill {\displaystyle {K}_{d}\left(\underline{x}\right)=\prod _{j=1}^{d}K\left({x}_{j}\right)}\end{array}$$ 
which leads to the kernel smoothed probability density function in dimension $d$, where $({\underline{X}}_{1},\cdots ,{\underline{X}}_{n})$ is the dvariate random sample which components are denoted ${\underline{X}}_{i}=({X}_{i1},\cdots ,{X}_{id})$ :
$$\begin{array}{c}\hfill {\displaystyle \widehat{p}\left(\underline{x}\right)=\frac{1}{N{\prod}_{j=1}^{d}{h}_{j}}\sum _{i=1}^{N}{K}_{d}\left(\frac{{x}_{1}{X}_{i1}}{{h}_{1}},\cdots ,\frac{{x}_{d}{X}_{id}}{{h}_{d}}\right)}\end{array}$$ 
Let's note that the bandwidth is the vector $\underline{h}=({h}_{1},\cdots ,{h}_{d})$.
The quality of the approximation may be controlled by the AMISE (Asymptotic Mean Integrated Square error) criteria defined as :
$$\begin{array}{c}\hfill \left\{\begin{array}{ccc}AMISE\left(\widehat{p}\right)\hfill & =& \text{two}\phantom{\rule{4.pt}{0ex}}\text{first}\phantom{\rule{4.pt}{0ex}}\text{terms}\phantom{\rule{4.pt}{0ex}}\text{in}\phantom{\rule{4.pt}{0ex}}\text{the}\phantom{\rule{4.pt}{0ex}}\text{series}\phantom{\rule{4.pt}{0ex}}\text{expansion}\phantom{\rule{4.pt}{0ex}}\text{with}\phantom{\rule{4.pt}{0ex}}\text{respect}\phantom{\rule{4.pt}{0ex}}\text{to}\phantom{\rule{4.pt}{0ex}}n\phantom{\rule{4.pt}{0ex}}\text{in}\phantom{\rule{4.pt}{0ex}}MISE\left(\widehat{p}\right)\hfill \\ MISE\left(\widehat{p}\right)\hfill & =& {\mathbb{E}}_{\underline{X}}\left[\left\right\widehat{p}p{\left\right}_{{L}_{2}}^{2}\right]=\int \phantom{\rule{0.166667em}{0ex}}MSE(\widehat{p},\underline{x})\phantom{\rule{0.166667em}{0ex}}d\underline{x}\hfill \\ MSE(\widehat{p},\underline{x})\hfill & =& {\left[{\mathbb{E}}_{\underline{X}}\left[\widehat{p}\left(\underline{x}\right)\right]p\left(\underline{x}\right)\right]}^{2}+{\mathrm{Var}}_{\underline{X}}\left[\widehat{p}\left(\underline{x}\right)\right]\hfill \end{array}\right.\end{array}$$ 
The quality of the estimation essentially depends on the value of the bandwidth $h$. The bandwidth that minimizes the AMISE criteria has the expression (given in dimension 1) :
$$h}_{AMISE}\left(K\right)={\left[\frac{R\left(K\right)}{{\mu}_{2}{\left(K\right)}^{2}R\left({p}^{\left(2\right)}\right)}\right]}^{\frac{1}{5}}{n}^{\frac{1}{5}$$  (2) 
where $R\left(K\right)=\int K{\left(\underline{x}\right)}^{2}\phantom{\rule{0.166667em}{0ex}}d\underline{x}$ and ${\mu}_{2}\left(K\right)=\int {\underline{x}}^{2}K\left(\underline{x}\right)\phantom{\rule{0.166667em}{0ex}}d\underline{x}={\sigma}_{K}^{2}$.
If we note that $R\left({p}^{\left(r\right)}\right)={(1)}^{r}{\Phi}_{2r}$ with ${\Phi}_{r}=\int {p}^{\left(r\right)}p\left(x\right)\phantom{\rule{0.166667em}{0ex}}dx={\mathbb{E}}_{\underline{X}}\left[{p}^{\left(r\right)}\right]$, then relation (2) writes :
$$h}_{AMISE}\left(K\right)={\left[\frac{R\left(K\right)}{{\mu}_{2}{\left(K\right)}^{2}{\Phi}_{4}}\right]}^{\frac{1}{5}}{n}^{\frac{1}{5}$$  (3) 
Several rules exist to evaluate the optimal bandwidth ${h}_{AMISE}\left(K\right)$ : all efforts are concentrated on the evaluation of the term ${\Phi}_{4}$. We give here the most usual rules :
the Silverman rule in dimension 1,
the plugin bandwidth selection  Solvetheequation method in dimension $d$,
the Scott rule in dimension d.
In the case where the density $p$ is normal with standard deviation $\sigma $, then the term ${\Phi}_{4}$ can be exactly evaluated. In that particular case, the optimal bandwidth of relation (3) with respect to the AMISE criteria writes as follows :
$$h}_{AMISE}^{p=normal}\left(K\right)={\left[\frac{8\sqrt{\pi}R\left(K\right)}{3{\mu}_{2}{\left(K\right)}^{2}}\right]}^{\frac{1}{5}}\sigma {n}^{\frac{1}{5}$$  (4) 
An estimator of ${h}_{AMISE}^{p=normal}\left(K\right)$ is obtained by replacing $\sigma $ by its estimator ${\widehat{\sigma}}^{n}$, evaluated from the numerical sample $({X}_{1},\cdots ,{X}_{n})$ :
$$\widehat{h}}_{AMISE}^{p=normal}\left(K\right)={\left[\frac{8\sqrt{\pi}R\left(K\right)}{3{\mu}_{2}{\left(K\right)}^{2}}\right]}^{\frac{1}{5}}{\widehat{\sigma}}^{n}{n}^{\frac{1}{5}$$  (5) 
The Silverman rule consists in considering ${\widehat{h}}_{AMISE}^{p=normal}\left(K\right)$ of relation (5) even if the density $p$ is not normal :
$$h}^{Silver}\left(K\right)={\left[\frac{8\sqrt{\pi}R\left(K\right)}{3{\mu}_{2}{\left(K\right)}^{2}}\right]}^{\frac{1}{5}}{\widehat{\sigma}}^{n}{n}^{\frac{1}{5}$$  (6) 
Relation (6) is empirical and gives good results when the density is not far from a normal one.
Plugin bandwidth selection  Solvetheequation method (dimension 1)
Relation (3) requires the evaluation of the quantity ${\Phi}_{4}$. As a generale rule, we use the estimator ${\widehat{\Phi}}_{r}$ of ${\Phi}_{r}$ defined by :
$${\widehat{\Phi}}_{r}=\frac{1}{n}\sum _{i=1}^{n}{\widehat{p}}^{\left(r\right)}\left({X}_{i}\right)$$  (7) 
Derivating relation (1) leads to :
$${\widehat{p}}^{\left(r\right)}\left(x\right)=\frac{1}{n{h}^{r+1}}\sum _{i=1}^{n}{K}^{\left(r\right)}\left(\frac{x{X}_{i}}{h}\right)$$  (8) 
and then the estimator ${\widehat{\Phi}}_{r}\left(h\right)$ is defined as :
$${\widehat{\Phi}}_{r}\left(h\right)=\frac{1}{{n}^{2}{h}^{r+1}}\sum _{i=1}^{n}\sum _{j=1}^{n}{K}^{\left(r\right)}\left(\frac{{X}_{i}{X}_{j}}{h}\right)$$  (9) 
We note that ${\widehat{\Phi}}_{r}\left(h\right)$ depends of the parameter $h$ which can be taken in order to minimize the AMSE (Asymptotic Mean Square Error) criteria evaluated between ${\Phi}_{r}$ and ${\widehat{\Phi}}_{r}\left(h\right)$. The optimal parameter $h$ is :
$$h}_{AMSE}^{\left(r\right)}={\left(\frac{2{K}^{\left(r\right)}\left(0\right)}{{\mu}_{2}\left(K\right){\Phi}_{r+2}}\right)}^{\frac{1}{r+3}}{n}^{\frac{1}{r+3}$$  (10) 
Given that preliminary results, the solvetheequation plugin method proceeds as follows :
Relation (3) defines ${h}_{AMISE}\left(K\right)$ as a function of ${\Phi}_{4}$ we denote here as :
$${h}_{AMISE}\left(K\right)=t\left({\Phi}_{4}\right)$$  (11) 
The term ${\Phi}_{4}$ is approximated by its estimator defined in (9) evaluated with its optimal parameter ${h}_{AMSE}^{\left(4\right)}$ defined in (10) :
$$h}_{AMSE}^{\left(4\right)}={\left(\frac{2{K}^{\left(4\right)}\left(0\right)}{{\mu}_{2}\left(K\right){\Phi}_{6}}\right)}^{\frac{1}{7}}{n}^{\frac{1}{7}$$  (12) 
which leads to a relation of type :
$${\Phi}_{4}\simeq {\widehat{\Phi}}_{4}\left({h}_{AMSE}^{\left(4\right)}\right)$$  (13) 
Relations (3) and (12) lead to the new one :
$$h}_{AMSE}^{\left(4\right)}={\left(\frac{2{K}^{\left(4\right)}\left(0\right){\mu}_{2}\left(K\right){\Phi}_{4}}{R\left(K\right){\Phi}_{6}}\right)}^{\frac{1}{7}}{h}_{AMISE}{\left(K\right)}^{\frac{5}{7}$$  (14) 
which rewrites :
$${h}_{AMSE}^{\left(4\right)}=l\left({h}_{AMISE}\left(K\right)\right)$$  (15) 
Relation (14) depends on both terms ${\Phi}_{4}$ and ${\Phi}_{6}$ which are evaluated with their estimators defined in (9) respectively with their AMSE optimal parameters ${g}_{1}$ and ${g}_{2}$ (see relation (10)). It leads to the expressions:
$$\left\{\begin{array}{ccc}{g}_{1}\hfill & =& {\displaystyle {\left(\frac{2{K}^{\left(4\right)}\left(0\right)}{{\mu}_{2}\left(K\right){\Phi}_{6}}\right)}^{\frac{1}{7}}{n}^{\frac{1}{7}}}\hfill \\ {g}_{2}\hfill & =& {\displaystyle {\left(\frac{2{K}^{\left(6\right)}\left(0\right)}{{\mu}_{2}\left(K\right){\Phi}_{8}}\right)}^{\frac{1}{7}}{n}^{\frac{1}{9}}}\hfill \end{array}\right.$$  (16) 
In order to evaluate ${\Phi}_{6}$ and ${\Phi}_{8}$, we suppose that the density $p$ is normal with a variance ${\sigma}^{2}$ which is approximated by the empirical variance of the numerical sample, which leads to :
$$\left\{\begin{array}{ccc}{\widehat{\Phi}}_{6}\hfill & =& {\displaystyle \frac{15}{16\sqrt{\pi}}{\widehat{\sigma}}^{7}}\hfill \\ {\widehat{\Phi}}_{8}\hfill & =& {\displaystyle \frac{{105}^{\phantom{(}}}{32\sqrt{\pi}}{\widehat{\sigma}}^{9}}\hfill \end{array}\right.$$  (17) 
Then, to resume, thanks to relations (11), (13), (15), (16) and (17), the optimal bandwidth is solution of the equation :
$${h}_{AMISE}\left(K\right)=t\circ {\widehat{\Phi}}_{4}\circ l\left({h}_{AMISE}\left(K\right)\right)$$  (18) 
Scott rule (dimension d)
The Scott rule is a simplification of the Silverman rule generalized to the dimension $d$ which is optimal when the density $p$ is normal with independent components. In all the other cases, it gives an empirical rule that gives good result when the density $p$ is not far from the normal one. For examples, the Scott bandwidth may appear too large when $p$ presents several maximum.
The Silverman rule given in dimension 1 in relation (6) can be generalized in dimension $d$ as follows : if we suppose that the density $p$ is normal with independent components, in dimension $d$ and that we use the normal kernel $N(0,1)$ to estimate it, then the optimal bandwidth vector $\underline{h}$ with respect to the AMISE criteria writes as follows :
$${\underline{h}}^{Silver}\left(N\right)={\left({\left(\frac{4}{d+2}\right)}^{1/(d+4)}{\widehat{\sigma}}_{i}^{n}{n}^{1/(d+4)}\right)}_{i}$$  (19) 
where ${\widehat{\sigma}}_{i}^{n}$ is the standard deviation of the $ith$ component of the sample $({\underline{X}}_{1},\cdots ,{\underline{X}}_{n})$, and ${\sigma}_{K}$ the standard deviation of the 1D kernel $K$.
The Scott proposition is a simplification of the Silverman rule, based on the fact that the coefficient ${\left(\frac{4}{d+2}\right)}^{1/(d+4)}$ remains in $[0.924,1.059]$ when the dimension $d$ varies. Thus, Scott fixed it to 1 :
$${\left(\frac{4}{d+2}\right)}^{1/(d+4)}\simeq 1$$  (20) 
which leads to the simplified expression :
$${\underline{h}}^{Silver}\left(N\right)\simeq {\left({\widehat{\sigma}}_{i}^{n}{n}^{1/(d+4)}\right)}_{i}$$  (21) 
Furthermore, in the general case, we have from relation (2) :
$$\frac{{h}_{AMISE}\left({K}_{1}\right)}{{h}_{AMISE}\left({K}_{2}\right)}=\frac{{\sigma}_{{K}_{2}}}{{\sigma}_{{K}_{1}}}{\left[\frac{{\sigma}_{{K}_{1}}R\left({K}_{1}\right)}{{\sigma}_{{K}_{2}}R\left({K}_{2}\right)}\right]}^{1/5}$$  (22) 
Considering that ${\sigma}_{K}R\left(K\right)\simeq 1$ whatever the kernel $K$, relation (22) simplifies in :
$${h}_{AMISE}\left({K}_{1}\right)\simeq {h}_{AMISE}\left({K}_{2}\right)\frac{{\sigma}_{{K}_{2}}}{{\sigma}_{{K}_{1}}}$$  (23) 
If we consider the normal kernel $N(0,1)$ for ${K}_{2}$, then relation (23) writes in a more general notation :
$${h}_{AMISE}\left(K\right)\simeq {h}_{AMISE}\left(N\right)\frac{1}{{\sigma}_{K}}$$  (24) 
If ${h}_{AMISE}\left(N\right)$ is evaluated with the Silverman rule, (24) rewrites :
$${h}^{Silver}\left(K\right)\simeq {h}^{Silver}\left(N\right)\frac{1}{{\sigma}_{K}}$$  (25) 
At last, from relation (21) and (25) applied in each direction $i$, we deduce the Scott rule :
$${\underline{h}}^{Scott}={\left(\frac{{\widehat{\sigma}}_{i}^{n}}{{\sigma}_{K}}{n}^{1/(d+4)}\right)}_{i}$$  (26) 
Boundary treatment
In dimension 1, the boundary effects may be taken into account in OpenTURNS : the boundaries are automatically detected from the numerical sample (with the min and max functions) and the kernel smoothed PDF is corrected in the boundary areas to remain within the boundaries, according to the miroring technique :
the Scott bandwidth is evaluated from the numerical sample : $h$
two subsamples are extracted from the inital numerical sample, containing all the points within the range $[min,min+h[$ and $]maxh,max]$,
both subsamples are transformed into their symmetric samples with respect their respective boundary : its results two samples within the range $]minh,min]$ and $[max,max+h[$,
a kernel smoothed PDF is built from the new numerical sample composed with the initial one and the two new ones, with the previous bandwidth $h$,
this last kernel smoothed PDF is truncated within the inital range $[min,max]$ (conditional PDF).
Implementation in OpenTURNS
The choice of the kind of the kernel is free in OpenTURNS : it is possible to select any 1D distribution and to define it as a kernel. However, in order to optimize the efficiency of the kernel smoothing fitting (it means to minimise the AMISE error), it is recommended to select a symmetric distribution for the kernel.
All the distribution default constructors of OpenTURNS create a symmetric default distribution when possible. It is also possible to work with the Epanechnikov kernel, which is a $Beta(r=2,t=4,a=1,b=1)$.
The default kernel is a product of standard Normal distribution. The dimension of the product is automatically evaluated from the random sample.
The bandwidth $\underline{h}$ may be fixed by the User. However, it is recommended to let OpenTURNS evaluate it automatically from the numerical sample according to the following rules :
In dimension $d$, OpenTURNS automatically applies the Scott rule.
In dimension 1, the automatic bandwidth selection method depends on the size $n$ of the numerical sample. As a matter of fact, the computation bottleneck is the estimation of the estimators ${\widehat{\Phi}}_{r}$ as it requires the evaluation of a double summation on the numerical sample, which has a cost of $\mathcal{P}\left({n}^{2}\right)$.
if $n\le 250$, the Solvetheequation plugin method is used on the entire numerical sample. The optimal bandwidth ${h}_{AMISE}\left(N\right)$ is first evaluated when considering a normal kernel, resolving equation (18) for the Normal kernel. Then relation (24) is applied in order to evaluate ${h}_{AMISE}\left(K\right)$.
if $n>250$, the Solvetheequation plugin method is too computationnally expensive. Then, OpenTURNS proceeds as follows :
OpenTURNS evaluates the bandwidth ${h}_{AMISE}^{{n}_{1},PI}\left(N\right)$ with the plugin method applied on the first ${n}_{1}=250$ points of the numerical sample, using the Normal kernel $N$ (by solving equation (18) with $K=N$);
OpenTURNS evaluates the bandwidth ${h}^{{n}_{1},Silver}\left(N\right)$ with the Silverman rule applied on the first ${n}_{1}=250$ points of the numerical sample, using the Normal kernel $N$ (relation (6) with $K=N$);
OpenTURNS evaluates the bandwidth ${h}^{n,Silver}\left(N\right)$ with the Silverman rule applied on the entire numerical sample, using the Normal kernel $N$ (relation (6) with $K=N$);
Considering from relation (3) that :
$$\frac{{h}^{Silver}\left(K\right)}{{h}_{AMISE}\left(K\right)}={\left[\frac{{\Phi}_{4}(p=normal)}{{\Phi}_{4}\left(p\right)}\right]}^{\frac{1}{5}}$$  (27) 
which is independent of the size $n$, we have the final relation :
$${h}_{AMISE}^{n,PI}\left(N\right)=\frac{{h}_{AMISE}^{{n}_{1},PI}\left(N\right)}{{h}^{{n}_{1},Silver}\left(N\right)}{h}^{n,Silver}\left(N\right)$$  (28) 
Then, if the User has chosen the kernel $K$ rather than the normal kernel $N$, relation (24) is used, which leads to :
$${h}_{AMISE}^{n,PI}\left(K\right)=\frac{1}{{\sigma}_{K}}\frac{{h}_{AMISE}^{{n}_{1},PI}\left(N\right)}{{h}_{AMISE}^{{n}_{1},Silver}\left(N\right)}{h}_{AMISE}^{n,Silver}\left(N\right)$$  (29) 
Other notations
Link with OpenTURNS methodology
of the distribution of the input random variable (Step B of the global methodology),
of the distribution of the ouput variable of interest (Step C of the global methodology).
References and theoretical basics
Kernel smoothing, M.P. Wand and M.C. Jones, Chapman & Hall/CRC edition, ISNB 0412552701.
Multivariate Density Estimation, practice and Visualisation, Theory, David W. Scott, Wiley edition.
Examples
This example illustrates the effect of the choice of the bandwidth $h$ on the estimation of the pdf compared to the optimal one. Depending on the choice of $h$, one could observe for the same size $N$ of input values oversmoothing effects or undersmoothing effects.
Oversmoothing effect
In this case, $h$ is bigger than the optimal choice ${h}_{opt}$. The effect of the values is more widely spread as in the optimal case.
Undersmoothing effect
In this case, $h$ is smaller than the optimal choice ${h}_{opt}$. The effect of the values is more locally focused on the values obtained in the data set than in the optimal case.
Optimal smoothing
Following the previous Silverman rule, for a Gaussian distribution.
Mathematical description
Parametric models aim to describe probability distributions of a random variable with the aid of a limited number of parameters $\underline{\theta}$. Therefore, in the case of continuous variables (i.e. where all possible values are continuous), this means that the probability density of $\underline{X}=\left({X}^{1},...,{X}^{{n}_{X}}\right)$ can be expressed as ${f}_{X}(\underline{x};\underline{\theta})$. In the case of discrete variables (i.e. those which take only discrete values), their probabilities can be described in the form $\mathbb{P}\left(\underline{X}=\underline{x};\underline{\theta}\right)$.
The available distributions of OpenTURNS are listed in this section. We start with continuous distributions.
Arcsine distribution: $\underline{\theta}=\left(a,b\right)$, with the constraint $a<b$. The probability density function writes:
$$\frac{1}{\pi \frac{ba}{2}\sqrt{1{\left(\frac{x\frac{a+b}{2}}{\frac{ba}{2}}\right)}^{2}}}$$  (30) 
The support is $[a,b]$.
Beta distribution: Univariate distribution. $\underline{\theta}=\left(r,t,a,b\right)$, with the constraints $r>0$, $t>r$, $b>a$. The probability density function writes:
$${f}_{X}(x;\underline{\theta})=\frac{{(xa)}^{r1}{(bx)}^{tr1}}{{(ba)}^{t1}B(r,tr)}{\mathbf{1}}_{a\le x\le b}$$  (31) 
where $B$ denotes the Beta function. The support is $[a,b]$.
Note that the Epanechnikov distribution is a particular Beta distribution : $Beta(a=1,b=1,r=2,t=4)$. It is usefull within the kernel smoothing theory (see [kernel smoothing] ).
PDF of a Beta distribution.

PDF of a Beta distribution.

PDF of a Beta distribution.

PDF of a Beta distribution.

PDF of a Beta distribution.

PDF of a Beta distribution.

PDF of a Beta distribution.

PDF of a Epanechnikov distribution.

Burr distribution: Univariate distribution. $\underline{\theta}=\left(c,k\right)$, with the constraints $c>0$, $k>0$. The probability density function writes:
$${f}_{X}(x;\underline{\theta})=ck\frac{{x}^{(c1)}}{{(1+{x}^{c})}^{(k+1)}}{\mathbf{1}}_{x>0}$$  (32) 
PDF of a Burr distribution.

PDF of a Burr distribution.

Chi: Univariate distribution. $\underline{\theta}=\nu $ with the constraint $\nu >0$. The probability density function writes:
$${f}_{X}(x;\underline{\theta})={x}^{\nu 1}{e}^{{x}^{2}/2}\frac{{2}^{1{\nu}^{\phantom{(}}/2}}{\Gamma {(\nu /2)}_{\phantom{(}}}{1}_{[0,+\infty [}\left(x\right)$$  (33) 
PDF of a Chi distribution. 
ChiSquare: Univariate distribution. $\underline{\theta}=\nu $ with the constraint $\nu >0$. The probability density function writes:
$${f}_{X}(x;\underline{\theta})=\frac{{2}^{{\nu}^{\phantom{(}}/2}}{\Gamma {(\nu /2)}_{\phantom{(}}}{x}^{(\nu /21)}{e}^{x/2}{1}_{[0,+\infty [}\left(x\right)$$  (34) 
The support is $[0,+\infty [$.
PDF of a Chi Square distribution.

PDF of a Chi Square distribution.

PDF of a Chi Square distribution. 
Dirichlet distribution: Multivariate $d$dimensional distribution. $\underline{\theta}=({\theta}_{1},\cdots ,{\theta}_{d+1})$, with the constraints $d\ge 1$ and ${\theta}_{i}>0$. The probability density function writes:
$${f}_{X}(\underline{x};\underline{\theta})=\frac{\Gamma \left({\sum}_{j=1}^{d+1}{\theta}_{j}\right)}{{\prod}_{j=1}^{d+1}\Gamma \left({\theta}_{j}\right)}{\left[1\sum _{j=1}^{d}{x}_{j}\right]}^{({\theta}_{d+1}1)}\prod _{j=1}^{d}{x}_{j}^{({\theta}_{j}1)}{\mathbf{1}}_{\Delta}\left(\underline{x}\right)$$  (35) 
with $\Delta =\{\underline{x}\in {\mathbb{R}}^{d}/\forall i,{x}_{i}\ge 0,{\sum}_{i=1}^{d}{x}_{i}\le 1\}$.
Epanechnikov distribution: Univariate distribution. The Epanechnikov distribution is a particular Beta distribution : $Beta(a=1,b=1,r=2,t=4)$. It is usefull within the kernel smoothing theory (see [kernel smoothing] ). See Figure 4 for the graph of its pdf.
Exponential distribution: $\underline{\theta}=\left(\lambda ,\gamma \right)$, with the constraint $\lambda >0$. The probability density function writes:
$${f}_{X}(x;\underline{\theta})=\lambda exp\left(\lambda (x\gamma )\right){\mathbf{1}}_{\gamma \le x}$$  (36) 
The support is $[\gamma ,+\infty [$, and is right skewed. The expected value of the distribution is $\gamma +1/\lambda $. The coefficient of variation (standard deviation / mean) is equal to $\frac{1}{1+\gamma \lambda}$ and does not depend on $\lambda $ if $\gamma =0$.
PDF of an Exponential distribution. 
FisherSnedecor distribution: $\underline{\theta}=\left({d}_{1},{d}_{2}\right)$, with the constraint ${d}_{i}>0$. The probability density function writes:
$$f}_{X}(x;\underline{\theta})=\frac{1}{xB({d}_{1}/2,{d}_{2}/2)}\left[{\left(\frac{{d}_{1}x}{{d}_{1}x+{d}_{2}}\right)}^{{d}_{1}/2}{\left(1\frac{{d}_{1}x}{{d}_{1}x+{d}_{2}}\right)}^{{d}_{2}/2}\right]{\mathbf{1}}_{x\ge 0$$  (37) 
The support is $[0,+\infty [$, and is right skewed.
PDF of a FisherSnedecor distribution.

PDF of a FisherSnedecor distribution.

Gamma distribution: Univariate distribution. $\underline{\theta}=\left(\lambda ,k,\gamma \right)$, with the constraints $\lambda >0$, $k>0$. The probability density function writes:
$${f}_{X}(x;\underline{\theta})=\frac{\lambda}{\Gamma \left(k\right)}{\left(\lambda (x\gamma )\right)}^{k1}exp\left(\lambda (x\gamma )\right){\mathbf{1}}_{\gamma \le x}$$  (38) 
where $\Gamma $ is the gamma function. The support is $[\gamma ,+\infty [$, and is right skewed.
PDF of a Gamma distribution.

PDF of a Gamma distribution.

PDF of a Gamma distribution. 
Generalized Pareto distribution: Univariate distribution. $\underline{\theta}=\left(\xi ,\sigma \right)$, with the constraints $\sigma >0$. The cumulative probability function writes:
$${F}_{X}(x;\underline{\theta})=\left\{\begin{array}{cc}{\displaystyle 1{\left(1+\frac{\xi x}{\sigma}\right)}_{\phantom{(}}^{1/\xi}}\hfill & \text{if}\phantom{\rule{4.pt}{0ex}}\xi \ne 0\hfill \\ {\displaystyle 1exp\left(\frac{x}{\sigma}\right)}\hfill & \text{if}\phantom{\rule{4.pt}{0ex}}\xi =0\hfill \end{array}\right.$$  (39) 
The support is ${\mathbb{R}}^{+}$ if $\xi \ge 0$ and $[0,\frac{1}{\xi}]$ if $\xi <0$.
PDF of a Generalized Pareto distribution.

PDF of a Generalized Pareto distribution.

Gumbel distribution: Univariate distribution. $\underline{\theta}=\left(\alpha ,\beta \right)$, with the constraint $\alpha >0$. The probability density function writes:
$${f}_{X}(x;\underline{\theta})=\alpha exp\left(\alpha (x\beta ){e}^{\alpha (x\beta )}\right)$$  (40) 
The support is $\mathbb{R}$. $\beta $ describes the most likely value, but this is less than the expected value of the distribution because the distribution is asymmetric (right skewed): the probability values in the distribution's right tail (i.e. values greater than $\beta $) decrease more gradually than those in the left tail (i.e. values less than $\beta $). a provides a measure of dispersion: the probability density function flattens as $\alpha $ decreases.
PDF of a Gumbel distribution. 
Histogram distribution: Univariate distribution. $\underline{\theta}=\left({({l}_{i},{h}_{i})}_{i}\right)$, with the constraint ${h}_{i}>0$ and ${l}_{i}>0$. The probability density function writes:
$${f}_{X}(x;\underline{\theta})=\sum _{i=1}^{n}{H}_{i}\phantom{\rule{0.277778em}{0ex}}{1}_{[{x}_{i},{x}_{i+1}]}\left(x\right)$$  (41) 
where
${H}_{i}={h}_{i}/S$ is the normalized heights, with $S={\sum}_{i=1}^{n}{h}_{i}\phantom{\rule{0.166667em}{0ex}}{l}_{i}$ being the initial surface of the histogram.
${l}_{i}={x}_{i+1}{x}_{i}$, $1\le i\le n$
$n$ is the size of the HistogramPairCollection
PDF of a Histogram distribution.

PDF of a Histogram distribution.

Inverse ChiSquare distribution: Univariate distribution. $\underline{\theta}=\left(\nu \right)$,, with the constraint $\nu >0$. The probability density function writes:
$${f}_{X}(x;\underline{\theta})={\displaystyle \frac{exp\left(\frac{1}{2x}\right)}{\Gamma \left(\frac{\nu}{2}\right){\lambda}^{\frac{\nu}{2}}{x}^{\frac{\nu}{2}+1}}}{\mathbf{1}}_{x>0}$$  (42) 
A Inverse ChiSquare distribution parametered by $\nu $ is exactly the InverseGamma distribution parametered by $({\displaystyle \frac{\nu}{2}},2)$.
PDF of an Inverse ChiSquare distribution. 
Inverse Gamma distribution: Univariate distribution. $\underline{\theta}=\left(k,\lambda \right)$,, with the constraint $k>0$ and $\lambda >0$. The probability density function writes:
$$f}_{X}(x;\underline{\theta})=\frac{exp\left({\displaystyle \frac{1}{\lambda x}}\right)}{{\Gamma}^{\phantom{(}}\left(k\right){\lambda}^{k}{x}^{k+1}}{\mathbf{1}}_{x>0$$  (43) 
PDF of an Inverse Gamma distribution. 
Inverse Normal distribution: Univariate distribution. $\underline{\theta}=\left(\lambda ,\mu \right)$, with the constraint $\lambda >0$ and $\mu >0$. The probability density function writes:
$$f}_{X}(x;\underline{\theta})={\left(\frac{\lambda}{2\pi {x}^{3}}\right)}^{1/2}{e}^{\lambda {(x\mu )}^{2}/\left(2{\mu}^{2}x\right)}{\mathbf{1}}_{x>0$$  (44) 
PDF of a Inverse Normal distribution.

PDF of a Inverse Normal distribution.

Inverse Wishart distribution: Multivariate distribution. $\underline{\theta}=\left(\underline{\underline{V}},\nu \right)$ where $\underline{\underline{V}}$ is a symmetric positive definite matrix of dimension $p$ and $\nu >p1$. The probability density function writes:
$${f}_{\underline{x}}(\underline{x};\underline{\theta})=\frac{\underline{\underline{V}}{}^{\frac{\nu}{2}}{e}^{\frac{\mathrm{tr}{\left(\underline{\underline{V}}m{\left(\underline{x}\right)}^{1}\right)}^{\phantom{(}}}{2}}}{{2}^{\frac{\nu p}{2}}{\leftm\left(\underline{x}\right)\right}^{\frac{\nu +p+1}{2}}{\Gamma}_{p}{\left(\frac{\nu}{2}\right)}_{\phantom{(}}}{\mathbf{1}}_{{\mathcal{M}}_{p}^{+}\left(\mathbb{R}\right)}\left(m\left(\underline{x}\right)\right)$$  (45) 
where $\underline{x}\in {\mathbb{R}}^{\frac{p(p+1)}{2}}$, ${\mathcal{M}}_{p}^{+}\left(\mathbb{R}\right)$ is the set of symmetric positive matrices of dimension $p$ and $m:{\mathbb{R}}^{\frac{p(p+1)}{2}}\to {\mathcal{M}}_{p}^{+}\left(\mathbb{R}\right)$ is given by:
$$\begin{array}{c}\hfill m\left(\underline{x}\right)=\left(\begin{array}{cccc}{x}_{1}& {x}_{2}& \cdots & {x}_{1+p(p1)/2}\\ {x}_{2}& {x}_{3}& & \vdots \\ \vdots & & \ddots & \vdots \\ {x}_{1+p(p1)/2}& \cdots & \cdots & {x}_{p(p+1)/2}\end{array}\right)\end{array}$$  (46) 
Laplace distribution: Univariate distribution. $\underline{\theta}=\left(\lambda ,\mu \right)$, with the constraint $\lambda >0$. The probability density function writes:
$$f}_{X}(x;\underline{\theta})=\frac{{\lambda}^{\phantom{(}}}{{2}_{\phantom{(}}}{e}^{\lambda x\mu $$  (47) 
The Laplace distribution is the generalisation of the Exponential distribution to the range $\mathbb{R}$.
PDF of a Laplace distribution. 
Logistic distribution: Univariate distribution. $\underline{\theta}=\left(\alpha ,\beta \right)$, with the constraint $\beta \ge 0$. The probability density function writes:
$${f}_{X}(x;\underline{\theta})=\frac{exp\left(\frac{x\alpha}{\beta}\right)}{\beta {\left[1+exp\left(\frac{x\alpha}{\beta}\right)\right]}^{2}}$$  (48) 
The support is $\mathbb{R}$. $\alpha $ describes the most likely value. $\beta $ provides a measure of dispersion: the probability density function flattens as $\beta $ decreases.
PDF of a logistic distribution. 
LogNormal distribution: Univariate distribution. $\underline{\theta}=\left({\mu}_{\ell},{\sigma}_{\ell},\gamma \right)$, with the constraint ${\sigma}_{\ell}>0$. The probability density function writes:
$${f}_{X}(x;\underline{\theta})=\frac{1}{{\sigma}_{\ell}(x\gamma )\sqrt{2\pi}}exp\left(\frac{1}{2}{\left(\frac{\mathrm{ln}(x\gamma ){\mu}_{\ell}}{{\sigma}_{\ell}}\right)}^{2}\right){\mathbf{1}}_{\gamma \le x}$$  (49) 
The support is $[\gamma ,+\infty [$, and is right skewed.
PDF of a LogNormal distribution. 
LogUniform distribution: Univariate distribution. $\underline{\theta}=\left({a}_{\ell},{b}_{\ell}\right)$, with the constraint ${b}_{\ell}>{a}_{\ell}$. The probability density function writes:
$${f}_{X}(x;\underline{\theta})=\frac{1}{x({b}_{\ell}{a}_{\ell})}{\mathbf{1}}_{{a}_{\ell}\le log\left(x\right)\phantom{\rule{4pt}{0ex}}leq{b}_{\ell}}$$  (50) 
The support is $[exp\left({a}_{\ell}\right),exp\left({b}_{\ell}\right)]$, and is right skewed.
PDF of a LogUniform distribution. 
Maximum entropy statistics distribution: Multivariate distribution, parametrized by $d$ marginals, with bounds $a,b$ verifying ${a}_{i}\le {a}_{i+1}$ and ${b}_{i}\le {b}_{i+1}$. The probability density function writes:
$${f}_{X}\left(x\right)={f}_{1}\left({x}_{1}\right)\prod _{k=2}^{d}{\phi}_{k}\left({x}_{k}\right)exp\left({\int}_{{x}_{k1}}^{{x}_{k}}{\phi}_{k}\left(s\right)ds\right){\mathbf{1}}_{{x}_{1}\le \cdots \le {x}_{d}}$$  (51) 
with
$${\phi}_{k}\left({x}_{k}\right)=\frac{{f}_{k}\left({x}_{k}\right)}{{F}_{k1}\left({x}_{k}\right){F}_{k}\left({x}_{k}\right)}$$  (52) 
MeixnerDistribution distribution: Univariate distribution. $\underline{\theta}=\left(\alpha ,\beta ,\delta ,\mu \right)$, with the constraint $\alpha >0$, $\beta \in ]\pi ,\pi [$ and $\delta >0$. The probability density function writes:
$$f}_{X}(x;\underline{\theta})=\frac{{\left[2cos(\beta /2)\right]}^{2\delta}}{2\alpha \pi \Gamma \left(2\delta \right)}{e}^{\frac{\beta (x\mu )}{\alpha}}{\left\Gamma (\delta +i\frac{x\mu}{\alpha})\right}^{2$$  (53) 
where ${i}^{2}=1$. The support is $\mathbb{R}$.
PDF of a MeixnerDistribution distribution. 
Non Central Chi Square: Univariate distribution. $\underline{\theta}=\left(\nu ,\lambda \right)$, with the constraint $\nu >0$ and $\lambda \ge 0$. The probability density function writes:
$${f}_{X}(x;\underline{\theta})=\sum _{j=0}^{\infty}{e}^{\lambda}\frac{{\lambda}^{j}}{j!}{p}_{{\chi}^{2}(\nu +2j)}\left(x\right)$$  (54) 
where ${p}_{{\chi}^{2}\left(q\right)}$ is the probability density function of a ${\chi}^{2}\left(q\right)$ random variate.
PDF of a Non Central Chi Square distribution.

PDF of a Non Central Chi Square distribution.

Non Central Student: Univariate distribution. $\underline{\theta}=\left(\nu ,\delta ,\gamma \right)$. Let's note that a random variable $X$ is said to have a standard noncentral student distribution $\mathcal{T}(\nu ,\delta )$ if it can be written as:
$$X=\frac{N}{\sqrt{C/\nu}}$$  (55) 
where $N$ has the normal distribution $\mathcal{N}(\delta ,1)$ and $C$ has the ${\chi}^{2}\left(\nu \right)$ distribution, $N$ and $C$ being independent.
The noncentral Student distribution in OpenTURNS has an additional parameter $\gamma $ such that the random variable $X$ is said to have a noncentral Student distribution $\mathcal{T}(\nu ,\delta ,\gamma )$ if $X\gamma $ has a standard $\mathcal{T}(\nu ,\delta )$ distribution.
We explicitate here the probability density function of the Non Central Student :
$${p}_{NCS}\left(x\right)=\frac{exp({\delta}^{2}/2)}{\sqrt{\nu \pi}\Gamma (\nu /2)}{\left(\frac{\nu}{\nu +{(x\gamma )}^{2}}\right)}^{(\nu +1)/2}\sum _{j=0}^{\infty}\frac{\Gamma \left(\frac{\nu +j+1}{2}\right)}{\Gamma (j+1)}{\left(\delta (x\gamma )\sqrt{\frac{2}{\nu +{(x\gamma )}^{2}}}\right)}^{j}$$  (56) 
PDF of a Non Central Student distribution. 
Normal Gamma: Bivariate distribution. $\underline{\theta}=\left(\mu ,\kappa ,\alpha ,\beta \right)$.
The Normal Gamma distribution is the distribution of the random vector $(X,Y)$ where $Y$ follows the distribution $\Gamma (\alpha ,\beta )$ with $\alpha >0$ and $\beta >0$, $XY$ follows the distribution $\mathcal{N}(\mu ,\frac{1}{\sqrt{\kappa Y}})$.
We explicitate here the probability density function of the Non Central Student :
$${p}_{NG}(x,y)={\displaystyle \frac{\Gamma \left(\alpha \right)}{{\beta}^{\alpha}}}\sqrt{{\displaystyle \frac{2\pi}{\kappa}}}{y}^{\alpha 1/2}exp\left({\displaystyle \frac{y}{2}}\left[\kappa {(x\mu )}^{2}+2\beta \right]\right)$$  (57) 
Normal distribution (or Gaussian distribution): Multivariate $n$dimensional distribution. In the case $n=1$, $\underline{\theta}=\left(\mu ,\sigma \right)$, with the constraint $\sigma >0$. The probability density is given as:
$${f}_{X}(x;\underline{\theta})=\frac{1}{\sigma \sqrt{2\pi}}exp\left(\frac{1}{2}{\left(\frac{x\mu}{\sigma}\right)}^{2}\right)$$  (58) 
The support is $\mathbb{R}$. $\mu $ provides the most likely value (for which the probability density function is at its highest), and the density function is symmetric around this value (the values $\mu a$ and $\mu +a$ are equally likely); $\mu $ is also the expected value (mean) of this distribution. Whilst $\sigma $ provides a measure of dispersion: the larger it is, the flatter the probability density function is (i.e. values far away from $\mu $ are still likely, or in other words possible values are more spread out).
PDF of a Normal distribution. 
In dimension $n>1$, the MultiNormal Distribution (or Multivariate Normal Distribution) writes :
$$f}_{X}(x;\underline{\theta})=\frac{1}{{\displaystyle {\left(2\pi \right)}^{\frac{n}{2}}{\left(\mathrm{det}\underline{\underline{\Sigma}}\right)}^{\frac{1}{2}}}}{e}^{\frac{1}{2}{(\underline{x}\underline{\mu})}^{t}{\underline{\underline{\Sigma}}}^{{1}^{\phantom{(}}}(\underline{x}\underline{\mu})$$  (59) 
where $\underline{\underline{\Sigma}}={\underline{\underline{\Lambda}}}_{\underline{\sigma}}\underline{\underline{R}}{\underline{\underline{\Lambda}}}_{\underline{\sigma}}$, ${\underline{\underline{\Lambda}}}_{\underline{\sigma}}=\mathrm{diag}\left(\underline{\sigma}\right)$, $\underline{\underline{R}}$ SPD, ${\sigma}_{i}>0$. The distribution is parameterized by $(\underline{\mu},\underline{\sigma},\underline{\underline{R}})$ or $(\underline{\mu},\underline{\underline{\Sigma}})$.
Random Mixture distribution: Univariate distribution. A Random Mixture $Y$ is defined as an affine combination of random variables ${X}_{i}$ as follows:
$$Y={a}_{0}+\sum _{i=1}^{n}{a}_{i}{X}_{i}$$  (60) 
where ${\left({a}_{i}\right)}_{0\le i\le n}\in {\mathbb{R}}^{n+1}$ and ${\left({X}_{i}\right)}_{1\le i\le n}$ are some independent univariate distributions.
For example,
$$Y=2+5{X}_{1}+{X}_{2}$$  (61) 
where :
${X}_{1}$ follows a $\mathcal{E}(\lambda =1.5)$,
${X}_{3}$ follows a $\mathcal{N}(\mu =4,Variance=1)$.
The pdf and cdf of this distribution are drawn in Fig.27 and Fig.27.
Probability density function of a Random Mixture.

Cumulative density function of a Random Mixture.

Rice distribution: Univariate distribution. $\underline{\theta}=\left(\sigma ,\nu \right)$, with the constraint $\nu \ge 0$ and $\sigma >0$. The probability density is given as:
$${f}_{X}(x;\underline{\theta})=2\frac{x}{{\sigma}^{2}}{p}_{{\chi}^{2}(2,\frac{{\nu}^{2}}{{\sigma}^{2}})}\left(\frac{{x}^{2}}{{\sigma}^{2}}\right)$$  (62) 
where ${p}_{{\chi}^{2}(\nu ,\lambda )}$ is the probability density function of a Non Central Chi Square distribution.
Probability density function of a Rice distribution.

Cumulative density function of a Rice distribution.

Rayleigh distribution: Univariate distribution. $\underline{\theta}=\left(\sigma ,\gamma \right)$, with the constraint $\sigma >0$.The probability density is given as:
$${f}_{X}(x;\underline{\theta})=\frac{(x\gamma )}{{\sigma}^{2}}{e}^{\frac{{(x\gamma )}^{2}}{2{\sigma}^{2}}}{1}_{[\gamma ,+\infty [}\left(x\right)$$  (63) 
PDF of a Rayleigh distribution. 
Student distribution: Univariate distribution. $\underline{\theta}=\left(\nu ,\underline{\mu},\underline{\sigma},\underline{\underline{R}}\right)$, with the constraint $\nu >2$.The Student distribution has the following probability density function, written en dimension $d$ :
$${p}_{T}\left(\underline{x}\right)=\frac{\Gamma \left(\frac{\nu +d}{2}\right)}{{\left(\pi d\right)}^{\frac{d}{2}}\Gamma \left(\frac{\nu}{2}\right)}\frac{{\left\mathrm{det}\left(\underline{\underline{R}}\right)\right}^{1/2}}{{\prod}_{k=1}^{d}{\sigma}_{k}}{\left(1+\frac{{\underline{z}}^{t}{\underline{\underline{R}}}^{1}\underline{z}}{\nu}\right)}^{\frac{\nu +d}{2}}$$  (64) 
where $\underline{z}={\underline{\underline{\Delta}}}^{1}\left(\underline{x}\underline{\mu}\right)$ with $\underline{\underline{\Delta}}=\underline{\underline{\mathrm{diag}}}\left(\underline{\sigma}\right)$.
In dimension $d=1$, $\underline{\theta}=(\nu ,\mu ,\sigma )$ and the distribution writes :
$$p}_{T}\left(x\right)=\frac{\Gamma \left(\frac{\nu +1}{2}\right)}{\sqrt{\pi}\Gamma \left(\frac{\nu}{2}\right)}\frac{1}{\sigma}{\left(1+\frac{{(x\mu )}^{2}}{\nu}\right)}^{\frac{\nu +1}{2}$$  (65) 
The parameter $\mu $ describes the most likely value. $\nu $ is a measure of dispersion: the probability density function flattens as $\nu $ decreases.
PDF of a Student distribution. 
Trapezoidal distribution: $\underline{\theta}=\left(a,b,c,d\right)$, with the constraint $a\le b<c\le d$. The probability density function writes:
$${f}_{X}(x;\underline{\theta})=\left\{\begin{array}{cc}{\displaystyle h\frac{xa}{ba}}\hfill & \mathrm{if}\phantom{\rule{4pt}{0ex}}a\le x\le b\hfill \\ {\displaystyle h}\hfill & \mathrm{if}\phantom{\rule{4pt}{0ex}}b\le x\le c\hfill \\ {\displaystyle h\frac{dx}{dc}}\hfill & \mathrm{if}\phantom{\rule{4pt}{0ex}}c\le x\le d\hfill \\ 0\hfill & \mathrm{otherwise}\hfill \end{array}\right.$$  (66) 
with $h=\frac{2}{d+cab}$
The support is $[a,d]$.
Triangular distribution: Univariate distribution. $\underline{\theta}=\left(a,b,m\right)$, with the constraints $a\le m$, $m\le b$, $b>a$. The probability density function writes:
$${f}_{X}(x;\underline{\theta})=\left\{\begin{array}{cc}{\displaystyle 2\frac{xa}{(ma)(ba)}}\hfill & \mathrm{if}\phantom{\rule{4pt}{0ex}}a\le x\le m\hfill \\ {\displaystyle 2\frac{bx}{(bm)(ba)}}\hfill & \mathrm{if}\phantom{\rule{4pt}{0ex}}m\le x\le b\hfill \\ 0\hfill & \mathrm{otherwise}\hfill \end{array}\right.$$  (67) 
The support is $[a,b]$. $m$ describes the most likely value.
PDF of a Triangular distribution. 
Truncated Normal Distribution: Univariate distribution. $\underline{\theta}=\left({\mu}_{n},{\sigma}_{n},a,b\right)$, with the constraints ${\sigma}_{n}>0$, $b>a$. The probability density function writes:
$${f}_{X}(x;\underline{\theta})=\frac{\varphi \left(\frac{x{\mu}_{n}}{{\sigma}_{n}}\right)/{\sigma}_{n}}{\Phi \left(\frac{b{\mu}_{n}}{{\sigma}_{n}}\right)\Phi \left(\frac{a{\mu}_{n}}{{\sigma}_{n}}\right)}{\mathbf{1}}_{a\le x\le b}$$  (68) 
where $\varphi $ and $\Phi $ represent the probability density and the cumulative distribution function respectively of the reduced centred Normal distribution (i.e. the mean $\mu $ zero and standard deviation $\sigma $ equal to 1). The support is $[a,b]$. $\mu $ describes the most likely value. Whilst $\sigma $ provides a measure of dispersion: the probability density function flattens as s increases (the probability density becomes zero for values outside the interval $[a,b]$).
PDF of a TruncatedNormal distribution.

PDF of a TruncatedNormal distribution.

Uniform distribution: Univariate distribution. $\underline{\theta}=\left(a,b\right)$, with the constraint $a<b$. The probability density function writes:
$${f}_{X}(x;\underline{\theta})=\frac{1}{ba}{\mathbf{1}}_{a\le x\le b}$$  (69) 
The support is $[a,b]$. All values in this interval are equallylikely.
PDF of a Uniform distribution. 
Weibull distribution: Univariate distribution. $\underline{\theta}=\left(\alpha ,\beta ,\gamma \right)$, with the constraints $\alpha >0$, $\beta >0$. probability density function writes:
$${f}_{X}(x;\underline{\theta})=\frac{\beta}{\alpha}{\left(\frac{x\gamma}{\alpha}\right)}^{\beta 1}exp\left({\left(\frac{x\gamma}{\alpha}\right)}^{\beta}\right){\mathbf{1}}_{\gamma \le x}$$  (70) 
The support is $[\gamma ,+\infty [$, and is right skewed. Both $\alpha $ and $\beta $ influence the dispersion. We note that the distribution becomes more skewed as $\beta $ decreases. In the case where $\beta =1$ this is corresponds to the Exponential distribution.
PDF of a Weibull distribution.

PDF of a Weibull distribution.

Wishart distribution: Multivariate distribution. $\underline{\theta}=\left(\underline{\underline{V}},\nu \right)$ where $\underline{\underline{V}}$ is a symmetric positive definite matrix of dimension $p$ and $\nu >p1$. The probability density function writes:
$${f}_{\underline{x}}(\underline{x};\underline{\theta})=\frac{m\left(\underline{x}\right){}^{\frac{\nu p1}{2}}{e}^{\frac{\mathrm{tr}{\left({\underline{\underline{V}}}^{1}m\left(\underline{x}\right)\right)}^{\phantom{(}}}{2}}}{{2}^{\frac{\nu p}{2}}{\left\underline{\underline{V}}\right}^{\frac{\nu}{2}}{\Gamma}_{p}{\left(\frac{\nu}{2}\right)}_{\phantom{(}}}{\mathbf{1}}_{{\mathcal{M}}_{p}^{+}\left(\mathbb{R}\right)}\left(m\left(\underline{x}\right)\right)$$  (71) 
where $\underline{x}\in {\mathbb{R}}^{\frac{p(p+1)}{2}}$, ${\mathcal{M}}_{p}^{+}\left(\mathbb{R}\right)$ is the set of symmetric positive matrices of dimension $p$ and $m:{\mathbb{R}}^{\frac{p(p+1)}{2}}\to {\mathcal{M}}_{p}^{+}\left(\mathbb{R}\right)$ is given by:
$$\begin{array}{c}\hfill m\left(\underline{x}\right)=\left(\begin{array}{cccc}{x}_{1}& {x}_{2}& \cdots & {x}_{1+p(p1)/2}\\ {x}_{2}& {x}_{3}& & \vdots \\ \vdots & & \ddots & \vdots \\ {x}_{1+p(p1)/2}& \cdots & \cdots & {x}_{p(p+1)/2}\end{array}\right)\end{array}$$  (72) 
OpenTURNS also proposes some Discrete Distributions.
Bernoulli distribution: Univariate distribution. $\underline{\theta}=p$, with the constraint $0<p<1$. The Bernoulli distribution takes only 2 values : 0 and 1.
$$\mathbb{P}\left(X=1\right)=p,\phantom{\rule{0.166667em}{0ex}}\mathbb{P}\left(X=0\right)=1p$$  (73) 
Distribution of a Bernoulli distribution.

CDF of a Bernoulli distribution.

Binomial distribution: Univariate distribution. $\underline{\theta}=(n,p)$, with the constraint $0<p<1$ and . The Binomial distribution values are the integer between 0 and n.
$$P(X=k)={C}_{n}^{k}{p}^{k}{(1p)}^{nk}$$  (74) 
where
$$\begin{array}{c}{k}^{\phantom{(}}\in \{0,\cdots ,n\}\hfill \\ n\in \mathbb{N}\hfill \\ p\in [0,1]\hfill \end{array}$$  (75) 
Distribution of a Binomial distribution.

CDF of a Binomial distribution.

Dirac distribution: Multivariate distribution. $\underline{\theta}=point$. The Dirac distribution takes only one value : $point\in {\mathbb{R}}^{n}$.
$$\mathbb{P}\left(\underline{X}=point\right)=1$$  (76) 
Distribution of a Dirac distribution.

CDF of a Dirac distribution.

Geometric distribution: Univariate distribution. $\underline{\theta}=p$, with the constraint $0<p<1$. all natural numbers $k\in {\mathbb{N}}^{*}$,
$$\mathbb{P}\left(X=k;\underline{\theta}\right)=p{\left(1p\right)}^{k1}$$  (77) 
The support is ${\mathbb{N}}^{*}$.
Distribution of a Geometric distribution.

CDF of a Geometric distribution.

KPermutationsDistribution distribution: Multivariate $d$dimensional distribution. $\underline{\theta}=(k,n)$, with the constraints $n\ge 1$ and $k\ge 1$. The KPermutationsDistribution is the discrete uniform distribution on the set of injective functions $({i}_{0},\cdots ,{i}_{{k}_{1}})$ from $\{0,\cdots ,k1\}$ into $\{0,\cdots ,n1\}$:
$$P(\underline{X}=({i}_{0},\cdots ,{i}_{k1}))=\frac{1}{d}$$  (78) 
where $d={A}_{n}^{k}=\frac{n!}{(nk)!}$.
Multinomial distribution: Multivariate $n$dimensional distribution. $\underline{\theta}=({\left({p}_{k}\right)}_{1\le k\le n},N)$, with the constraint $0<p<1$ and ${x}_{i}\in {\mathbb{N}}^{*}$,
$$P(\underline{X}=\underline{x})=\frac{N!}{{x}_{1}!\cdots {x}_{n}!(Ns)!}{p}_{1}^{{x}_{1}}\cdots {p}_{n}^{{x}_{n}}{(1q)}^{Ns}$$  (79) 
where
$$\begin{array}{c}{0}^{\phantom{(}}\le {p}_{i}\le 1\hfill \\ {x}_{i}\in \mathbb{N}\hfill \\ {\displaystyle q=\sum _{k=1}^{n}{p}_{k}\le 1}\hfill \\ s=\sum _{k=1}^{n}{x}_{k}\le {N}_{\phantom{(}}\hfill \end{array}$$  (80) 
In dimension $n=1$, this definition corresponds to the Binomial distribution.
Negative Binomial distribution: Univariate distribution. $\underline{\theta}=(r,p)$, with the constraint $0<p<1$ and $r>0$. The Negative Binomial distribution values are the positive integers $0,1,\cdots $
$$P(X=k)=\frac{\Gamma (k+r)}{\Gamma \left(r\right)\Gamma (k+1)}{p}^{k}{(1p)}^{r}$$  (81) 
where $k\in \mathbb{N}$
Distribution of a Negative Binomial distribution.

CDF of a Negative Binomial distribution.

Poisson distribution: Univariate distribution. $\underline{\theta}=\lambda $, with the constraint $\lambda >0$. For all $k\in \mathbb{N}$,
$$\mathbb{P}\left(X=k;\underline{\theta}\right)=\frac{{\lambda}^{k}}{k!}exp\left(\lambda \right)$$  (82) 
The support is $\mathbb{N}$.
Distribution of a Poisson distribution.

CDF of a Poisson distribution.

Skellam distribution: Univariate distribution. $\underline{\theta}=({\lambda}_{1},{\lambda}_{2})$, with the constraint ${\lambda}_{i}>0$. The Skellan distribution takes its values in $\mathbb{Z}$. It is the distribution of $({X}_{1}{X}_{2})$ for $({X}_{1},{X}_{2})$ independant and respectively distributed according to $Poisson\left({\lambda}_{i}\right)$. The probability distribution function is:
$$\forall k\in \mathbb{Z},\phantom{\rule{1.em}{0ex}}\mathbb{P}\left(X=k\right)=2\mathbb{P}\left(Y=2{\lambda}_{1}\right)$$  (83) 
where $Y$ is distributed according to the the non central chisquare distribution ${\chi}_{\nu ,\delta}^{2}$, with $\nu =2(k+1)$ and $\delta =2{\lambda}_{2}$.
Distribution of a Skellam distribution.

CDF of a Skellam distribution.

UserDefined: Multivariate $n$dimensional distribution. $\underline{\theta}={(\underline{{x}_{k}},{p}_{k})}_{1\le k\le N}$, with the constraint $\lambda >0$, where $0\le {p}_{k}\le 1$, $\sum _{k=1}^{{N}^{\phantom{(}}}{p}_{k}=1$.
$$P(\underline{X}={\underline{x}}_{k})={p}_{k}{)}_{1\le k\le N}$$  (84) 
The support is $\mathbb{N}$.
Distribution of a UserDefined distribution.

CDF of a UserDefined distribution.

ZipfMandelbrot distribution: Univariate distribution. $\underline{\theta}=\left(N,q,s\right)$, with the constraints $N\ge 1$, $q\ge 0$ and $s>0$. For all $k\in [1,N]$, $k$ integer,
$$\forall k\in [1,N],P(X=k)=\frac{1}{{(k+q)}^{s}}\frac{1}{H(N,q,s)}$$  (85) 
where $H(N,q,s)$ is the Generalized Harmonic Number : $H(N,q,s)=\sum _{i=1}^{N}\frac{1}{{(i+q)}^{s}}$.
Distribution of a ZipfMandelbrot distribution.

CDF of a ZipfMandelbrot distribution.

Standard representative of distributions
OpenTURNS associates to each distribution a standard representative, corresponding to a specific set of its parameters. The following tabulars detail the specific set of parameters and gives the expression of its non centered moments of order $n$.
Other notations
Link with OpenTURNS methodology
Moreover, a parametric approach is often preferable when the uncertainty study criterion defined in step A deals with a rare event, obtaining a precise evaluation of the necessary criteria generally necessitates the extrapolation of X values from the observed data. Beware however! An unwise modelling assumption (bad choice of distribution) can lead to an erroneous extrapolation and thus the results of the study may be false!
The correct choice of probability distribution is thus crucial. Statistical tools are available to validate or invalidate the choice of distribution given a set of data (see for example [Graphical analysis] [KolmogorovSmirnov test] ). But consideration of the underlying context is also recommended. For example:
the Normal distribution is relevant in metrology to represent certain measures of uncertainty.
the Exponential distribution is useful for modelling uncertainty when considering the life duration of material that is not subject to ageing,
the Gumbel distribution is defined to describe extreme phenomenon (e.g. maximal annual flow of a river or of wind speed)
Some distributions are often used to express expert judgement in simple terms:
the Uniform distribution expresses knowledge concerning the absolute limits of variables (i.e. the probability to exceed these limits is strictly zero) without any other prior assumption about the distribution (such as, for example the mean value or the most likely value),
the Triangular distribution expresses knowledge concerning the absolute limits of variables and the most likely value.
Finally, an important point concerning the multidimensional case where ${n}_{X}>1$. Choosing the type of distribution implies an assumption about the uncertainty of each of the variables ${X}^{i}$, but also on the potential interdependencies between variables. These interdependencies between unknown variables can consequently have an impact on the results of the uncertainty study.
Readers wishing to consider the dependencies in their study more deeply are referred to, for example, [copula method] , [linear correlation] , [rank correlation] .
The following bibliographical references provide main starting points for further study of this method:
Saporta, G. (1990). "Probabilités, Analyse de données et Statistique", Technip
Dixon, W.J. & Massey, F.J. (1983) "Introduction to statistical analysis (4th ed.)", McGrawHill
Bhattacharyya, G.K., and R.A. Johnson, (1997). "Statistical Concepts and Methods", John Wiley and Sons, New York.
Mathematical description
To define the joined probability density function of the random input vector $\underline{X}$ by composition, one needs:
the specification of the copula of interest $C$ with its parameters,
the specification of the ${n}_{X}$ marginal laws of interest ${F}_{{X}_{i}}$ of the ${n}_{X}$ input variables ${X}_{i}$.
The joined cumulative density function is therefore defined by :
$$\begin{array}{c}\hfill \mathbb{P}\left({X}^{1}\le {x}^{1},{X}^{2}\le {x}^{2},\cdots ,{X}^{{n}_{X}}\le {x}^{{n}_{X}}\right)=C\left({F}_{{X}^{1}}\left({x}^{1}\right),{F}_{{X}^{2}}\left({x}^{2}\right),\cdots ,{F}_{{X}^{{n}_{X}}}\left({x}^{{n}_{X}}\right)\right)\end{array}$$ 
Within this part, we define the concept of copula and its use within OpenTURNS.
Principles
Copulas allow to represent the part of the joined cumulative density function which is not described by the marginal laws. It enables to represent the dependence structure of the input variables. A copula is a special cumulative density function defined on ${[0,1]}^{{n}_{X}}$ whose marginal distributions are uniform on $[0,1]$. The choice of the dependence structure is disconnected from the choice of the marginal distributions.
Basic properties of copulas
A copula, restricted to ${[0,1]}^{{n}_{X}}$ is a ${n}_{U}$dimensional cumulative density function with uniform marginals.
$C\left(\underline{u}\right)\ge 0$, $\forall \underline{u}\in {[0,1]}^{{n}_{U}}$
$C\left(\underline{u}\right)={u}_{i}$, $\forall \underline{u}=(1,...,1,{u}_{i},1,...,1)$
For all $N$box $\mathcal{B}=[{a}_{1},{b}_{1}]\times \cdots \times [{a}_{{n}_{U}},{b}_{{n}_{U}}]\in {[0,1]}^{{n}_{U}}$, we have ${\mathcal{V}}_{C}\left(\mathcal{B}\right)\ge 0$, where:
${\mathcal{V}}_{C}\left(\mathcal{B}\right)={\sum}_{i=1,\cdots ,{2}^{{n}_{U}}}sign\left({\underline{v}}_{i}\right)\times C\left({\underline{v}}_{i}\right)$, the summation being made over the ${2}^{{n}_{U}}$ vertices $\underline{{v}_{i}}$ of $\mathcal{B}$.
$sign\left({\underline{v}}_{i}\right)=+1$ if ${v}_{i}^{k}={a}_{k}$ for an even number of ${k}^{\text{'}}s$, $sign\left({\underline{v}}_{i}\right)=1$ otherwise.
Copulas available within OpenTURNS
Different copulas are available within OpenTURNS:
AliMikhailHaq Copula: The AliMikhailHaq copula is archimedean, parameterized by a scalar $\theta \ge 0$ . The Clayton copula is thus defined by:
$$\begin{array}{c}\hfill {\displaystyle C({u}_{1},{u}_{2})={\displaystyle \frac{{u}_{1}{u}_{2}}{1\theta (1{u}_{1})(1{u}_{2})}}}\end{array}$$ 
IsoPDF of a AliMikhailHaq copula. 
Clayton Copula: The Clayton copula is parameterized by a scalar $\theta \ge 0$ . The Clayton copula is thus defined by:
$$\begin{array}{c}\hfill {\displaystyle C({u}_{1},{u}_{2})={\left({u}_{1}^{\theta}+{u}_{2}^{\theta}1\right)}^{1/\theta}}\end{array}$$ 
IsoPDF of a Clayton copula. 
Composed Copula: A copula may be defined as the product of other copulas : if ${C}_{1}$ and ${C}_{2}$ are two copulas respectively of random vectors in ${\mathbb{R}}^{{n}_{1}}$ and ${\mathbb{R}}^{{n}_{2}}$, we can create the copula of a random vector of ${\mathbb{R}}^{{n}_{1}+{n}_{2}}$, noted $C$ as follows :
$$\begin{array}{c}\hfill C({u}_{1},\cdots ,{u}_{n})={C}_{1}({u}_{1},\cdots ,{u}_{{n}_{1}}){C}_{2}({u}_{{n}_{1}+1},\cdots ,{u}_{{n}_{1}+{n}_{2}})\end{array}$$ 
It means that both subvectors $({u}_{1},\cdots ,{u}_{{n}_{1}})$ and $({u}_{{n}_{1}+1},\cdots ,{u}_{{n}_{1}+{n}_{2}})$ of ${\mathbb{R}}^{{n}_{1}}$ and ${\mathbb{R}}^{{n}_{2}}$ are independent.
FarlieGumbelMorgenstern Copula: The FarlieGumbelMorgenstern copula is parameterized by a scalar $\theta \in [1,1]$ . The FarlieGumbelMorgenstern copula is thus defined by:
$$\begin{array}{c}\hfill {\displaystyle C({u}_{1},{u}_{2})={u}_{1}{u}_{2}(1+\theta (1{u}_{1})(1{u}_{2}))}\end{array}$$ 
IsoPDF of a FarlieGumbelMorgenstern copula. 
Frank Copula: The Frank copula is parameterized by a scalar $\theta \ne 0$ . The Frank copula is thus defined by:
$$\begin{array}{c}\hfill {\displaystyle C({u}_{1},{u}_{2})=\frac{1}{\theta}log\left(1+\frac{({e}^{\theta {u}_{1}}1)({e}^{\theta {u}_{2}}1}{{e}^{\theta}1}\right)}\end{array}$$ 
IsoPDF of a Frank copula. 
Gumbel Copula: The Gumbel copula is parameterized by a scalar $\theta \ge 0$ . The Gumbel copula is thus defined by:
$$\begin{array}{c}\hfill {\displaystyle C({u}_{1},{u}_{2})=exp\left({\left({(log\left({u}_{1}\right))}^{\theta}+{(log\left({u}_{2}\right))}^{\theta}\right)}^{1/\theta}\right)}\end{array}$$ 
IsoPDF of a Gumbel copula. 
Independent Copula: It means that all the input variables are independent the ones from the others. The independent copula is defined by:
$$\begin{array}{c}\hfill C({u}_{1},{u}_{2},\cdots ,{u}_{{n}_{\phantom{\rule{0.222222em}{0ex}}U}})=\prod _{i=1}^{{n}_{\phantom{\rule{0.222222em}{0ex}}U}}{u}_{i}\end{array}$$ 
IsoPDF of an Independent copula. 
Maximumentropy statistics copula: The density function is defined by:
$$\begin{array}{c}\hfill {f}_{U}\left(u\right)=\prod _{k=2}^{d}\frac{exp\left({\int}_{{\partial}_{k1}^{1}\left({u}_{k1}\right)}^{{\partial}_{k}^{1}\left({u}_{k}\right)}{\phi}_{k}\left(s\right)ds\right)}{{\partial}_{k1}\left({\partial}_{k}^{1}\left({u}_{k}\right)\right){u}_{k}}{\mathbf{1}}_{{F}_{1}^{1}\left({u}_{1}\right)\le \cdots \le {F}_{d}^{1}\left({u}_{d}\right)}\end{array}$$ 
$$\begin{array}{c}\hfill \text{with}\phantom{\rule{4.pt}{0ex}}{\partial}_{k}\left(t\right)={F}_{k}\left({G}^{1}\left(t\right)\right)\phantom{\rule{4.pt}{0ex}}\text{and}\phantom{\rule{4.pt}{0ex}}G\left(t\right)=\frac{1}{t}\sum _{k=1}^{d}{F}_{k}\left(t\right)\end{array}$$ 
Min Copula: The Min copula is the upper FréchetHoeffding bound defined by:
$$\begin{array}{c}\hfill C({u}_{1},\cdots ,{u}_{n})=min({u}_{1},\cdots ,{u}_{n})\end{array}$$ 
Normal Copula: The Normal copula is parameterized by a correlation matrix $\mathbf{R}$. The Normal copula is thus defined by:
$$\begin{array}{c}\hfill C({u}_{1},\cdots ,{u}_{n})={\Phi}_{\mathbf{R}}^{{n}_{\phantom{\rule{0.222222em}{0ex}}U}}\left({\Phi}^{1}\left({u}_{1}\right),{\Phi}^{1}\left({u}_{2}\right),\cdots ,{\Phi}^{1}\left({u}_{{n}_{\phantom{\rule{0.222222em}{0ex}}U}}\right)\right)\end{array}$$ 
where:
${\Phi}_{\mathbf{R}}^{{n}_{X}}$ is the multinormal cumulative density function in dimension ${n}_{X}$:
$$\begin{array}{c}\hfill {\Phi}_{\mathbf{R}}^{{n}_{X}}\left(\underline{x}\right)={\int}_{\infty}^{{x}_{1}}...{\int}_{\infty}^{{x}_{{n}_{X}}}\frac{1}{{(2\pi .det\mathbf{R})}^{\frac{{n}_{X}}{2}}}\phantom{\rule{0.222222em}{0ex}}.\phantom{\rule{0.222222em}{0ex}}{e}^{\frac{{}^{t}\underline{u}.\mathbf{R}.\underline{u}}{2}}\phantom{\rule{0.222222em}{0ex}}d{u}_{1}...d{u}_{{n}_{\phantom{\rule{0.222222em}{0ex}}X}}\end{array}$$ 
$\Phi $ is the cumulative distribution function of the normal law in dimension 1:
$$\begin{array}{c}\hfill \Phi \left(x\right)={\int}_{\infty}^{x}\frac{1}{\sqrt{2\pi}}\phantom{\rule{0.222222em}{0ex}}{e}^{\frac{{t}^{2}}{2}}\phantom{\rule{0.222222em}{0ex}}dt\end{array}$$ 
$\mathbf{R}$ is the correlation matrix. This matrix is defined by its algebric properties: symmetric, definite and positive.
The correlation matrix $\mathbf{R}$ can be obtained by different means:
If one knows the Spearmann correlation Matrix, that is to say,
$$\begin{array}{c}\hfill {\rho}_{ij}^{S}={\rho}^{S}({X}_{i},{X}_{j})={\rho}^{P}({F}_{{X}_{i}}\left({X}_{i}\right),{F}_{{X}_{j}}\left({X}_{j}\right))\end{array}$$ 
the correlation matrix $\mathbf{R}$ is deduced by the following formula:
$$\begin{array}{c}\hfill {\mathbf{R}}_{ij}=2sin\left(\frac{\pi}{6}{\rho}_{ij}^{S}\right)\end{array}$$ 
If one knows the Kendall measure of correlation, that is to say,
$$\begin{array}{c}\hfill {\tau}_{ij}=\tau ({X}_{i},{X}_{j})=\mathbb{P}\left(({X}_{{i}_{1}}{X}_{{i}_{2}}).({X}_{{j}_{1}}{X}_{{j}_{2}})>0\right)\mathbb{P}\left(({X}_{{i}_{1}}{X}_{{i}_{2}}).({X}_{{j}_{1}}{X}_{{j}_{2}})<0\right)\end{array}$$ 
where $({X}_{{i}_{1}},{X}_{{j}_{1}})$ and $({X}_{{i}_{2}},{X}_{{j}_{2}})$ follow the law of $({X}_{i},{X}_{j})$, the correlation matrix $\mathbf{R}$ is deduced by the following formula:
$$\begin{array}{c}\hfill {\mathbf{R}}_{ij}=sin(\frac{\pi}{2}.{\tau}_{ij})\end{array}$$ 
If one knows the Pearson correlation Matrix ${\mathbf{R}}^{P}$, there are two possibilities:
If and only if all the marginal laws are Normal,
$$\begin{array}{c}\hfill \mathbf{R}\equiv {\mathbf{R}}^{P}\end{array}$$ 
In the other cases, one has to build the correlation matrix $\mathbf{R}$ by inversion of the following formula from the Pearson Correlation Matrix ${\mathbf{R}}^{P}$:
$$\begin{array}{c}\hfill {\mathbf{R}}_{ij}^{P}=\int {\int}_{{\mathbb{R}}^{2}}({x}^{i}\mathbb{E}\left[{X}^{i}\right])({x}^{j}\mathbb{E}\left[{X}^{j}\right]){\Phi}_{ij}({x}^{i},{x}^{j},{\mathbf{R}}_{ij})d{x}^{i}d{x}^{j}\end{array}$$ 
IsoPDF of a Normal copula. 
Sklar Copula: The Sklar copula is obtained directly from the expression of the $n$dimensional distribution which cumulative distribution function is $F$ with ${F}_{i}$ its marginals :
$$\begin{array}{c}\hfill C({u}_{1},\cdots ,{u}_{n})=F({F}_{1}^{1}\left({u}_{1}\right),\cdots ,{F}_{n}^{1}\left({u}_{n}\right))\end{array}$$ 
Figure 51 shows the isoPDF of a Sklar copula extracted from a bidimensional Student distribution.
IsoPDF of a Sklar copula. 
Other notations
Link with OpenTURNS methodology
References and theoretical basics
The following references give a first entry point to the Copulas:
Nelsen, 'Introduction to Copulas'
Embrechts P., Lindskog F., Mc Neil A., 'Modelling dependence with copulas and application to Risk Management', ETZH 2001.
Examples
Mathematical description
A multivariate random variable $\underline{Y}$ may be defined as an affine transform of $n$ independent univariate random variable, as follows :
$$\underline{Y}={\underline{y}}_{0}+\underline{\underline{M}}\phantom{\rule{0.166667em}{0ex}}\underline{X}$$  (86) 
where ${\underline{y}}_{0}\in {\mathbb{R}}^{d}$ is a deterministic vector with $d\in \{1,2,3\}$, $\underline{\underline{M}}\in {\mathcal{M}}_{d,n}\left(\mathbb{R}\right)$ a deterministic matrix and ${\left({X}_{k}\right)}_{1\le k\le n}$ are some independent univariate distributions.
In such a case, it is possible to evaluate directly the distribution of $\underline{Y}$ and then to ask $\underline{Y}$ any request compatible with a distribution : moments, probability and cumulative density functions, quantiles (in dimension 1 only) ...
Principle
Evaluation of the probability density function of the Random Mixture
As the univariate random variables ${X}_{i}$ are independent, the characteristic function of $\underline{Y}$, denoted ${\phi}_{Y}$, is easily defined from the characteristic function of ${X}_{k}$ denoted ${\phi}_{{X}_{k}}$ as follows :
$$\phi}_{Y}({u}_{1},\cdots ,{u}_{d})=\prod _{j=1}^{d}{e}^{i{u}_{j}{{y}_{0}}_{j}}\prod _{k=1}^{n}{\phi}_{{X}_{k}}\left({\left({M}^{t}u\right)}_{k}\right),\phantom{\rule{4.pt}{0ex}}\text{for}\phantom{\rule{4.pt}{0ex}}\underline{u}\in {\mathbb{R}}^{d$$  (87) 
Once ${\phi}_{Y}$ evaluated, it is possible to evaluate the probability density function of $Y$, denoted ${p}_{Y}$ : several techniques are possible, as the inversion of the Fourier transformation. This technique is not easy to implement.
OpenTURNS uses another technique, based on the Poisson sum formulation, defined as follows :
$$\sum _{{j}_{1}\in \mathbb{Z}}\cdots \sum _{{j}_{d}\in \mathbb{Z}}{p}_{Y}\left({y}_{1}+\frac{2\pi {j}_{1}}{{h}_{1}},\cdots ,{y}_{d}+\frac{2\pi {j}_{d}}{{h}_{d}}\right)=\prod _{j=1}^{d}\frac{{h}_{j}}{2*\pi}\sum _{{k}_{1}\in \mathbb{Z}}\cdots \sum _{{k}_{d}\in \mathbb{Z}}\phi \left({k}_{1}{h}_{1},\cdots ,{k}_{d}{h}_{d}\right){e}^{\u0131\left({\sum}_{m=1}^{d}{k}_{m}{h}_{m}{y}_{m}\right)}$$  (88) 
By fixing ${h}_{1},\cdots ,{h}_{d}$ small enough, $\frac{2k\pi}{{h}_{j}}\approx +\infty $ and ${p}_{Y}(\cdots ,\frac{2k\pi}{{h}_{j}},\cdots )\approx 0$ because of the decreasing properties of ${p}_{Y}$. Thus the nested sums of the left term of (88) are reduced to the central term ${j}_{1}=\cdots ={j}_{d}=0$ : the left term is approximatively equal to ${p}_{Y}\left(y\right)$.
Furthermore, the right term of (88) is a series which converges very fast: only few terms of the series are enough to get machineprecision accuracy. Let us note that the factors ${\phi}_{Y}({k}_{1}{h}_{1},\cdots ,{k}_{d},{h}_{d})$, which are expensive to evaluate, do not depend on $y$ and are evaluated once only.
$$p}_{Y}\left(y\right)=\sum _{j\in {\mathbb{Z}}^{d}}{q}_{Y}\left({y}_{1}+\frac{2\pi {j}_{1}}{{h}_{1}},\cdots ,{y}_{d}+\frac{2\pi {j}_{d}}{{h}_{d}}\right)+\frac{H}{{2}^{d}{\pi}^{d}}\sum _{{k}_{1}\le N}\cdots \sum _{{k}_{d}\le N}{\delta}_{Y}\left({k}_{1}{h}_{1},\cdots ,{k}_{d}{h}_{d}\right){e}^{\u0131\left({\sum}_{m=1}^{d}{k}_{m}{h}_{m}{y}_{m}\right)$$  (89) 
where $H={h}_{1}\times \cdots \times {h}_{d}$, $j=({j}_{1},\cdots ,{j}_{d})$, ${\delta}_{Y}:={\phi}_{Y}{\psi}_{Y}$
In the case where $n\gg $ 1, using the limit central theorem, the law of $\underline{Y}$ tends to the normal distribution density $q$, which will drastically reduce $N$. The sum on $q$ will become the most CPUintensive part, because in the general case we will have to keep more terms than the central one in this sum, since the parameters ${h}_{1},\cdots {h}_{d}$ were calibrated with respect to $p$ and not $q$.
The parameters ${h}_{1},\cdots {h}_{d}$ are calibrated using the following formula:
$$\begin{array}{c}\hfill {h}_{\ell}=\frac{2\pi}{(\beta +4\alpha ){\sigma}_{\ell}}\end{array}$$  (90) 
where ${\sigma}_{\ell}=\sqrt{\mathrm{Cov}{\left[\underline{Y}\right]}_{\ell ,\ell}}$ and $\alpha $, $\beta $ are respectively the number of standard deviations covered by the marginal distribution ($\alpha =5$ by default) and $\beta $ the number of marginal deviations beyond which the density is negligible ($\beta =8.5$ by default).
The $N$ parameter is dynamically calibrated: we start with $N=8$ then we double $N$ value until the total contribution of the additional terms is negligible.
Evaluation of the moments of the Random Mixture
The relation (86) enables to evaluate all the moments of the random mixture, if mathematically defined. For example, we have :
$$\begin{array}{c}\hfill \left\{\begin{array}{ccc}\mathbb{E}\left[\underline{Y}\right]\hfill & =& \underline{{y}_{0}}+\underline{\underline{M}}\mathbb{E}\left[\underline{X}\right]\hfill \\ \mathrm{Cov}\left[\underline{Y}\right]\hfill & =& \underline{\underline{M}}\phantom{\rule{0.166667em}{0ex}}\mathrm{Cov}\left[\underline{X}\right]{\underline{\underline{M}}}^{t}\hfill \end{array}\right.\end{array}$$ 
Computation on a regular grid
The interest is to compute the density function on a regular grid. Purposes are to get quickly ann approximation. The regular grid is of form:
$$\begin{array}{c}\hfill \forall r\in \{1,\cdots ,d\},\forall m\in \{0,\cdots ,M1\},\phantom{\rule{0.222222em}{0ex}}{y}_{r,m}={\mu}_{r}+b\left(\frac{2m+1}{M}1\right){\sigma}_{r}\end{array}$$  (91) 
By denoting ${p}_{{m}_{1},\cdots ,{m}_{d}}={p}_{\underline{Y}}({y}_{1,{m}_{1}},\cdots ,{y}_{d,{m}_{d}})$:
$$\begin{array}{c}\hfill {p}_{{m}_{1},\cdots ,{m}_{d}}={Q}_{{m}_{1},\cdots ,{m}_{d}}+{S}_{{m}_{1},\cdots ,{m}_{d}}\end{array}$$  (92) 
for which the term ${S}_{{m}_{1},\cdots ,{m}_{d}}$ is the most CPU consuming. This term rewrites:
$$\begin{array}{cc}\hfill {S}_{{m}_{1},\cdots ,{m}_{d}}=& \frac{H}{{2}^{d}{\pi}^{d}}\sum _{{k}_{1}=N}^{N}\cdots \sum _{{k}_{d}=N}^{N}\delta \left({k}_{1}{h}_{1},\cdots ,{k}_{d}{h}_{d}\right){E}_{{m}_{1},\cdots ,{m}_{d}}({k}_{1},\cdots ,{k}_{d})\hfill \end{array}$$  (3.3) 
with:
$$\begin{array}{cc}\hfill \delta \left({k}_{1}{h}_{1},\cdots ,{k}_{d}{h}_{d}\right)& =(\phi \psi )\left({k}_{1}{h}_{1},\cdots ,{k}_{d}{h}_{d}\right)\hfill \\ \hfill {E}_{{m}_{1},\cdots ,{m}_{d}}({k}_{1},\cdots ,{k}_{d})& ={e}^{i{\sum}_{j=1}^{d}{k}_{j}{h}_{j}\left({\mu}_{j}+b\left(\frac{2{m}_{j}+1}{M}1\right){\sigma}_{j}\right)}\hfill \end{array}$$  (3.3) 
The aim is to rewrite the previous expression as a $d$ discrete Fourier transform, in order to apply Fast Fourier Transform (FFT) for its evaluation.
We set $M=N$ and $\forall j\in \{1,\cdots ,d\},\phantom{\rule{0.222222em}{0ex}}{h}_{j}=\frac{\pi}{b{\sigma}_{j}}$ and ${\tau}_{j}=\frac{{\mu}_{j}}{b{\sigma}_{j}}$. For convenience, we introduce the functions:
$${f}_{j}\left(k\right)={e}^{i\pi (k+1)\left({\tau}_{j}1+\frac{1}{N}\right)}$$ 
We use $k+1$ instead of $k$ in this function to simplify expressions below.
We obtain:
$$\begin{array}{cc}\hfill {E}_{{m}_{1},\cdots ,{m}_{d}}({k}_{1},\cdots ,{k}_{d})& ={e}^{i{\sum}_{j=1}^{d}{k}_{j}{h}_{j}b{\sigma}_{j}\left(\frac{{\mu}_{j}}{b{\sigma}_{j}}+\frac{2{m}_{j}}{N}+\frac{1}{N}1\right)}\hfill \\ & ={e}^{2i\pi \left(\frac{{\sum}_{j=1}^{d}{k}_{j}{m}_{j}}{N}\right)}{e}^{i\pi {\sum}_{j=1}^{d}{k}_{j}\left({\tau}_{j}1+\frac{1}{N}\right)}\hfill \\ & ={e}^{2i\pi \left(\frac{{\sum}_{j=1}^{d}{k}_{j}{m}_{j}}{N}\right)}{f}_{1}({k}_{1}1)\times \cdots \times {f}_{d}({k}_{d}1)\hfill \end{array}$$  (3.3) 
For performance reasons, we want to use the discrete Fourier transform with the following convention in dimension 1 :
$${A}_{m}=\sum _{k=0}^{N1}{a}_{k}{e}^{2i\pi \frac{km}{N}}$$ 
which extension to dimensions 2 and 3 are respectively :
$${A}_{m,n}=\sum _{k=0}^{N1}\sum _{l=0}^{N1}{a}_{k,l}{e}^{2i\pi \frac{km}{N}}{e}^{2i\pi \frac{ln}{N}}$$ 
$${A}_{m,n,p}=\sum _{k=0}^{N1}\sum _{l=0}^{N1}\sum _{s=0}^{N1}{a}_{k,l,s}{e}^{2i\pi \frac{km}{N}}{e}^{2i\pi \frac{ln}{N}}{e}^{2i\pi \frac{sp}{N}}$$ 
We decompose sums of (3.3) on the interval $[N,N]$ into three parts:
$$\begin{array}{cc}\hfill \sum _{{k}_{j}=N}^{N}\delta \left({k}_{1}{h}_{1},\cdots ,{k}_{d}{h}_{d}\right){E}_{{m}_{1},\cdots ,{m}_{d}}({k}_{1},\cdots ,{k}_{d})=& \sum _{{k}_{j}=N}^{1}\delta \left({k}_{1}{h}_{1},\cdots ,{k}_{d}{h}_{d}\right){E}_{{m}_{1},\cdots ,{m}_{d}}({k}_{1},\cdots ,{k}_{d})\hfill \\ & +\delta \left({k}_{1}{h}_{1},\cdots ,0,\cdots ,{k}_{d}{h}_{d}\right){E}_{{m}_{1},\cdots ,0,\cdots ,{m}_{d}}({k}_{1},\cdots ,0,\cdots ,{k}_{d})\hfill \\ & +\sum _{{k}_{j}=1}^{N}\delta \left({k}_{1}{h}_{1},\cdots ,{k}_{d}{h}_{d}\right){E}_{{m}_{1},\cdots ,{m}_{d}}({k}_{1},\cdots ,{k}_{d})\hfill \end{array}$$  (3.3) 
If we already computed $E$ for dimension $d1$, then the middle term in this sum is trivial.
To compute the last sum of (3.3), we apply a change of variable ${k}_{j}^{\text{'}}={k}_{j}1$:
$$\begin{array}{cc}\hfill \sum _{{k}_{j}=1}^{N}\delta \left({k}_{1}{h}_{1},\cdots ,{k}_{d}{h}_{d}\right){E}_{{m}_{1},\cdots ,{m}_{d}}({k}_{1},\cdots ,{k}_{d})=& \sum _{{k}_{j}=0}^{N1}\delta \left({k}_{1}{h}_{1},\cdots ,({k}_{j}+1){h}_{j},\cdots ,{k}_{d}{h}_{d}\right)\times \hfill \\ & \phantom{\rule{85.35826pt}{0ex}}{E}_{{m}_{1},\cdots ,{m}_{d}}({k}_{1},\cdots ,{k}_{j}+1,\cdots ,{k}_{d})\hfill \end{array}$$  (3.3) 
Equation (3.3) gives:
$$\begin{array}{cc}\hfill {E}_{{m}_{1},\cdots ,{m}_{d}}({k}_{1},\cdots ,{k}_{j}+1,\cdots ,{k}_{d})& ={e}^{2i\pi \left(\frac{{\sum}_{l=1}^{d}{k}_{l}{m}_{l}}{N}+\frac{{m}_{j}}{N}\right)}{f}_{1}({k}_{1}1)\times \cdots \times {f}_{j}\left({k}_{j}\right)\times \cdots \times {f}_{d}({k}_{d}1)\hfill \\ & ={e}^{2i\pi \left(\frac{{m}_{j}}{N}\right)}{e}^{2i\pi \left(\frac{{\sum}_{l=1}^{d}{k}_{l}{m}_{l}}{N}\right)}{f}_{1}({k}_{1}1)\times \cdots \times {f}_{j}\left({k}_{j}\right)\times \cdots \times {f}_{d}({k}_{d}1)\hfill \end{array}$$  (3.3) 
Thus
$$\begin{array}{cc}\hfill \sum _{{k}_{j}=1}^{N}\delta \left({k}_{1}{h}_{1},\cdots ,{k}_{d}{h}_{d}\right){E}_{{m}_{1},\cdots ,{m}_{d}}& ({k}_{1},\cdots ,{k}_{d})={e}^{2i\pi \left(\frac{{m}_{j}}{N}\right)}\sum _{{k}_{j}=0}^{N1}\delta \left({k}_{1}{h}_{1},\cdots ,({k}_{j}+1){h}_{j},\cdots ,{k}_{d}{h}_{d}\right)\times \hfill \\ & {e}^{2i\pi \left(\frac{{\sum}_{l=1}^{d}{k}_{l}{m}_{l}}{N}\right)}{f}_{1}({k}_{1}1)\times \cdots \times {f}_{j}\left({k}_{j}\right)\times \cdots \times {f}_{d}({k}_{d}1)\hfill \end{array}$$  (3.3) 
To compute the first sum of equation (3.3), we apply a change of variable ${k}_{j}^{\text{'}}=N+{k}_{j}$:
$$\begin{array}{cc}\hfill \sum _{{k}_{j}=N}^{1}\delta \left({k}_{1}{h}_{1},\cdots ,{k}_{d}{h}_{d}\right){E}_{{m}_{1},\cdots ,{m}_{d}}({k}_{1},\cdots ,{k}_{d})=& \sum _{{k}_{j}=0}^{N1}\delta \left({k}_{1}{h}_{1},\cdots ,({k}_{j}N){h}_{j},\cdots ,{k}_{d}{h}_{d}\right)\times \hfill \\ & \phantom{\rule{85.35826pt}{0ex}}{E}_{{m}_{1},\cdots ,{m}_{d}}({k}_{1},\cdots ,{k}_{j}N,\cdots ,{k}_{d})\hfill \end{array}$$  (3.3) 
Equation (3.3) gives:
$$\begin{array}{cc}\hfill {E}_{{m}_{1},\cdots ,{m}_{d}}({k}_{1},\cdots ,{k}_{j}N,\cdots ,{k}_{d})& ={e}^{2i\pi \left(\frac{{\sum}_{l=1}^{d}{k}_{l}{m}_{l}}{N}{m}_{j}\right)}{f}_{1}({k}_{1}1)\times \cdots \times {f}_{j}({k}_{j}1N)\times \cdots \times {f}_{d}({k}_{d}1)\hfill \\ & ={e}^{2i\pi \left(\frac{{\sum}_{l=1}^{d}{k}_{l}{m}_{l}}{N}\right)}{f}_{1}({k}_{1}1)\times \cdots \times {\overline{f}}_{j}(N1{k}_{j})\times \cdots \times {f}_{d}({k}_{d}1)\hfill \end{array}$$  (3.3) 
Thus:
$$\begin{array}{cc}\hfill \sum _{{k}_{j}=N}^{1}\delta \left({k}_{1}{h}_{1},\cdots ,{k}_{d}{h}_{d}\right){E}_{{m}_{1},\cdots ,{m}_{d}}& ({k}_{1},\cdots ,{k}_{d})=\sum _{{k}_{j}=0}^{N1}\delta \left({k}_{1}{h}_{1},\cdots ,({k}_{j}N){h}_{j},\cdots ,{k}_{d}{h}_{d}\right)\times \hfill \\ & {e}^{2i\pi \left(\frac{{\sum}_{l=1}^{d}{k}_{l}{m}_{l}}{N}\right)}{f}_{1}({k}_{1}1)\times \cdots \times {\overline{f}}_{j}(N1{k}_{j})\times \cdots \times {f}_{d}({k}_{d}1)\hfill \end{array}$$  (3.3) 
To summarize:
In order to compute sum from ${k}_{1}=1$ to $N$, we multiply by ${e}^{2i\pi \left(\frac{{m}_{1}}{N}\right)}$ and consider $\delta (({k}_{1}+1)h,\cdots ){f}_{1}\left({k}_{1}\right)$
In order to compute sum from ${k}_{1}=N$ to $1$, we consider $\delta (({k}_{1}N)h,\cdots ){\overline{f}}_{1}(N1{k}_{1})$
OpenTURNS
In the 0.13 version of OpenTURNS, distributions which are able to evaluate their characteristic function are the following ones : ${\chi}^{2}$, Exponential, Gamma, Laplace, Logistic, Mixture, univariate Normal, Rayleigh, Triangular, univariate TruncatedNormal, Uniform, KernelMixture (which the distribution coming from a kernel smoothing method without treatment of bounds), RandomMixture.
Thus, all the requests to $Y$ that require the evaluation of the probability density function may be satisfied only if the univariate random variables ${X}_{i}$ follow distributions which characteristic function has been implemented.
Until the 1.5 version of OpenTURNS, only univariate random mixtures were available. For all the other requests, no restriction is assigned.
Link with OpenTURNS methodology
Examples
$$\begin{array}{c}\hfill Y=2+5{X}_{1}+{X}_{2}\end{array}$$ 
where ${X}_{1}$ and ${X}_{2}$ are independent and :
${X}_{1}$ follows a $\mathcal{E}(1.5)$,
${X}_{2}$ follows a $\mathcal{N}(4,1)$.
The pdf and cdf graphs are the following ones.
Mathematical description
Let $X$ be a scalar uncertain variable modelled as a random variable. This method deals with the construction of a dataset prior to the choice of a probability distribution for $X$. A QQplot (where "QQ" stands for "quantilequantile") is a tool that may be used to compare two samples $\left\{{x}_{1},...,{x}_{N}\right\}$ and $\left\{{x}_{1}^{\text{'}},...,{x}_{M}^{\text{'}}\right\}$; the goal is to determine graphically whether these two samples come from the same probability distribution or not. If this is the case, the two samples should be aggregated in order to increase the robustness of further statistical analyses.
Principle of the method
A QQplot is based on the notion of quantile. The $\alpha $quantile ${q}_{X}\left(\alpha \right)$ of $X$, where $\alpha \in (0,1)$, is defined as follows:
$$\begin{array}{c}\hfill \mathbb{P}\left(X\le {q}_{X}\left(\alpha \right)\right)=\alpha \end{array}$$ 
If a sample $\left\{{x}_{1},...,{x}_{N}\right\}$ of $X$ is available, the quantile can be estimated empirically:
the sample $\left\{{x}_{1},...,{x}_{N}\right\}$ is first placed in ascending order, which gives the sample $\left\{{x}_{\left(1\right)},...,{x}_{\left(N\right)}\right\}$;
then, an estimate of the $\alpha $quantile is:
$$\begin{array}{c}\hfill {\widehat{q}}_{X}\left(\alpha \right)={x}_{\left(\right[N\alpha ]+1)}\end{array}$$ 
where $\left[N\alpha \right]$ denotes the integral part of $N\alpha $.
Thus, the ${j}^{\mathrm{th}}$ smallest value of the sample ${x}_{\left(j\right)}$ is an estimate ${\widehat{q}}_{X}\left(\alpha \right)$ of the $\alpha $quantile where $\alpha =(j1)/N$ ($1<j\le N$). Let us then consider our second sample $\left\{{x}_{1}^{\text{'}},...,{x}_{M}^{\text{'}}\right\}$; this one also provides an estimate ${\widehat{q}}_{X}^{\text{'}}\left(\alpha \right)$ of this same quantile:
$$\begin{array}{c}\hfill {\widehat{q}}_{X}^{\text{'}}\left(\alpha \right)={x}_{\left(\right[M\times (j1)/N]+1)}^{\text{'}}\end{array}$$ 
If the the two samples correspond to the same probability distribution, then ${\widehat{q}}_{X}\left(\alpha \right)$ and ${\widehat{q}}_{X}^{\text{'}}\left(\alpha \right)$ should be close. Thus, graphically, the points $\left\{\left({\widehat{q}}_{X}\left(\alpha \right),{\widehat{q}}_{X}^{\text{'}}\left(\alpha \right)\right),\phantom{\rule{4pt}{0ex}}\alpha =(j1)/N,\phantom{\rule{4pt}{0ex}}1<j\le N\right\}$ should be close to the diagonal.
The following figure illustrates the principle of a QQplot with two samples of size $M=50$ and $N=50$. Note that the unit of the two axis is that of the variable $X$ studied. In this example, the points remain close to the diagonal and the hypothesis "the two samples come frome the same distribution" does not seem irrelevant, even if a more quantitative analysis (see [Smirnov test] ) should be carried out to confirm this.
In this second example, the two samples clearly arise from two different distributions.
Link with OpenTURNS methodology
Saporta, G. (1990). "Probabilités, Analyse de données et Statistique", Technip
Dixon, W.J. & Massey, F.J. (1983) "Introduction to statistical analysis (4th ed.)", McGrawHill
D'Agostino, R.B. and Stephens, M.A. (1986). "GoodnessofFit Techniques", Marcel Dekker, Inc., New York.
Bhattacharyya, G.K., and R.A. Johnson, (1997). "Statistical Concepts and Methods", John Wiley and Sons, New York.
Sprent, P., and Smeeton, N.C. (2001). "Applied Nonparametric Statistical Methods – Third edition", Chapman & Hall
Mathematical description
Let $X$ be a scalar uncertain variable modelled as a random variable. This method deals with the construction of a dataset prior to the choice of a probability distribution for $X$. Smirnov's test is a tool that may be used to compare two samples $\left\{{x}_{1},...,{x}_{N}\right\}$ and $\left\{{x}_{1}^{\text{'}},...,{x}_{M}^{\text{'}}\right\}$; the goal is to determine whether these two samples come from the same probability distribution or not. If this is the case, the two samples should be aggregated in order to increase the robustness of further statistical analyses.
Principle of the method
Smirnov's test is a statistical test based on the maximum distance between the cumulative distribution function ${\widehat{F}}_{N}$ and ${\widehat{F}}_{M}^{\text{'}}$ of the samples $\left\{{x}_{1},...,{x}_{N}\right\}$ and $\left\{{x}_{1}^{\text{'}},...,{x}_{M}^{\text{'}}\right\}$ (see [empirical cumulative distribution function] ). This distance is expressed as follows:
$$\begin{array}{c}\hfill {\widehat{D}}_{M,N}=\underset{x}{sup}\left{\widehat{F}}_{N}\left(x\right){\widehat{F}}_{M}^{\text{'}}\left(x\right)\right\end{array}$$ 
The probability distribution of the distance ${\widehat{D}}_{M,N}$ is asymptotically known (i.e. as the size of the samples tends to infinity). If $M$ and $N$ are sufficiently large, this means that for a probability $\alpha $, one can calculate the threshold / critical value ${d}_{\alpha}$ such that:
if ${\widehat{D}}_{M,N}>{d}_{\alpha}$, we conclude that the two samples are not identically distributed, with a risk of error $\alpha $,
if ${\widehat{D}}_{M,N}\le {d}_{\alpha}$, it is reasonable to say that both samples arise frome the same distribution.
An important notion is the socalled "$p$value" of the test. This quantity is equal to the limit error probability ${\alpha}_{\mathrm{lim}}$ under which the "identicallydistributed" hypothesis is rejected. Thus, the two samples will be supposed identically distributed if and only if ${\alpha}_{\mathrm{lim}}$ is greater than the value $\alpha $ desired by the user. Note that the higher ${\alpha}_{\mathrm{lim}}\alpha $, the more robust the decision.
Link with OpenTURNS methodology
We remind the reader that the underlying theoretical results of the test are asymptotic. There is no rule to determine the minimum number of data values one needs to use this test; but it is often considered a reasonable approximation when $N$ is of an order of a few dozen.
The following bibliographical references provide main starting points for further study of this method:
Saporta G. (1990). "Probabilités, Analyse de données et Statistique", Technip
Dixon W.J. & Massey F.J. (1983) "Introduction to statistical analysis (4th ed.)", McGrawHill
Mathematical description
This method deals with the parametric modelling of a probability distribution for a random vector $\underline{X}=\left({X}^{1},...,{X}^{{n}_{X}}\right)$. The appropriate probability distribution is found by using a sample of data $\left\{{\underline{x}}_{1},...,{\underline{x}}_{N}\right\}$. Such an approach can be described in two steps as follows:
Choose a probability distribution (e.g. the Normal distribution, or any other distribution available in OpenTURNS see [standard parametric models] ),
Find the parameter values $\underline{\theta}$ that characterize the probability distribution (e.g. the mean and standard deviation for the Normal distribution) which best describes the sample $\left\{{\underline{x}}_{1},...,{\underline{x}}_{N}\right\}$.
The maximum likelihood method is used for the second step.
Principle
In the current version of OpenTURNS this method is restricted to the case where ${n}_{X}=1$ and continuous probability distributions. Please note therefore that $\underline{X}={X}^{1}=X$ in the following text. The maximum likelihood estimate (MLE) of $\underline{\theta}$ is defined as the value of $\underline{\theta}$ which maximizes the likelihood function $L\left(X,\underline{\theta}\right)$:
$$\begin{array}{c}\hfill \widehat{\underline{\theta}}=\mathrm{argmax}\phantom{\rule{4pt}{0ex}}L\left(X,\underline{\theta}\right)\end{array}$$ 
Given that $\left\{{x}_{1},...,{x}_{N}\right\}$ is a sample of independent identically distributed (i.i.d) observations, $L\left({x}_{1},...,{x}_{N},\underline{\theta}\right)$ represents the probability of observing such a sample assuming that they are taken from a probability distribution with parameters $\underline{\theta}$. In concrete terms, the likelihood $L\left({x}_{1},...,{x}_{N},\underline{\theta}\right)$ is calculated as follows:
$$L\left({x}_{1},...,{x}_{N},\underline{\theta}\right)=\prod _{j=1}^{N}{f}_{X}\left({x}_{j};\underline{\theta}\right)$$ 
if the distribution is continuous, with density ${f}_{X}\left(x;\underline{\theta}\right)$.
For example, if we suppose that $X$ is a Gaussian distribution with parameters $\underline{\theta}=\{\mu ,\sigma \}$ (i.e. the mean and standard deviation),
$$\begin{array}{ccc}\hfill L\left({x}_{1},...,{x}_{N},\underline{\theta}\right)& =& \prod _{j=1}^{N}\frac{1}{\sigma \sqrt{2\pi}}exp\left[\frac{1}{2}{\left(\frac{{x}_{j}\mu}{\sigma}\right)}^{2}\right]\hfill \\ & =& \frac{1}{{\sigma}^{N}{\left(2\pi \right)}^{N/2}}exp\left[\frac{1}{2{\sigma}^{2}}\sum _{j=1}^{N}{\left({x}_{j}\mu \right)}^{2}\right]\hfill \end{array}$$ 
The following figure graphically illustrates the maximum likelihood method, in the particular case of a Gaussian probability distribution.
In general, in order to maximize the likelihood function classical optimization algorithms (e.g. gradient type) can be used. The Gaussian distribution case is an exception to this, as the maximum likelihood estimators are obtained analytically:
$$\begin{array}{c}\hfill \widehat{\mu}=\frac{1}{N}\sum _{i=1}^{N}{x}_{i},\phantom{\rule{4pt}{0ex}}\widehat{{\sigma}^{2}}=\frac{1}{N}\sum _{i=1}^{N}{\left({x}_{i}\widehat{\mu}\right)}^{2}\end{array}$$ 
Other notations
Link with OpenTURNS methodology
Input:
$\left\{{x}_{1},...,{x}_{N}\right\}$: sample data
Distribution: Distribution type chosen from the proposed continuous 1dimensional distributions in [standard parametric models]
Output :
$\widehat{\underline{\theta}}$: maximum likelihood estimate of $\underline{\theta}$
as $N$ tends to infinity, the asymptotic theory results assure, under certain assumptions concerning the regularity of the model, that the MLE is the best possible estimator (its bias tends towards 0 i.e. no tendency towards under or overestimation, the uncertainty of $\widehat{\underline{\theta}}$ is lesser than in all other unbiased estimation methods); in practice, one often considers the asymptotic behaviour to be reached when $N\ge $ a few dozens, even if no theoretical rule can assure this with certitude.
if $N$ is smaller, the MLE is still useful but $\widehat{\underline{\theta}}$ is less robust (uncertainty greater and bias possible).
A more advanced study of the goodnessoffit of the selected probability distribution with the given sample data is described in [Graphical analysis] [KolmogorovSmirnov test] , [CramerVon Mises test] , [AndersonDarling test] and [BIC criterion] .
The following bibliographical references provide main starting points for further study of this method:
Saporta G. (1990). "Probabilités, Analyse de données et Statistique", Technip
Dixon W.J. & Massey F.J. (1983) "Introduction to statistical analysis (4th ed.)", McGrawHill
Mathematical description
We consider a computer model $h$ (i.e. a deterministic function) to calibrate:
$$\begin{array}{c}\hfill \underline{z}=h(\underline{x},{\underline{\theta}}_{h}),\end{array}$$ 
where
$\underline{x}\in {\mathbb{R}}^{{d}_{x}}$ is the input vector;
$\underline{z}\in {\mathbb{R}}^{{d}_{z}}$ is the output vector;
${\underline{\theta}}_{h}\in {\mathbb{R}}^{{d}_{h}}$ are the unknown parameters of $h$ to calibrate.
Our goal here is to estimate ${\underline{\theta}}_{h}$, based on a certain set of $n$ inputs $({\underline{x}}^{1},...,{\underline{x}}^{n})$ (an experimental design) and some associated observations $({\underline{y}}^{1},...,{\underline{y}}^{n})$ which are regarded as the realizations of some random vectors $({\underline{Y}}^{1},...,{\underline{Y}}^{n})$, such that, for all $i$, the distribution of ${\underline{Y}}^{i}$ depends on ${\underline{z}}^{i}=h({\underline{x}}^{i},{\underline{\theta}}_{h})$. Typically, ${\underline{Y}}^{i}={\underline{z}}^{i}+{\underline{\epsilon}}^{i}$ where ${\underline{\epsilon}}^{i}$ is a random measurement error.
For the sake of clarity, lower case letters are used for both random variables and realizations in the following (the notation does not distinguish the two anymore), as usual in the bayesian literature.
In fact, the bayesian procedure which is implemented allows to infer some unknown parameters $\underline{\theta}\in {\mathbb{R}}^{{d}_{\theta}}$ from some data $\underline{\underline{y}}=({\underline{y}}^{1},...,{\underline{y}}^{n})$ as soon as the conditional distribution of each ${\underline{y}}^{i}$ given $\underline{\theta}$ is specified. Therefore $\underline{\theta}$ can be made up with some computer model parameters ${\underline{\theta}}_{h}$ together with some others ${\underline{\theta}}_{\epsilon}$: $\underline{\theta}={({{\underline{\theta}}_{h}}^{t},{{\underline{\theta}}_{\epsilon}}^{t})}^{t}$. For example, ${\underline{\theta}}_{\epsilon}$ may represent the unknown standard deviation $\sigma $ of an additive centered gaussian measurement error affecting the data (see the example hereafter). Besides the procedure can be used to estimate the parameters of a distribution from direct observations (no computer model to calibrate: $\underline{\theta}={\underline{\theta}}_{\epsilon}$).
More formally, the likelihood $L\left(\underline{\underline{y}}\right\underline{\theta})$ is defined by, firstly, a family $\{{\mathcal{P}}_{\underline{w}},\underline{w}\in {\mathbb{R}}^{{d}_{w}}\}$ of probability distributions parametrized by $\underline{w}$, which is specified in practice by a conditional distribution $f(.\underline{w})$ given $\underline{w}$ ($f$ is a PDF or a probability mass function), and, secondly, a function $g:{\mathbb{R}}^{{d}_{\theta}}\u27f6{\mathbb{R}}^{n\phantom{\rule{0.166667em}{0ex}}{d}_{w}}$ such that $g\left(\theta \right)={({{g}^{1}\left(\underline{\theta}\right)}^{t},...,{{g}^{n}\left(\underline{\theta}\right)}^{t})}^{t}$ which enables to express the parameter ${\underline{w}}^{i}$ of the ith observation ${\underline{y}}^{i}\sim f(.{\underline{w}}^{i})$ in function of $\underline{\theta}$: ${g}^{i}\left(\underline{\theta}\right)={\underline{w}}^{i}$ thus ${\underline{y}}^{i}\sim f(.{g}^{i}\left(\underline{\theta}\right))$ and
$$\begin{array}{c}\hfill L\left(\underline{\underline{y}}\right\underline{\theta})=\prod _{i=1}^{n}f\left({\underline{y}}^{i}\right{g}^{i}\left(\underline{\theta}\right)).\end{array}$$ 
Considering the issue of the calibration of some computer model parameters ${\underline{\theta}}_{h}$, the full statistical model can be seen as a twolevel hierarchical model, with a single level of latent variables $\underline{z}$. A classical example is given by the nonlinear Gaussian regression model:
$$\begin{array}{ccc}\hfill {y}_{i}& =\hfill & \hfill h\left({\underline{x}}_{i}\right{\underline{\theta}}_{h})+{\epsilon}_{i},\phantom{\rule{4.pt}{0ex}}\text{where}\phantom{\rule{4.pt}{0ex}}{\epsilon}_{i}\stackrel{i.i.d.}{\sim}\mathcal{N}(0,{\sigma}^{2}),\phantom{\rule{1.em}{0ex}}i=1,...,n.\end{array}$$ 
It can be implemented with $f(.{(\mu ,\sigma )}^{t})$ the PDF of the gaussian distribution $\mathcal{N}(\mu ,{\sigma}^{2})$, with ${g}^{i}\left(\underline{\theta}\right)={(h({\underline{x}}^{i},{\underline{\theta}}_{h}),\phantom{\rule{0.222222em}{0ex}}\sigma )}^{t}$, and with $\underline{\theta}={\underline{\theta}}_{h}$, respectively $\underline{\theta}={({{\underline{\theta}}_{h}}^{t},\sigma )}^{t}$, if $\sigma $ is considered known, respectively unknown.
Given a distribution modelling the uncertainty on $\underline{\theta}$ prior to the data, Bayesian inference is used to perform the inference of $\underline{\theta}$, hence the name Bayesian calibration.
Principle
Contrary to the maximum likelihood approach described in [Maximum Likelihood Principle] , which provides a single `best estimate' value $\widehat{\underline{\theta}}$, together with confidence bounds accounting for the uncertainty remaining on the true value $\underline{\theta}$, the Bayesian approach derives a full distribution of possible values for $\underline{\theta}$ given the available data $\underline{\underline{y}}$. Known as the posterior distribution of $\underline{\theta}$ given the data $\underline{\underline{y}}$, its density can be expressed according to Bayes' theorem:
$$\begin{array}{ccc}\hfill \pi \left(\underline{\theta}\right\underline{\underline{y}})& =\hfill & \hfill \frac{L\left(\underline{\underline{y}}\right\underline{\theta})\times \pi \left(\underline{\theta}\right)}{m\left(\underline{\underline{y}}\right)},\end{array}$$  (3.3) 
where
$L\left(\underline{\underline{y}}\right\underline{\theta})$ is the (data) likelihood;
$\pi \left(\underline{\theta}\right)$ is the socalled prior distribution of $\underline{\theta}$ (with support $\Theta $), which encodes all possible $\underline{\theta}$ values weighted by their prior probabilities, before consideration of any experimental data (this allows for instance to incorporate expert information or known physical constraints on the calibration parameter)
$m\left(\underline{\underline{y}}\right)$ is the marginal likelihood:
$$\begin{array}{ccc}\hfill m\left(\underline{\underline{y}}\right)& =\hfill & \hfill {\displaystyle {\int}_{\underline{\theta}\in \Theta}L\left(\underline{\underline{y}}\right\underline{\theta})\pi \left(\underline{\theta}\right)d\underline{\theta},}\end{array}$$ 
which is the necessary normalizing constant ensuring that the posterior density integrates to 1.
Except in very simple cases, (3.3) has, in general, no closed form. Thus, it must be approximated, either using numerical integration when the parameter space dimension ${d}_{\theta}$ is low, or more generally through stochastic sampling techniques known as MonteCarlo MarkovChain (MCMC) methods. See [The MetropolisHastings Algorithm] .
The following bibliographical references provide main starting points for further study of this method:
Berger, J.O. (1985). "Statistical Decision Theory and Bayesian Analysis", Springer.
Marin J.M. & Robert C.P. (2007) "Bayesian Core: A Practical Approach to Computational Bayesian Statistics", Springer.
Other notations
A rigorous and complete documentation about Markov Chain MonteCarlo sampling is beyond the purpose of this section which provides a short introduction to the MetropolisHastings algorithm. In particular, the MetropolisHastings algorithm is only introduced hereafter in the context of the simulation of an homogeneous Markov Chain (no dynamical adaptation). The reader is invited to refer to the monographs suggested below for further explanations or details.
Mathematical description
Markov chain. Considering a $\sigma $algebra $\mathcal{A}$ on $\Omega $, a Markov chain is a process ${\left({X}_{k}\right)}_{k\in \mathbb{N}}$ such that
$$\begin{array}{c}\hfill \forall (A,{x}_{0},...,{x}_{k1})\in \mathcal{A}\times {\Omega}^{k}\phantom{\rule{1.em}{0ex}}\mathbb{P}\left({X}_{k}\in A\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}{X}_{0}={x}_{0},...,{X}_{k1}={x}_{k1}\right)=\mathbb{P}\left({X}_{k}\in A\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}{X}_{k1}={x}_{k1}\right).\end{array}$$ 
An example is the random walk for which ${X}_{k}={X}_{k1}+{\epsilon}_{k}$ where the steps ${\epsilon}_{k}$ are independent and identically distributed.
Transition kernel. A transition kernel on $(\Omega ,\mathcal{A})$ is a mapping $K:(\Omega ,\mathcal{A})\to [0,1]$ such that
$\forall A\in \mathcal{A}\phantom{\rule{1.em}{0ex}}K(.,A)$ is measurable;
$\forall x\in \Omega \phantom{\rule{1.em}{0ex}}K(x,.)$ is a probability distribution on $(\Omega ,\mathcal{A})$.
The kernel $K$ has density $k$ if $\forall (x,A)\in \Omega \times \mathcal{A}\phantom{\rule{1.em}{0ex}}K(x,A)={\int}_{A}\phantom{\rule{0.222222em}{0ex}}k(x,y)\text{d}y$.
${\left({X}_{k}\right)}_{k\in \mathbb{N}}$ is a homogeneous Markov Chain of transition $K$ if $\forall (A,x)\in \mathcal{A}\times \Omega \phantom{\rule{1.em}{0ex}}\mathbb{P}\left({X}_{k}\in A{X}_{k1}=x\right)=K(x,A)$.
Some Notations. Let ${\left({X}_{k}\right)}_{k\in \mathbb{N}}$ be a homogeneous Markov Chain of transition $K$ on $(\Omega ,\mathcal{A})$ with initial distribution $\nu $ (that is ${X}_{0}\sim \nu $):
${K}_{\nu}$ denotes the probability distribution of the Markov Chain ${\left({X}_{k}\right)}_{k\in \mathbb{N}}$;
$\nu {K}^{k}$ denotes the probability distribution of ${X}_{k}$ (${X}_{k}\sim \nu {K}^{k}$);
${K}^{k}$ denotes the mapping defined by ${K}^{k}(x,A)=\mathbb{P}\left({X}_{k}\in A{X}_{0}=x\right)$ for all $(x,A)\in \Omega \times \mathcal{A}$.
Total variation convergence. A Markov Chain of distribution ${K}_{\nu}$ is said to converge in total variation distance towards the distribution $t$ if
$$\begin{array}{c}\hfill \underset{k\to +\infty}{lim}\underset{A\in \mathcal{A}}{sup}\left\nu {K}^{k}\left(A\right)t\left(A\right)\right=0.\end{array}$$ 
Then the notation used here is $\nu {K}^{k}{\to}_{TV}t$.
Some interesting properties. Let $t$ be a (target) distribution on $(\Omega ,\mathcal{A})$, then a transition kernel $K$ is said to be:
$t$invariant if $tK=t$;
$t$irreducible if, $\forall (A,x)\in \Omega \times \mathcal{A}$ such that $t\left(A\right)>0$, $\exists k\in {\mathcal{N}}^{*}\phantom{\rule{1.em}{0ex}}{K}^{k}(x,A)>0$ holds.
Goal
Markov Chain MonteCarlo techniques allows to sample and integrate according to a distribution $t$ which is only known up to a multiplicative constant. This situation is common in Bayesian statistics where the "target" distribution, the posterior one $t\left(\underline{\theta}\right)=\pi \left(\underline{\theta}\right\underline{\underline{y}})$, is proportional to the product of prior and likelihood: see equation (3.3).
In particular, given a "target" distribution $t$ and a $t$irreducible kernel transition $Q$, the MetropolisHastings algorithm produces a Markov chain ${\left({X}_{k}\right)}_{k\in \mathbb{N}}$ of distribution ${K}_{\nu}$ with the following properties:
the transition kernel of the Markov chain is $t$invariant;
$\nu {K}^{k}{\to}_{TV}t$;
the Markov chain satisfies the ergodic theorem: let $\phi $ be a realvalued function such that ${\mathbb{E}}_{X\sim t}\left[\left\phi \right(X\left)\right\right]<+\infty $, then, whatever the initial distribution $\nu $ is:
$$\begin{array}{c}\hfill {\displaystyle \frac{1}{n}\sum _{k=1}^{n}\phantom{\rule{0.222222em}{0ex}}\phi \left({X}_{k}\right)\underset{k\to +\infty}{\to}{\mathbb{E}}_{X\sim t}\left[\phi \left(X\right)\right]\phantom{\rule{4.pt}{0ex}}\text{almost}\phantom{\rule{4.pt}{0ex}}\text{surely}.}\end{array}$$ 
In that sense, simulating ${\left({X}_{k}\right)}_{k\in \mathbb{N}}$ amounts to sampling according to $t$ and can be used to integrate relatively to the probability measure $t$. Let us remark that the ergodic theorem implies that $\frac{1}{n}\sum _{k=1}^{n}\phantom{\rule{0.222222em}{0ex}}{\mathbf{1}}_{A}\left({X}_{k}\right)\underset{k\to +\infty}{\to}{\mathbb{P}}_{X\sim t}\left(X\in A\right)$ almost surely.
Principle
By abusing the notation, $t\left(x\right)$ represents, in the remainder of this section, a function of $x$ which is proportional to the PDF of the target distribution $t$. Given a transition kernel $Q$ of density $q$, the scheme of the MetropolisHastings algorithm is the following (lower case letters are used hereafter for both random variables and realizations as usual in the bayesian literature):
Draw ${x}_{0}\sim \nu $ and set $k=1$.
Draw a candidate for ${x}_{k}$ according to the given transition kernel $Q$: $\tilde{x}\sim Q({x}_{k1},.)$.
Compute the ratio $\rho =\frac{t\left(\tilde{x}\right)/q({x}_{k1},\tilde{x})}{t\left({x}_{k1}\right)/q(\tilde{x},{x}_{k1})}$.
Draw $u\sim \mathcal{U}\left(\right[0,1\left]\right)$; if $u\le \rho $ then set ${x}_{k}=\tilde{x}$, otherwise set ${x}_{k}={x}_{k1}$.
Set $k=k+1$ and go back to 1).
Of course, if $t$ is replaced by a different function of $x$ which is proportional to it, the algorithm keeps unchanged, since $t$ only takes part in the latter in the ratio $\frac{t\left(\tilde{x}\right)}{t\left({x}_{k1}\right)}$. Moreover, if $Q$ proposes some candidates in a uniform manner (constant density $q$), the candidate $\tilde{x}$ is accepted according to a ratio $\rho $ which reduces to the previous "natural" ratio $\frac{t\left(\tilde{x}\right)}{t\left({x}_{k1}\right)}$ of PDF. The introduction of $q$ in the ratio $\rho $ prevents from the bias of a nonuniform proposition of candidates which would favor some areas of $\Omega $.
The $t$invariance is ensured by the symmetry of the expression of $\rho $ ($t$reversibility).
In practice, $Q$ is specified as a random walk ($\exists {q}_{RW}$ such that $q(x,y)={q}_{RW}(xy)$) or as a independent sampling ($\exists {q}_{IS}$ such that $q(x,y)={q}_{IS}\left(y\right)$), or as a mixture of random walk and independent sampling.
The important property the practitioner have to keep in mind when choosing the transition kernel $Q$ is the $t$irreducibility. Moreover, for efficient convergence, $Q$ has to be chosen so as to explore quickly the whole support of $t$ without conducting to a too small acceptance ratio (the ratio of accepted candidates $\tilde{x}$ ). It is usually recommended that this latter ratio is about $0.2$ but such a ratio is neither a warranty of efficiency, nor a substitute to a convergence diagnosis.
The following bibliographical references provide main starting points for further study of this method:
Robert, C.P. and Casella, G. (2004). "Monte Carlo Statistical Methods" (Second Edition), Springer.
Meyn, S. and Tweedie R.L. (2009). "Markov Chains ans Stochastic Stability" (Second Edition), Cambridge University Press.
Other notations
Mathematical description
The objective is to estimate the value of the parameters based on a sample of an unknown distribution, supposed to be a member of a parametric family of distributions. We describes here the estimators implemented in OpenTURNS for the estimation of the several parametric models. They are all derived from either the Maximum Likelihood method or from the method of moments, excepted for the bound parameters that are systematically modified to strictly include the extrem realizations of the underlying sample $({x}_{1},\cdots ,{x}_{n})$.
We suppose that we have a realization $({\underline{x}}_{1},\cdots ,{\underline{x}}_{n})$ of a sample $({\underline{X}}_{1},\cdots ,{\underline{X}}_{n})$ of size $n$, with the ${X}_{i}$ being iid, with common distribution $\mathcal{D}\left(\underline{\theta}\right)$. The objective is to build an estimator ${\widehat{\theta}}_{n}$ of $\underline{\theta}$, based on the realization $({\underline{x}}_{1},\cdots ,{\underline{x}}_{n})$. We adopt the following notations:
${\overline{\underline{x}}}_{n}=\frac{1}{n}{\sum}_{i=1}^{n}{\underline{x}}_{i}$ the sample mean (${\overline{x}}_{n}$ in the 1D case);
${\sigma}_{n}=\sqrt{\frac{1}{n1}{\sum}_{i=1}^{n}{({x}_{i}\overline{x})}^{2}}$ the sample standard deviation in the 1D case;
${x}_{(1,n)}={min}_{i=1,\cdots ,n}{x}_{i}$ the minimum of the realization in the 1D case;
${x}_{(n,n)}={max}_{i=1,\cdots ,n}{x}_{i}$ the maximum of the realization in the 1D case;
${x}_{1/2}$ the median of the sample in the 1D case;
Continuous univariate distributions:
Arcsine  $\begin{array}{c}{\displaystyle \widehat{\mu}={\widehat{\mu}}_{x}}\hfill \\ {\displaystyle \widehat{\sigma}={\widehat{\sigma}}_{x}}\hfill \end{array}$ 
Beta  $\begin{array}{c}{\displaystyle {\widehat{a}}_{n}=(1\mathrm{sign}\left({x}_{(1,n)}\right)/(2+n)){x}_{(1,n)}}\hfill \\ {\displaystyle {\widehat{b}}_{n}=(1+\mathrm{sign}\left({x}_{(n,n)}\right)/(2+n)){x}_{(n,n)}}\hfill \\ {\displaystyle {\widehat{t}}_{n}=\frac{({\widehat{b}}_{n}{\overline{x}}_{n})({\overline{x}}_{n}{\widehat{a}}_{n})}{{\left({\sigma}_{n}^{X}\right)}^{2}1}}\hfill \\ {\displaystyle {\widehat{r}}_{n}=\frac{t({\overline{x}}_{n}{\widehat{a}}_{n})}{{\widehat{b}}_{n}{\widehat{a}}_{n}}}\hfill \end{array}$ 
Burr  ${\widehat{c}}_{n}$ is the solution of the non linear equation : $1+\frac{c}{n}\left[SR\frac{n}{{\sum}_{i=1}^{n}log(1+{x}_{i}^{c})}SSR\right]=0$ where $SR=\sum _{i=1}^{n}\frac{log\left({x}_{i}\right)}{1+{x}_{i}^{c}}$ and $SSR=\sum _{i=1}^{n}\frac{{x}_{i}^{c}log\left({x}_{i}\right)}{1+{x}_{i}^{c}}$. Then ${\widehat{k}}_{n}=\frac{n}{{\sum}_{i=1}^{n}log(1+{x}_{i}^{c})}.$ 
Chi  $\widehat{\nu}}_{n}={\overline{{x}^{2}}}_{n$ 
ChiSquare  $\widehat{\nu}}_{n}={\overline{x}}_{n$ 
Dirichlet  Maximum likelihood estimators, according to the reference J. Huang 
Epanechnikov  no parameter to estimate 
Exponential  $\begin{array}{c}{\displaystyle {\widehat{\gamma}}_{n}=(1\mathrm{sign}\left({x}_{(1,n)}\right)/(2+n)){x}_{(1,n)}}\hfill \\ {\displaystyle {\widehat{\lambda}}_{n}=1/{\overline{x}}_{n}{\widehat{\gamma}}_{n}}\hfill \end{array}$ 
FisherSnedecor  No factory method implemented so far 
Gamma  $\begin{array}{c}{\displaystyle {\widehat{\gamma}}_{n}=(1\mathrm{sign}\left({x}_{(1,n)}\right)/(2+n)){x}_{(1,n)}}\hfill \\ {\displaystyle {\widehat{\lambda}}_{n}=\frac{{\overline{x}}_{n}{\widehat{\gamma}}_{n}}{{\left({\sigma}_{n}^{X}\right)}^{2}}}\hfill \\ {\displaystyle {\widehat{k}}_{n}=\frac{{({\overline{x}}_{n}{\widehat{\gamma}}_{n})}^{2}}{{\left({\sigma}_{n}^{X}\right)}^{2}}}\hfill \end{array}$ 
Generalized Pareto  see text below 
Gumbel  $\begin{array}{c}{\displaystyle {\widehat{\alpha}}_{n}=\frac{\pi}{{\sigma}_{n}^{X}\sqrt{6}}}\hfill \\ {\displaystyle {\widehat{\beta}}_{n}={\overline{x}}_{n}\frac{\gamma \sqrt{6}}{\pi}{\sigma}_{n}^{X}}\hfill \\ \gamma \simeq 0.57721\phantom{\rule{4.pt}{0ex}}\text{is}\phantom{\rule{4.pt}{0ex}}\text{Euler's}\phantom{\rule{4.pt}{0ex}}\text{constant.}\hfill \end{array}$ 
Histogram  The bandwidth is the AMISEoptimal one : $h=\frac{{\left(24\sqrt{\pi}\right)}^{1/3}{\sigma}_{n}}{{n}^{1/3}}$ where ${\sigma}_{n}^{2}$ is the non biaised variance of the data. The range is $\left[min\right(data),max(data\left)\right]$. 
Inverse ChiSquare  No factory method implemented so far 
Inverse Gamma  No factory method implemented so far 
Inverse Normal  $\begin{array}{c}{\displaystyle {\widehat{\mu}}_{n}={\overline{x}}_{n}}\hfill \\ {\displaystyle {\widehat{\lambda}}_{n}={\left(\frac{1}{n}\sum _{i=1}^{n}\frac{1}{{x}_{i}}\frac{1}{{\overline{x}}_{n}}\right)}^{1}}\hfill \end{array}$ 
Laplace  $\begin{array}{c}{\displaystyle {\widehat{\mu}}_{n}={x}_{1/2}}\hfill \\ {\displaystyle {\widehat{\lambda}}_{n}=\frac{1}{n}\sum _{i=1}^{n}{x}_{i}{\widehat{\mu}}_{n}}\hfill \end{array}$ 
Logistic  $\begin{array}{c}{\displaystyle \widehat{\alpha}={\overline{x}}_{n}}\hfill \\ {\displaystyle {\widehat{\beta}}_{n}={\sigma}_{n}^{X}}\hfill \end{array}$ 
LogNormal  see text below 
LogUniform  $\begin{array}{c}{\displaystyle {\widehat{a}}_{n}=(11/(2+n)){x}_{(1,n)}}\hfill \\ {\displaystyle {\widehat{b}}_{n}=(1+1/(2+n)){x}_{(n,n)}}\hfill \end{array}$ 
Meixner  Moments method. See details below. 
Non Central Chi Square  No factory method implemented so far 
Non Central Student  No factory method implemented so far 
Normal  Maximum likelihood estimators 
Normal Gamma  No factory method implemented so far 
Rayleigh  $\begin{array}{c}{\displaystyle {\widehat{\gamma}}_{n}=(1\mathrm{sign}\left({x}_{(1,n)}\right)/(2+n)){x}_{(1,n)}}\hfill \\ {\displaystyle {\widehat{\sigma}}_{n}=\sqrt{\frac{2}{n}{\sum}_{i=1}^{n}{({x}_{i}{\widehat{\gamma}}_{n})}^{2}}}\hfill \end{array}$ 
Rice  Moments estimators, according to the reference C.G. Koay 
Student (1d)  Moments estimators 
Trapezoidal  Numerical resolution of maximum likelihood estimators 
Triangular  $\begin{array}{c}{\displaystyle {\widehat{a}}_{n}=(1\mathrm{sign}\left({x}_{(1,n)}\right)/(2+n)){x}_{(1,n)}}\hfill \\ {\displaystyle {\widehat{b}}_{n}=(1+\mathrm{sign}\left({x}_{(n,n)}\right)/(2+n)){x}_{(n,n)}}\hfill \\ {\displaystyle {\widehat{m}}_{n}=3{\overline{x}}_{n}{\widehat{a}}_{n}{\widehat{b}}_{n}}\hfill \end{array}$ 
TruncatedNormal  Numerical maximum likelihood estimation. 
Uniform  $\begin{array}{c}{\displaystyle {\widehat{a}}_{n}=(1\mathrm{sign}\left({x}_{(1,n)}\right)/(2+n)){x}_{(1,n)}}\hfill \\ {\displaystyle {\widehat{b}}_{n}=(1+\mathrm{sign}\left({x}_{(n,n)}\right)/(2+n)){x}_{(n,n)}}\hfill \end{array}$ 
Weibull  $\begin{array}{c}{\displaystyle {\widehat{\gamma}}_{n}=(1\mathrm{sign}\left({x}_{(1,n)}\right)/(2+n)){x}_{(1,n)}}\hfill \\ ({\widehat{\alpha}}_{n},{\widehat{\beta}}_{n})\phantom{\rule{4.pt}{0ex}}\text{solution}\phantom{\rule{4.pt}{0ex}}\text{of}\phantom{\rule{4.pt}{0ex}}\left\{\begin{array}{c}{\overline{x}}_{n}={\widehat{\gamma}}_{n}+{\widehat{\alpha}}_{n}+\Gamma (1+1/{\widehat{\beta}}_{n})\hfill \\ {\left({\sigma}_{n}^{X}\right)}^{2}={\widehat{\alpha}}_{n}\left(\Gamma (1+2/{\widehat{\beta}}_{n})\Gamma {(1+1/{\widehat{\beta}}_{n})}^{2}\right)\hfill \end{array}\right.\hfill \end{array}$ 
Details for the Generalized Pareto distribution :
OpenTURNS implements three parametric estimation methods: the classical moments method, the exponential regression method and the probability weighted moments method, according to the reference G. Matthys & J. Beirlant. The default strategy is to use the probability weighted moments method when the sample size is smaller than the threshold defined in the RessourceMap object ($GeneralizedParetoFactorySmallSize$). In case of failure, it uses the exponential regression method. If the sample size is to high, it uses the exponential regression method. The classical moments method is proposed but not used by default.
Details for the LogNormal distribution :
We note :
$$\begin{array}{ccc}\hfill {\displaystyle {S}_{0}}& =& \sum _{i=1}^{n}\frac{1}{{x}_{i}\gamma}\hfill \\ \hfill {\displaystyle {S}_{1}}& =& \sum _{i=1}^{n}log({x}_{i}\gamma )\hfill \\ \hfill {\displaystyle {S}_{2}}& =& \sum _{i=1}^{n}lo{g}^{2}({x}_{i}\gamma )\hfill \\ \hfill {\displaystyle {S}_{3}}& =& \sum _{i=1}^{n}\frac{log({x}_{i}\gamma )}{{x}_{i}\gamma}\hfill \end{array}$$  (3.3) 
OpenTURNS tries to evaluate the parameters first using the Local Maximum Likelihood based estimators of $({\mu}_{\ell},{\sigma}_{\ell},\gamma )$ defined by :
$$\begin{array}{ccc}\hfill {\displaystyle {\widehat{\mu}}_{\ell ,n}}& =& \frac{{S}_{1}\left(\widehat{\gamma}\right)}{n}\hfill \\ \hfill {\displaystyle {\widehat{\sigma}}_{\ell ,n}^{2}}& =& \frac{{S}_{2}\left(\widehat{\gamma}\right)}{n}{\widehat{\mu}}_{l,n}^{2}\hfill \end{array}$$  (3.3) 
Thus, ${\widehat{\gamma}}_{n}$ verifies the relation :
$${S}_{0}\left(\gamma \right)\left({S}_{2}\left(\gamma \right){S}_{1}\left(\gamma \right)\left(1+\frac{{S}_{1}\left(\gamma \right)}{n}\right)\right)+n{S}_{4}\left(\gamma \right)=0$$  (96) 
under the constraint $\gamma \le min{x}_{i}$.
OpenTURNS tries to solve (96) by the step doubling bracheting method followed by the bisection method. Once ${\widehat{\gamma}}_{n}$ is evaluated, $({\widehat{\mu}}_{\ell ,n},{\widehat{\sigma}}_{\ell ,n})$ are evaluated as defined in (3.3) and ().
If the resolution of (96) is not possible, OpenTURNS sends a message to the User and evaluates the parameters from the Modified Moments based estimators unsing ${\overline{x}}_{n}$, ${\sigma}_{n}^{2}$ and the additional modified moment equation :
$$\mathbb{E}[log({X}_{\left(1\right)}\gamma )]=log({x}_{\left(1\right)}\gamma )$$  (97) 
The quantity $E{Z}_{1}\left(n\right)=\frac{\mathbb{E}[log({X}_{\left(1\right)}\gamma )]{\mu}_{\ell}}{{\sigma}_{\ell}}$ is the mean of the first order statistics of a standard normal sample of size $n$. We have the following relation :
$$E{Z}_{1}\left(n\right)={\int}_{\mathbb{R}}nz\varphi \left(z\right){(1\Phi \left(z\right))}^{n1}\phantom{\rule{0.166667em}{0ex}}\mathrm{d}z$$  (98) 
where $\varphi $ et $\Phi $ are the pdf and cdf of the standard normal distribution.
The estimator ${\widehat{\omega}}_{n}$ of $\omega ={e}^{{\sigma}_{\ell}^{2}}$ is obtained as solution of :
$$\omega (\omega 1){\kappa}_{n}{\left[\sqrt{\omega}{e}^{E{Z}_{1}\left(n\right)\sqrt{log\omega}}\right]}^{2}=0$$  (99) 
where ${\kappa}_{n}=\frac{{s}_{n}^{2}}{{({\overline{x}}_{n}{x}_{\left(1\right)})}^{2}}$.
Then $({\widehat{{\mu}_{\ell}}}_{n},{\widehat{{\sigma}_{\ell}}}_{n},{\widehat{\gamma}}_{n})$ are evaluated from :
$$\begin{array}{cc}\hfill {\widehat{{\mu}_{\ell}}}_{n}=& log{\widehat{\beta}}_{n}\hfill \\ \hfill {\widehat{{\sigma}_{\ell}}}_{n}=& \sqrt{log{\widehat{\omega}}_{n}}\hfill \\ \hfill {\widehat{\gamma}}_{n}=& {\overline{x}}_{n}{\widehat{\beta}}_{n}\sqrt{{\widehat{\omega}}_{n}}\hfill \end{array}$$  (3.3) 
where $\widehat{\beta}}_{n}=\frac{{s}_{n}}{\sqrt{{\widehat{\omega}}_{n}({\widehat{\omega}}_{n}1)}$.
If the resolution of (99) is not possible, OpenTURNS sends a message to the User and evaluates the parameters from the Moments based estimators which are always defined.
The estimator ${\widehat{\omega}}_{n}$ of $\omega ={e}^{{\sigma}_{\ell}^{2}}$ is the positive root of :
$${\omega}^{3}+3{\omega}^{2}(4+{a}_{3,n}^{2})=0$$  (101) 
which is always defined. Then we have $({\widehat{{\mu}_{\ell}}}_{n},{\widehat{{\sigma}_{\ell}}}_{n},{\widehat{\gamma}}_{n})$ using the relations (3.3).
Details for the Meixner distribution :
We use the following estimators:
$$\begin{array}{cc}\hfill {\widehat{{\gamma}_{1}}}_{n}=& \frac{\frac{1}{n}{\sum}_{i=1}^{n}{({x}_{i}{\widehat{x}}_{n})}^{3}}{{\widehat{\sigma}}_{n}^{3}}\hfill \\ \hfill {\widehat{{\gamma}_{2}}}_{n}=& \frac{\frac{1}{n}{\sum}_{i=1}^{n}{({x}_{i}{\widehat{x}}_{n})}^{4}}{{\widehat{\sigma}}_{n}^{4}}\hfill \\ \hfill {\widehat{\delta}}_{n}=& \frac{1}{{\widehat{{\gamma}_{2}}}_{n}{\widehat{{\gamma}_{1}}}_{n}^{2}3}\hfill \\ \hfill {\widehat{\beta}}_{n}=& sign\left({\widehat{{\gamma}_{1}}}_{n}\right)arcos(2{\widehat{\delta}}_{n}({\widehat{{\gamma}_{2}}}_{n}3))\hfill \\ \hfill {\widehat{\alpha}}_{n}=& {\left({\widehat{\sigma}}_{n}^{2}(cos{\widehat{\beta}}_{n}+1)\right)}^{1/3}\hfill \end{array}$$  (3.3) 
where (3.3) is defined if ${\widehat{{\gamma}_{2}}}_{n}\ge 2{\widehat{{\gamma}_{1}}}_{n}+3$.
Continuous multivariate distributions:
Dirichlet  Maximum likelihood estimators 
Normal  $\begin{array}{c}{\displaystyle {\widehat{\underline{\mu}}}_{n}^{\phantom{(}}={\overline{\underline{x}}}_{n}}\hfill \\ {\displaystyle {\widehat{\mathrm{Cov}}}_{n}=\frac{1}{n1}\sum _{i=1}^{n}\left({\underline{X}}_{i}{\widehat{\underline{\mu}}}_{n}\right){\left({\underline{X}}_{i}{\widehat{\underline{\mu}}}_{n}\right)}^{t}}\hfill \end{array}$ 
Student  not yet implmented 
Discrete univariate distributions :
Bernoulli  ${\widehat{p}}_{n}={\overline{x}}_{n}$ 
Binomial  See details below. 
Dirac  ${\widehat{point}}_{n}={x}_{1}$ 
Geometric  ${\widehat{p}}_{n}=\frac{1}{{\overline{x}}_{n}}$ 
Multinomial  $\begin{array}{c}data:({\underline{x}}^{1},\cdots ,{\underline{x}}^{n})\hfill \\ {\displaystyle N=ma{x}_{i,k}\phantom{\rule{0.166667em}{0ex}}{x}_{i}^{k}}\hfill \\ {\displaystyle {p}_{i}=\frac{1}{nN}\sum _{k=1}^{n}{x}_{i}^{k}}\hfill \end{array}$ 
Negative Binomial  $\begin{array}{c}data:({\underline{x}}^{1},\cdots ,{\underline{x}}^{n})\hfill \\ {\displaystyle {\widehat{p}}_{n}=\frac{{\overline{x}}_{n}}{{\widehat{r}}_{n}+{\overline{x}}_{n}}}\hfill \\ {\displaystyle {\widehat{r}}_{n}\phantom{\rule{4.pt}{0ex}}\text{solution}\phantom{\rule{4.pt}{0ex}}\text{of}\phantom{\rule{4.pt}{0ex}}n\left(log\left(\frac{{\widehat{r}}_{n}}{{\widehat{r}}_{n}+{\overline{x}}_{n}}\right)\psi \left({\widehat{r}}_{n}\right)\right)+\sum _{i=1}^{n}\psi ({x}^{i}+{\widehat{r}}_{n})=0}\hfill \\ \text{The}\phantom{\rule{4.pt}{0ex}}\text{resolution}\phantom{\rule{4.pt}{0ex}}\text{is}\phantom{\rule{4.pt}{0ex}}\text{done}\phantom{\rule{4.pt}{0ex}}\text{using}\phantom{\rule{4.pt}{0ex}}\text{Brent's}\phantom{\rule{4.pt}{0ex}}\text{method.}\hfill \end{array}$ 
Poisson  $\widehat{\lambda}}_{n}={\overline{x}}_{n$ 
Skellam  Moments estimators: see details below. 
UserDefined  Uniform distribution over the sample. 
Details for the Binomial distribution :
We initialize the value of $(n,{p}_{n})$ to $\left(\u2308\frac{{\widehat{x}}_{n}^{2}}{{\widehat{x}}_{n}{\widehat{\sigma}}_{n}^{2}}\u2309,\frac{{\widehat{x}}_{n}}{n}\right)$ where ${\widehat{x}}_{n}$ is the empirical mean of the sample $({x}_{1},\cdots ,{x}_{n})$, and ${\widehat{\sigma}}_{n}^{2}$ its unbiaised empirical variance.
Then, we evaluate the likelihood of the sample with respect to the Binomial distribution parameterized with $\left(\u2308\frac{{\widehat{x}}_{n}^{2}}{{\widehat{x}}_{n}{\widehat{\sigma}}_{n}^{2}}\u2309,\frac{{\widehat{x}}_{n}}{n}\right)$. By testing successively $n+1$ and $n1$ instead of $n$, we determine the variation of the likelihood of the sample with respect to the Binomial distribution parameterized with $(n+1,{p}_{n+1})$ and $(n1,{p}_{n1})$. We then iterate in the direction that makes the likelihood decrease, until the likelihood stops decreasing. The last couple is the one selected.
Details for the Skellam distribution :
The estimators of $({\lambda}_{1},{\lambda}_{2})$ write:
$$\begin{array}{c}\hfill \begin{array}{ccc}{\widehat{{\lambda}_{1}}}_{n}\hfill & =& \frac{1}{2}({\widehat{\sigma}}_{n}+{\widehat{x}}_{n})\hfill \\ {\widehat{{\lambda}_{2}}}_{n}\hfill & =& \frac{1}{2}({\widehat{\sigma}}_{n}{\widehat{x}}_{n})\hfill \end{array}\end{array}$$ 
Discrete multivariate distributions:
Dirac  ${\widehat{point}}_{n}={\underline{x}}_{1}$ 
Multinomial  Maximum likelihood estimators 
UserDefined  Uniform distribution over the sample. 
Copula distributions :
We note ${\widehat{\tau}}_{n}$ the Kendall$\tau $ of the sample and ${\widehat{\rho}}_{n}$ its Spearman correlation coefficient. AMH is the AliMikhailHaq copula and FGM the FarlieGumbelMorgenstern one.
AMH  ${\widehat{\theta}}_{n}$ solution of $\widehat{\tau}}_{n}=\frac{3\theta 2}{3\theta}\frac{2{(1\theta )}^{2}ln(1\theta )}{3{\theta}^{2}$. 
Clayton  $\widehat{\theta}}_{n}=\frac{2{\widehat{\tau}}_{n}^{\phantom{(}}}{{1}_{\phantom{(}}{\widehat{\tau}}_{n}$. 
FGM  $\widehat{\theta}}_{n}=\frac{9}{2}{\widehat{\tau}}_{n}^{\phantom{(}$ if ${\widehat{\theta}}_{n}<1$. Otherwise, $\widehat{\theta}}_{n}=3{\widehat{\rho}}_{n}^{\phantom{(}$ if ${\widehat{\theta}}_{n}<1$. Otherwise, the estimation is not possible. 
Frank  ${\widehat{\theta}}_{n}$ solution of ${\widehat{\tau}}_{n}=14\left(\frac{1D{({\widehat{\theta}}_{n},1)}^{\phantom{(}}}{\theta}\right)$ where $D$ is the Debye function defined as $D(x,n)=\frac{n}{{x}^{n}}{\int}_{0}^{x}\frac{{t}^{n}}{{e}^{t}{1}_{\phantom{(}}}dt$. 
Gumbel  $\widehat{\theta}}_{n}=\frac{{1}^{\phantom{(}}}{1{\widehat{\tau}}_{{n}_{\phantom{(}}}$. 
Normal  The correlation matrix $\underline{\underline{R}}$ is such that ${R}_{ij}=sin{\left(\frac{\pi}{2}{\widehat{\tau}}_{n,ij}\right)}_{\phantom{(}}$. 
Other notations
Link with OpenTURNS methodology
Huang J., "Maximum Likelihood Estimation of Dirichlet Distribution Parameters".
Koay C.G., Basser P.J., "Analytically exact correction scheme for signal extraction from noisy magnitude MR signals", Journal of magnetics Resonance 179, 317322, 2006.
G. Matthys & J. Beirlant "Estimating the extreme value index abd high quantiles with exponential regression models", Statistica Sinica, 13, 850880, 2003.
Saporta G. (1990). "Probabilités, Analyse de données et Statistique", Technip.
Dixon W.J. & Massey F.J. (1983) "Introduction to statistical analysis (4th ed.)", McGrawHill.
Examples
Mathematical description
This method deals with the modelling of a probability distribution of a random vector $\underline{X}=\left({X}^{1},...,{X}^{{n}_{X}}\right)$. It seeks to verify the compatibility between a sample of data $\left\{{\underline{x}}_{1},{\underline{x}}_{2},...,{\underline{x}}_{N}\right\}$ and a candidate probability distribution previous chosen. OpenTURNS enables the use of graphical tools to answer this question in the one dimensional case ${n}_{X}=1$, and with a continuous distribution.
Principle
The QQplot, and henry line tests are defined in the case to ${n}_{X}=1$. Thus we denote $\underline{X}={X}^{1}=X$. The first graphical tool provided by Open TURNS is a QQplot (where "QQ" stands for "quantilequantile"). In the specific case of a Normal distribution (see [standard parametric models] ), Henry's line may also be used.
QQplot
A QQPlot is based on the notion of quantile. The $\alpha $quantile ${q}_{X}\left(\alpha \right)$ of $X$, where $\alpha \in (0,1)$, is defined as follows:
$$\begin{array}{c}\hfill \mathbb{P}\left(X\le {q}_{X}\left(\alpha \right)\right)=\alpha \end{array}$$ 
If a sample $\left\{{x}_{1},...,{x}_{N}\right\}$ of $X$ is available, the quantile can be estimated empirically:
the sample $\left\{{x}_{1},...,{x}_{N}\right\}$ is first placed in ascending order, which gives the sample $\left\{{x}_{\left(1\right)},...,{x}_{\left(N\right)}\right\}$;
then, an estimate of the $\alpha $quantile is:
$$\begin{array}{c}\hfill {\widehat{q}}_{X}\left(\alpha \right)={x}_{\left(\right[N\alpha ]+1)}\end{array}$$ 
where $\left[N\alpha \right]$ denotes the integral part of $N\alpha $.
Thus, the ${j}^{\mathrm{th}}$ smallest value of the sample ${x}_{\left(j\right)}$ is an estimate ${\widehat{q}}_{X}\left(\alpha \right)$ of the $\alpha $quantile where $\alpha =(j1)/N$ ($1<j\le N$).
Let us then consider the candidate probability distribution being tested, and let us denote by $F$ its cumulative distribution function. An estimate of the $\alpha $quantile can be also computed from $F$:
$$\begin{array}{c}\hfill {\widehat{q}}_{X}^{\text{'}}\left(\alpha \right)={F}^{1}\left((j1)/N\right)\end{array}$$ 
If $F$ is really the cumulative distribution function of $F$, then ${\widehat{q}}_{X}\left(\alpha \right)$ and ${\widehat{q}}_{X}^{\text{'}}\left(\alpha \right)$ should be close. Thus, graphically, the points $\left\{\left({\widehat{q}}_{X}\left(\alpha \right),{\widehat{q}}_{X}^{\text{'}}\left(\alpha \right)\right),\phantom{\rule{4pt}{0ex}}\alpha =(j1)/N,\phantom{\rule{4pt}{0ex}}1<j\le N\right\}$ should be close to the diagonal.
The following figure illustrates the principle of a QQplot with a sample of size $N=50$. Note that the unit of the two axis is that of the variable $X$ studied; the quantiles determined via $F$ are called here "value of $T$". In this example, the points remain close to the diagonal and the hypothesis "$F$ is the cumulative distribution function of $X$" does not seem irrelevant, even if a more quantitative analysis (see for instance [KolmogorovSmirnov goodnessoffit test] ) should be carried out to confirm this.
In this second example, the candidate distribution function is clearly irrelevant.
Henry's line
This second graphical tool is only relevant if the candidate distribution function being tested is gaussian. It also uses the ordered sample $\left\{{x}_{\left(1\right)},...,{x}_{\left(N\right)}\right\}$ introduced for the QQplot, and the empirical cumulative distribution function ${\widehat{F}}_{N}$ presented in [empirical cumulative distribution function] .
By definition,
$$\begin{array}{c}\hfill {x}_{\left(j\right)}={\widehat{F}}_{N}^{1}\left(\frac{j}{N}\right)\end{array}$$ 
Then, let us denote by $\Phi $ the cumulative distribution function of a Normal distribution with mean 0 and standard deviation 1. The quantity ${t}_{\left(j\right)}$ is defined as follows:
$$\begin{array}{c}\hfill {t}_{\left(j\right)}={\Phi}^{1}\left(\frac{j}{N}\right)\end{array}$$ 
If $X$ is distributed according to a normal probability distribution with mean $\mu $ and standarddeviation $\sigma $, then the points $\left\{\left({x}_{\left(j\right)},{t}_{\left(j\right)}\right),\phantom{\rule{4pt}{0ex}}1\le j\le N\right\}$ should be close to the line defined by $t=(x\mu )/\sigma $. This comes from a property of a normal distribution: it the distribution of $X$ is really $\mathcal{N}(\mu ,\sigma )$, then the distribution of $(X\mu )/\sigma $ is $\mathcal{N}(0,1)$.
The following figure illustrates the principle of Henry's graphical test with a sample of size $N=50$. Note that only the unit of the horizontal axis is that of the variable $X$ studied. In this example, the points remain close to a line and the hypothesis "the distribution function of $X$ is a gaussian one" does not seem irrelevant, even if a more quantitative analysis (see for instance [KolmogorovSmirnov goodnessoffit test] ) should be carried out to confirm this.
In this second example, the hypothesis of a gaussian distribution seems far less relevant because of the behaviour for small values of $X$.
Kendall plot
In the bivariate case, the Kendall Ploot test enables to validate the choice of a specific copula model or to verify that two samples share the same copula model.
Let $\underline{X}$ be a bivariate random vector which copula is noted $C$.
Let ${\left({\underline{X}}^{i}\right)}_{1\le i\le N}$ be a sample of $\underline{X}$.
We note :
$$\begin{array}{c}\hfill {\displaystyle \forall i\ge 1,{H}_{i}=\frac{1}{n1}Card\left\{j\in [1,N],j\ne i,\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}{x}_{1}^{j}\le {x}_{1}^{i}\phantom{\rule{4.pt}{0ex}}\text{and}\phantom{\rule{4.pt}{0ex}}{x}_{2}^{j}\le {x}_{2}^{i}\right\}}\end{array}$$ 
and $({H}_{\left(1\right)},\cdots ,{H}_{\left(N\right)})$ the ordered statistics of $({H}_{1},\cdots ,{H}_{N})$.
The statistic ${W}_{i}$ is defined by :
$${W}_{i}=N{C}_{N1}^{i1}{\int}_{0}^{1}t{K}_{0}{\left(t\right)}^{i1}{(1{K}_{0}\left(t\right))}^{ni}\phantom{\rule{0.166667em}{0ex}}d{K}_{0}\left(t\right)$$  (103) 
where ${K}_{0}\left(t\right)$ is the cumulative density function of ${H}_{i}$. We can show that this is the cumulative density function of the random variate $C(U,V)$ when $U$ and $V$ are independent and follow $Uniform(0,1)$ distributions.
In OpenTURNS 0.15.0, Eq. (103) is evaluated with the Monte Carlo sampling method : OpenTURNS generates $n$ samples of size $N$ from the bivariate copula $C$, in order to have $n$ realisations of the statistics ${H}_{\left(i\right)},\forall 1\le i\le N$ and have an estimation of ${W}_{i}=E\left[{H}_{\left(i\right)}\right],\forall i\le N$.
When testing a specific copula with respect to a sample, the Kendall Plot test draws the points ${({W}_{i},{H}_{\left(i\right)})}_{1\le i\le N}$. If the points are one the first diagonal, the copula model is validated.
When testing whether two samples have the same copula, the Kendall Plot test draws the points ${({H}_{\left(i\right)}^{1},{H}_{\left(i\right)}^{2})}_{1\le i\le N}$ respectively associated to the first and second sample. Note that the two samples must have the same size.
In Figures 52 to 53, the data 1 and data 2 have been generated from a $Frank(1.5)$ copula, and data 3 from a $Gumbel(4.5)$ copula.
Figures 52 and 52 respectively validates and invalidates the Frank copula model to data 1 and data 2.
Figures 53 and 53 respectively validates that data 1 and data 2 share the same copula, and shows that data 1 and data 3 don't share the same copula.
The Kendall Plot test validates the use of the Frank copula model for the data 1. 
The Kendall Plot test invalidates the use of the Frank copula model for the data 1. 
The Kendall Plot test validates that data 1 and data 2 have the same copula model. 
The Kendall Plot test invalidates that data 1 and data 3 have the same copula model. 
Remark : In the case where you want to test a sample with respect to a specific copula, if the size of the sample is superior to 500, we recommend to use the second form of the Kendall plot test : generate a sample of the proper size from your copula and then test both samples. This way of doing is more efficient.
Other notations
Link with OpenTURNS methodology
The following bibliographical references provide main starting points for further study of this method:
Saporta G. (1990). "Probabilités, Analyse de données et Statistique", Technip
Dixon W.J. & Massey F.J. (1983) "Introduction to statistical analysis (4th ed.)", McGrawHill
Mathematical description
This method deals with the modelling of a probability distribution of a random vector $\underline{X}=\left({X}^{1},...,{X}^{{n}_{X}}\right)$. It seeks to verify the compatibility between a sample of data $\left\{{\underline{x}}_{1},{\underline{x}}_{2},...,{\underline{x}}_{N}\right\}$ and a candidate probability distribution previous chosen. OpenTURNS enables the use of the ${\chi}^{2}$ GoodnessofFit test to answer this question in the one dimensional case ${n}_{X}=1$, and with a discrete distribution.
Principle
Let us limit the case to ${n}_{X}=1$. Thus we denote $\underline{X}={X}^{1}=X$. We also note that as we are considering discrete distributions i.e. those for which the possible values of $X$ belong to a discrete set $\mathcal{E}$, the candidate distribution is characterised by the probabilities ${\left\{p(x;\underline{\theta})\right\}}_{x\in \mathcal{E}}$.
The chi squared test is based on the fact that if the candidate distribution is appropriate, the number of values in the sample x1, x2, ..., xN that are equal to $x$ should be on average equal to $Np(x;\underline{\theta})$. The idea is therefore to compare the "theoretical values" with the actual observed values. This comparison is performed with the aid of the following "distance".
$$\begin{array}{c}\hfill {\widehat{D}}_{N}^{2}=\sum _{x\in {\mathcal{E}}_{N}}\frac{{\left(Np\left(x\right)n\left(x\right)\right)}^{2}}{n\left(x\right)}\end{array}$$ 
where ${\mathcal{E}}_{N}$ denotes the elements of $\mathcal{E}$ which have been observed at least once in the data sample and where $n\left(x\right)$ denotes the number of data values in the sample that are equal to $x$.
The probability distribution of the distance ${\widehat{D}}_{N}^{2}$ is asymptotically known (i.e. as the size of the sample tends to infinity), and this asymptotic distribution does not depend on the candidate distribution being tested. If $N$ is sufficiently large, this means that for a probability $\alpha $, one can calculate the threshold / critical value) ${d}_{\alpha}$ such that:
if ${\widehat{D}}_{N}>{d}_{\alpha}$, we reject the candidate distribution with a risk of error $\alpha $,
if ${\widehat{D}}_{N}\le {d}_{\alpha}$, the candidate distribution is considered acceptable.
An important notion is the socalled "$p$value" of the test. This quantity is equal to the limit error probability ${\alpha}_{\mathrm{lim}}$ under which the candidate distribution is rejected. Thus, the candidate distribution will be accepted if and only if ${\alpha}_{\mathrm{lim}}$ is greater than the value $\alpha $ desired by the user. Note that the higher ${\alpha}_{\mathrm{lim}}\alpha $, the more robust the decision.
Link with OpenTURNS methodology
Input data:
$\left\{{x}_{1},...,{x}_{N}\right\}$ : data sample
Distribution: probability distribution that we are testing for goodnessoffit
Parameters:
$\alpha $ : Level of significance for the test
Outputs:
Result: Binary variable specifying whether the candidate distribution is rejected (0) or not (1)
${\alpha}_{\mathrm{lim}}$ : $p$value of the test
References and theoretical basics
Even for discrete distributions, certain precautions must be taken when using this test. Firstly, the critical value ${d}_{\alpha}$ is only valid for a sufficiently large sample size. No rule exists to determine the minimum number of data values necessary in order to use this test; it is often thought, however, that the approximation is reasonable when $N$ is of the order of a few dozen. But whatever the value of $N$, the distance – and similarly the $p$value – remains a useful tool for comparing different probability distributions to a sample. The distribution which minimizes ${\widehat{D}}_{N}$ – or maximizes the $p$value – will be of interest to the analyst.
On the other hand, the calculation of ${d}_{\alpha}$ and of the $p$value should in theory be modified if we are testing the goodness of fit of a parametric model and if the parameters of the candidate distribution have been estimated from the same sample. The current version of OpenTURNS, however, does not permit such a modification, and so the results must be used with care when the $p$value ${\alpha}_{\mathrm{lim}}$ and the desired error risk $\alpha $ are very close.
The following bibliographical references provide main starting points for further study of this method:
Saporta, G. (1990). "Probabilités, Analyse de données et Statistique", Technip
Dixon, W.J. & Massey, F.J. (1983) "Introduction to statistical analysis (4th ed.)", McGrawHill
D'Agostino, R.B. and Stephens, M.A. (1986). "GoodnessofFit Techniques", Marcel Dekker, Inc., New York.
Bhattacharyya, G.K., and R.A. Johnson, (1997). "Statistical Concepts and Methods", John Wiley and Sons, New York.
Sprent, P., and Smeeton, N.C. (2001). "Applied Nonparametric Statistical Methods – Third edition", Chapman & Hall
Mathematical description
This method deals with the modelling of a probability distribution of a random vector $\underline{X}=\left({X}^{1},...,{X}^{{n}_{X}}\right)$. It seeks to verify the compatibility between a sample of data $\left\{{\underline{x}}_{1},{\underline{x}}_{2},...,{\underline{x}}_{N}\right\}$ and a candidate probability distribution previous chosen. OpenTURNS enables the use of the KolmogorovSmirnov GoodnessofFit test to answer this question in the one dimensional case ${n}_{X}=1$, and with a continuous distribution.
Principle
Let us limit the case to ${n}_{X}=1$. Thus we denote $\underline{X}={X}^{1}=X$. This goodnessoffit test is based on the maximum distance between the cumulative distribution function ${\widehat{F}}_{N}$ of the sample $\left\{{x}_{1},{x}_{2},...,{x}_{N}\right\}$ (see [empirical cumulative distribution function] ) and that of the candidate distribution, denoted $F$. This distance may be expressed as follows:
$$\begin{array}{c}\hfill D=\underset{x}{sup}\left{\widehat{F}}_{N}\left(x\right)F\left(x\right)\right\end{array}$$ 
With a sample $\left\{{x}_{1},{x}_{2},...,{x}_{N}\right\}$, the distance is estimated by:
$$\begin{array}{c}\hfill {\widehat{D}}_{N}=\underset{i=1...N}{sup}\leftF\left({x}_{i}\right)\frac{i1}{N};\frac{i}{N}F\left({x}_{i}\right)\right\end{array}$$ 
The probability distribution of the distance ${\widehat{D}}_{N}$ is asymptotically known (i.e. as the size of the sample tends to infinity). If $N$ is sufficiently large, this means that for a probability $\alpha $ and a candidate distribution type, one can calculate the threshold / critical value ${d}_{\alpha}$ such that:
if ${\widehat{D}}_{N}>{d}_{\alpha}$, we reject the candidate distribution with a risk of error $\alpha $,
if ${\widehat{D}}_{N}\le {d}_{\alpha}$, the candidate distribution is considered acceptable.
Note that ${d}_{\alpha}$ does not depend on the candidate distribution $F$ being tested, and the test is therefore relevant for any continuous distribution.
An important notion is the socalled "$p$value" of the test. This quantity is equal to the limit error probability ${\alpha}_{\mathrm{lim}}$ under which the candidate distribution is rejected. Thus, the candidate distribution will be accepted if and only if ${\alpha}_{\mathrm{lim}}$ is greater than the value $\alpha $ desired by the user. Note that the higher ${\alpha}_{\mathrm{lim}}\alpha $, the more robust the decision.
The diagram below illustrates the principle of comparison with the empirical cumulative distribution function for an ordered sample $\left\{5,6,10,22,27\right\}$; the candidate distribution considered here is the Exponential distribution with parameters $\lambda =0.07$, $\gamma =0$ (see [standard parametric models] ).
Link with OpenTURNS methodology
Input data:
$\left\{{x}_{1},...,{x}_{N}\right\}$ : data sample
Distribution: probability distribution that we are testing for goodnessoffit
Parameters:
$\alpha $ : Level of significance for the test
Outputs:
Result: Binary variable specifying whether the candidate distribution is rejected (0) or not (1)
${\alpha}_{\mathrm{lim}}$ : $p$value of the test
We remind the reader that the underlying theoretical results of the test are asymptotic. There is no rule to determine the minimum number of data values one needs to use this test; but it is often considered a reasonable approximation when $N$ is of an order of a few dozen. But whatever the value of $N$, the distance – and similarly the $p$value – remains a useful tool for comparing different probability distributions to a sample. The distribution which minimizes ${\widehat{D}}_{N}$ – or maximizes the $p$value – will be of interest to the analyst.
We also point out that the calculation of ${d}_{\alpha}$ should in theory be modified if on is testing the goodnessoffit to a parametric model where the parameters have been estimated from the same sample. The current version of OpenTURNS does not allow this modification, and the results should be therefore used with caution when the $p$value ${\alpha}_{\mathrm{lim}}$ and the desired error risk $\alpha $ are very close.
Readers interested in Goodness of Fit tests for continuous distributions are referred to [CramerVon Mises test] and [AndersonDarling test] in the reference documentation.
The following bibliographical references provide main starting points for further study of this method:
Saporta, G. (1990). "Probabilités, Analyse de données et Statistique", Technip
Dixon, W.J. & Massey, F.J. (1983) "Introduction to statistical analysis (4th ed.)", McGrawHill
NIST/SEMATECH eHandbook of Statistical Methods, http://www.itl.nist.gov/div898/handbook/
D'Agostino, R.B. and Stephens, M.A. (1986). "GoodnessofFit Techniques", Marcel Dekker, Inc., New York.
Bhattacharyya, G.K., and R.A. Johnson, (1997). "Statistical Concepts and Methods", John Wiley and Sons, New York.
Sprent, P., and Smeeton, N.C. (2001). "Applied Nonparametric Statistical Methods – Third edition", Chapman & Hall
Mathematical description
This method deals with the modelling of a probability distribution of a random vector $\underline{X}=\left({X}^{1},...,{X}^{{n}_{X}}\right)$. It seeks to verify the compatibility between a sample of data $\left\{{\underline{x}}_{1},{\underline{x}}_{2},...,{\underline{x}}_{N}\right\}$ and a candidate probability distribution previous chosen. OpenTURNS enables the use of the CramervonMises GoodnessofFit test to answer this question in the one dimensional case ${n}_{X}=1$, and with a continuous distribution. The current version is limited to the case of the Normal distribution.
Principle
Let us limit the case to ${n}_{X}=1$. Thus we denote $\underline{X}={X}^{1}=X$. This goodnessoffit test is based on the distance between the cumulative distribution function ${\widehat{F}}_{N}$ of the sample $\left\{{x}_{1},{x}_{2},...,{x}_{N}\right\}$ (see [empirical cumulative distribution function] ) and that of the candidate distribution, denoted $F$. This distance is no longer the maximum deviation as in the [KolmogorovSmirnov test] but the distance squared and integrated over the entire variation domain of the distribution:
$$\begin{array}{c}\hfill D={\int}_{\infty}^{\infty}{\left[F\left(x\right){\widehat{F}}_{N}\left(x\right)\right]}^{2}\phantom{\rule{0.166667em}{0ex}}dF\end{array}$$ 
With a sample $\left\{{x}_{1},{x}_{2},...,{x}_{N}\right\}$, the distance is estimated by:
$$\begin{array}{c}\hfill {\widehat{D}}_{N}=\frac{1}{12N}+\sum _{i=1}^{N}{\left[\frac{2i1}{2N}F\left({x}_{i}\right)\right]}^{2}\end{array}$$ 
The probability distribution of the distance ${\widehat{D}}_{N}$ is asymptotically known (i.e. as the size of the sample tends to infinity). If $N$ is sufficiently large, this means that for a probability $\alpha $ and a candidate distribution type, one can calculate the threshold / critical value ${d}_{\alpha}$ such that:
if ${\widehat{D}}_{N}>{d}_{\alpha}$, we reject the candidate distribution with a risk of error $\alpha $,
if ${\widehat{D}}_{N}\le {d}_{\alpha}$, the candidate distribution is considered acceptable.
Note that ${d}_{\alpha}$ depends on the candidate distribution $F$ being tested; the current version of OpenTURNS is limited to the case of the Normal distribution.
An important notion is the socalled "$p$value" of the test. This quantity is equal to the limit error probability ${\alpha}_{\mathrm{lim}}$ under which the candidate distribution is rejected. Thus, the candidate distribution will be accepted if and only if ${\alpha}_{\mathrm{lim}}$ is greater than the value $\alpha $ desired by the user. Note that the higher ${\alpha}_{\mathrm{lim}}\alpha $, the more robust the decision.
Other notations
Link with OpenTURNS methodology
Input data:
$\left\{{x}_{1},...,{x}_{N}\right\}$ : data sample
Distribution: normal probability distribution that we are testing for goodnessoffit
Parameters:
$\alpha $ : Level of significance for the test
Outputs:
Result: Binary variable specifying whether the candidate distribution is rejected (0) or not (1)
${\alpha}_{\mathrm{lim}}$ : $p$value of the test
We remind the reader that the underlying theoretical results of the test are asymptotic. There is no rule to determine the minimum number of data values one needs to use this test; but it is often considered a reasonable approximation when $N$ is of an order of a few dozen. But whatever the value of $N$, the distance – and similarly the $p$value – remains a useful tool for comparing different probability distributions to a sample. The distribution which minimizes ${\widehat{D}}_{N}$ – or maximizes the $p$value – will be of interest to the analyst.
We also point out that the calculation of ${d}_{\alpha}$ should in theory be modified if on is testing the goodnessoffit to a parametric model where the parameters have been estimated from the same sample. The current version of OpenTURNS does not allow this modification, and the results should be therefore used with caution the $p$value ${\alpha}_{\mathrm{lim}}$ and the desired error risk $\alpha $ are very close.
Readers interested in Goodness of Fit tests for continuous distributions are referred to [KolmogorovSmirnov test] and [AndersonDarling test] in the reference documentation.
The following bibliographical references provide main starting points for further study of this method:
Saporta, G. (1990). "Probabilités, Analyse de données et Statistique", Technip
Dixon, W.J. & Massey, F.J. (1983) "Introduction to statistical analysis (4th ed.)", McGrawHill
D'Agostino, R.B. and Stephens, M.A. (1986). "GoodnessofFit Techniques", Marcel Dekker, Inc., New York.
Bhattacharyya, G.K., and R.A. Johnson, (1997). "Statistical Concepts and Methods", John Wiley and Sons, New York.
Sprent, P., and Smeeton, N.C. (2001). "Applied Nonparametric Statistical Methods – Third edition", Chapman & Hall
Mathematical description
This method deals with the modelling of a probability distribution of a random vector $\underline{X}=\left({X}^{1},...,{X}^{{n}_{X}}\right)$. It seeks to verify the compatibility between a sample of data $\left\{{\underline{x}}_{1},{\underline{x}}_{2},...,{\underline{x}}_{N}\right\}$ and a candidate probability distribution previous chosen. OpenTURNS enables the use of the AndersonDarling GoodnessofFit test to answer this question in the one dimensional case ${n}_{X}=1$, and with a continuous distribution. The current version is limited to the case of the Normal distribution.
Principle
Let us limit the case to ${n}_{X}=1$. Thus we denote $\underline{X}={X}^{1}=X$. This goodnessoffit test is based on the distance between the cumulative distribution function ${\widehat{F}}_{N}$ of the sample $\left\{{x}_{1},{x}_{2},...,{x}_{N}\right\}$ (see [empirical cumulative distribution function] ) and that of the candidate distribution, denoted $F$. This distance is a quadratic type, as in the [CramerVon Mises test] , but gives more weight to deviations of extreme values:
$$\begin{array}{c}\hfill D={\int}_{\infty}^{\infty}\frac{{\displaystyle {\left[F\left(x\right){\widehat{F}}_{N}\left(x\right)\right]}^{2}}}{{\displaystyle F\left(x\right)\left(1F\left(x\right)\right)}}\phantom{\rule{0.166667em}{0ex}}dF\left(x\right)\end{array}$$ 
With a sample $\left\{{x}_{1},{x}_{2},...,{x}_{N}\right\}$, the distance is estimated by:
$$\begin{array}{c}\hfill {\widehat{D}}_{N}=N\sum _{i=1}^{N}\frac{2i1}{N}\left[lnF\left({x}_{\left(i\right)}\right)+ln\left(1F\left({x}_{(N+1i)}\right)\right)\right]\end{array}$$ 
where $\left\{{x}_{\left(1\right)},...,{x}_{\left(N\right)}\right\}$ describes the sample placed in increasing order.
The probability distribution of the distance ${\widehat{D}}_{N}$ is asymptotically known (i.e. as the size of the sample tends to infinity). If $N$ is sufficiently large, this means that for a probability $\alpha $ and a candidate distribution type, one can calculate the threshold / critical value ${d}_{\alpha}$ such that:
if ${\widehat{D}}_{N}>{d}_{\alpha}$, we reject the candidate distribution with a risk of error $\alpha $,
if ${\widehat{D}}_{N}\le {d}_{\alpha}$, the candidate distribution is considered acceptable.
Note that ${d}_{\alpha}$ depends on the candidate distribution $F$ being tested; the current version of OpenTURNS is limited to the case of the Normal distribution.
An important notion is the socalled "$p$value" of the test. This quantity is equal to the limit error probability ${\alpha}_{\mathrm{lim}}$ under which the candidate distribution is rejected. Thus, the candidate distribution will be accepted if and only if ${\alpha}_{\mathrm{lim}}$ is greater than the value $\alpha $ desired by the user. Note that the higher ${\alpha}_{\mathrm{lim}}\alpha $, the more robust the decision.
Link with OpenTURNS methodology
Input data:
$\left\{{x}_{1},...,{x}_{N}\right\}$ : data sample
Distribution: normal probability distribution that we are testing for goodnessoffit
Parameters:
$\alpha $ : Level of significance for the test
Outputs:
${\widehat{D}}_{N}$ : Distance between theoretical and empirical values
${d}_{\alpha}$ : Threshold / Critical value which if exceeded the tested probability is rejected
Result: Binary variable specifying whether the candidate distribution is rejected or not
We remind the reader that the underlying theoretical results of the test are asymptotic. There is no rule to determine the minimum number of data values one needs to use this test; but it is often considered a reasonable approximation when $N$ is of an order of a few dozen. But whatever the value of $N$, the distance – and similarly the $p$value – remains a useful tool for comparing different probability distributions to a sample. The distribution which minimizes ${\widehat{D}}_{N}$ – or maximizes the $p$value – will be of interest to the analyst.
We also point out that the calculation of ${d}_{\alpha}$ should in theory be modified if on is testing the goodnessoffit to a parametric model where the parameters have been estimated from the same sample. The current version of OpenTURNS does not allow this modification, and the results should be therefore used with caution the $p$value ${\alpha}_{\mathrm{lim}}$ and the desired error risk $\alpha $ are very close.
Readers interested in Goodness of Fit tests for continuous distributions are referred to [KolmogorovSmirnov test] and [CramervonMises test] in the reference documentation.
The following bibliographical references provide main starting points for further study of this method:
NIST/SEMATECH eHandbook of Statistical Methods, http://www.itl.nist.gov/div898/handbook/
D'Agostino, R.B. and Stephens, M.A. (1986). "GoodnessofFit Techniques", Marcel Dekker, Inc., New York.
Sprent, P., and Smeeton, N.C. (2001). "Applied Nonparametric Statistical Methods – Third edition", Chapman & Hall
Mathematical description
This method deals with the modelling of a probability distribution of a random vector $\underline{X}=\left({X}^{1},...,{X}^{{n}_{X}}\right)$. It seeks to rank variable candidate distributions by using a sample of data $\left\{{\underline{x}}_{1},{\underline{x}}_{2},...,{\underline{x}}_{N}\right\}$. OpenTURNS enables the use of the Bayesian Information Criterion (BIC) to answer this question in the one dimensional case ${n}_{X}=1$.
Principle
Let us limit the case to ${n}_{X}=1$. Thus we denote $\underline{X}={X}^{1}=X$. Moreover, let us denote by ${\mathcal{M}}_{1}$,..., ${\mathcal{M}}_{K}$ the parametric models envisaged by the user among the [standard parametric models] . We suppose here that the parameters of these models have been estimated previously by the [maximum likelihood method] on the basis of the sample $\left\{{\underline{x}}_{1},{\underline{x}}_{2},...,{\underline{x}}_{n}\right\}$. We denote by ${L}_{i}$ the maximized likelihood for the model ${\mathcal{M}}_{i}$.
By definition of the likelihood, the higher ${L}_{i}$, the better the model describes the sample. However, using the likelihood as a criterion to rank the candidate probability distributions would involve a risk: one would almost always favour complex models involving many parameters. If such models provide indeed a large numbers of degreesoffreedom that can be used to fit the sample, one has to keep in mind that complex models may be less robust that simpler models with less parameters. Actually, the limited available information ($N$ data points) does not allow to estimate robustly too many parameters.
The BIC criterion can be used to avoid this problem. The principle is to rank ${\mathcal{M}}_{1}$,..., ${\mathcal{M}}_{K}$ according to the following quantity:
$$\begin{array}{c}\hfill {\mathrm{BIC}}_{i}=log\left({L}_{i}\right)\frac{{p}_{i}}{2}log\left(n\right)\end{array}$$ 
where ${p}_{i}$ denotes the number of parameters being adjusted for the model ${\mathcal{M}}_{i}$. The larger ${\mathrm{BIC}}_{i}$, the better the model. Note that the idea is to introduce a penalization term that increases with the numbers of parameters to be estimated. A complex model will then have a good score only if the gain in terms of likelihood is high enough to justify the number of parameters used.
The term "Bayesian Information Criterion" comes the interpretation of the quantity ${\mathrm{BIC}}_{i}$. In a bayesian context, the unknow "true" model may be seen as a random variable. Suppose now that the user does not have any informative prior information on which model is more relevant among ${\mathcal{M}}_{1}$,..., ${\mathcal{M}}_{K}$; all the models are thus equally likely from the point of view of the user. Then, one can show that ${\mathrm{BIC}}_{i}$ is an approximation of the posterior distribution's logarithm for the model ${\mathcal{M}}_{i}$.
Readers interested in other ways to rank candidate models referred to [KolmogorovSmirnov test] , [CramerVon Mises test] and [AndersonDarling test] in the reference documentation.
The following bibliographical references provide main starting points for further study of this method:
Saporta, G. (1990). "Probabilités, Analyse de données et Statistique", Technip
Dixon, W.J. & Massey, F.J. (1983) "Introduction to statistical analysis (4th ed.)", McGrawHill
D'Agostino, R.B. and Stephens, M.A. (1986). "GoodnessofFit Techniques", Marcel Dekker, Inc., New York.
Bhattacharyya, G.K., and R.A. Johnson, (1997). "Statistical Concepts and Methods", John Wiley and Sons, New York.
Burnham, K.P., and Anderson, D.R (2002). "Model Selection and Multimodel Inference: A Practical Information Theoretic Approach", Springer
Mathematical description
This method deals with the parametric modelling of a probability distribution for a random vector $\underline{X}=\left({X}^{1},...,{X}^{{n}_{X}}\right)$. It aims to measure a type of dependence (here a linear correlation) which may exist between two components ${X}^{i}$ and ${X}^{j}$.
Principle
The Pearson's correlation coefficient ${\rho}_{U,V}$ aims to measure the strength of a linear relationship between two random variables $U$ and $V$. It is defined as follows:
$$\begin{array}{c}\hfill {\rho}_{U,V}=\frac{{\displaystyle \mathrm{Cov}\left[U,V\right]}}{{\sigma}_{U}{\sigma}_{V}}\end{array}$$ 
where $\mathrm{Cov}\left[U,V\right]=\mathbb{E}\left[\left(U{m}_{U}\right)\left(V{m}_{V}\right)\right]$, ${m}_{U}=\mathbb{E}\left[U\right]$, ${m}_{V}=\mathbb{E}\left[V\right]$, ${\sigma}_{U}=\sqrt{\mathrm{Var}\left[U\right]}$ and ${\sigma}_{V}=\sqrt{\mathrm{Var}\left[V\right]}$. If we have a sample made up of a set of $N$ pairs $\left\{({u}_{1},{v}_{1}),({u}_{2},{v}_{2}),...,({u}_{N},{v}_{N})\right\}$, Pearson's correlation coefficient can be estimated using the formula:
$$\begin{array}{c}\hfill {\widehat{\rho}}_{U,V}=\frac{{\displaystyle \sum _{i=1}^{N}\left({u}_{i}\overline{u}\right)\left({v}_{i}\overline{v}\right)}}{\sqrt{{\displaystyle \sum _{i=1}^{N}{\left({u}_{i}\overline{u}\right)}^{2}{\left({v}_{i}\overline{v}\right)}^{2}}}}\end{array}$$ 
where $\overline{u}$ and $\overline{v}$ represent the empirical means of the samples $({u}_{1},...,{u}_{N})$ and $({v}_{1},...,{v}_{N})$.
Pearson's correlation coefficient takes values between 1 and 1. The closer its absolute value is to 1, the stronger the indication is that a linear relationship exists between variables $U$ and $V$. The sign of Pearson's coefficient indicates if the two variables increase or decrease in the same direction (positive coefficient) or in opposite directions (negative coefficient). We note that a correlation coefficient equal to 0 does not necessarily imply the independence of variables $U$ and $V$: this property is in fact theoretically guaranteed only if $U$ and $V$ both follow a Normal distribution. In all other cases, there are two possible situations in the event of a zero Pearson's correlation coefficient:
the variables $U$ and $V$ are in fact independent,
or a nonlinear relationship exists between $U$ and $V$.
Other notations
Link with OpenTURNS methodology
Pearson's correlation coefficient is also used in step C' "Sensitivity Analysis and Ranking of Sources of Uncertainty". If a propagation of uncertainty with MonteCarlo simulation (step C, [Mean and Variance Estimation using Standard Monte Carlo] ) has been carried out, [Pearson's Ranking] shows the user how to class the components of the input vector $\underline{X}$ according to their impact on the uncertainty of a final variable / output variable defined in step A.
References and theoretical basics
Saporta, G. (1990). "Probabilités, Analyse de données et Statistique", Technip
Dixon, W.J. & Massey, F.J. (1983) "Introduction to statistical analysis (4th ed.)", McGrawHill
Bhattacharyya, G.K., and R.A. Johnson, (1997). "Statistical Concepts and Methods", John Wiley and Sons, New York.
Mathematical description
This method deals with the modelling of a probability distribution of a random vector $\underline{X}=\left({X}^{1},...,{X}^{{n}_{X}}\right)$. It seeks to find a type of dependency (here a linear correlation) which may exist between two components ${X}^{i}$ and ${X}^{j}$.
Principle
The Pearson's correlation coefficient ${\rho}_{U,V}$, defined in [Pearson's Coefficient] , measures the strength of a linear relationship between two random variables $U$ and $V$. If we have a sample made up of $N$ pairs $\left\{({u}_{1},{v}_{1}),({u}_{2},{v}_{2}),({u}_{N},{v}_{N})\right\}$, we denote ${\widehat{\rho}}_{U,V}$ to be the estimated coefficient.
Even in the case where two variables $U$ and $V$ have a Pearson's coefficient ${\rho}_{U,V}$ equal to zero, the estimate ${\widehat{\rho}}_{U,V}$ obtained from the sample may be nonzero: the limited sample size does not provide the perfect image of the real correlation. Pearson's test nevertheless enables one to determine if the value obtained by ${\widehat{\rho}}_{U,V}$ is significantly different from zero. More precisely, the user first chooses a probability $\alpha $. From this value the critical value ${d}_{\alpha}$ is calculated such that:
if $\left{\widehat{\rho}}_{U,V}\right>{d}_{\alpha}$, one can conclude that the real Pearson's correlation coefficient ${\rho}_{U,V}$ is not zero; the risk of error in making this assertion is controlled and equal to $\alpha $;
if $\left{\widehat{\rho}}_{U,V}\right\le {d}_{\alpha}$, there is insufficient evidence to reject the null hypothesis ${\rho}_{U,V}=0$.
An important notion is the socalled "$p$value" of the test. This quantity is equal to the limit error probability ${\alpha}_{\mathrm{lim}}$ under which the null correlation hypothesis is rejected. Thus, Pearson's coefficient is supposed non zero if and only if ${\alpha}_{\mathrm{lim}}$ is greater than the value $\alpha $ desired by the user. Note that the higher ${\alpha}_{\mathrm{lim}}\alpha $, the more robust the decision.
Other notations
Link with OpenTURNS methodology
Input data:
Two samples $\left\{{x}_{1}^{i},...,{x}_{N}^{i}\right\}$ and $\left\{{x}_{1}^{j},...,{x}_{N}^{j}\right\}$ of variables ${X}^{i}$ and ${X}^{j}$, each pair $\left({x}_{k}^{i},{x}_{k}^{j}\right)$ corresponding to a simultaneous sampling of the two variables
Parameters:
a probability $\alpha $ taking values strictly between 0 and 1, defining the risk of permissible decision error (significance level)
Outputs:
Result: Binary variable specifying whether the hypothesis of a correlation coefficient equal to 0 is rejected (0) or not (1)
${\alpha}_{\mathrm{lim}}$ : $p$value of the test
The underlying theory of the Pearson test assumes in fact that the variables ${X}^{i}$ and ${X}^{j}$ are both normally distributed. In all other cases, the decision produced by the test is only valid if the sample size $N$ is sufficiently large (in practice $N\ge $ a few dozen, even if there is no theoretical result that enables us to prove that asymptotic behaviour has been attained).
Still considering the case of distributions other than the Normal distribution, whatever the value of $N$, we recall that ${\rho}_{{X}^{i},{X}^{j}}=0$ does not enable us to conclude that ${X}^{i}$ and ${X}^{j}$ are independent (see [Pearson's Correlation Coefficient] ).
More generally, the numerical value of Pearson's correlation coefficient can only be interpreted when the two variables studied ${X}^{i}$ and ${X}^{j}$ are related in a linear way; the scatter plot of points $\left\{({x}_{1}^{i},{x}_{1}^{j}),...,({x}_{N}^{i},{x}_{N}^{j})\right\}$ provides some indication concerning the validity of this hypothesis.
The following pages describe methods which enable us to test the hypothesis of the Normal distribution using the available sample $\left\{{x}_{1}^{i},...,{x}_{N}^{i}\right\}$ and $\left\{{x}_{1}^{j},...,{x}_{N}^{j}\right\}$: [KolmogorovSmirnov Goodness of Fit Test] , [Cramervon Mises Goodness of Fit Test] , [AndersonDarling Goodness of Fit Test] .
Out of Pearson's test validity domain (i.e. linear relationship, Normal distributions), [Spearman's test] provides some answers.
The following bibliographical references provide main starting points for further study of this method:
Saporta, G. (1990). "Probabilités, Analyse de données et Statistique", Technip
Dixon, W.J. & Massey, F.J. (1983) "Introduction to statistical analysis (4th ed.)", McGrawHill
Bhattacharyya, G.K., and R.A. Johnson, (1997). "Statistical Concepts and Methods", John Wiley and Sons, New York.
Mathematical description
This method deals with the parametric modelling of a probability distribution for a random vector $\underline{X}=\left({X}^{1},...,{X}^{{n}_{X}}\right)$. It aims to measure a type of dependence (here a monotonous correlation) which may exist between two components ${X}^{i}$ and ${X}^{j}$.
Principle
The Spearman's correlation coefficient ${\rho}_{U,V}^{S}$ aims to measure the strength of a monotonic relationship between two random variables $U$ and $V$. It is in fact equivalent to the Pearson's correlation coefficient after having transformed $U$ and $V$ to linearize any monotonic relationship (remember that Pearson's correlation coefficient may only be used to measure the strength of linear relationships, see [Pearson's Correlation Coefficient] ):
$$\begin{array}{c}\hfill {\rho}_{U,V}^{S}={\rho}_{{F}_{U}\left(U\right),{F}_{V}\left(V\right)}\end{array}$$ 
where ${F}_{U}$ and ${F}_{V}$ denote the cumulative distribution functions of $U$ and $V$.
If we arrange a sample made up of $N$ pairs $\left\{({u}_{1},{v}_{1}),({u}_{2},{v}_{2}),...,({u}_{N},{v}_{N})\right\}$, the estimation of Spearman's correlation coefficient first of all requires a ranking to produce two samples $({u}_{1},...,{u}_{N})$ and $({v}_{1},...,{v}_{N})$. The ranking ${u}_{\left[i\right]}$ of the observation ${u}_{i}$ is defined as the position of ${u}_{i}$ in the sample reordered in ascending order: if ${u}_{i}$ is the smallest value in the sample $({u}_{1},...,{u}_{N})$, its ranking would equal 1; if ${u}_{i}$ is the second smallest value in the sample, its ranking would equal 2, and so forth. The ranking transformation is a procedure that takes the sample $({u}_{1},...,{u}_{N})$) as input data and produces the sample $({u}_{\left[1\right]},...,{u}_{\left[N\right]})$ as an output result.
For example, let us consider the sample $({u}_{1},{u}_{2},{u}_{3},{u}_{4})=(1.5,0.7,5.1,4.3)$. We therefore have $({u}_{\left[1\right]},{u}_{\left[2\right]}{u}_{\left[3\right]},{u}_{\left[4\right]})=(2,1,4,3)$. ${u}_{1}=1.5$ is in fact the second smallest value in the original, ${u}_{2}=0.7$ the smallest, etc.
The estimation of Spearman's correlation coefficient is therefore equal to Pearson's coefficient estimated with the aid of the $N$ pairs $({u}_{\left[1\right]},{v}_{\left[1\right]})$, $({u}_{\left[2\right]},{v}_{\left[2\right]})$, ..., $({u}_{\left[N\right]},{v}_{\left[N\right]})$:
$$\begin{array}{c}\hfill {\widehat{\rho}}_{U,V}^{S}=\frac{{\displaystyle \sum _{i=1}^{N}\left({u}_{\left[i\right]}{\overline{u}}_{\left[\right]}\right)\left({v}_{\left[i\right]}{\overline{v}}_{\left[\right]}\right)}}{\sqrt{{\displaystyle \sum _{i=1}^{N}{\left({u}_{\left[i\right]}{\overline{u}}_{\left[\right]}\right)}^{2}{\left({v}_{\left[i\right]}{\overline{v}}_{\left[\right]}\right)}^{2}}}}\end{array}$$ 
where ${\overline{u}}_{\left[\right]}$ and ${\overline{v}}_{\left[\right]}$ represent the empirical means of the samples $({u}_{\left[1\right]},...,{u}_{\left[N\right]})$ and $({v}_{\left[1\right]},...,{v}_{\left[N\right]})$.
The Spearman's correlation coefficient takes values between 1 and 1. The closer its absolute value is to 1, the stronger the indication is that a monotonic relationship exists between variables $U$ and $V$. The sign of Spearman's coefficient indicates if the two variables increase or decrease in the same direction (positive coefficient) or in opposite directions (negative coefficient). We note that a correlation coefficient equal to 0 does not necessarily imply the independence of variables $U$ and $V$. There are two possible situations in the event of a zero Spearman's correlation coefficient:
the variables $U$ and $V$ are in fact independent,
or a nonmonotonic relationship exists between $U$ and $V$.
Other notations
Link with OpenTURNS methodology
Spearman's correlation coefficient is also used in step C' "Sensitivity Analysis and Ranking of Sources of Uncertainty". If a propagation of uncertainty with MonteCarlo simulation (step C, [Mean and Variance Estimation using Standard Monte Carlo] ) has been carried out, [Spearman's Ranking] shows the user how to class the components of the input vector $\underline{X}$ according to their impact on the uncertainty of a final variable / output variable defined in step A.
Saporta, G. (1990). "Probabilités, Analyse de données et Statistique", Technip
Dixon, W.J. & Massey, F.J. (1983) "Introduction to statistical analysis (4th ed.)", McGrawHill
Bhattacharyya, G.K., and R.A. Johnson, (1997). "Statistical Concepts and Methods", John Wiley and Sons, New York.
Sprent, P., and Smeeton, N.C. (2001). "Applied Nonparametric Statistical Methods – Third edition", Chapman & Hall
Mathematical description
This method deals with the modelling of a probability distribution of a random vector $\underline{X}=\left({X}^{1},...,{X}^{{n}_{X}}\right)$. It seeks to find a type of dependency (here a monotonous correlation) which may exist between two components ${X}^{i}$ and ${X}^{j}$.
Principle
The Spearman's correlation coefficient ${\rho}_{U,V}^{S}$, defined in [Spearman's Coefficient] , measures the strength of a monotonous relationship between two random variables $U$ and $V$. If we have a sample made up of $N$ pairs $\left\{({u}_{1},{v}_{1}),({u}_{2},{v}_{2}),({u}_{N},{v}_{N})\right\}$, we denote ${\widehat{\rho}}_{U,V}^{S}$ to be the estimated coefficient.
Even in the case where two variables $U$ and $V$ have a Spearman's coefficient ${\rho}_{U,V}^{S}$ equal to zero, the estimate ${\widehat{\rho}}_{U,V}^{S}$ obtained from the sample may be nonzero: the limited sample size does not provide the perfect image of the real correlation. Pearson's test nevertheless enables one to determine if the value obtained by ${\widehat{\rho}}_{U,V}^{S}$ is significantly different from zero. More precisely, the user first chooses a probability $\alpha $. From this value the critical value ${d}_{\alpha}$ is calculated automatically such that:
if $\left{\widehat{\rho}}_{U,V}^{S}\right>{d}_{\alpha}$, one can conclude that the real Spearman's correlation coefficient ${\rho}_{U,V}^{S}$ is not zero; the risk of error in making this assertion is controlled and equal to $\alpha $;
if $\left{\widehat{\rho}}_{U,V}^{S}\right\le {d}_{\alpha}$, there is insufficient evidence to reject the null hypothesis ${\rho}_{U,V}^{S}=0$.
An important notion is the socalled "$p$value" of the test. This quantity is equal to the limit error probability ${\alpha}_{\mathrm{lim}}$ under which the null correlation hypothesis is rejected. Thus, Spearman's's coefficient is supposed non zero if and only if ${\alpha}_{\mathrm{lim}}$ is greater than the value $\alpha $ desired by the user. Note that the higher ${\alpha}_{\mathrm{lim}}\alpha $, the more robust the decision.
Other notations
Link with OpenTURNS methodology
Input data:
Two samples $\left\{{x}_{1}^{i},...,{x}_{N}^{i}\right\}$ and $\left\{{x}_{1}^{j},...,{x}_{N}^{j}\right\}$ of variables ${X}^{i}$ and ${X}^{j}$, each pair $\left({x}_{k}^{i},{x}_{k}^{j}\right)$ corresponding to a simultaneous sampling of the two variables
Parameters:
a probability $\alpha $ taking values strictly between 0 and 1, defining the risk of permissible decision error (significance level)
Outputs:
Result: Binary variable specifying whether the hypothesis of a correlation coefficient equal to 0 is rejected (0) or not (1)
${\alpha}_{\mathrm{lim}}$ : $p$value of the test
Remember that ${\rho}_{{X}^{i},{X}^{j}}=0$ does not enable us to conclude that ${X}^{i}$ and ${X}^{j}$ are independent (see [Spearman's correlation coefficient] ).
More generally, the numerical value of Spearman's correlation coefficient can only be interpreted when the two variables studied ${X}^{i}$ and ${X}^{j}$ are related in a monotonous way; the scatter plot of points $\left\{({x}_{1}^{i},{x}_{1}^{j}),...,({x}_{N}^{i},{x}_{N}^{j})\right\}$ provides some indication concerning the validity of this hypothesis.
The following bibliographical references provide main starting points for further study of this method:
Saporta, G. (1990). "Probabilités, Analyse de données et Statistique", Technip
Dixon, W.J. & Massey, F.J. (1983) "Introduction to statistical analysis (4th ed.)", McGrawHill
Bhattacharyya, G.K., and R.A. Johnson, (1997). "Statistical Concepts and Methods", John Wiley and Sons, New York.
Sprent, P., and Smeeton, N.C. (2001). "Applied Nonparametric Statistical Methods – Third edition", Chapman & Hall
Mathematical description
This method deals with the parametric modelling of a probability distribution for a random vector $\underline{X}=\left({X}^{1},...,{X}^{{n}_{X}}\right)$. We seek here to detect possible dependencies that may exist between two components ${X}^{i}$ and ${X}^{j}$. In response to this, OpenTURNS offers the use of the ${\chi}^{2}$ test for Independence for discrete probability distributions.
Principle
As we are considering discrete distributions, the possible values for ${X}^{i}$ and ${X}^{j}$ respectively belong to the discrete sets ${\mathcal{E}}_{i}$ and ${\mathcal{E}}_{j}$. The ${\chi}^{2}$ test of independence can be applied when we have a sample consisting of $N$ pairs $\left\{({x}_{1}^{i},{x}_{1}^{j}),({x}_{2}^{i},{x}_{2}^{j}),({x}_{N}^{i},{x}_{N}^{j})\right\}$. We denote:
${n}_{u,v}$ the number of pairs in the sample such that ${x}_{k}^{i}=u$ and ${x}_{k}^{j}=v$,
${n}_{u}^{i}$ the number of pairs in the sample such that ${x}_{k}^{i}=u$,
${n}_{v}^{j}$ the number of pairs in the sample such that ${x}_{k}^{j}=v$.
The test thus uses the quantity denoted ${\widehat{D}}_{N}^{2}$:
$$\begin{array}{c}\hfill {\widehat{D}}_{N}^{2}=\sum _{u\in {\mathcal{E}}_{i}}\sum _{v\in {\mathcal{E}}_{2}}\frac{{\left({p}_{u,v}{p}_{v}^{j}{p}_{u}^{i}\right)}^{2}}{{p}_{u}^{i}{p}_{v}^{j}}\end{array}$$ 
where:
$$\begin{array}{c}\hfill {p}_{u,v}=\frac{{n}_{u,v}}{N},\phantom{\rule{4pt}{0ex}}{p}_{u}^{i}=\frac{{n}_{u}^{i}}{N},\phantom{\rule{4pt}{0ex}}{p}_{v}^{j}=\frac{{n}_{v}^{j}}{N}\end{array}$$ 
The probability distribution of the distance ${\widehat{D}}_{N}^{2}$ is asymptotically known (i.e. as the size of the sample tends to infinity). If $N$ is sufficiently large, this means that for a probability $\alpha $, one can calculate the threshold (critical value) ${d}_{\alpha}$ such that:
if ${\widehat{D}}_{N}>{d}_{\alpha}$, we conclude, with the risk of error $\alpha $, that a dependency exists between ${X}^{i}$ and ${X}^{j}$,
if ${\widehat{D}}_{N}\le {d}_{\alpha}$, the independence hypothesis is considered acceptable.
An important notion is the socalled "$p$value" of the test. This quantity is equal to the limit error probability ${\alpha}_{\mathrm{lim}}$ under which the independence hypothesis is rejected. Thus, independence is assumed if and only if ${\alpha}_{\mathrm{lim}}$ is greater than the value $\alpha $ desired by the user. Note that the higher ${\alpha}_{\mathrm{lim}}\alpha $, the more robust the decision.
Other notations
Link with OpenTURNS methodology
Input data:
Two samples $\left\{{x}_{1}^{i},...,{x}_{N}^{i}\right\}$ and $\left\{{x}_{1}^{j},...,{x}_{N}^{j}\right\}$ of variables ${X}^{i}$ and ${X}^{j}$, each pair $\left({x}_{k}^{i},{x}_{k}^{j}\right)$ corresponding to a simultaneous sampling of the two variables
Parameters:
a probability $\alpha $ taking values strictly between 0 and 1, defining the risk of permissible decision error (significance level)
Outputs:
Result: Binary variable specifying whether the hypothesis of independence is rejected (0) or not (1)
${\alpha}_{\mathrm{lim}}$ : $p$value of the test
On the other hand, no hypothesis is made in the form of the relationship between the two tested variables. Readers interested in the detection of dependencies between two continuous variables are referred to [Pearson's Test] and [Spearman's test] in the reference documentation.
The following bibliographical references provide main starting points to further study of this method:
Saporta, G. (1990). "Probabilités, Analyse de données et Statistique", Technip
Dixon, W.J. & Massey, F.J. (1983) "Introduction to statistical analysis (4th ed.)", McGrawHill
Bhattacharyya, G.K., and R.A. Johnson, (1997). "Statistical Concepts and Methods", John Wiley and Sons, New York.
Sprent, P., and Smeeton, N.C. (2001). "Applied Nonparametric Statistical Methods – Third edition", Chapman & Hall
Mathematical description
This method deals with the parametric modelling of a probability distribution for a random vector $\underline{X}=\left({X}^{1},...,{X}^{{n}_{X}}\right)$. It aims to measure a type of dependence (here a linear relation) which may exist between a component ${X}^{i}$ and other uncertain variables ${X}^{j}$.
Principle of the method
The principle of the multiple linear regression model is to find the function that links the variable ${X}^{i}$ to other variables ${X}^{{j}_{1}}$,...,${X}^{{j}_{K}}$ by means of a linear model:
$$\begin{array}{c}\hfill {X}^{i}={a}_{0}+\sum _{j\in \{{j}_{1},...,{j}_{K}\}}{a}_{j}{X}^{j}+\epsilon \end{array}$$ 
where $\epsilon $ describes a random variable with zero mean and standard deviation $\sigma $ independent of the input variables ${X}^{i}$. For given values of ${X}^{{j}_{1}}$,...,${X}^{{j}_{K}}$, the average forecast of ${X}^{i}$ is denoted by ${\widehat{X}}^{i}$ and is defined as:
$$\begin{array}{c}\hfill {\widehat{X}}^{i}={a}_{0}+\sum _{j\in \{{j}_{1},...,{j}_{K}\}}{a}_{j}{X}^{j}\end{array}$$ 
The estimators for the regression coefficients ${\widehat{a}}_{0},{\widehat{a}}_{1},...,{\widehat{a}}_{K}$, and the standard deviation $\sigma $ are obtained from a sample of $({X}^{i},{X}^{{j}_{1}},...,{X}^{{j}_{K}})$, that is a set of $N$ values $({x}_{1}^{i},{x}_{1}^{{j}_{1}},...,{x}_{1}^{{j}_{K}})$,...,$({x}_{n}^{i},{x}_{n}^{{j}_{1}},...,{x}_{n}^{{j}_{K}})$. They are determined via the leastsquares method:
$$\begin{array}{c}\hfill \left\{{\widehat{a}}_{0},{\widehat{a}}_{1},...,{\widehat{a}}_{K}\right\}=\mathrm{argmin}\sum _{k=1}^{n}{\left[{x}_{k}^{i}{a}_{0}\sum _{j\in \{{j}_{1},...,{j}_{K}\}}{a}_{j}{x}_{k}^{j}\right]}^{2}\end{array}$$ 
In other words, the principle is to minimize the total quadratic distance between the observations ${x}_{k}^{i}$ and the linear forecast ${\widehat{x}}_{k}^{i}$.
Some estimated coefficient ${\widehat{a}}_{\ell}$ may be close to zero, which may indicate that the variable ${X}^{{j}_{\ell}}$ does not bring valuable information to forecast ${X}^{i}$. OpenTURNS includes a classical statistical test to identify such situations: Fisher's test. For each estimated coefficient ${\widehat{a}}_{\ell}$, an important characteristic is the socalled "$p$value" of Fisher's test. The coefficient is said to be "significant" if and only if ${\alpha}_{\ell \mathrm{lim}}$ is greater than a value $\alpha $ chosen by the user (typically 5% or 10%). The higher the $p$value, the more significant the coefficient.
Another important characteristic of the adjusted linear model is the coefficient of determination ${R}^{2}$. This quantity indicates the part of the variance of ${X}^{i}$ that is explained by the linear model:
$$\begin{array}{c}\hfill {R}^{2}=\frac{{\displaystyle \sum _{k=1}^{n}{\left({x}_{k}^{i}{\overline{x}}^{i}\right)}^{2}\sum _{k=1}^{n}{\left({x}_{k}^{i}{\widehat{x}}_{k}^{i}\right)}^{2}}}{{\sum}_{k=1}^{n}{\left({x}_{k}^{i}{\overline{x}}^{i}\right)}^{2}}\end{array}$$ 
where ${\overline{x}}^{i}$ denotes the empirical mean of the sample $\left\{{x}_{1}^{i},...,{x}_{n}^{i}\right\}$.
Thus, $0\le {R}^{2}\le 1$. A value close to 1 indicates a good fit of the linear model, whereas a value close to 0 indicates that the linear model does not provide a relevant forecast. A statistical test allows to detect significant values of ${R}^{2}$. Again, a $p$value is provided: the higher the $p$value, the more significant the coefficient of determination.
By definition, the multiple regression model is only relevant for linear relationships, as in the following simple example where ${X}^{2}={a}_{0}+{a}_{1}{X}^{1}$.
In this second example (still in dimension 1), the linear model is not relevant because of the exponential shape of the relation. But a linear approach would be useful on the transformed problem ${X}^{2}={a}_{0}+{a}_{1}exp{X}^{1}$. In other words, what is important is that the relationships between ${X}^{i}$ and the variables ${X}^{{j}_{1}}$,...,${X}^{{j}_{K}}$ is linear with respect to the regression coefficients ${a}_{j}$.
The value of ${R}^{2}$ is a good indication of the goodnessof fit of the linear model. However, several other verifications have to be carried out before concluding that the linear model is satisfactory. For instance, one has to pay attentions to the "residuals" $\{{u}_{1},...,{u}_{N}\}$ of the regression:
$$\begin{array}{c}\hfill {u}_{j}={x}^{i}{\widehat{x}}^{i}\end{array}$$ 
A residual is thus equal to the difference between the observed value of ${X}^{i}$ and the average forecast provided by the linear model. A keyassumption for the robustness of the model is that the characteristics of the residuals do not depend on the value of ${X}^{i},{X}^{{j}_{1}}$,...,${X}^{{j}_{K}}$: the mean value should be close to 0 and the standard deviation should be constant. Thus, plotting the residuals versus these variables can fruitful.
In the following example, the behaviour of the residuals is satisfactory: no particular trend can be detected neither in the mean nor in he standard deviation.
The next example illustrates a less favourable situation: the mean value of the residuals seems to be close to 0 but the standard deviation tends to increase with $X$. In such a situation, the linear model should be abandoned, or at least used very cautiously.
Link with OpenTURNS methodology
The following bibliographical references provide main starting points for further study of this method:
Saporta, G. (1990). "Probabilités, Analyse de données et Statistique", Technip
Dixon, W.J. & Massey, F.J. (1983) "Introduction to statistical analysis (4th ed.)", McGrawHill
NIST/SEMATECH eHandbook of Statistical Methods, http://www.itl.nist.gov/div898/handbook/
Bhattacharyya, G.K., and R.A. Johnson, (1997). "Statistical Concepts and Methods", John Wiley and Sons, New York.
Global methodology of an uncertainty study

Table of contents
 OpenTURNS' methods for Step C: uncertainty propagation
