Latin hypercube

Latin hypercube sampling may be considered a particular case of stratified sampling, see [4], [6], [10], [11] and [23].

The purpose of stratified sampling is to achieve a better coverage of the sample space of the input factors. Let the sample space S of the input vector X be partitioned into L disjoint strata S1SL. Represent the size of each Si , i.e. as pi = P(X ∈ Si). Obtain a random sample xh , h = 1, … ,ni  from Si, where i=1,l  ni = N . In particular, when l = 1, the result is a random sample over the entire sample space.

In the latin hypercube the range of each input factor, Xj , j = 1,2, …k, is divided into N intervals of equal marginal probability, 1/N, and one observation of each input factor is made in each interval using random sampling within that interval. Thus, there are N non-overlapping realizations for each of the k input factors. One of the realizations on X1 is randomly selected (each observation is equally likely to be selected), matched with a randomly selected realization of X2 , and so on up till Xk . These collectively constitute a first sample, x1 . One of the remaining realizations on X1  is then matched at random with one of the remaining observations on X2 , and so on, to get x2 . A similar procedure is followed for x3, … , xN , which exhausts the observations and results in a latin hypercube sample. The method has the advantage of ensuring input factor has all portions of its distribution represented by input values.

LHS performs better than random sampling when the output is dominated by a few components of the input factors. The method ensures that each of these components is represented in a fully stratified manner, no matter which components might turn out to be important.

LHS is better than random sampling for estimating the mean and the population distribution function. LHS asymptotically is better than random sampling in that it provides an estimator (of the expectation of the output function) with lower variance. In particular the closer the output function is to being additive in its input variables, the more reduction in variance.

IMPORTANT: Latin hypercube can be used when correlation among inputs is present.