Centile estimation includes methods for estimating the age related distribution of human growth.
The standard estimation of centile curves involves two continuous variables:
- the response variable, that is, the variable we are interested in and for which we are trying to find the centile curves, e.g. weight, BMI, head circumference etc. and
- the explanatory variable age.
The 100p centile of a random variable Y is the value y such that p(Y > y)=p, i.e. y= inv.cdf(p), where inv.cdf() the the inverse cumulative distribution function of of Y applied to p. Here we consider the conditional centile of Y given explanatory variable x (usually age). By varying x a 100p centile curve of y(x) against x is obtained. Centile curves can be obtained for different values of p. The World Health Organisation uses 100p=(3, 15, 50, 85, 97) in its charts and 100p=(1, 3, 5, 15, 25, 50, 75, 85, 95, 97, 99) in its tables.
This can be extended to more than one explanatory variable.
The methodology for creating growth centile references for individuals from a population comprises two different methods:
- the non parametric method of quantile regression (Koenker, 2005; Koenker and Bassett, 1978, Koenker and Ng (2005), He and Ng (1999) and Ng and Maechler (2007))
- the parametric LMS (i.e. Lambda, Mu and Sigma) method of Cole (1988), Cole and Green (1992) and its extensions for example see Wright and Royston (1997), van Buuren and Fredriks (2001), and Rigby and Stasinopoulos (2004, 2006).
Here we are dealing with the LMS method and its extensions. The LMS method, within GAMLSS, is equivalent of assuming the Box- Cox Cole and Green distribution (BCCG) for the response variable and fitting a smooth curves for μ, σ, and ν. The BCCG distribution is derived by assuming that Y, the response variable is a specific function of a random variable Z which has a (truncated) normal distribution. The BCCG distribution is suitable for positively or negatively skew data depending on the values of the parameter ν.
Rigby and Stasinopoulos (2004, 2006) extended the LMS method (which allows for skewness and but not for kurtosis in the data), by introducing the Box-Cox power exponential (BCPE) and the Box-Cox t (BCT) distributions respectively and called the resulting methods LMSP and LMST respectively. The BCPE assumes that the transformed random variable Z has a (truncated) exponential power distribution while BCT assumes that Z has a (truncated) t distribution.
More recently the function lms() is introduced for fitting centile curves in gamlss package.