here is the hidden area text

The Blog

Also known as the place I write words.

A new package gamlss.inf is introduced recently to help with the fitting of inflated distributions on the interval [0,1] and zero adjusted distributions on the interval (o, infinity) for a response variable.
The inflated and zero adjusted distributions are mixed  (i.e  continuous and discrete) distributions with a continuous part on a real line interval and an additional part with point probabilities at either 0 or 1 or both 0 and 1.

The gamlss package   gamlss.dist already provides the inflated beta distributions, BEINF, BEINF0, BEINF1 BEZI and BEOI  which allow the user to fit a beta distribution on (0,1), with extra point probabilities at 0 or 1 or both 0 and 1. The probabilities at the points  0 and 1 may depend on explanatory variables. Since the beta distribution has 2 parameters, the inflated beta, BEINF,  (with the addition of the two probabilities at 0 and 1) has a total of 4 parameters.  In practice, and for complicated data sets, the part  of  the response which lies on (0,1) may need more than 2 distribution parameters to be captured correctly.  The vignette Inflated distributions on the interval [0, 1] provides information how this can be done.

The gamlss package   gamlss.dist also has two zero adjusted distributions ZAGA and ZAIG (gamma and inverse Gaussian; respectively). The package gamlss.inf allows all possible distributions defined on (o, infinity) to be zero adjusted. For more information see the  vignette Zero adjusted distributions on the positive real line.

The book `Flexible Regression and Smoothing: Using GAMLSS in R’ is now published by CRC Press.

  • It provides a broad overview of regression and smoothing techniques by focusing on the practical application.
  • It provides an introduction to the  GAMLSS software in R.
  • It includes a comprehensive collection of real data examples, which reflect the range of problems addressed by GAMLSS models and provide a practical illustration of the process of using flexible GAMLSS models for statistical learning.
  • The R code is integrated into the text for ease of understanding and replication.

`Flexible Regression and Smoothing: Using GAMLSS in R’ is the first of a series of books on GAMLSS model. The second book is ‘Distributions for Modelling Location, Scale and Shape: Using GAMLSS in R’ and it is under preparation. The second book will cover the over 100 distributions (continuous, discrete and mixed) existing in the  gamlss.dist. It will  also cover distributions which can be created form existing distributions by: transformation, truncation,  censoring (i.e.  Tobit models) or inflation (i.e. zero inflated beta).

The R code of the examples in the book is given here


Version 4.4-0

1. gamlss

  • i) the term.plot() did not plot (by mistake) a gam object. This is now corrected
  • ii) the vis.lo() is introduced for plotting fitted loess curve
  • iii) VC.test() for Vuong and Clarke test is introduced (use with caution)
    iv) the function stepVGDAll.A() is introduced.
    v) Q.stats() function is amended so the number of observations in each intervals add up to the total number of observations (that was a bug)
    vi) the pcat() function for reducing the levels of a categorical
    variable is introduced.
    vii) the pbc() a more robust version of cy() is introduced.


  • i) the function plotMP() is added


  • i) the Tokyo rainfall data is included in gamlss now as an example of cycle P-spline smoothing
  • ii) the data set grip (hand grip strength) is added (for centile estimation)
  • iii) the data set leukemia is added (for random effect analysis)
  • v) the data oil is included
 1. package: gamlss
  • i) The function within gamlss() has a line added to prevent the iterative weighs wt to go to Inf.
  • ii) The tp() function within lms() and quantSheets() has changed name and modified slightly
  • iii) The vcoc.gamlss() has the warnings changed and allows if theinverse of the Hessian (R) fails to recalucated using different method.
  • iv) The output of the term.plot() function has changed.
  • v) IMPORTANT: pb() has a new version which is faster than the old one. The old pb() function is renamed and called pbo()
  • vi) The summary.gamlss() functions instead of giving an R warning for additive terms it gives a “Note” in the output.
  • vii) Q.stats() has been modified to .n.iter=5 will work with small data
2. package: gamlss.demo
  • i) The Locmean(), Locpoly(), WLocmean() and  WLocpoly()  functions were moved from the package gamlss.util to here since they are used only for the demos.
  •  ii) The function demo.LocalRegression() is added
3. package: gamlss.dist
  • i) The discrete distribution double Poisson DPO() is added.
  • ii) a modififiacion on plotZAGA() is added
4. package: gamlss.util
  • i) garmaFit() is modified to allow binomial (and binary) fits (Thanks to Matthias Schmid for point out to us).
  • ii) scattersmooth() the argument “cols” was added to allow different schemes of colours. This also allow the package not to depend to the package colorspace even though the plot is not looking as good. If you have the colorspace in you version of R you can still use the old scheme, see the example in scattersmooth().
  • iii) plotSimpleGamlss() : had a minor ammentment thanks to Michael Guan.
5. gamlss.add
  • i) the function plotNN() is added for plotting fitted neural netwarks

Version 4.3-1
1. package: gamlss
  • i)  the functions centiles(), and had some of the plotting options changed
  • ii) the lms() the function has changed with some extra arguments added, and a with an extra predict.lms() method. Now the lms object (created by lms()) can be used with centiles.pred().
  • iii) z.scores() function for lms object is added to simplify getting the z.scores for new observations
  • iv) the function quantSheets() is added. It uses the method of Schnabel and Eilers (2013) to create quantiles curves for centile estimation. The created object has print(), fitted(), predict() and  resid() methods.
  • v) the functions z.scoresQS() and findPower() are also added to assist centile estimation modelling using quantSheets().
2. package: gamlss.dist
  • i) the function flexDist() is moved from gamlss.util to gamlss.dist
  • ii) a bug is fixed in the qZAGA() and qZAIG() functions
Version 4.3-0
1. packages gamlss
  • i)   kri is added as another smoother
  • ii)  rvcov() the argument is added
  • iii) fp() now saves the lm fitted models and it can be access using getSmo()
  • vi)  ra() and rc() are gone to be replaced by re() and interface for calling mle() within gamlss
2. package: gamlss.dist
  • i) Family() is amended to save a function rather just a list
  • ii) all the FORTRAN source files have now translated to C thanks to Marco Enea
  • iii)  ST3C a version of ST3 written in C rather in R is added thanks to Alexios Ghalanos
3, package gamlss.add
  • i) fk() now does not print the results in each iteration (it was left by mistake in the last version)
  • ii) ga() ,tr() and nn() now can take bigger formula.

Version 4.2-7

i) gamlss

  1. gamlssML(): now allows the fitting binomial data (sorry it never checked before) and the use of formula in the specification of the model (e.g, y~1) to be consistent with gamlss(). Note that explanatory variables will be ignored if used with gamlssML().
  2.  .gamlss.multin.list is now on NAMESPACE
  3.  the functions vcov.gamlss() and summary.gamlss() have now an extra argument with two alternatives “R” and “PB” in order to calculate the Hessian. The “R” option uses the R function optimHess() while the “PB” uses a local function based a function of Pinheiro and Bates in nlme package
  4.  the function plot2way() is introduced as a way of plotting a two way interaction between two
    categorical variables (factors).
  5. the functions vcov.gamlss() and gen.likelihood() now works even if NA are in the coefficients
    of the beta the paraneters
  6.  The functions loglogSurv1(), loglogSurv2(), loglogSurv3(), loglogSurv() and logSurv() are
    introduced as a way of exploring the tail behaviour of a distribution.
  7. AIC(), GAIC() and extractAIC() have a new argument “c” which if k=2 and c=TRUE gives the corrected AIC that is, AICc (suggested by Mario Alvarado).

ii) gamlss.dist

  1. The use of the distributions BCCGo(), BCPEo() and BCTo() (which are identical to BCCG, BCPE
    and BCT respectively apart from the fact that they using log link for mu) had some side effects
    because the equivalent d, p, q and r functions did not exist. In this version to avoid the
    problem we have added the functions dBCCGo(), pBCCGo(), qBCCGo(), rBCCGo(), dBCPEo(),
    pBCPEo(), qBCPEo() and rBCPEo(), dBCTo(), pBCTo(), qBCTo() and rBCTo(). Those functions
    are identical to d, p, q, and r functions of BCCG(), BCPE() and BCT().


  1. functions fitTail() and fitTailAll() are introduced for fitting truncated distributions to
    the tails and creating Hill type of plots respectively.


iv)  gamlss.demo

  1.  The distributions SN1, SN2, LOGITNO, LOGNO2, TF2 and SST are added to the demos


This version is released on the 22–6-2013 and it is the first time that robust (sandwich) standard errors are introduce  in gamlss models.  Of course those standard errors apply to parametric GAMLSS models only. When non-parametric smoothing terms are used then  the (sandwich) standard errors can still be used with caution since they are not yet take into account the variability in those terms.

There are two new functions for sandwich calculations:

  • i) get.K() and
  • ii) rvcov().

The first is getting the meat (in the sandwich calculations, that is the matrix K) while the second is the robust version of vcov().   Also a new argument  “robust” is introduced for the gamlss() methods a) vcov.gamlss() b) summary.gamlss() and c) confint.gamlss().

Some bugs are fixed on the following function

  • LR.test() a small amendment for checking whether it is  gamlss model or not
  • gen.likelihood() a small amendment to make sure that it gets the correct link functions
  • getSmo() is introduced for get information on fitted smoothing terms
  • lms() a bug on the previous version is corrected and a additional feature is added with the argument”trans.x”.  When “trans.x=TRUE” the function looks for a suitable transformation in the x-variable.




Version 4.2-5

The most important change in this version of gamlss is the way that the standard errors are calculated. In  previous version the vcov() function was calculated using a final iteration to a non-linear maximisation procedure. This procedure failed in a lot of occasions and the result was that the reported standard errors were the ones given from the individuals distribution parameter fits of the algorithm which did not take account of the intercorrelation of the distribution parameters. In this version a new function   gen.likelihood() is indroduced. This   function   calculates the likelihood function of the model, so that the numerical calculation of the Hessian matrix of the full model can be achieved easily.  For fully parametric models this should produce accurate standard errors based on the observed information matrix. For models with non-parametric terms does non at the moment take account of the variability of the smoothing non-parametric terms, so a warning is produced.

The following are the changes in version 4.2.4:

1. gamlss:

  • i)  the function lms() has an extra argument method.pb added on. This allows the use of the GAIC method of estimating the smoothing in pb() parameters. It was found that for large data sets the local maximum likelihood (which is the default method of estimation) produces too wiggly centiles. The penalty k is taken from the argument k and it is the same as the one selecting the distribution.
  • ii) the function rqres.plot() is fixed and improved to allow worm plot as well as QQ-plots
  • iii) The Rsq() function is introduced. It uses the generalised R-squared of Nagelkerke, (1991).
  • iv) The term.plot() function has now se=TRUE and ylim=”common” as defaults.
  • v) the function gen.likelihood() is introduced. It creates the log-likelihood of a fitted model. It is used by vcov() and it should be more reliable than previous versions of vcov().
  • vi) The gamlss() function now saves the offset’s for all parameters. This is needed for vcov() and gen.likelihood() and allows the vcov() to work with offsets.
  • vii) The wp() function allows now two explanatory variables including factors (see the examples in help). Also it was found that if the model is very bad some of the values of the worm plot could be Inf so if line=TRUE the function was failing when the cubic fit in lm() was used. Now if there are Inf values, it does not fit the lm() model.
  • viii) summary() now allow the coefficients and standard errors to saved in a table.
  • ix) acfResid() is introduced to plot the act function for power function of the residuals, r, r^2, r^3 and r^4.
  • x) the method confint() is introduced for gamlss objects so confidence intervals for the beta parameters can be produced
  • xi) attach() was taken out from the functions findhyper(), fitDist(), gamlssML(), histDis() , lms(), par.plot(), prof.term() and wp() to comply with the new regulation of R


  • i) A new argument “varying” is introduce in all the truncated functions. It allows the distribution of the response variable to have different truncated points at the observational level. (Note that before the truncated parameters had to be the same for all observations)


The new version of gamlss is 4.2-0. The following are the changes made:


package gamlss:

  • The functions and prof.term() are improved. The argument step is not anymore compulsory and if not set the argument length is used instead. For most cases there is no need to have a fine grid since the function is approximated using splinefun(). The output is saved as an “profDeviance.gamlss” object. Note that more testing is needed for the reliability of the function.
  • The function fitDist() allows extra arguments “…” to be passed to gamlssML() and gamlss().
  • The can be used now in conjunction with gamlssML()
  • The function Q.stats() now allows plotting the resulting matrix for easy identification of the parts of the model which do not fit well.
  • The cs() and scs() are calling now the R function smooth.spline() rather than the FORTRAN code to comply with R regulations
  • The function vc() is disfuction the user is advised to use the equivalent function pvc()
  • The function lo() is rewritten to comply with R regulation. Now it takes a formula as its first argument rather than a list of explanatory variables. Also no standard errors for the smooth function are provided since the R function loess()  do not provide this information at the moment.
  • The logic link function in the package gamlss.dist is amended so it does not call the R .C function.
  • summary.gamlss() now have an argument “save” for saving the output, thanks to  Wilmar Igl
  • gamlssML(): a bug with vcov.gamlssML() function is fixed also “nlminb” is now the default maximisation procedure rather than “optim”
  • The default values of the argument cent on lms() and calibrartion() is change to 100*pnorm((-4:4)*2/3) as suggested by Tim Cole. Also a bug which did not allow term.plot() to work with lms() is fixed.
package gamlss.dist
  • All the FORTRAN routines have their REAL change to “double precession”
  • The distribution function SICHEL has been amended to so in the limited case where sigma is bigger than 10000 and nu > zero it switches to the negative binomial.  (the old version has sigma>1000 which creates problem with the function when was used for nu in the lice data see the Stasinopoulos and Rigby (2007) paper in JSS. Thanks to Ivailo Stoyanov who point out the problem.)
  • The skew Normal 1, SN1, skew Normal 2, SN2,  distributions are introduced
  • The link function “logshiftto2” is added in in package gamlss.dist. The reason for this is to prevent  the degrees of freedom parameter nu in TF2 (see below) to be less than 2.
  • The distributions SST and TF2 are introduced in package gamlss.dist. The distributions are reparametrisations of the ST3 and TF respectively. The sigma parameter in SST and TF2 is the standard deviation of the distribution. Not that the standard deviation is not defined for degrees of freedom less than 2. The “logshiftto2” link function (see above) prevents this.
  • The logit normal, LOGITNO and the log normal 2 (with mu as the median), LOGNO2, are introduced in gamlss.dist.
  • The functions gen.Family(), Family(), Family.d(), Family.p(), Family.q() and  Family.p() are introduced in gamlss.dist for generating “log” and “logic” versions of continuous distributions in the real line.
  • The generalised Pareto (GP) distribution (a re-parameterisation of PARETO2 and PARETO2o) is introduced in gamlss.dist
  • A bug in the GB2 and  EGB2 q functions is fixed.
package gamlss.add
  • The function ga() have changed to accept all of Simon Wood’s gam() arguments. This allows to fit random Markov fields using gamlss.
  • The function penLags() for fitting penalised lag terms and its interface with gamlss la() are added here.
  • The functions fitFixBP() and fitFreeKnots() have been moved from the gamlss.util packages to the gamlss.add package to be closer with their interface function fk() which allows fitting within gamlss
  • the film30 and film90 data sets are introduced
package gamlss.util
  • The functions fitFixBP() and fitFreeKnots() are moved to package gamlss.add
  • garmaFit() for fitting generalised ARMA models is introduced
  • centile.ts() for giving centiles for time series data is introduced
  • lagPlot() for scatter plotting of lags is introduced.

The new features in version 2.1-2 are as follows:

package gamlss:

  • The function histSmo() is added for density estimation.
  • The function histDist() now has the function gamlssML() as its main fitting function. The fitting function  gamlss() is only used if gamlssML() fails.
  • The function gamlssML() has now an argument start.from.
  • In the function fitDist(), the normal distribution NO() is added to the list of “.realline” so it also appears in the   AIC list
  • The  function gamlssML() has now method summary()
  • A bugs is corrected in the function package vcov(), thanks to Tom Jagger
  • The function random() is modified to allow Local maximum likelihood estimation of the smoothing parameter lambda
  • The function pvc() has been modified to allow fixing the dfs when the “by” argument is a factor (Tim Cole suggest it).
  • The predict.gamlss() function works now with offsets
  • The worm plot function wp() works now with any fitted object which has the method resid()
  • The function dop() is renamed as dtop() and now works with any fitted object which has the the method resid()
  • The function fitted.plot() is renamed fittedPlot() to avoid S3 problems
  • pvc(): now predict is working when the argument  “by” is a continuous variable, thanks to Torsten Hothorn for point out to us. The fitted function fitDist() is introduced for fitting different distributions to a single set of data. This function fits several parametric distributions to a vector of data and chooses the one with minimum GAIC.

package gamlss.dist:

  • The distributions SHASHo and SHASHo2 and PARETO2o are added to the package gamlss.dist
  • The random generating function rDEL() is now corrected thanks to Dr. Conrad Burden
  • The following distributions are added to package gamlss.dist: YULE, WARING, GEOM, IGAMMA, PARETO2
  • The distributions BCTo, BCPEo, and BBCGo are added so BCT, BCPE and BCCG can have”log” as a default

package gamlss.util:

  • The functions fitFixBP() and fitFreeKnots() for fitting fixed and free break points respectively have been improved