Selected Publications

The relative frailty variance among survivors provides a readily interpretable measure of how the heterogeneity of a population, as represented by a frailty model, evolves over time. We discuss the properties of the relative frailty variance, show that it characterizes frailty distributions and that, suitably rescaled, it may be used to compare patterns of dependence across models and data sets. In shared frailty models, the relative frailty variance is closely related to the cross-ratio function, which is estimable from bivariate survival data. We investigate the possible shapes of the relative frailty variance function for the purpose of model selection, and we review available frailty distribution families in this context. We introduce several new families with contrasting properties, including simple but flexible time varying frailty models. The benefits of the approach that we propose are illustrated with two applications to bivariate current status data obtained from serological surveys.
Journal of the Royal Statistical Society: Series B - Statistical Methodology, 2012

The self-controlled case series method may be used to study the association between a time-varying exposure and a health event. It is based only on cases, and it controls for fixed confounders. Exposure and event histories are collected for each case over a predefined observation period. The method requires that observation periods should be independent of event times. This requirement is violated when events increase the mortality rate, since censoring of the observation periods is then event dependent. In this article, the case series method for rare nonrecurrent events is extended to remove this independence assumption, thus introducing an additional term in the likelihood that depends on the censoring process. In order to remain within the case series framework in which only cases are sampled, the model is reparameterized so that this additional term becomes estimable from the distribution of intervals from event to end of observation. The exposure effect of primary interest may be estimated unbiasedly. The age effect, however, takes on a new interpretation, incorporating the effect of censoring. The model may be fitted in standard loglinear modeling software; this yields conservative standard errors. We describe a detailed application to the study of antipsychotics and stroke. The estimates obtained from the standard case series model are shown to be biased when event-dependent observation periods are ignored. When they are allowed for, antipsychotic use remains strongly positively associated with stroke in patients with dementia, but not in patients without dementia. Two detailed simulation studies are included as Supplemental Material.
Journal of the American Statistical Association, 2011



  • Let the data speak


  • Do first! then think

  • Do simple! then build up

  • Think! but only think a bit! then write it down

  • Read! but don’t read too much


  • Do it today! don’t procrastinate!

Recent Publications

. On the Geometric Interplay Between Goodness-of-Fit and Estimation: Illustrative Examples. In Computational Information Geometry : For Image and Signal Processing., 2017.


. Towards the Geometry of Model Sensitivity: An Illustration. In Computational Information Geometry : For Image and Signal Processing., 2017.


. The relative frailty variance and shared frailty models. Journal of the Royal Statistical Society: Series B - Statistical Methodology, 2012.


. Self-Controlled Case Series Analysis With Event-Dependent Observation Periods. Journal of the American Statistical Association, 2011.


Recent & Upcoming Talks

Sinape 2018 (upcoming)
Sep 23, 2018 5:41 PM
MaxEnt 2017
Jul 9, 2017 5:41 PM

Recent Posts

So, I am learning how to use ggplot. Here is one my first attempts and at the same time I discover that you need to use the gridExtra package to layout multiople ggplots. I am using one of my favourite datasets: the Anscombe quartet library(ggplot2) library(grid) library(gridExtra) data(anscombe) The quartet comprises four datasets that have nearly identical simple descriptive statistics, yet appear very different when graphed. p1<-ggplot(anscombe,aes(x1,y1))+geom_point()+geom_smooth(method="lm",se=FALSE,color="red")+xlim(3,16) p2<-ggplot(anscombe,aes(x2,y2))+geom_point()+geom_smooth(method="lm",se=FALSE,color="red")+xlim(3,16) p3<-ggplot(anscombe,aes(x3,y3))+geom_point()+geom_smooth(method="lm",se=FALSE,color="red")+xlim(3,16) p4<-ggplot(anscombe,aes(x4,y4))+geom_point()+geom_smooth(method="lm",se=FALSE,color="red")+xlim(7,20) grid.


Here we use some \[\LaTeX\]



  • On Semester 1 of the 2018-19 term I will be teaching Applied Statistical Inference Moodle page