WELCOME TO THE
SELF CONTROLLED CASE SERIES METHOD
The version of GLIM required is GLIM4. To run a case series analysis in GLIM, the data are entered and the model is set up in an analysis file, which then calls a suite of macros.
The macro suite
This is contained in the file sccs.mac. These macros split up the data as described in the tutorial, then fit a multinomial model to the expanded data using the ‘Poisson trick’ also described in the tutorial. GLIM4 has an absorption facility via the command $eliminate, which is used so as to avoid estimating the individual effects which are included simply to make the Poisson trick work.
The macros are launched by a call to macro clus. The following vectors need to be defined; the length of each of these vectors is the number of events (not cases: some individuals can have more than one event):
IND_: this contains a sequential list of individual identifiers. Multiple events in the same individual are listed as separate lines in the data file with the same value of IND_. The maximum value of IND_ is thus the number of distinct cases in the analysis.
AGE_: this contains the age at event.
STA_: age prior to start of observation period (so if observation starts on day k of age for some individual, the corresponding value of STA_ is k-1).
END_: age at end of observation period.
A further vector needs to be defined, called VC_, containing the codes for the exposure categories. This vector is used to code the exposure factor levels. For example, if there is a single risk period then VC_ = 1,2 where 1 codes the control period and 2 the risk period; the time after the risk period is automatically assigned the control period code.
In addition to these vectors, two (possibly three, if you want to model fixed covariates) must be defined. Each list contains several vectors, of length equal to the number of events. The lists are:
LA_: the list of age cutpoints
LV_: the list of exposure cutpoints
LC_: the list of fixed covariates (optional)
Finally, a model macro needs to be defined of the form
$macro model T_+V_$endmac
where here T_ stands for the age effect and V_ the exposure effect. You can also include interaction terms, for example
$macro model T_+V_*.C_1$endmac
where C_1 is the first element of list LC_ of fixed covariates.
The analysis file
This is used to input the data and define the vectors, lists and model required as input to the macros. Also some post-analysis processing is used to produce relative incidence estimates (as the estimates from the model fit are on the log scale) and deviances from fits of nested models if required.
The output includes the minimum and maximum values of the start and end of the observation periods, frequency tables of events by age group and risk period, deviance and model parameters (the parameters relating to exposure are listed last).
Suitable analysis files for some of the analyses described in the paper are given below.
MMR and meningitis in Oxford example
The analysis file is oxford.fil. The data are included directly in the file as the dataset is so small. There are just two age groups, as described in the tutorial paper.
To run this analysis, make sure that the macro suite sccs.mac is in the same directory. Then start at the GLIM prompt and type
The output you should obtain is shown in the file oxford.out.
MMR and ITP examples
The data for these examples is in the file itp.dat. Unvaccinated individuals are assigned an MMR age of -1.
The variable new takes the value 1 if the corresponding event is not a repeat event, and 0 if it is. The variable IND_ is thus obtained by cumulating this variable.
The analysis file is itp.fil. This includes a likelihood ratio test: the deviance difference between the model with the vaccine effect and the model without is obtained by
The output for this analysis is in itp.out.
OPV and intussusception examples
The data for these examples is in intus.dat. There are some multiple events. The variable indiv records individual codes, however some recoding is needed to obtain sequential individual codes for IND_.
Note that in analysis 2, unwanted records must be deleted before IND_ is defined.
Covariates and interactions
For this analysis the exposure cut points need to be recoded to allow for possible overlapping. How much recoding is needed depends on how much overlap there is: in these data the minimum separation between doses is 21 days so relatively little recoding is required. The analysis file for analyses 4 and 5 are intus4.fil and intus5.fil, with output in intus4.out and intus5.out.
GLIM can be coaxed into fitting models for multiple exposures, but this requires that some of the macros in the sccs.mac file be rewritten. Accordingly we shall not present such analyses here.
Specifying semi-parametric models in GLIM is actually easier than for parametric models, as the age groups need not be defined. A different suite of macros is used, called spcs.mac. The analysis files are much the same as before, the main difference being that the list LA_ need no longer be defined.
In the semi-parametric model, a separate age effect parameter is fitted at each distinct age at which events occurs: so the model takes longer to fit, and the output file is longer as well. Modelling issues (covariates, multiple risk periods etc) are dealt with exactly as with the parametric model.
Examples of semi-parametric modelling