Observational study designs

Observational study designs


“There an only a handful of ways to do a study properly but a thousand ways to do it wrong. ” – David Sackett

Types of epidemiological studies

Epidemiological studies are traditionally classified as either Observational or Experimental. The ultimate paradigm in epidemiologic research is an experiment where the investigator manipulates the intervention or exposure. In practice, the ethical problems in human experimentation and the cost involved in such studies almost invariably precludes extensive use of the experimental design. Most studies, therefore, are observational in nature. In an observational study, the investigator measures but does not intervene. For example, the rate of occurrence of acute myocardial infarction among smokers may be compared to the rate among nonsmokers; in this case, the investigator does not decide who smokes. Observational designs range from relatively weak studies like descriptive and ecological studies to strong designs like case control and cohort studies. This chapter will provide a general overview of the various observational designs.

Descriptive studies

A descriptive study is the weakest epidemiological design. The investigators merely describe the health status of a population or characteristic of a number of patients. Description is usually done with respect to time place and person. A case series is an example of a descriptive study. It offers limited information about a group of patients and their clinical characteristics and outcomes. Descriptive studies are weak because they make no attempt to link cause and effect and thereforeno causal association can be determined. Descriptive studies however, are often the first step to a well designed epidemiological study. They allow the investigator to define a good hypothesis which can then be tested using a better design. For example, Gottlieb’s description of a rare from of pneumonia (pneumocystis carinii) among young adult male homosexuals in 1981 helped in identifying and characterizing HIV disease (Gottlieb MS 1981)

Ecological studies

Ecological studies are also weak designs. Here the units of study are populations rather than individuals. Foe example, when the coronary artery disease (CAD) prevalence rates were compared between different countries. it was found that CAD miss were highest in those countries where mean serum cholesterol values were the highest. CAD rates were very low in countries like Japan (low mean serum cholesterol) while it was very high in countries like Finland (high mean serum cholesterol). This ecological link paved the way for intensive investigation into the association between serum cholesterol and CAD. Another example is the ecological link between malaria incidence and prevalence of sickle cell disease: malaria is rare in areas where sickel cell disease was prevalent. The association between smoking and lung cancer was supported by the ecological link between smoking and gender (males had higher lung cancer rates). Ecological studies can be useful in generating hypothesis but no causal inference can be drawn from them; an apparent ecological link may not be a true link, it could be confounded by several other factors.

Cross sectional studies

In this design, measurements are made on a population at one point in time. For example, a survey done in a village to identify the number of individuals with hypertension. Here the villagers are screened with blood pressure measurement at one point in time. The frequency of hypertension is then examined in relation to age sex, socioeconomic status, and other risk factors for hypertension. Cross sectional studies measure the prevalence of disease and are also called prevalence studies. Since there is no longitudinal component, cross sectional surveys cannot possibly measure incidence of any disease.

Cross sectional studies are easy to do and tend to be economical since repeated data collection is not done.They yield useful data on prevalence of diseases and this is often good enough to assess the health situation a population.

The main problem with a cross sectional study stems from the fact that both the exposure and the outcome are measured simultaneously. So, even if a strong association is made out between an exposure and the outcome, it is not easy to determine which occurred first, the exposure or the outcome. In other words, causal associations cannot be made based on cross sectional data.

Cohort Studies

Cohort studies are considered the strongest of ail observational designs. A cohort study is conceptually very straightforward. The idea is to measure and compare the incidence of disease in two or more study cohorts. The word cohort thrives from the Lain word for one of the ten divisions of a Roman Legion (army) In epidemiology, a cohort is a group of people who share a common experience or condition. For example, a birth cohort shares the same year of birth; a cohort of smokers has smoking as the common experience; a cohort of oral contraceptive users share OCP use as the common experience.

Usually, there is one cohort which is thought of as the exposed cohort – individuals in this cohort have been exposed to some event or condition- and another cohort is thought of as the unexposed cohort. For example, in the classic cohort study on smoking and lung cancer [Doll & Hill 1961], the exposure factor was smoking. A cohort of smokers and a cohort of nonsmokers were followed up and the incidence of lung cancer was measured and compared. Normally, an effort is made to match both cohorts with respect to age, sex and other important variables; the only key difference between the two cohorts is the closure status. If exposure also matched, then the cohort study is doomed!

Cohort studies are usually prospective or forward looking. They are also called longitudinal studies. Disease free cohorts are defined on the basis of the exposure status and then they are followed up for long time periods (follow up depends on the natural history of the outcome disease and how rare the outcome is). New cases of the disease are picked up during follow up and the incidence of the disease is computed on the basis of the exposure status. The incidence in the exposed cohort is then compared with the incidence in the unexposed cohort. This ratio is called Relative Risk (RR) or Risk Ratio

Relative Risk  =  Incidence in the exposed cohort /Incidence in the unexposed cohort

The relative risk is a measure of association between the exposure and the outcome. The larger the RR, the stronger the association. As it can be seen, the cohort study is the only study design in which the true incidence of a disease can be estimated. The RR therefore is considered the best measure of association.

Cohort studies are very strong designs. But they are very time consuming and expensive. Since most diseases are rare, large cohorts have to be followed up for many years to get good estimates of incidence and RR. This makes feasibility very difficult. The Framingham study cost the US government millions of dollars.

Example: Doll R, Hill AB. Mortality in relation to smoking: ten years’ observations of British doctors. BMJ 1964; 1399-1410,1460-1467.

This classic study is the most cited example of a cohort study. The cohort was a group of British male doctors .listed in the British Medical Register. Data on smoking status (closure) was obtained on 34,445 male physicians. The occurrence of lung cancer in this cohort was documented over a period of 10 years from death certificates and also from lists of physician deaths provided by the British Medical Association. Diagnoses of lung cancer were based upon the best evidence available. The results revealed that the incidence of lung cancer among nonsmokers was 0.07 per 1000 per year. The incidence among smokers was 1.30 per 1000 per year. The Relative Risk was 18.6. Thus, smokers appeared to have a 18 times greater risk of lung cancer when compared to nonsmokers.

As can be seen in this study, cohort studies have the major advantage of greater assurance that exposure preceded the outcome (smoking preceded lung cancer). This clear temporal (time) sequencing is extremely important while making causal inference. Cohort studies are advantageous for another reason: the effect of a certain exposure can be studied for multiple outcomes at the same time. For example, in a cohort study on smoking, its association can be studied with several outcomes, lung cancer, coronary heart disease, stroke, etc.

Case Control Studies

Conceptually, case control studies are more difficult to comprehend than cohort studies. In a cohort study, disease free exposed and non-exposed cohorts are followed up and then outcome events are picked up as and when they occur. In a case control design, sampling starts with diseased and non-diseased individuals. They are called-cases and controls. The exposure status is then determined by looking backward in time (using documentation of exposures or recall of historical events). For this reason, case control studies are also called as retrospective studies. The measure of association in a case control study is called an Odds Ratio (OR). The OR is the ratio of the odds (chance) of exposure among cases in favour of exposure among controls. If the disease is rare, then the OR tends to be a good approximation of the Relative risk (RR). However, true incidence estimates can not be generated from a case control study.

Case control studies are much simpler and easier to do when compared to cohort studies. They are very cost-efficient. Unfortunately, lack of a clear understanding of the case control methodology has lead people to believe that it is a second-rate substitute for cohort study. In reality, case control designs have a sound theoretical basis and well designed case control studies can provide information as good as cohort studies [Rothman & Greenland 1998].

Example: Herbst AL, et al. Adenocarcinoma of the vagina: association of maternal stilbestrol therapy with tumor appearance in young women. NEJM 1971;284(16):787-881.

In this classic study, investigators were trying to determine the factors responsible for the unusual occurrence of a rare tumor (vaginal adenocarcinoma) among 8 young women born between 1946 and 1951. Foe each of these 8 cases, 4 matched controls (those who did not have vaginal adenocarcinoma) were selected by examination of the birth records of the hospitals in which each patient was born. Females born within five days and on the same type of service (ward or private) as the eight cases were identified. The mothers of all these women were interviewed. The results revealed that mothers of 7 of 8 cases had been given diethylstilbestrol (estrogen) during pregnancy while none of the mothers (0 of 32) of controls had taken stilbestrol during pregnancy

(P < 0.00001). This was one of the earliest landmark case control studies. Certain important advantages of the case control design are apparent in this study:

  • Case control studies are the best design for investigating the etiology of rare diseases; if this hypothesis were to be tested using a cohort design, several thousand mothers who had received DES would have had to be followed up until their daughters developed vaginal tumours.
  • Case control study allows the investigator to simultaneously explore the multiple possible associations with a disease. In this study, mothers were asked to recall several exposure events like smoking during pregnancy, bleeding during pregnancy, intra-uterine X-ray exposure, etc.
  • the sample size required for case control studies is often considerably smaller; with just 8 case 32 controls, a powerful association was demonstrated in this study.
  • case control studies are remarkably cost-efficient. This study was done with almost no cost input and in a very short time.

Case control studies are often criticized because of the possibility of various types of bias. For example, if the control group that is selected for comparison has a very low odds for exposure, then the resultant OR will be biased. Also, other types of bias like information bias (recall bias) and confounding can make case control studies difficult to handle. For example, mothers whose daughters had developed adenocarcinoma were more likely to recall historical events (like consumption of DES) than mothers who had healthy daughters. This is called recall bias. This underlines the crucial importance of unbiased exposure ascertainment for both cases and controls (preferably by a person who is blinded to their case or control status). Case control studies, because they rely on history of past exposure, also suffer from the problem of unreliable data. Memory for many events fade and if no documentation of past exposure exists, then results of the study may be invalid.

Choosing the right study design

Rarely is only one type of study design appropriate to a study question. As can be seen in Table I, each observational study has its own strength and weakness. While cohort studies tend to be the strongest, they also tend to be very expensive, time-consuming and difficult. At the other extreme, cross sectional and ecological studies may be easy to do but do not allow any causal inference.

  Ecological Cross-sectional Case Control Cohort
Probability of        
selection bias NA medium high low
recall bias NA high high low
loss to follow-up NA NA low high
confounding High medium medium low
Time required Low medium medium high
Cost Low medium medium high
Strength of causal inference Low low medium high


Table 1: Comparison of various study designs NA: not applicable; Source: modified from Beaglehole et al, 1993
The decision to choose a particular study design would depend on the research question and the resources available for the study. For example, as seen in Table 2, if a rare condition is being investigated, it may almost impossible to a cohort study. If incidence estimates need to be measured, then the only study which will allow them are cohort studies. If causal etiology is being investigated, ecological and cross sectional designs may be totally inappropriate.


  Ecological Cross-sectional Case Control Cohort
Investigation of rare disease ++++ +++++
Investigation of rare exposure ++ +++++
Testing multiple outcomes of an exposure + ++ +++++
Study of multiple exposures and determinants ++ ++ ++++ +++
Measurement of temporal sequence ++ + +++++
Measurement of incidence rates +++++


References & further reading

1. Hulley SB et al. Designing Clinical Research. Baltimore: Williams & Wilkins, 1988.

2. Last JM. A Dictionary of Epidemiology, 3rd Edition. Oxford University Press, 1995.

3. Rothman KJ, Greenland S. Modem Epidemiology, 2nd Edition. Philadelphia: Lippincot-Raven, 1998.

4. Beaglehole R, Bonita R, Kjellstrom T. Basic Epidemiology. World Health Organization, 1993.

5. MacMahon B, Trichopoulos D. Epidemiology. Principles & Methods, 2nd Edition. Little Brown and Co, 1996.

6. Gottlieb MS et al. Pneumocystis carinii carinii pneumonia and mucosal candidiasis in previously healthy homosexual men: evidence of a new acquired cellular immunodeficiency. NEJM 1981;305(24):1425-1431.

7. Riegelman RK, Hirsch RP. Studying a Study and Testing a Test. How to Read the Health Science Literature, 3rd Edition. Little Brown and Company, 1996.

8. SackettDL,WennbergJE. Choosing the best research design for each question. BMJ1997;315:1636.

Exercise: Observational Study Designs

Exercise 1. In an attempt to measure the effect of birth weight on the subsequent growth of chidren a concurrent cohort study was carried out. 300 children with birth weight 2 kg to 2.5 kg were followed till age one, when anthropometric measurements were made to assess the nutritional status. A similar number of children born during the same period with birth weight greater than 2.5 kg were also similarly followed up. Information on socioeconomic status of the families was also obtained.

The following were the results:

                                                       Low Birth Weight Normal Weight

No Of Childrens Studied
NO Found Malnourished at age One

I. What is the exposure factor that is being studied in this case?

2. What is the incidence of malnutrition among the exposed (a) =

3. What is the incidence of malnutrition among the unexposed (b) =

4. Calculate the Relative Risk (RR) =

5. Describe in one sentence what this relative risk means to you:

6. Do a chi-square test to see if the difference is statistically significant:

Observed Values
MalNourished – Well Nourished



Expected Values
MalNourished – Well Nourished


 Null hypothesis:


Calculate the attributable risk among the exposed:

(a-b/a) x l00 =            _____% or
(RR-l/RR) x l00 =     ______%
Describe in one sentence what this means to you:

Assuming that 10% of the babies are born with low birth weight (below 2.5 kg), Pe = 0.l, what is the population attributable risk?

PAR= Pe (RR – 1)

Describe in one sentence what this means to you:

The following table shows the distribution of the two groups of children according to their socioeconomic class.

Socio Economic Status
Low Birth Weight                                  Normal



7. Are the two groups similar with respect to socioeconomic status?

(later you could do a chi-square test to see if the two groups are significantly different.)

It appears possible that at least some of the difference in the nutritional status between the two groups could have been due to the difference in socioeconomic status. If we had thought of this earlier, we could have chosen the comparison group after matching for socioeconomic status. Socioeconomic status may be acting as a ‘confounding factor’ in this instance.

Example2: In a study to measure the protective effect of BCG against tuberculous meningitis, all case of TB meningitis diagnosed in 5 hospitals in a large city during the year 1998 were selected. An equal number of controls matched for age, sex and the neighbourhood (geographical location) were also selected. BCG vaccination status of the cases and controls were assessed by trained workers by looking for the typical vaccine scar over the deltoid region.

There were 60 cases and 60 controls. 25% of the cases and 50% of the controls had BCG scars.

1. What sort of an epidemiological study design is this? Why?

2. What is the exposure factor?

3. What is the outcome?

4. Set up the 2×2 table for unmatched analysis:


Cases                   Controls

BCG Scar Positive    
BCG Scar Negative

BCG Scar positive

BCG Scar negative

5. Calculate the odds ratio.

Since TB meningitis is a rare disease, the calculated OR must be a good estimate of the relative risk.

6, Calculate the protective effect of BCG against TB meningitis.

7, Why were the children matched for

a) age b) sex c) neighbourhood

8, Since the controls were chosen after matching for important confounders, the correct way to analyse the data is by using matched pair analysis.

  • Number of pairs where both the case and the control was vaccinated = 10 pairs
  • Number of pairs where the case was vaccinated but the control was not = 5 pairs
  • Number of pairs where the control was vaccinated but the case was not = 20 pairs
  • Number of pairs where neither the case nor the control was vaccinated = 25 pairs

Setup the 2×2 table:

                                      Control+                       Control-


Calculate the odds ratio (b/c) =

Calculate the protective effect =

9. Could there have been bias in ascertaining the exposure factor?

10. Is it easy to diagnose a meningitis to be as one due to tuberculosis?

11. What happens if there had been errors in

a. reading the BCG scar

b. diagnosis of TB meningitis

Exercise 3. A large study was performed to ascertain the relationship between alcohol consumption and CAD. 2000 people aged 30 – 60 years in a small town were enlisted for the study as eligible for the study. Of those, 1500 agreed to participate in the study.

They underwent a complete physical exam and investigations to rule out CAD (100 people were found to have CAD and they were excluded) Information about alcohol consumption and other risk factors were also collected at this stage.

Over the next 10 years, this population was followed up biannually and screened for CAD. The result at the end of follow up is given below:

Alcohol Consumption CAD No CAD
Ever 475 425
Never 240 260

a) What type of study design is this?

b) Is there an association between alcohol consumption and CAD?

c) Was the comparison ever vs never valid?

To ascertain dose response, a n x n table was set up to measure the association between various amounts of alcohol consumption and risk of CAD.

Amount of alcohol Consumption (drinks per day) CAD No CAD Relative Risk
Nil 240 260  
<=2 50 200  
3-5 125 125  
6+ 300 100  

d) What is the association between amount of alcohol consumption and risk of CAD?

4. A recent study by Pais et al (Lancet 1996;348:358-63) looked at the risk factors for acute myocardial infarction in India. 200 cases with first AMI and 200 age and sex matched controls without AMI were evaluated for several risk factors including overt diabetes. The results were as follows:

DM + among AMI cases: 36

DM + among controls : 18

The remaining in each group were negative for DM.

a) What kind of a study design is this?

b) What is the exposure factor?

c) What is the outcome?

d) Set up a 2×2 table and compute the Odds Ratio (OR)?

DM Case Control

e) From this OR, do you think DM is a risk factor for AMI?

f) The 95% Cl for the OR was calculated to be 1.31- 5.30. Does this signily an association between DM and AMI?

g) If you had wanted to test the same hypothesis using a cohort study design, how would you do it?

Exercise 5. A team of researchers from a medical college decided to estimate the prevalence of neurocysticercosis among Madras city population. They chose 1000 individuals attending various OPDs in a Government Hospital and performed CT scans on all of them. They did not find even one case among those studied. The group concluded that neurocysticercosis was not an important problem in the general population.

a) What kind of a study design is this?

b) What was the stated objective of the study?

c) What was the population chosen for the study?

d) In earlier studies, the prevalence of neurocysticercosis in the general population in India has been estimated to be 1 per 10,000. Given this expectation, was a sample size of 1000 adequate?

e) What are the biases in this study?

f) Is the conclusion of the study valid?

6. A team of researchers from a tertiary, referral eye centre decided to estimate the prevalence of diabetic retinopathy among south Indian diabetics. They chose 1000 consecutive diabetic individuals attending their OPD and performed retinal photography on all of them. The photographs were divided equally between and read by a panel of 6 Ophthalmologists. Based on their reports, among those studied, 554 were diagnosed to have diabetic retinopathy. The group concluded that diabetic retinopathy was a major problem among diabetics in south India.

a) What kind of a study design is this?

b) What was the stated objective of the study?

c) What was the population chosen for the study?

d) What are the biases in this study?

e) Is the conclusion of the study valid?

Dr. Madhukar Pai MD, DNB
Consultant, Community Medicine & Epidemiology
Email: [email protected]