The study of the distribution, determinants and deterrents of morbidity and mortality in human populations. Works on groups, not on individuals. Assumes causal and preventive factors can be identified through systematic investigation of different populations or subgroups of individuals in a population in different places or times.
Risk factors (Antecedents)
A behavior, environmental exposure or inherent human characteristic that is associated with an important health related condition. Associated with an increased probability of disease but may not always cause the diseases.
Assumptions of Epidemiology
1) Human Disease does not occur at random, 2) Causal and preventative factors can be identified through systematic investigations of different populations/subgroups of individuals in a population in different places or times
Measures frequency/prevalence of disease and describes the existing distribution of variables without regard to causal associations. Examines distribution of disease (person, place, time). Cannot test hypothesis, but can generate a hypothesis from descriptive studies
Identifying causes of disease with a goal of prevention (deterrents). Tests a hypothesis. We do them to look for a relationship/association between an exposure and disease. We need a measure of this association
Characteristic positively associated with an important health related condition
Proposed that external and person environment be considered to explain development of the disease. Defines Epidemic and Endemic
Published: Nature and Political Observations Made Upon the Bills of Mortality. Collected info on births & deaths, credited as founder of medical (vital) stats
First clinical trial on sailors with scurvy.
Noted milkmaids got cowpox, not smallpox and used cowpox pustule to create smallpox vaccine
Complied/reported numbers of deaths in Office of Registrar General for England and Wales. Developed forerunner of ICD codes. Called Father of Modern Epi.
First to propose/test hypothesis: that Cholera is transmitted by water. Compiled data from water supplier and neighborhood. Broad Street Pump.
Reasons for IRB
created because of Nuremburg, PHS Syphilis Trial, Willowbrook Hepatitis Trial, Jewish Chronic Disease Hospital Cancer Trial and Milgram Deception Study.
Public report after Tuskegee Syphilis Study from National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research. Defines 3 Ethical Principles: 1) Respect for Persons, 2) Beneficence, 3) Justice.
Health v. Disease
Health is defined by physical, mental and social well-being. In Epi, we must quantify this, so we measure disease in terms of morbidity and mortality.
Group of people with common characteristic(s). We need to know the population to determine who is at risk
a/b Numerator and denominator are mutually exclusive, no implied relationship. Used to compare the magnitude of two or more measures
a/(a+b) Numerator is included in the denominator. Tells us what fraction or percent is affected. No time period is in the calculation
a/(a+b) per specific period of time (usually person-years). Numerator is included in the denominator.
Case Fatality Rate
Measure of prognosis or the rate at which people die of a disease (#of deaths from a disease/# cases of the disease)
a measure of prognosis, measure the possibility of surviving a specified time period (# new cases with disease-#deaths among these cases/# new cases)
Years of Potential Life Lost (YPLL)
Every age at death that occurs before the selected endpoint (arbitrary age chosen to NOT be premature death) is subtracted, and the years summed. Measures relative impact of premature death on society
# cases of a given disease that exist in a defined population at a specified time. Measures actual cases (or burden of disease) so we use this data to plan delivery of health services. Most commonly measured in cross-sectional studies and may identify prognostic factors rather than etiologic factor (who survives rather than who gets a disease). Prevalent cases determined by both disease occurrence and duration/survival.
Proportion of a defined population that has a specific disease or attribute. It is not rate, even though it is in the name. May be point prevalence rate, period prevalence rate or lifetime prevalence rate. Unlikely, but IF ID and duration of disease are constant, and the prevalence is
# existing cases at a point of time/total population, sick and healthy (a ‘snapshot’ at a date or event)
Measured over a period of time, but is still a proportion. Numerators include all cases at the beginning of the time and new cases occurring during the time period. Denominator includes everyone in a population.
Lifetime Prevalence Rate
Proportion of individuals in a population who have had a given disease at any time in their life.
# of new cases or events occurring in a defined population in a specified period of time. Provides us with risk of getting disease in individuals who are not yet infected, not in actual cases
Incidence Rate (Incidence Density)
The rate at which new cases occur in a defined population. # of new cases/population at risk of the disease in time healthy person time
Cumulative Incidence Rate (CI)
# of new cases/Total Population at risk. This is a proportion, and measures probability (risk) that an individual will develop a disease during a specified period of time. Exclude prevalent cases. Must follow people over time who were previously not diseased until an outcome occurs or the observation period ends.
accumulates when we observe a group over time to ascertain development of an event. Allows for movement of people in and out of study
# of cases/entire population. Difficult to compare because of variances in populations (FLA to Alaska, 1970 population to 2010 population)
# cases in a category/total population of that category. Refer to categories of the population defined by specific characteristics (age, sex, etc)
Direct Adjusted Rates
1) select standard population. 2) Multiply specific mortality (death) rate in a category by number of persons in that category in the standard population. 3) Sum the expected number for each column to get a total number expected, and divide by the total standard population. Watch your decimals and zeros in denominators
Race/Ethnicity in Epi
Often used variables in research. Biologically race is ill defined, poorly understood and questionable. Fullilove described it as an arbitrary system of visual classification. DNA evidence indicates genetic diversity is a continuum with no clear breaks that delineates racial groups, and many identify with more than 1 racial group. Ethnicity can be used instead, but is complex. Using race/ethnicity may help identify subgroups to which additional resources need to be directed. Some believe this further stigmatizes certain subgroups. If race is included in a study, there must be a strong rationale, and there could be other variable to be better surrogates.
Short term fluctuations
Such as food-born outbreaks
Periodic fluctuations on a seasonal/annual basis. Mostly valuable in investigations of acute diseases or those with a short latent period (influenza in winter bc of increased human contact)Annual increases in influenza in cold months
Long term changes over decades or more. Can be caused by change in the actual incidence of disease due to alterations in environment, genetic or lifestyle factors. Also diagnosis, treatment or accuracy of records.
The occurrence in a community or region of cases of an illness, specific health related behavior or other health related events clearly in excess of normal expectancy. Always consider person, place and time.
Generally based on place. Aggregation of relatively uncommon events or diseases in space and/or time in amount that they are believed or perceive to greater than could be expected by chance.
Case Reports/case series
Describe the experience of a patient or group of patients-may lead to a hypothesis (Descriptive Study)
Measure characteristics in entire populations, not individuals. May also be analytic and test a hypothesis (Descriptive Study)
Exposre and disease measured at the same time in a group of individuals. May also be analytic and test a hypothesis (Descriptive Study)
Detailed report on the clinical profile of a single patient. Strengths: 1) Document unusual medical history/clinical features of disease, 2) Can provide clues in the identification of a new disease or adverse effects of exposure. Limitations: 1) No comparison, 2) Cannot be used to test for statistical association, 3) Risk factors may be coincidental (Descriptive Study)
Collections of individual case reports in short time span. Description of clinical/epidemiologic characteristics of a number of patients with a given disease. Strengths: 1) not single case, more probability of pattern, 2) Can examine the dose-response relationship by examining levels of exposure with levels of disease severity. Limitations: No comparison group, so cannot test for presence of valid statistical association. (Descriptive Study)
Uses data from the entire population to compare disease frequencies between different groups during the same time period or the same population at different points in time. (ex: per capita consumption of meat and colon cancer rates). Strength: Cheap, quick; may stimulate additional epi research; may be only design for uncovering association at group level; becoming more popular w/GIS data. Limitations: Cannot link exposure w/disease in individuals, therefore possibility of making ecological fallacy; sources may not be accurate; cannot control for confounding factors, cannot establish temporal sequence . (Descriptive Study)
Ecological Fallacy (Aggregation Bias)
Patterns observed on the aggregate level are not observed and cannot represent the association at the individual level. Cannot control for outside factors which may explain the association.
‘Snapshot’ of a cohort at one point in time… exposure and disease outcome are measured simultaneously. Can compare point prevalence ratios and prevalence odds. May be descriptive or analytic.Includes prevalent cases of disease (may have had it for 1 day or 10 years). No information on the temporal relationship between exposure and disease. Good for unchanging variables or between current and past practice. Both disease and exposure may have been the result of a third factor. Calculates prevalence risk ratio.
Repeated Measures Studies
Successive cross-sectional studies, repeated surveys of same population, not same individuals. Detect overall time trend in a population. (Descriptive Study)
Healthy subjects are defined by their exposure status and followed over tome to determine incidence of disease. Can be prospective or retrospective. Calculates Relative Risk.
Combination of cross-sectional and cohort study. Same individuals surveyed at several points in time. Can measure changes in individuals
Subjects defined by their disease status. Group of cases are compared with a group of controls which are drawn from same source population. The study looks backward at exposure of subjects. Cannot calculate incidence in case-control studies, so cannot calculate relative risk….calculates odds ratio to estimate relative risk.
Nested case-control studies
Type of case-control study conducted within a cohort study.
Type of Case control study in which controls are sampled from the entire population—those at risk at the start of the study
Type of Case control study in cases serve as own controls—good for event that have acute onset times
Intervention (Experimental) Studies
Randomized controlled clinical tests. Community or field trials
Randomized Control Trials
Random allocation of volunteers to experimental or control procedure to determine impact of experimental exposure or outcome. Sub-divided into Preventive, Intervention and Therapeutic.
Random allocation is at a community level or other group. Example: Floride in water
Measure of Association
A single measure that estimates the association between an exposure and the risk of developing disease. Calculates the ratios of the measures of disease frequency (usually incidence) called the Relative Risk. OR calculates the difference between the two measure of disease frequency called risk difference/attributable risk
RR=IE/IU Incidence of exposed/Incidence of not exposed. Estimates the strength of an association between exposed and unexposed. Relative to baseline incidence. 1.0= no relationship; >1=Exposure is Risk Factor;
# New Cases in a specified time/Total Population at Risk
# New Cases in a specified time/Total person-time of observation
(a/a+b)/(c/c+d) CIE/CIU Relative risk when cumulative incidence is the incidence measure being used
(a/PYE)/(c/PYU) IDE/IDU Relative risk when Incidence Density is the incidence measure being used
Prevalence Rate Ratio
Used in Cross-sectional studies. Prevalence Rate Among Exposed/Prevalence among Unexposed.
The population from which the cases are derived.
Cases should be representative of all those with disease in the source population.
Problems when selecting cases
Incident cases are newly diagnosed, so it takes time for them to occur and so the study takes longer. Prevalent cases are existing cases, they are available now, but may include many long survivor and not representative of all cases.
Controls should be representative of all those without the disease in source population. We use them to compare the history of exposure among the cases with that of individuals from the same source population who are free of the disease. They are a sample of the source population that produced the cases—they provide information on the exposure distribution of the source population. Must be sampled independent of exposure status.
Selection of an appropriate comparison group is the most difficult and critical issue in the design of case control. There is no perfect group, choose several if possible. If study results are consistent across control groups, it suggests results are valid. If different effects are observed, it still provides useful information as to nature of the association or possible biases might be present.
Can use dead controls, but we usually don’t. You can use proxy respondents for both, but they are not representative of source population.
General Population Controls
Used when cases are selected from the population in which the study is based, or as an alternate control group in hospital based case control studies. Can be collected from canvassing, random-digit telephone dialing, etc. Advantages: Can be selected randomly, minimizes potential for selecting certain characteristics which may alter results, and is representative of the source population. Disadvantages: Can be costly/time consuming, population lists for sampling are not always available, quality of information may differ between cases and control, healthy individuals may be less motivated to participate (give less thoughtful answers), some might actually have the disease (you are only asking them, not testing them)
Easily available, readily identifiable, more likely to be aware of prior exposure, more likely to be affected by same intangible factors that influenced cases to come to a particular hospital, more willing to participate. Disadvantages: People are sick, their disease may be associated with the exposure of interest, may not be representative of the source population in terms of exposure
Special control series
Controls which came from family/friends/relatives. Advantages: healthy, more likely to participate, may offer a degree of control regarding potential confounding factors. Disadvantage: Potential for inadvertently selecting a group too similar to cases
May be of the same type or different (one set of hospital controls, one set of community controls). Using more controls of the same type increases the power of the study, but this caps out at about 4
Selecting controls so that they are similar to cases in certain characteristics which may be associated with the disease being studied (age, sex, race, etc)
Selecting controls so the proportion of controls with a characteristic such as age or sex is the same proportion as in the case group (if you have 70% male cases, pick 70% male controls). You must select the cases first then select controls in the same proportions for selected variables.
Individual (Pair) Matching
A control is selected for each case based on similarity in specified characteristics (age, race, sex). Often used with hospital controls. Problems: Finding matching control is more difficult the more characteristics are matched. Cannot study a characteristic which the cases and controls are matched on. Unplanned matching may occur (the hospital neighborhood might be same race). Match only on characteristics what are risk factors for the disease which are NOT to be investigated in the study. If either partner leaves, both are eliminated
Obtaining Exposure Information
In a case control study, after choosing controls/cases, you obtain exposure information usually by interview. Interviewee may not know information or may not remember it, which may lead to misclassification
Sources of Exposure Information
Interviews, medical records, questionnaires, surrogates
The differing recall by cases than controls. Cases may be more motivated to try to remember thing that might have make them sick.
An interviewer may know who has the disease or not, and treats subject differently.
Case-Control Study Advantages
Good for studying relatively rare diseases, relatively fast to conduct, relatively inexpensive, relatively few subjects, existing records may be available, minimal risks to subjects, study multiple causes of disease (exposures), evaluation of disease with long latency periods
Case-Control Study Dis-advantages
Selection of appropriate controls may be difficult, relies on recall or existing records, validation of information difficult, control of other variables may be difficult, cannot calculate relative risk directly or determine prevalence, temporal sequence difficult to establish, prone to selection/observer/recall bias
OR=AD/BC Also called the cross-product ratio. Odds is the number of ways an event can occur divided by the number of ways it cannot occur. Can be calculated for cohort, case-control and cross-sectional studies
In a matched-pair analysis, both case and control had the same exposure
In a matched-pair analysis, case and control had different exposures
Odds Ratio for Matched Pairs
Interpretation of OR as RR
Generally speaking in traditional case-control studies OR ~ RR when the following assumptions are met: 1) When the cases are representative, with regard to history of exposure, of all people with the disease in the population from which the cases were drawn. 2) When the controls are representative, with regard to history of exposure, of all people without the disease in the population from which the cases were drawn. 3) When the disease being studied does not occur frequently.
Nested Case-Control Studies
A case-control study conducted within a cohort study. The cohort study provides the roster for control selection. Cases consist of all incident cases generated by the source population (cohort) over the study period. For each case a set of controls is selected from subjects at risk at the time of the disease occurrence. Same individual can randomly be selected as a control for more than one case. A Participant is eligible to become a control and when they develop the disease they become a case
Selecting Controls in Nested Case-Control Studies
Risk set sampling for controls (controls selected from population at risk as cases are diagnosed. No need for rare disease assumptions for OR ro estimate RR (because it comes from an existing cohort).
An alternative to the nested case-control study without matching on time. Controls are selected from those at risk at the beginning of the study period, usually a random sample of all members of the cohort. As controls are a random sample for the study base, they can serve as controls for multiple diseases.
Case Cross Over Studies
Developed for situations hen brief exposure causes a transient change in risk of a rare acute onset disease (MI immediately following heavy physical exertion). A variant on matched-pair design, each person serves as its own control. A cases; d person time is divided into an index period (case) and a reference period (control). The brief period of increased risk following the transient exposure is called the hazard period. The exposure frequency during the hazard period is compared to a control period.