Epidemiology Glossary – Flashcards
Unlock all answers in this set
Unlock answersquestion
Attributable risk
answer
Attributable risk = Incidence in the exposed - Incidence in the unexposed. A measure of exposure effect that indicates, on an absolute scale, how much greater the frequency of disease in the exposed group is compared with the unexposed, assuming the relationship between exposure and disease is causal (an important assumption). It is the difference between the incidence rate in the exposed and non exposed groups, i.e. it represents the risk attributable to the exposure of interest. The attributable risk is especially useful in evaluating the impact of introduction or removal of risk factors. Its value indicates the number of cases of the disease among the exposed group that could be prevented if the exposure were completely eliminated.
question
Bradford Hill Criteria
answer
ACCESS Drunk Robots Trash Paris Analogy - For analogous exposures and outcomes an effect has already been shown Consistency - relationship is observed repeatedly in different persons, places, circumstances, and time Coherence - a causal conclusion should not fundamentally contradict generally known facts of natural history and biology of disease Experimental evidence - causation more likely if evidence is based on appropriate study design Strength- a strong association is more likely to have a causal component than is a modest association Specificity - a single putative cause produces a specific effect Dose-response relationship - an increasing amount of exposure increases the risk Reversibility - The condition can be altered (prevented or ameliorated) by an appropriate experimental regimen Temporality - the factor must precede the outcome it is assumed to affect Plausibility - causation is biological plausible
question
Case
answer
an individual with the outcome under study (in a case-control study). Epidemiological research is based on the ability to quantify the occurrence of disease in populations. This requires a clear definition of what is meant by a case. This could be a person who has the disease, health disorder, or suffers the event of interest (by "event" we mean a change in health status, e.g. death in studies of mortality or becoming pregnant in fertility studies). The epidemiological definition of a case is not necessarily the same as the clinical definition.
question
Case-control study
answer
study in which individuals are selected on the basis of whether or not they have the outcome of interest; usually some relatively rare outcome. Exposure (risk factor) status is explored to establish whether the exposure is more common in the case (those that have the outcome) or control (those that do not have the outcome) group. This type of study always results in an odds ratio, for example comparing the odds of being exposed (e.g. a smoker) in those who had the outcome (e.g. pancreatic cancer), with the odds of being a smoker in those who did not have pancreatic cancer.
question
Cause
answer
the key question in most medical research. Did exposure to electromagnetic radiation cause the leukaemia in children living near mobile phone masts? Did HRT cause the higher DVT rates in women taking it? Research works by trying to disprove alternative explanations (e.g. chance, confounding). If this can be done, then the relationship between the exposure and the outcome will be one of causation.
question
Count
answer
the most basic measure of disease frequency is a simple count of affected individuals. The number (count) of cases that occurred in a particular population is of little use in comparing populations and groups. For instance, knowing that there were 100 cases of lung cancer in city A and 50 in city B does not tell us that lung cancer is more frequent in city A than B. There may simply be more people in city A. The number of cases may, however, be useful in planning services. For instance, if you wanted to set up an incontinence clinic, you would want to know the number of people with incontinence in your population.
question
Chi squared test
answer
a statistical procedure for testing whether two proportions are similar (e.g. whether the proportion of lung cancer cases in males who smoke is significantly different to the proportion of lung cancer cases in males who do not smoke).
question
Cohort study
answer
study in which individuals are selected on the basis of exposure status and are followed over a period of time to allow the frequency of occurrence of the outcome of interest in the exposed and non exposed groups to be compared. Take a group of people, note whether they've been exposed or not, observed them over time and wait for them to get ill, to die etc. This type of study typically produces a relative risk.
question
(95%) Confidence interval
answer
an estimated range of values calculated from a given set of sample data which are likely to contain the 'true' population value. E.g. a range of values around a relative risk measure which would, in 95% of such studies, contain the 'true' risk (the true risk being the relative risk that would be obtained if the study had included the entire population of patients). By "contain (or 'span') the true value", we mean that the true value lies above the lower value of the confidence interval but below the upper values of the confidence interval. For example, for a 95% confidence interval of 1.2 - 3.4, we can say that we are 95% confident that the true value of risk will not be lower than 1.2 and will not be higher than 3.4. If we find that our confidence interval for the relative risk or odds ratio for group A compared with group B does not include 1, then we typically reject the null hypothesis of no difference. However, if our study is not on rates of disease or on proportions of patients exposed but is on a measure such as blood pressure or weight, we would typically reject the null hypothesis if the confidence interval for the average difference in blood pressure or weight between group A and group B does not include 0, not 1. Why is this? See entry for null hypothesis.
question
Confounding
answer
a possible explanation for the study finding if confounding variables have not been taken into account in the study.
question
Confounding variable
answer
a factor that is associated with both the exposure and outcome of interest. Common confounders include age, smoking, socio-economic deprivation. Smoking is a confounder because smoking tends to be more prevalent in people exposed to non-tobacco-related toxins and carcinogens, and also more prevalent in people with a range of diseases.
question
Control (as opposed to a case)
answer
a person without the outcome under study (in a case-control study), or a person not receiving the intervention (in a clinical trial). The choice of an appropriate group of controls requires care, as we need to be able to draw useful comparisons between these controls and the cases/intervention group.
question
Exposure
answer
when people have been 'exposed', they have been in contact with something that is hypothesised to have an effect on health e.g. tobacco, nuclear radiation, pesticides in food, HRT. Contact may be via any route: oral, inhalation, through the skin etc. These are typically called 'risk factors' of disease. We are interested in whether the exposure results in higher (or sometimes lower) outcome rates.
question
Forest plot
answer
Graphical display designed to illustrate the relative strength of treatment effects in multiple quantitative scientific studies addressing the same question. It was developed for use in medical research as a means of graphically representing a meta-analysis of the results of randomized controlled trials.
question
Funnel plot
answer
useful graph designed to check the existence of publication bias in systematic reviews and meta-analyses.
question
Galbraith plot
answer
one way of displaying several estimates of the same quantity that have different standard errors. It can be used to examine heterogeneity in a meta-analysis
question
Hierarchy of studies
answer
Systematic reviews and meta-analysis Randomised controlled trials Cohort studies Case-controlled studies Descriptive/cross-sectional studies Case reports
question
Incidence
answer
Incidence = Number of new cases of disease in a given time period / Number of disease-free persons at the beginning of that time period. The number of new cases of the outcome of interest occurring in a defined population over a define period of time. Note that this is not the same as prevalence, which includes new and old cases. Incidence measures events (a change from a healthy state to a diseased state). This measure of incidence can be interpreted as the probability, or risk, that an individual will develop the disease during a specific time period.
question
Matching
answer
A method for "controlling for" (i.e. effectively removing) the effect of confounding at the design stage of a case-control study; controls are selected to have a similar distribution of potentially confounding variables to the cases, e.g. they are said to be "matched" for sex if there are similar proportions of men and women in both groups.
question
Normal distribution
answer
A set of values and frequencies that describe many things in nature, at least approximately, e.g. height, weight, blood pressure. This symmetrical distribution (see Figure 1) is the basis of many statistical tests because, if you know the average value (usually called the mean) and the standard deviation, then you can draw every point of a normal distribution and you know what proportion of values are greater than (or less than) any given point, e.g. the % of men more than two metres tall. Some things are not normally distributed (e.g. proportions of anything, serum concentrations of electrolytes) but can be made to fit quite well after some simple mathematical trickery.
question
Null hypothesis
answer
Formulating a null hypothesis is the first stage in performing any statistical test. Typically, when two groups (A and B) are being compared, the null hypothesis that the statistical test tries to disprove is that there is no difference between the two groups in the measure being tested. If we are comparing rates, then the null hypothesis would be that rate A equals rate B, which means that the relative risk (rate A divided by rate B) equals 1. For case-control studies, the null hypothesis would be that the odds of exposure for group A equal the odds of exposure for group B, i.e. the odds ratio (odds of exposure for A divided by the odds of exposure for B) equals 1. A statistical test is then performed on the relative risk or the odds ratio and a confidence interval for it is derived. We can reject the null hypothesis if the confidence interval does not include the value expected under the null. In this case, the null has RR=1 or OR=1, so we would reject it if the confidence interval does not include 1. However, for normally distributed variables such as blood pressure (BP) in Question 4, the null hypothesis would be that the average BP for group A equals the average BP for group B, i.e. the difference between the two average BPs equals 0. The statistical test would then be performed on this difference in average BPs and the resulting confidence interval would also relate to the difference in average BPs. We therefore would reject the null hypothesis if the confidence interval did not include 0, which is the value expected under the null. If, when faced with a confidence interval around some measure and wondering whether to reject the null hypothesis or not, you can't remember whether it should include 1 or 0, always think in terms of what value the null hypothesis expects your measure to have and then see if that value falls within the range of values covered by the confidence interval.
question
Odds
answer
the odds is another way to express probability, e.g. the odds of exposure is the number of people who have been exposed divided by the number of people who have not been exposed. The mathematical relationship between odds and probability is: Odds = probability / (1 - probability)
question
Odds ratio
answer
Odds ratio = odds of exposure in the diseased group (cases) / odds of exposure in the disease-free group). The relative risk can be calculated from cohort studies, since the incidence of disease in the exposed and non-exposed is known. In case-control studies, however, the subjects are selected on the basis of their disease status (sample of subjects with a particular disease (cases) and sample of subjects without that disease (controls)), not on the basis of exposure. Therefore, it is not possible to calculate the incidence of disease in the exposed and non-exposed individuals. It is, however, possible to calculate the odds of exposure. The odds ratio (of exposure) is the ratio between two odds, e.g. the odds of exposure in the case s divided by the odds of exposure in the controls. This ratio is the measure reported in case-control studies instead of the relative risk. It can be mathematically shown that the odds ratio of exposure is generally a good estimate of the relative risk. An odds ratio of 1 tells us that exposure is no more likely in the cases than controls (which implies that exposure has no effect on case/control status); an odds ratio greater than 1 tells us that exposure is more likely in the case group (which implies that exposure might increase the risk of the disease). An odds ratio less than 1 tells us that exposure is less likely in the case group (which implies that exposure might have a protective effect).
question
Outcome
answer
the event or main quantity of interest in a particular study, e.g. death, contracting a disease, blood pressure
question
Population attributable risk (also known as the population excess risk)
answer
a measure of the risk of outcome in the study population which is attributable to the exposure of interest.
question
Population excess fraction (also known as the population attributable fraction)
answer
a measure of the proportion (fraction) of the cases observed in the study population attributable to the exposure of interest.
question
Prevalence
answer
Prevalence = Number of cases in a defined population at one point in time / Number of persons in a defined population at the same point in time The number of cases of an outcome of interest in a defined population at a particular point of time, hence it is often called point prevalence. This includes both new (also called "incident") cases and existing cases.
question
P-value
answer
the probability of obtaining the study result (relative risk, odds ratio etc) if the null hypothesis is true. The smaller the p-value, the easier it is for us to reject the null hypothesis and accept that the result was not just due to chance. A p-value of 0.05 is usually seen as providing insufficient evidence against the null hypothesis, so we accept the null.
question
Randomisation
answer
a method for ensuring that both groups in a clinical trial (i.e. those receiving the intervention and those not receiving the intervention (controls)), have similar proportions of confounding variables, such as age.
question
Rate and risk
answer
these words are often taken to mean the same thing (though to some epidemiological purists they are not always the same). We talk of someone's risk/chance/probability of getting a disease (or getting pregnant or dying etc.) and a population having a disease rate. Both terms imply a proportion, i.e. the number of people with the outcome of interest divided by the total number of people at risk of the outcome.
question
Regression
answer
a method for controlling the effect of confounding at the analysis stage of a study - statistical modelling is used to control for one or many confounding variables.
question
Relative risk
answer
Relative risk = incidence in exposed group / incidence in the unexposed group the relative risk is used as a measure of association between an exposure and disease For example, the proportion of people with high cholesterol who developed ischaemic heart disease divided by the proportion of people with normal cholesterol who developed ischaemic heart disease. A value of 1.0 indicates that the incidence of disease in the exposed and the unexposed are identical and thus the data shows no association between the exposure and the disease. A value greater than 1.0 indicates a positive association or an increased risk among those exposed to a factor. Similarly, a relative risk less than 1.0 means there is an inverse association or a decreased risk among those exposed, i.e. the exposure is protective.
question
Restriction
answer
a method for controlling the effect of confounding at the design stage of a study, e.g. by including patients in a clinical trial only between the ages of 18 and 65 without pre-existing illness so that the results of the trial are not confused ('confounded') by different levels of age or morbidity in the two treatment groups.
question
Sample
answer
a relatively small number of observations (or patients) from which we try to describe the whole population from which the sample has been taken. Typically, we calculate the mean for the sample and use the confidence interval to describe the range within which we think the population mean lies. This is one of the absolutely key concepts behind all medical research (and much non-medical research too).
question
Standardisation
answer
a method for controlling the effect of confounding at the analysis stage of a study. Used to produce a Standardised Mortality Ratio, a commonly used measure in epidemiology.
question
Standardized mortality ratio
answer
Ratio of observed deaths to expected deaths, where expected deaths are calculated for a typical area with the same age and gender mix by looking at the death rates for different ages and genders in the larger population. Represents the ratio of the number of observed deaths (or cases of disease) (O) in a particular population to the number that would be expected (E), if that population had the same mortality or morbidity experience as a standard population, corrected for differences in age and sex structure.
question
Statistical test
answer
the only way to decide whether the results of your analysis, e.g. your measure for group A compared with your measure for group B, are likely to be due to chance or could be real. The procedure for doing a statistical test is to take one value representing the observed difference in your study between groups A and B and compare that value against tables of an appropriate mathematical distribution such as the normal distribution to see how extreme it is (we use computers instead of printed tables, thankfully, these days). For example, to see if someone is unusually tall, we would need to compare their height with a normal distribution with the mean and standard distribution taken from members of the population of the same age and sex. This would be done by subtracting the population mean from the person's height and dividing by the population standard deviation and looking up the result (called the "test statistic") in a table of the standard normal distribution (so-called because it has a mean of 0 and standard deviation of 1) to find out what proportion of values are greater than this. This proportion is therefore the proportion of the population who are taller than the person. Something similar is routinely done on infants to monitor their growth.
question
Stratification
answer
a method for controlling the effect of confounding at the analysis stage of a study - risks are calculated separately for each category of confounding variable, e.g. each age group and each sex separately.
question
Misclassification bias
answer
Failure to ascertain disease incidence or vital status for any appreciable segment of a study group may lead to erroneous or misleading conclusions. (e.g. diagnostic improvements over time)
question
Attrition bias
answer
People lost to follow-up may be atypical, either because of vital status or of previous exposure, or factors (such as age) associated either with exposure or outcome. (e.g. sick people may not fill in follow-up questionnaires)
question
Healthy volunteer or healthy worker effect:
answer
Refers to low mortality/low disease incidence that healthy population volunteers or industrial cohorts may experience when compared to the general population Selection of healthy persons into employment Selection of unhealthy persons out of the cohort / workforce
question
Nested case-control study
answer
Case-control study done in the population of an ongoing cohort. Thus the case-control study is "nested" within the cohort: Cases of a disease that occur in a defined cohort are identified and, for each, a specified number of matched controls are selected from among those in the cohort who have not developed the disease by the time of disease occurrence in the case. Potentially offers reductions in costs and efforts of data collection and analysis compared with the full cohort approach, with relatively minor loss in statistical efficiency Advantageous for studies of biologic precursors of disease. Compared with case-control studies, nested case-control studies can reduce 'recall bias' and temporal ambiguity, and compared with cohort studies can reduce cost and save time. The drawback of nested case-control studies is non-diseased persons from whom the controls are selected may not be fully representative of the original cohort, due to death or failure to follow-up cases
question
Type II or β error
answer
the number of times you have failed to pick up the truth (false negative) Power = 1 - β
question
Type I or α error
answer
the number of times you find a significant difference that is in actuality, false (ie false positive)
question
Blinding
answer
a. Triple blind - doctor, patient, person doing measurements don't know b. Double - doctor and patient doesn't know c. Single - doctor knows, patient doesn't know Removes: a. Doctors knowledge bias b. Patient's knowledge bias c. Withdrawal bias - occurs when subjects who leave the study (drop-outs) differ significantly from those that stayed
question
Cross-over design (in clinical trial)
answer
Wash-out with placebo to get rid of drugs Helpful because people become their own controls, so the standard deviation (variability in blood pressure) comes down Reduces number of people needed in the study Does not require baseline
question
Parallel study (in clinical trial)
answer
Groups of people with specific treatments and monitor death rates years after Useful for big numbers Can be done for things like surgery (which cannot be done with crossover studies) No concerns of carry-over - when two experimental conditions are applied to the same sample or participant, and the effect of the first condition "carries over" to the second
question
Critical appraisal checklist
answer
Queen Dorothy Perkins MAC BEI Question - Is there a hypothesis? Is the question relevant? Design - Is it cross-sectional, cohort, case-control, ecological, RCT? Hierarchy of studies. Is it appropriate? Population - Sample size appropriate? Results generalizable to other populations Methods - Exposure measurements Analysis - appropriate statistical tests? Chance? Confounders Presence of confounders Bias - Measurement/selection? Ethics - Is study ethical Interpretation - Do the authors interpret results correctly? Do they make a causal inference?
question
Twin studies
answer
Compares similarity of MZ and DZ twins for the disease under study MZ/DZ concordance ratio may let us infer mode of inheritance o Strictly genetic trait: MZ = 100%, DZ = 25-50% o Complex trait: low concordance rate
question
Adoption studies
answer
Distinguish genetic and environmental influences on family resemblance by comparing rates of a disorder in biological family members to those in adoptive family members If the disease is genetic, biological (genetically related) family members should resemble each other more than do adoptive (environmentally related) family members
question
Parametric linkage analysis
answer
Loci that are close enough together on the same chromosome segregate together more often than do loci on different chromosome. Assume mode of inheritance
question
Model-free linkage analysis
answer
Similarity of phenotype between relatives No assumption of genetic model
question
Candidate gene
answer
Gene for which there is evidence of its possible role in the trait or disease that is under study Gene hypothesized to play a role in disease by their location in a region of linkage
question
Genome wide
answer
Surveys most of the genome for causal genetic variants No assumptions are made about the genomic location of the causal variants
question
Population-based case-control design
answer
Ascertain two groups of individuals from the population: unrelated affected cases and unrelated healthy controls Standard statistical tests to compare relative frequencies of alleles (genotypes) at a single marker locus in cases of controls (Pearson's χ2 test, logistic regression)
question
Family-based association design
answer
Ascertain small nuclear families and extended pedigrees containing affected and unaffected Focus on transmission of marker alleles from parents to offspring Standard statistical tests to compare transmissions of marker alleles to affected and unaffected offspring
question
Population stratification
answer
Presence of systematic differences in allele frequencies between subpopulations in a population possibly due to different ancestry can lead to artefactual evidence of association due to confounding