The problem with being ‘Average’
The word “average” is an everyday term that we all use almost without thought. In general terms, an average is a value that is meant to be typical of a set of measurements, usually thought to be roughly in the middle. But what does it actually mean?
For example, what does this statement mean: “The average age at the start of menopause is 52 years”? If you ask a statistician, they are likely to respond in their usual cryptic way – with a question: “What type of average is it?” You may well respond with your own question: “How many types are there?” which of course sets the statistician off and you will be lucky to get an answer to your first question. You may eventually find out that there are three main types of averages: the mean, median and mode.
Mean: Imagine a sample of women who are all at the start of menopause. If you laid bars with weights equal to each of their ages in order along a beam, the mean is the point at which the beam will balance. However, if one of the women is much younger or older than the others (an outlier in statistical terms), the point of balance will move towards her age. Therefore, the mean may not give a reliable indication of what is typical of a set of values in the presence of outliers.
Median: Going back to the beam illustration, the median is the value that will divide the group of women into two equal halves. Therefore 50% of the women in the group will have an age less than or equal to the median age. Therefore, in the presence of outliers, the median gives a more reliable indication of what is typical of a set of values than the mean does.
Mode: The mode is the most frequently occurring value that, at face value, appears to be the most intuitive of the three types of averages. It is notoriously unstable, however. Statements such as “Most women start menopause at the age of 50 years” refer to the mode and seem quite sensible. Imagine a group of ten women who start menopause at the following ages: 40, 45, 45, 47, 50, 53, 56, 58, 60, 61. The mode in this case is 45 – would you consider this to be typical of this group of women? Suppose, you later find an error for the sixth patient and she is actually 50 years of age. Now there are two possible values for the mode: 45 and 50. Which do you present?
The moral of the story: When presenting or interpreting results that include an average, think about what the most suitable type of average is for the values of interest. Invariably, the mean is what is presented in many cases. However, where there are outlying values that are not typical of the rest, almost all patients maybe above or below this ‘average’. In such cases, the median is preferable.
Measures of effect: The trouble with Odds
The odds ratio is a difficult concept to comprehend. Many researchers do not differentiate it from its counterpart, the risk ratio, and end up using it inappropriately. For example, statements such as the one that follows are commonly encountered in reference to an odds ratio of x: “.. intervention A was associated with an “x-fold” or “x-times” increase in the risk of …”. This statement would be correct if x was the risk ratio, but is incorrect when x is the odds ratio, as is the case here. The treatment effect is inflated, which may help the article get more attention, but is misleading.
Table 1 illustrates the differences in the computation of odds and risk ratios. Table 2 provides examples emphasising the differences between these ratios. Odds and risk ratios can sometimes be nearly equal, such as when the event rate is small, but even then, it is not helpful to present the wrong result in support of a supposedly ‘correct’ statement.
|Table 1: Illustration of calculation steps taken to derive odds and risk ratios|
|Table 2: Illustration of calculated odds and risk ratios for comparison|
|(Tables 1 and 2 have been copied (with permission) from a free online illustrative tutorial available at www.MedStats.org/Tutorials.htm.)|
Our advice: Whenever possible, calculate and use the risk ratio, rather than the odds ratio. The target audience of these reports are medical doctors and patients, who understand and use the term “risk” everyday, correctly translated to terms such as “twice/two-fold”. Computation of the risk ratio also easily extends to the provision of other useful measures such as risk reduction and the number needed to treat (NNT).
A word of caution: There is a time and place for the odds ratio in medical research. For example, it is used in case-control studies where a risk ratio can not be calculated directly, and is also used in meta-analysis and in cohort studies or trials that use regression methods for analysis. “Odds” is also a real favourite for gamblers. We, however, wonder if even they can correctly explain “odds ratio”.
The birth-weight paradox
The strong association between low birth-weight (LBW) and infant mortality is widely accepted and appears to be simple and direct. Mortality among newborn babies is very much higher among babies with the lowest birth weight, even when considering full term babies only. Yet, this relationship appears to fall apart in the face of paradoxical findings: small babies from low-risk groups (eg non-smoking mothers, singletons) have been found to have higher mortality than small babies in corresponding high risk groups (e.g. mothers who smoke, twins). This phenomenon is called the “birth-weight paradox” and has led to several controversies over the years.
The fallacy: What is happening here? This is a typical case where an association is found and causality is assumed without considering the effect of other unmeasured underlying factors (confounders). For example, some have suggested that the effect of maternal smoking is modified by birth weight in such a way that smoking is beneficial for LBW babies. Is it?
The moral of the story: In observational studies, associations should not be taken to prove causality. Confounding variables should always be adjusted for in order to correctly interpret relationships between variables. Did anyone bother to ask: Why were the babies born to so called healthy mothers so much lighter in the first place? Is there something else we are missing (a confounder not accounted for) which might explain higher mortality?
The popular “case-mix” excuse
You attend a departmental meeting at one of the large UK institutions. Results from a regional benchmarking study are being presented to a large audience of enthusiastic local practitioners who are known to be very proud of their unit. The results are not so complimentary, and many in the audience are becoming visibly anxious through the presentation. In order to make it all “clear to everyone” and avoid undue concern, a senior member of staff interjects the presentation and makes a statement that goes as follows: “People shouldn’t read too much into these figures. The reason our figures for outcomes A to D are not as good as those from hospital X and the rest of the region is because our unit looks after high risk cases. We are comparing apples with pears here”. This is met with nods of agreement, and satisfaction all over. What timely intervention, it seems. You have a question, but the meeting is over-running. The coffee break is next.
Some questions you could ask:
- What are these high risk patients you look after actually at risk of?
- What proportion of all patients are actually “high risk”?
- Who is actually suffering these adverse outcomes: is it the “high risk” cases or the “low risk” ones?
Do not be surprised to get the following answers: “Well, not sure. Don’t really know. We have never really looked at it, to be honest. We just know we aren’t bad. In fact, we are the best in the region”.
The moral of the story: Yes, case-mix may well explain some of the differences observed, but have you checked? Having a larger proportion of high risk cases should not be blindly taken as a sufficient explanation for poor overall results. You may well be excellent with ‘high risk’ cases and really poor with those at ‘low risk’. Do not hide behind the “case mix” excuse. Check and verify the results.
Finally: Statisticians and post mortems
Statisticians are often confronted by researchers proudly brandishing reams of data which they feel are good enough to coerce into a research publication. A typical question is: “I have collected this data, what analysis can I perform on it? I would like to publish the results.” Unfortunately, such statements are symptomatic of what so many ‘research studies’ die of: no clear research question, lack of study design and no analysis plan. All these elements are essential for the success of any research project and confronting a statistician without any of them is simply asking for a post-mortem to be performed on your study.
Our advice, as is that of many statisticians, is: talk to a statistician early on in the project, ideally at planning stage. It might save a few tears. They will help you define your research question (a study research question must be specific and precise), design the study (the number of subjects in your study is an important determinant of the success of your study) and plan your analysis (you need to know what analysis you are going to do in order to collect the appropriate data).
- Huff D, Geis I. How to lie with Statistics. [S.l.]: Norton 1954.
- Hernandez-Diaz S, Schisterman EF, Hernan MA. The birth weight “paradox” uncovered? American journal of epidemiology. 2006 Dec 1;164(11):1115-20.
- Davies HT, Crombie IK, Tavakoli M. When can odds ratios mislead? BMJ (Clinical research ed. 1998 Mar 28;316(7136):989-91.
- Holcomb WL, Jr., Chaiworapongsa T, Luke DA, Burgdorf KD. An odd measure of risk: use and misuse of the odds ratio. Obstetrics and gynecology. 2001 Oct;98(4):685-8.
- The MedStats Club: www.medstats.org.
- Moses L, Louis TA. Statistical consulting in clinical research: the two-way street. Statistics in medicine. 1984 Jan-Mar;3(1):1-5.