INTRODUCTION
Apart
from the mean, median and mode are the two commonly used measures of
central tendency. The median is sometimes referred to as a measure of
location as it tells us where the data are.[1] This article describes about median, mode, and also the guidelines for selecting the appropriate measure of central tendency.
MEDIAN
Median
is the value which occupies the middle position when all the
observations are arranged in an ascending/descending order. It divides
the frequency distribution exactly into two halves. Fifty percent of
observations in a distribution have scores at or below the median. Hence
median is the 50th percentile.[2] Median is also known as ‘positional average’.[3]
It is easy to calculate the median. If the number of observations are odd, then (n
+ 1)/2th observation (in the ordered set) is the median. When the total
number of observations are even, it is given by the mean of n/2th and (n/2 + 1)th observation.[2]
Advantages
- It is easy to compute and comprehend.
- It is not distorted by outliers/skewed data.[4]
- It can be determined for ratio, interval, and ordinal scale.
Disadvantages
- It does not take into account the precise value of each observation and hence does not use all information available in the data.
- Unlike mean, median is not amenable to further mathematical calculation and hence is not used in many statistical tests.
- If we pool the observations of two groups, median of the pooled group cannot be expressed in terms of the individual medians of the pooled groups.
MODE
Mode
is defined as the value that occurs most frequently in the data. Some
data sets do not have a mode because each value occurs only once. On the
other hand, some data sets can have more than one mode. This happens
when the data set has two or more values of equal frequency which is
greater than that of any other value. Mode is rarely used as a summary
statistic except to describe a bimodal distribution. In a bimodal
distribution, the taller peak is called the major mode and the shorter
one is the minor mode.
Advantages
- It is the only measure of central tendency that can be used for data measured in a nominal scale.[5]
- It can be calculated easily.
Disadvantages
- It is not used in statistical analysis as it is not algebraically defined and the fluctuation in the frequency of observation is more when the sample size is small.
POSITION OF MEASURES OF CENTRAL TENDENCY
The
relative position of the three measures of central tendency (mean,
median, and mode) depends on the shape of the distribution. All three
measures are identical in a normal distribution [Figure 1a]. As mean is always pulled toward the extreme observations, the mean is shifted to the tail in a skewed distribution [Figure [Figure1b1b and andc].c].
Mode is the most frequently occurring score and hence it lies in the
hump of the skewed distribution. Median lies in between the mean and the
mode in a skewed distribution.[6,7]
Figure 1
The
relative position of the various measures of central tendency. (a)
Normal distribution (b) Positively (right) skewed distribution (c)
Negatively (left) skewed distribution.
- Arithmetic mean (or simply, mean) – the sum of all measurements divided by the number of observations in the data set.
- nominal data, which have purely qualitative category assignments.
- Geometric mean – the nth root of the product of the data values, where there are n of these. This measure is valid only for data that are measured absolutely on a strictly positive scale.
- Harmonic mean – the reciprocal of the arithmetic mean of the reciprocals of the data values. This measure too is valid only for data that are measured absolutely on a strictly positive scale.
- Weighted mean – an arithmetic mean that incorporates weighting to certain data elements
- Truncated mean
– the arithmetic mean of data values after a certain number or
proportion of the highest and lowest data values have been discarded.
- Interquartile mean (a type of truncated mean)
- Midrange – the arithmetic mean of the maximum and minimum values of a data set.
- Midhinge – the arithmetic mean of the two quartiles.
- Trimean – the weighted arithmetic mean of the median and two quartiles.
- Winsorized mean – an arithmetic mean in which extreme values are replaced by values closer to the median.
- Geometric median - which minimizes the sum of distances to the data points. This is the same as the median when applied to one-dimensional data, but it is not the same as taking the median of each dimension independently. It is not invariant to different rescaling of the different dimensions.