Quartiles and percentiles

  • EDEXCEL A Level

Video masterclass

Topic summary

A median is a 'half-way' point in our data. We can use quartiles and percentiles to find any fraction (or percentage) of the way into our data.

Quartiles

Quartiles divide a data set into four equal parts. The three quartiles are:

  • Lower Quartile (Q1): The value below which 25% of the data falls. This is also known as the 25th percentile.
  • Median (Q2): The middle value of the data set, also known as the 50th percentile.
  • Upper Quartile (Q3): The value below which 75% of the data falls, also known as the 75th percentile.

To find the quartiles for a small discrete data set, use these formulas for the positions of the quartiles:

\[Q_1 = \frac{n}{4}\]

\[Q_2 = \frac{n+1}{2}\]

\[Q_3 = \frac{3n}{4}\]

Where \(n\) is the total number of data points. With \(Q_1\) and \(Q_3\), we always round this number up, unless it is a integer where we add 0.5.

To find the quartiles for a large grouped data set, use these formulas for the positions of the quartiles:

\[Q_1 = \frac{n}{4}\]

\[Q_2 = \frac{n}{2}\]

\[Q_3 = \frac{3n}{4}\]

We use the exact values in our further calculations.

Percentiles

Percentiles divide a data set into 100 equal parts. The k-th percentile is the value below which \(k\%\) of the data falls. The formula for the position of the k-th percentile is:

To find a percentile for a large grouped data set, use these formulas for the position of the k-th percentile:

\[P_k = \frac{kn}{100}\]

Where:

  • \(P_k\) is the k-th percentile.
  • \(k\) is the desired percentile (e.g., 20 for the 20th percentile).
  • \(n\) is the number of data points.

Interpolation

When the data is grouped and we have the position of the quartile or percentile, we use linear interpolation to estimate the exact value. This method assumes the data is evenly distributed within the class interval.

For interpolation, the formula is:

\[\text{Estimated Value} = L + \left( \frac{P - F}{f} \right) \times h\]

Where:

  • \(L\) is the lower boundary of the class interval containing the quartile or percentile.
  • \(P\) is the position of the quartile or percentile you are calculating.
  • \(F\) is the cumulative frequency before the class.
  • \(f\) is the frequency of the class interval containing the quartile or percentile.
  • \(h\) is the width of the class interval.

Example of Interpolation

Suppose you want to calculate the 30th percentile, and the 30th percentile falls within a class interval with:

  • Lower boundary \(L = 10\)
  • Class width \(h = 5\)
  • Cumulative frequency before the class \(F = 20\)
  • Frequency of the class \(f = 8\)
  • The position \(P = 30\)

The estimated value would be:

\[
\text{Estimated Value} = 10 + \left( \frac{30 - 20}{8} \right) \times 5 = 10 + \left( \frac{10}{8} \right) \times 5 = 10 + 6.25 = 16.25
\]

Extra questions (ultimate exclusive)

Ultimate members get access to four additional questions with full video explanations.