If you are working on a thesis, dissertation or journal article, the chances are that the very first numbers your reader will encounter are descriptive statistics. They sit at the start of your results chapter and tell the examiner what your data actually look like before you run any inferential tests. Get them right, and the rest of your analysis stands on solid ground. Get them wrong, and even the most sophisticated regression model in the world will not save your viva.
This guide is written for PhD and Master’s researchers in the US, UK, Canada, Australia, the Middle East, Africa and Southeast Asia who want a practical, examiner-friendly understanding of descriptive statistics — what they are, the types you should know, how to choose between them and how to report them in APA or Vancouver style.
Quick Answer
Descriptive statistics are numerical and graphical methods that summarise the basic features of a dataset without drawing inferences about a wider population. The four core types are measures of central tendency (mean, median, mode), measures of dispersion (range, variance, standard deviation, interquartile range), measures of distribution shape (skewness, kurtosis) and measures of frequency (counts, percentages, proportions). Choice depends on the level of measurement — nominal, ordinal, interval or ratio — and the distribution of the variable.
Why Descriptive Statistics Matter Before Anything Else
Examiners and journal reviewers consistently flag the same problem: researchers jump straight into p-values without first describing the sample. Descriptive statistics serve four purposes that no inferential test can replicate.
1. They establish the credibility of your sample
A reader needs to know who or what you measured. Reporting the mean age, gender split and education level of your participants tells the examiner whether your sample is plausibly representative of the population you claim to study.
2. They check assumptions for inferential tests
Most parametric tests — t-tests, ANOVA, Pearson correlation, multiple regression — assume approximately normal distributions and reasonable variance. Skewness, kurtosis and standard deviation values let you decide whether those assumptions hold or whether a non-parametric alternative is safer.
3. They reveal data quality issues
Impossible minima or maxima, suspicious modes, or a mean that diverges sharply from the median often signal data-entry errors, missing-value codes treated as real values, or outliers that need to be investigated rather than ignored.
4. They make your thesis readable
A well-laid-out descriptive table is the single most cited element of any results chapter. It lets a busy examiner understand your dataset in 60 seconds.
The Four Types of Descriptive Statistics
Descriptive statistics fall into four widely accepted families. Most quantitative theses report at least three of them in the same table.
Measures of central tendency
These describe the “centre” of a distribution and answer the question: what is a typical value?
- Mean — the arithmetic average. Best for symmetric, interval/ratio data.
- Median — the middle value when data are ordered. Robust to outliers and skewed distributions; the default for ordinal data and income-style variables.
- Mode — the most frequently occurring value. The only valid measure of central tendency for nominal data such as gender, blood group or country of origin.
Measures of dispersion
These describe how spread out the values are around the centre. A mean without a measure of dispersion is almost meaningless.
- Range — maximum minus minimum. Quick but unstable.
- Variance — the average of squared deviations from the mean. Used in subsequent inferential tests.
- Standard deviation (SD) — the square root of variance, expressed in the original unit of measurement. The default companion to the mean.
- Interquartile range (IQR) — the spread of the middle 50 % of the data. The natural companion to the median.
Measures of distribution shape
These describe the symmetry and tail behaviour of the data and tell you whether parametric tests are safe.
- Skewness — the asymmetry of the distribution. Values between −1 and +1 are usually treated as acceptably symmetric.
- Kurtosis — the heaviness of the tails. Values close to 0 (excess kurtosis) indicate a near-normal shape.
Measures of frequency
For categorical and ordinal variables, descriptive statistics are most usefully reported as counts and percentages. A frequency table for “programme of study” or “country of residence” is more informative than any mean.
Choosing the Right Descriptive Statistic
The single biggest mistake postgraduate researchers make is reporting a mean and standard deviation for every variable in their dataset, regardless of measurement level. The right choice depends on two things: what kind of variable you have, and how that variable is distributed.
Match the statistic to the level of measurement
- Nominal (gender, ethnicity, programme): mode, frequency, percentage. Never a mean.
- Ordinal (Likert items, rating scales, education level): median, IQR, frequency. Mean and SD are tolerated in some social-science journals but should be justified.
- Interval (temperature, IQ score, calendar year): mean, SD, skewness, kurtosis.
- Ratio (height, income, reaction time, GDP): mean, SD, geometric mean where appropriate, plus dispersion and shape measures.
Match the statistic to the distribution
For symmetric, roughly normal data, report mean and SD. For skewed data — income, time-to-event, citation counts — report the median and IQR (or median with 25th and 75th percentiles). When in doubt, run a normality check: a Shapiro–Wilk test on samples up to about 50, visual inspection of histograms and Q–Q plots on larger samples, plus the skewness and kurtosis values themselves.
Decide on summary versus full description
For a primary outcome variable, give the full descriptive picture: n, mean, SD, median, IQR, minimum, maximum, skewness, kurtosis. For demographic covariates, a tighter summary — n and percentage for categorical, mean and SD for continuous — is enough.
How to Report Descriptive Statistics in a Thesis or Journal
Reporting style matters as much as the statistics themselves. Examiners and editors look for consistency, the correct number of decimal places, and a clear table that can be read independently of the surrounding text.
APA 7 style (most social-science theses)
- Italicise statistical symbols: M, SD, Mdn, n, p.
- Round to two decimal places for most descriptive values; use one decimal for percentages above 10.
- Inline form: “Participants reported a mean satisfaction score of M = 4.32 (SD = 0.87) on a 5-point scale.”
- Always state the sample size: n = 127.
Vancouver / IEEE / journal-specific styles
Most STEM journals expect descriptive results in a single comprehensive Table 1, with means and standard deviations in the format “4.32 ± 0.87” or “4.32 (0.87)”. Always check the target journal’s author guidelines and a recent published article in the same journal before finalising your tables.
Build a clean Table 1
Whatever style you follow, your descriptive table should include: variable name, n (or n and missing), central-tendency measure, dispersion measure, and minimum/maximum where useful. Group rows logically — demographics first, then study variables — and align the decimal points. Add a footnote that defines every abbreviation used.
Descriptive vs Inferential Statistics: Where the Line Sits
One of the most common questions we receive from students preparing their methodology chapter is where descriptive statistics end and inferential statistics begin. The boundary is conceptually simple but procedurally important.
Descriptive statistics describe only the data you have actually collected. They make no claims beyond your sample. Inferential statistics — confidence intervals, hypothesis tests, regression coefficients, effect sizes with sampling distributions — use your sample to make probabilistic statements about a wider population you have not measured.
Both belong in a thesis. Your results chapter should typically open with a descriptive section that establishes the sample and the variables, then move into an inferential section that addresses each research question or hypothesis in turn. If you would like a deeper walk-through of how to build that structure, see our guide to writing a literature review and our 10 tips for better academic writing for the surrounding chapters.
Common Mistakes to Avoid
The same handful of errors appears in roughly four out of five of the descriptive sections we review.
- Reporting a mean for a nominal variable. “Mean gender = 1.45” is meaningless. Report the frequency and percentage instead.
- Forgetting the sample size. Every descriptive table must show n. Without it, the reader cannot assess precision.
- Inconsistent decimal places. Choose two decimals for descriptive values and stick to it across every table and the running text.
- Ignoring missing data. Report how many cases are missing per variable. “Listwise deletion” is not a substitute for transparency.
- Treating Likert means as interval. If you do, justify the assumption explicitly with a citation; if you do not, use the median and IQR.
- Skipping distribution checks. Skewness and kurtosis values cost nothing to report and protect you from inferential-test mistakes later.
- Copying SPSS output verbatim. SPSS output is a working file, not a thesis table. Reformat it into your style guide’s expected structure.
If your dataset is large, multi-variable or contains a mix of measurement levels, the descriptive section can quickly become unmanageable in a single sitting. Many of our clients hand the cleaned data to our team and receive a fully formatted, examiner-ready descriptive chapter built in SPSS, R or Python with every assumption check documented, alongside a Turnitin similarity report for the surrounding write-up.
Putting It All Together
Strong descriptive statistics are the unglamorous backbone of every credible empirical thesis. They tell your examiner who is in your sample, what your variables look like, and whether the inferential analysis that follows is built on safe foundations. Choose measures that match your level of measurement and your distribution, report them in a single well-formatted table with consistent decimals, and remember that descriptive statistics are not a substitute for inferential testing — they are the platform that makes inferential testing trustworthy.
If you find yourself second-guessing every choice between mean and median, every skewness value, every Table 1 layout, you are not alone. Descriptive analysis sounds basic but is one of the most frequently flagged sections in viva voce examinations. Bringing in an experienced second pair of eyes — ideally one with PhD-level training in your discipline — can save weeks of revision and protect the credibility of the inferential work that follows.
For more on the writing surrounding your statistics, see our companion piece on how to write a perfect thesis statement.