Reliability 101 in five minutes

If you have heard the word 'reliability' in a meeting and nodded along, this post is for you. Five minutes from here, you will have a working definition you can use, a sense of what counts as a good number, and the three questions to ask the next time someone shows you a reliability statistic.

Reliability is consistency, not accuracy

Reliability asks one question: are the items in this scale measuring the same thing consistently? It does not ask whether the scale is measuring the right thing. (That second question is called validity, and it is a separate post.)

An analogy that holds up well: imagine a bathroom scale that always reads exactly 5 pounds heavier than your true weight. Step on it ten times and you get the same number every time. The scale is reliable (consistent), even though it is wrong (inaccurate). Reliability is the consistency property. Validity is the accuracy property. A survey scale needs both, and reliability comes first because nothing else works without it.

The number you will see most often: Cronbach's α

Cronbach's alpha (written α) is the reliability statistic survey tools report by default. It runs from 0 to 1. The convention most fields use:

0.90 and above: excellent. The scale is very tightly internally consistent. Sometimes too tight (items may be redundant).
0.80 to 0.89: strong. This is the comfortable range for a well-designed scale.
0.70 to 0.79: acceptable for most uses. Below 0.70 is where teams should start asking whether the scale is doing its job.
Below 0.70: the scale's items are not consistently measuring the same thing. Either the items need revision or the scale is measuring more than one construct.

Three questions to ask when you see a reliability number

1. What was the sample size?

Reliability is a property of scores in a sample, not a fact about the instrument in the abstract. With 25 respondents, a single α is a fuzzy estimate; with 200 respondents, it is much sharper. Always read α and n in the same breath.

2. Are any items dragging the scale down?

Item-total correlations show how much each individual item contributes. An item below 0.30 is weak; below 0.20, it is probably not measuring the same construct as the others. ReliCheck flags weak items automatically.

3. Is the scale measuring one thing or two?

α can stay respectable when a scale is quietly measuring two related things at once (a "wellbeing" scale that mixes physical and emotional items, for example). The fix is to look at the inter-item correlation matrix and the KMO statistic alongside α.

That is the working knowledge

A reliability number tells you whether the items in a scale are pulling in the same direction. A high number is necessary but not sufficient for trustworthy survey results. A low number is a real warning that the scale needs work before any conclusion can rest on it.

For more depth: the long-form companion covers what α does not tell you, and the reliability guide walks through every reliability and validity statistic ReliCheck computes.

Check a Survey

Import Data

Build a Survey

Sample Reports

Solutions

Plans

Reliability 101 in five minutes

Reliability is consistency, not accuracy

The number you will see most often: Cronbach's α

Three questions to ask when you see a reliability number

1. What was the sample size?

2. Are any items dragging the scale down?

3. Is the scale measuring one thing or two?

That is the working knowledge

Reliability 101 in five minutes

Reliability is consistency, not accuracy

The number you will see most often: Cronbach's α

Three questions to ask when you see a reliability number

1. What was the sample size?

2. Are any items dragging the scale down?

3. Is the scale measuring one thing or two?

That is the working knowledge

Related posts

What a reliability number actually tells you

How to validate a new survey scale