Features in ReliCheck

Features that know credibility.

Twenty-nine AI tools work alongside ReliCheck's statistical analysis system to help you build stronger surveys and tests before decisions are made from the results. The platform can review question quality, evaluate survey readiness, explain findings in plain language, and help identify whether results are reliable enough to support reporting, evaluation, or organizational action. Advanced tools for reliability, factor analysis, IRT, trend analysis, equity gaps, and predictive modeling remain available when deeper analysis is needed. You stay in control of the wording, interpretation, and what gets shared.

Try it free See the full workflow

AI grounded in your data, audited against reliability standards

ReliCheck AI features illustrated alongside survey data

Four stages, twenty-nine tools

In every step, not bolted onto one.

Generic AI survey tools write questions. ReliCheck's AI writes, audits, collects, narrates, and verdicts the whole survey, test, or analysis lifecycle. Each tool reads the same underlying statistics, so the plain-language output and the math always agree.

✎

Before you collect

AI proposes constructs from your items, audits the draft against the decision you're trying to make, writes alternates for vague or double-barreled wording, generates draft Likert items from a topic, and recommends validated scales that match your research aim.

Question Reviewer, Construct Mapper, Purpose Checker, Item Generator, Scale Recommender

◉

While you collect

Respondents can fill the survey by chatting with an AI in plain language instead of clicking radio buttons. The AI maps each freeform reply to the right Likert position, single-choice option, or multi-choice set, and confirms it back in one sentence. Lower abandon rate, higher accessibility.

Conversational Take Mode

∑

While you analyze

Plain-language paragraphs sit at the top of every analytics surface, and every dashboard adds an HR-friendly verdict card that translates the math into a tiered headline, body paragraph, and a "what to do" action list. Reliability, descriptives, factor structure, measurement invariance, group comparisons, pre/post change, subgroup gaps, regression predictors, mediation, moderation, item response theory, and multilevel modeling each get both layers. 360 panels add a per-subject narrator that names strengths, development areas, and blind spots in plain language.

Reliability Explainer, Dashboard Narrators, Advanced Methodology Narrators, HR-friendly Verdict Cards, 360 Subject Narrator, Suite Roll-Up Narrator

★

When you report

A four-tier verdict tells you what these results can and cannot support. A data-quality check flags straight-lining, duplicate vectors, and short open-ended answers. A draft-report writer turns the live analytics snapshot into a two-to-three paragraph draft in researcher, HR, or teacher voice. A teacher-friendly narrator turns classroom test analytics into a department-ready summary.

Response Quality Check, "Can I Use These Results?" Advisor, Draft Report Writer, Test Analytics Narrator

Before you collect

Build a focused survey, not a kitchen-sink one.

✎

Question Reviewer

Spots double-barreled, leading, ambiguous, and biased wording as you type. Suggested rewrites preserve scale anchors and sentiment, with the original always one click away.

Lives in the Questions builder

⋈

Construct Mapper

You give ReliCheck a list of items. AI proposes two to six constructs and assigns every item to one, with a one-sentence rationale per group. You review the proposal in a preview modal, edit construct names inline, and apply the assignments only for the items where the construct field was blank.

Lives in the Questions builder toolbar

◎

Survey Purpose Checker

State the decision your survey is meant to inform. AI audits the items against that purpose, tags each one as core, supporting, tangential, or off-topic, names the aspects of the purpose that are underrepresented, and proposes specific items you can add with one click.

Lives at the top of the Questions builder

✦

Item Generator

Type a topic (team psychological safety, classroom engagement, customer trust) and AI returns five, ten, or fifteen draft Likert items, each tagged with a construct and a reverse-score flag. You pick which ones to add. The builder drops the chosen statements straight into the Questions tab as new items, ready to edit.

Lives in the Questions builder toolbar

⚖

Scale Recommender

Describe what you're trying to measure (resilience in first-year teachers, loneliness in graduate students, perceived stress in oncology nurses). AI returns two to four validated scales with the scale name, the construct, the recommended citation, the typical sample size, and a one-sentence rationale. When the recommendation matches ReliCheck's built-in library, one click stands up a starter survey.

Lives in the Questions builder toolbar

↻

Authoring assists chain together

Run the scale recommender first to anchor the survey in a validated instrument. Use the item generator to draft any supplementary items you want next to it. Then run the construct mapper and the purpose checker on the combined draft. The four assists are deliberately small and composable so you can stop at any point and still ship a focused survey.

Recommended order: Scale Recommender to Item Generator to Construct Mapper to Purpose Checker

While you collect

Let respondents answer in their own words.

◉

Conversational Take Mode

Append ?chat=1 to any survey's public link and the take page becomes a chat. The AI asks each question in plain language and lists the options inline. The respondent answers naturally: "agree," "5," "I love this," "the first and the third." The AI maps every reply to the right Likert position, single-choice option, or multi-choice set, and confirms in one sentence. Then the next question. Skip logic, autosave with invitation tokens, and the same submit endpoint as the form mode all work unchanged. The standard form mode keeps working at the bare URL, so chat mode is purely additive and opt-in.

Append ?chat=1 to any public share link

↧

Why it works

Long surveys lose respondents to form fatigue. A chat interface reads faster, feels less like work, and lets people answer with shades of meaning the radio button does not capture ("I love this part but the workload is brutal"). The AI extracts the structured value and surfaces it back so the respondent can correct if needed. Same data, lower abandon rate, better accessibility for users who struggle with form layouts.

Bonus: works well for screen readers and assistive tech

🛡

Cost and abuse guardrails

The endpoint that powers the chat is rate-limited to 240 calls per hour per IP so a single noisy respondent cannot run up the AI bill. Each call uses a small token budget (max 400 output tokens) and a question-shape-constrained prompt. Server-side validation clamps Likert values to the survey's scale and rejects out-of-range single-choice indices, so an off-topic reply cannot poison the dataset.

Per-IP rate limit, validated output, no PII to the AI beyond what the respondent types

While you analyze

Every dashboard speaks plain English.

🛡

Reliability Narrator

The Reliability tab opens with a plain-language paragraph in researcher voice. It interprets the scale's internal consistency, names the weakest items by their actual prompt text, and recommends a next step. Cronbach's alpha, McDonald's omega, item-total correlation, and split-half all get translated into sentences the team can read.

Top of the Reliability analytics tab

Dashboard Narrators

A plain-language paragraph at the top of every analytics tab. Description explains response patterns and ceiling effects. Validity reads the factor structure. Open-Ended describes engagement without touching respondent text. Compare interprets group differences. Pre/Post explains learning gains. Subgroups names the biggest gaps. Predictors describes what drives the outcome.

Top of Description, Validity, Open-Ended, Compare, Pre/Post, Subgroups, and Predictors tabs

⊕

Advanced Methodology Narrators

The methodologies academic reviewers and accreditation reports expect, written in plain language. Confirmatory factor analysis explains CFI, TLI, RMSEA, and SRMR. Measurement invariance reports whether the survey works the same way across groups. Item response theory narrates discrimination, item information, and reliability at trait level (Graded Response, 2PL, 3PL, and two-dimensional MIRT each get their own interpretation, including which items cross-load on two traits). Mediation reports the indirect effect with bootstrap confidence, including BCa intervals and binary-outcome cases on the log-odds scale. Moderation interprets the Johnson-Neyman region. Multilevel models translate intraclass correlation and variance components into a plain-English answer about whether the grouping matters, with separate treatment for the two-level linear, two-level logistic GLMM, and three-level linear modes.

CFA, Measurement Invariance, IRT, Mediation, Moderation, and Multilevel Model cards

✓

HR-friendly Verdict Cards

Every analytics dashboard now has a plain-language verdict card above the math card. Reliability reports "Strong / Adequate / Modest / Weak" with what the alpha and omega numbers mean for the work. Measurement invariance answers "Can I compare these groups?" with a Yes / Compare with care / Not on equal footing / Not yet ready tier and an action list per tier. Predictors translates the top coefficients into "A 1-point increase in Manager support is associated with about a 0.4-point increase in Engagement (standardized beta = 0.41)." Mediation, moderation, group comparison, pre/post change, subgroup gaps, key drivers, IRT, and MLM each get the same treatment. Every body paragraph names the actual outcome and grouping variable; every actions list is tier-aware. The reviewer sees what the analysis means before reading the table.

Top of every analytics tab, above the existing math card

▣

Key Driver Narrator

The Key Drivers tab ranks every survey factor by its share of explained variance using Johnson's Relative Weights for continuous outcomes (or standardized log-odds for binary outcomes). The AI summary at the top of the tab names the top one to three drivers in full, translates the importance metric into plain language ("workload accounts for 28 percent of the explained variance in engagement"), and flags any driver that shows a strong bivariate correlation but a near-zero standardized coefficient as a multicollinearity story rather than a finding. The output is built for an HR or evaluation lead, not a statistician.

Top of the Key Drivers analytics tab

◉

360 Subject Narrator

For every subject of a 360 / multi-rater panel, a two-to-four sentence summary card sits at the top of the subject report. The AI names the strongest theme by quoting the highest-rated item, names the biggest development area by quoting the lowest item, and flags blind spots when self-ratings exceed others by 0.75 points or more on the five-point scale (or underestimation in the reverse direction). The closing sentence nudges what the manager or HR partner should do next. The tone pill (Strong picture, Solid with notes, Gaps to address, Significant gaps) gives a fast visual read before any number is shown.

Top of every 360 subject report; full report also downloadable as a PDF

∎

Suite Roll-Up Narrator

Open a workflow suite that has two or more surveys attached and a quarterly roll-up card pins to the top of the suite page. The narrator reads response volume this quarter versus last, the average Strength Index across every survey in the suite, and the Likert-mean direction on any construct tagged across two or more surveys. The output is the line you would actually read aloud in a leadership meeting ("HR Suite is steady this quarter; engagement is essentially flat, exit volume is up modestly, response volume slipped 12 percent"). The tone pill (Steady or improving, Mixed quarter, Watch this quarter, Slipping this quarter) is the one-word read; the paragraph and the three highlights name specific constructs and surveys. Numbers come from the live response data, never invented.

Top of the suite detail view, above the Templates and Surveys cards. Gated to suites with 2+ attached surveys.

⚒

Response Quality Narrator

The Response Quality dashboard opens with a plain-language read of how trustworthy the responses are. The narrator reads the straight-lining count, the duplicate-vector pairs, the share of low-effort open-ended answers, and per-question missingness; it then writes a paragraph in plain HR voice ("data quality is solid; two respondents straight-lined every Likert item and one open-ended question is being skipped at 38 percent"). It surfaces channel-level skew when one distribution path is attracting noticeably lower-effort responses, and never identifies individual respondents.

Top of the Response Quality analytics tab

↷

Completion Narrator

The Completion & Missing Data dashboard opens with a paragraph that names the Completion Score, the modal drop-off question by its actual prompt text, and any questions skipped at much higher rates than the rest. When a single-choice grouping variable is present, the narrator flags any group with a much higher skip rate so accessibility or engagement gaps surface before they become a story.

Top of the Completion analytics tab

↗

Trends Narrator

The Trends dashboard reads the wave-over-wave story. The narrator reports whether the composite is trending up, steady, or slipping, names the most-moved construct by its actual label, and translates the current-vs-previous-wave significance flag into plain English ("the drop is statistically significant" / "the change is within sampling noise" / "the sample is too small to call this significant yet"). When wave detection comes from the Pulse channel-tag convention, the narrator references the wave labels directly so the read aligns with how the survey was sent.

Top of the Trends analytics tab (in More analyses)

✔

Survey Readiness Narrator

Before a single response is in, the Survey Readiness tab grades the survey design itself across six weighted domains and the AI narrator translates the score into a plain answer: ready to send, almost ready with two small fixes, workable with real gaps, or not ready until blockers clear. The narrator names the single most pressing fix by its issue title or domain, references what unlocks if it is fixed (omega, Compare, per-construct reliability, equity gaps), and skips fix-first framing entirely when the score is 85 or higher. The same paragraph appears at the top of the Distribute view's pre-publish readiness card, the Analytics tab, and is hinted via a Readiness pill in the survey ctx-bar.

Top of the Readiness analytics tab and the Distribute pre-publish card

⚖

Equity Gaps Narrator

The Equity Gap analysis tab surfaces outcome differences across protected-class or program groups (gender, race or ethnicity, age band, role, tenure, etc.) and the narrator reads the result in HR-friendly language. It names axes by the survey's actual question label, names groups by their actual option labels, and translates Cohen's d into plain English ("engagement is 0.71 standard deviations lower for one group than another on the race or ethnicity axis"). The framing is patterns to understand and act on, never accusations; when subgroups were hidden for k-anonymity, the narrator mentions the hidden count once so the reader knows a smaller-group story may be missing from the analysis.

Top of the Equity Gaps analytics tab (in More analyses)

When you report

Know what these results can support before you publish.

◉

Response Quality Check

Three deterministic checks scan the dataset: straight-lining (same Likert answer across every item), duplicate response vectors (identical answer patterns across respondents), and very short open-ended answers. AI then narrates the severity in plain language and recommends whether to review, recollect, or proceed.

Strength Index tab, under the verdict card

★

"Can I Use These Results?" Advisor

A four-tier verdict sits directly under the Survey Strength Index ring: yes, yes with cautions, use with care, or not yet. Three buckets break it down further. Safe to use for. Use caution for. Not recommended for. Specific cautions reference the actual numbers the verdict is built on.

Top of the Strength Index tab

📝

Test Analytics Narrator

For classroom tests and quizzes. Reads reliability, pass rate, item difficulty, and item quality, then writes a teacher-friendly paragraph that names your strongest and weakest items by their label, flags possibly miskeyed questions before they cost another semester, and tells you whether the test reliably ranked your students. Plain language, no jargon, ready to forward to a department chair.

Top of every test analysis dashboard

✎

Draft Report Writer

One click on the Strength Index turns the live analytics snapshot into a two-to-three paragraph draft you can paste into a write-up. Pick the voice. Researcher writes a formal Methods-and-Results paragraph with alpha, omega, factor count, and cumulative variance cited precisely. HR lead writes a leadership summary with the headline first and the numbers in parentheses. Teacher writes a department-friendly paragraph with a specific next step. Numbers come from the snapshot only, never invented.

Top of the Strength Index tab

⇪

The reporting trio works together

Open the Strength Index. The verdict tells you what these results can support. The data-quality check tells you whether the response set itself is clean. The draft-report writer turns the verified analytics into a paragraph you can edit. Three cards stack down the page in the order a careful reviewer would actually use them.

Strength Index tab, top to bottom

↧

And then the PDF

When you're ready to send, the export buttons mirror the dashboard exactly. The full survey report walks all twelve analytics tabs and produces one multi-page PDF. The per-tab export captures just the cards on the current tab. The test report is built for classroom and department review. The pages look the way the screen looks because the export is a clean screenshot of the rendered dashboard, not a flat text dump.

Strength Index, each analytics tab, and the test analytics view

Principles

What we will not do

Reliability

Pretend a result is solid

If the sample is small, alpha is low, or missingness is structural, the AI summary says so before it gives you the headline. The verdict tiers exist precisely to keep weak data from getting reported as strong.

Privacy

Train on your data

Survey content and responses are not used to train any model, ours or our providers'. Open-ended respondent text never crosses the wire to the dashboard narrator. Your data stays your data.

Transparency

Hide its work

Every AI claim links to a number, a method, and the items behind it. You can disagree with the framing, edit the draft, or fall back to the offline template if you want a deterministic read.

Plus reliability, in one survey platform

Start free, run a survey, and watch the AI summaries appear next to the actual numbers. The pairing is the point.

Start free Read the methodology