What AI Video Interview Software Actually Analyzes

HireVue processes 25,000 data points per video interview session. That number gets cited a lot. What nobody explains is what those data points actually are.

If you’re walking into a video interview in 2026 without understanding what the software is measuring, you’re preparing for the wrong test.

I’ve spent the past year digging through how AI-assisted video interview platforms work. Not marketing copy. The actual technical breakdowns, the research papers, the patent filings. Here’s the mechanic’s view of what’s actually happening when you hit record.

How AI Video Interview Screening Works in 2026

Let me back up for a second.

Traditional phone screens are expensive. A recruiter spending 30 minutes per candidate across 200 applicants is 100 hours of labor before a single in-person interview happens. AI video screening tools were built to solve that math problem.

The dominant players: HireVue (used by Unilever, Delta, Goldman Sachs), Spark Hire (used by 6,000+ companies), and a growing field of smaller competitors like Vidyard, Montage, and Modern Hire.

The basic flow: you receive a link, you record yourself answering pre-set questions within a time limit, and the software evaluates your responses before a human ever watches the video. In many cases, humans never watch it at all. The AI scores you and routes your application.

That’s the part people don’t fully absorb. You may be eliminated by an algorithm. Not by a person.

Understanding what that algorithm is measuring changes how you prepare.

The 5 Categories AI Interview Tools Actually Analyze

1. Language and Word Choice

This is the most significant category. The AI transcribes your answer and analyzes the text for several signals.

What it’s looking for:

Keyword presence. Job-relevant language. If the role requires “cross-functional collaboration” and you say “worked with different teams,” you may score lower than a candidate who used the actual phrase. The same ATS keyword logic that applies to your resume applies here.
Answer completeness. Did you actually answer the question? Systems trained on high-performing candidate responses know that good answers to “Tell me about a time you handled conflict” contain a situation, an action, and a result. Rambling without resolution scores poorly.
Filler word density. “Um,” “like,” “you know,” “basically.” Systems flag high filler word rates as a communication signal. This isn’t about perfection. It’s about whether your speech patterns suggest disorganized thinking under pressure.
Sentiment. Some systems analyze positive vs. negative language patterns. Candidates who describe challenges in constructive language (“I learned that…”) score differently than those who describe the same challenges in purely negative terms.
AI-generated language detection. Here’s the part nobody’s prepared for. Some systems are now trained to flag responses that appear scripted or AI-assisted. Formulaic phrasing, overly polished sentence structure, responses that sound like they were rehearsed verbatim from a prep guide. The same way employers screen AI-generated resumes, some are now screening AI-coached interview answers. Authenticity is a feature of the natural language model, not a bug.

2. Vocal Delivery

The AI analyzes the audio track separately from the transcript.

What it measures:

Speaking pace. Average words per minute. Extremes in either direction flag as potential issues. Too fast reads as anxious. Too slow reads as uncertain. The sweet spot varies by role but generally falls between 120-180 words per minute.
Tone variation. Flat monotone speech scores lower than speech with appropriate modulation. The system is measuring engagement signals, not performance. Candidates who sound energized about the topic score differently than candidates who deliver the same content in a flat register.
Pause patterns. Brief pauses before answering (thinking) are neutral to positive. Long pauses mid-answer register as hesitation. Very fast answers with no pauses register as rehearsed.
Volume consistency. Fading volume at the end of sentences is a pattern the software tracks. It correlates with confidence decline in the answer.

3. Visual Signals and Body Language

This is where the data points pile up fast. The 25,000 number HireVue cites includes frame-by-frame analysis of the video track.

What it’s analyzing:

Eye contact with the camera. Looking at the camera is “looking at the interviewer.” Looking at the screen (where you see your own face) is looking away. Looking off-screen is a different signal. Most people look at their own face during video calls. That reads as poor eye contact to the AI. Look at the camera. Put a dot of tape next to it if it helps.
Facial expression consistency. The system identifies micro-expression patterns associated with high-confidence responses. Candidates whose facial expressions are consistent with their verbal content score differently than those with incongruent expression patterns (saying confident things with an anxious face, for example).
Posture. Systems with advanced video analysis track forward lean (engagement), slumped posture (disengagement), and head tilt patterns. These are subtle signals but they’re part of the dataset.
Gesture. Some systems analyze hand gesture frequency as a signal of emotional engagement and communication style. No gestures (rigid delivery) and excessive gestures (distraction) both score differently than moderate, natural movement.

Important context: The research on whether these visual signals actually predict job performance is genuinely contested. Multiple academic studies have questioned whether AI video analysis correlates with long-term employee outcomes. Several countries have restricted or banned the practice. In 2023, Illinois passed an AI Video Interview Act requiring employers to disclose when AI is used.

But contested research doesn’t mean the systems aren’t deployed. They are. Widely. And your application is being scored by them whether you agree with the methodology or not.

4. Technical and Environmental Signals

This is the category people underestimate.

Audio quality. Systems flag recordings with significant background noise, echo, or poor microphone quality. It correlates with preparation signals in the model. A candidate who shows up with bad audio is a different profile than a candidate who set up a proper recording environment.
Lighting. Backlighting (window behind you) often makes your face underexposed. The visual analysis systems need to see your face clearly to analyze it. Backlighting literally reduces the data points available for analysis, which tends to hurt your score.
Connectivity. Freezing, pixelation, lag spikes. These affect transcript accuracy, which affects language analysis. Poor connectivity isn’t just an annoyance. It actively degrades your score in text-based analysis categories.
Framing. Sitting too far from the camera, cutting off the top of your head, being slightly off-center. The systems are calibrated for standard video interview framing. Non-standard framing produces lower-confidence analysis results, which trend toward lower scores.

Fix these before you record anything. They’re table stakes. Set up your recording environment the same way you’d set up for a high-stakes presentation, because that’s what it is.

5. Answer Depth and Structure

The AI is trained on thousands of high-performing and low-performing interview responses. It has learned patterns.

What structural signals it reads:

STAR framework presence. Situation, Task, Action, Result. Answers that follow this arc contain the narrative elements the model associates with quality responses. Not because STAR is some magic formula, but because it forces answer completeness. The AI has seen enough responses to know that “here’s what happened, here’s what I did, here’s what changed” is a more informative answer than a wandering narrative that never reaches a result.
Specificity markers. “I improved customer satisfaction” vs. “I improved customer satisfaction scores from 3.8 to 4.6 over six months.” The system recognizes quantification as a quality signal. This mirrors what ATS systems do with resume bullet points. Quantified answers score higher. Specific numbers aren’t just good storytelling, they’re what the model was trained to look for.
Answer length relative to question type. A question that asks you to describe a complex situation should produce a longer answer than a question about your availability. Systems flag responses that are dramatically too short (lack of depth) or dramatically too long (inability to be concise).

The 2026 Shift: AI Collaboration Questions Are Now Standard

Here’s something that wasn’t in most interview prep guides six months ago.

AI video interview questions in 2026 now routinely include: “Tell me how you’ve used AI tools to improve your productivity or achieve a result.”

This matters for three reasons.

First, it’s a role-readiness filter. Companies that are building AI-first workflows want candidates who have actually used AI tools, not candidates who can describe what they are. The question is a fast screener.

Second, the AI analyzing your answer is itself trained to recognize authentic AI-tool experience versus theoretical awareness. Candidates who have genuinely used tools like Claude, Gemini, Copilot, or specialized industry AI give different answers than candidates who’ve read about them. The specificity markers apply here too: “I used Claude to draft outreach sequences and A/B tested the output against manually written versions” is a different answer than “I’ve explored various AI tools for productivity.”

Third, it’s become a behavioral interview category. Expect follow-ups: “What was the result?” “What limitations did you find?” “How did you verify the AI’s output before using it?” Prepare a real example. If you don’t have one, use one from the next six weeks before your interview.

The Practical Prep Checklist for AI Video Interviews

Most interview prep focuses on what to say. AI video screening means you also need to optimize the environment and delivery, not just the content.

Before you record:

Test your setup. Record a 60-second sample and play it back. Check audio quality, lighting, framing. Fix what’s wrong before the actual session.
Put a piece of tape or a sticky note next to your camera lens. This is your focal point. Look at it when you answer questions, not at your own face on the screen.
Do the recording in a quiet room with controlled lighting. A lamp to your front-left (if you’re right-handed) is the classic portrait lighting setup. It eliminates shadows and puts even light on your face.
Close all other applications. Connectivity issues hurt transcription accuracy, which hurts your language analysis scores.

On content:

Prepare 8-10 STAR stories that can flex across different question types. You don’t need a unique story for every possible question. You need versatile stories. The 8-story interview bank approach still holds and applies directly here.
Practice out loud. Not in your head. The vocal delivery analysis requires that you’ve actually spoken your answers before. The rhythm and pacing of rehearsed speech is measurably different from improvised speech. You want the former.
Add quantification to every story. If you can’t remember the specific number, estimate and note it’s approximate. “Around 35%” is a signal to the system. “Better results” is not.
Prepare your AI collaboration example now. One real story with a specific tool, a specific use case, and a specific result.

On delivery:

Pause before you answer. Not three seconds. One second. It signals thought, not hesitation. This is a real pattern difference between high-scoring and low-scoring responses.
Vary your tone. If you’re monitoring yourself for this, you’re probably overdoing it. The goal is to talk about the work with the same energy you’d use describing it to a colleague. That naturalness is what the vocal analysis is trying to detect.

One More Thing: Know When AI Was Used

Illinois requires disclosure. New York City passed algorithmic bias regulations affecting hiring tools in 2023. More jurisdictions are moving in this direction.

If you receive a video interview link that doesn’t disclose AI analysis, you can ask before recording. “Will my video response be analyzed by AI screening software?” is a reasonable question. The recruiter’s answer tells you something useful about the company’s approach to transparency in hiring.

This doesn’t change your prep. But it’s information worth having.

Before you submit your next application, make sure your resume is already optimized for the role. Sign up free at JobCanvas.ai, upload your resume, and see exactly which skills and terms you’re missing before you even get to the video interview stage. That’s the filter before the filter.

AI video interviews are part of the process now. Understand what they’re measuring and prepare accordingly. The 25,000 data points are just inputs. The algorithm is trying to predict something. Know what it’s predicting, and you can give it the signals it’s looking for.