Pillar Guide22 min read

The Complete Guide to AI Interviews (2026)

What an AI interview is, how AI is rewriting both sides of the loop, and how to choose an AI interview tool — from the team that builds Acedly AI.

Devon Park

Head of Research, Acedly

What an AI-conducted interview actually is

An AI-conducted interview is an asynchronous video screening. The recruiter never appears on camera. You receive a link, the page shows a prompt, a recording timer starts, and you have a fixed number of takes — usually one, occasionally two — to deliver an answer. Your video and audio are uploaded, transcribed, and scored by a combination of NLP models and (depending on the vendor) facial-expression and prosody analysis. A hiring manager then sees a ranked list of candidates with model-generated scores attached, often weeks after the recording.

This is a different workflow from a live recruiter call and a different category from a real-time AI copilot. The defining traits are:

  • Asynchronous. The recording happens on your timetable; the review happens on theirs. There is no one to read the room for.
  • Constrained takes. Most platforms allow a single take per question. A minority allow one re-record. None allow unlimited iteration.
  • Algorithmic first pass. The first reviewer of your answer is a model. Humans see a shortlist that has already been ranked.
  • Volume tooling. Employers reach for this format when they have hundreds or thousands of applicants for a role. It is rare for senior or specialist hiring, common for graduate and high-volume front-line roles.

How AI-conducted screening actually scores you

The black-box framing is mostly marketing. Most vendors publish enough about their scoring stack that you can prepare against it directly. Three families of signals dominate.

Content scoring

The dominant signal in every credible vendor is the content of your answer: a transcript is generated from the audio, and an NLP model scores it against a rubric that was either hand-built by an industrial-organisational psychologist or distilled from historical hire / no-hire data at that employer. The rubric usually weights structured answer shape (something close to STAR), use of role-relevant keywords drawn from the job description, the presence of concrete numbers and named outcomes, and the absence of disqualifying signals (excessive filler words, evasive phrasing, mismatches with the résumé).

You can prepare against content scoring directly. The mechanics are the same as preparing for a structured behavioural round with a human — your answer should have a clear setup, a clear action, and a clear quantified outcome — but with two adjustments. First, you do not get the credit for natural human warmth that a live interviewer gives. Second, you are scored against the rubric every time, so consistency matters more than peak performance.

Vocal and prosodic features

A smaller but real signal is how you say what you say. Pace, pitch variation, energy, and the ratio of speech to silence are all measurable from the audio without any transcription. The vendors that publish their feature lists name things like words per minute, average pause duration, and pitch range. The defensible reading of this is that you should sound engaged, not monotone, and you should not rush — pace under stress is the most common failure mode candidates self-report.

Read the vendor's policy and your local employment law. The boundaries around prosody-based screening tightened sharply between 2023 and 2026, and several large vendors have publicly dropped facial-expression scoring in particular under regulatory pressure (the Illinois AI Video Interview Act and the EU AI Act both target this category). Some vendors still score prosody as a "neutral" engagement signal; others have dropped audio-only scoring beyond the transcript.

Facial expression and gaze (mostly retired)

A few years ago, several vendors scored facial expressions, eye gaze, and microexpressions as proxies for engagement, honesty, and enthusiasm. The category was always contested on validity grounds and is now contested on legal grounds in most jurisdictions where async screening is common. HireVue publicly dropped facial-analysis scoring in 2021; most credible competitors have followed. If your screening vendor still claims to score facial expressions, treat the result as low-signal and weight your content and vocal preparation accordingly.

The platforms that run AI-conducted interviews

Five vendors run the bulk of AI-conducted screening for English-speaking employers in 2026. Knowing which one you're facing changes the preparation by a few percent each way.

AI-conducted screening platforms candidates actually encounter
FeatureHireVueSpark HireModern Hire (HireVue)HireflixmyInterview
FormatAsync video, structured promptsAsync video, simple promptsAsync video + assessments (now under HireVue)Async video, recruiter-customisedAsync video, AI personality scoring
Takes per questionUsually 1Often 1, sometimes 3Usually 1Configurable per recruiterUsually 1
Scoring layerNLP content + light prosodyLight, mostly transcriptNLP + role-fit assessmentsLight, recruiter-ledPersonality model + transcript
Where you'll see itFortune 500 grad / volume hiringSMB and franchise hiringSame families as HireVueMid-market and techService, retail, hospitality
Typical answer time60–120 s per Q30–90 s per Q60–120 s per Q30–180 s per Q30–120 s per Q

The other platform worth naming is the in-house async screener that some large employers (notably Goldman Sachs, McKinsey, and several big tech grad programmes) have built on top of generic video tooling. The mechanics are identical to the named vendors; the rubric is just internal.

Preparing for an AI-conducted interview

The preparation has six elements. Most candidates do the first two, neglect the rest, and underperform their content.

Setup the environment. Wired internet if possible. A neutral background. A single light source roughly behind your camera. A glass of water. Your phone face-down, on silent. Test the platform's webcam and microphone preview, in the actual browser, before the recording window opens — most failed recordings are device-permission issues, not content issues.

Read the prompt carefully and reset. Most platforms give you 30 to 90 seconds of preview time before the recording begins. Use it. Read the prompt aloud silently. Identify the three or four points you want to make. Take one deep breath. Then start.

Open with a one-sentence summary. The transcript scorer is looking for structured answer shape. Open with a single sentence that names the situation and the outcome — "I led a payments migration last year that finished six business days late after we added a dual-write rollback for safety" — and then expand. The summary primes the rubric the way a thesis statement primes an essay grader.

Hit the role-relevant keywords without stuffing. Recruiters configure the rubric against the job description. The relevant keywords are usually obvious from the JD: the named technologies, the named methodologies (STAR, OKR, agile), the named verbs (led, shipped, owned). Use them where they genuinely apply. Do not pad them in.

End with a quantified outcome. "We migrated six terabytes with zero downtime over a three-week window." "Cycle time dropped from 18 to 6 days within a quarter." Numbers anchor the answer in the rubric. If your role didn't produce numbers, name the second-order outcome — "the manager who reviewed the rollback later adopted the dual-write pattern for two other migrations."

Pace yourself. Most credible vendors penalise the bottom and top quintile of words-per-minute. The middle quintile is roughly 140 to 170 WPM in English. Practise to the timer — if the platform gives you 90 seconds, rehearse three or four answers that come in at 75 to 85 seconds. The most common failure mode is running out of time mid-thought.

Common pitfalls

The pitfalls are remarkably consistent across vendors and roles.

  • Treating the screen like a live conversation. There is no one nodding back. Your prosody flattens, the silences feel awkward, and you over-fill them with hedges. The fix is to imagine a single specific person — a friend, a former manager — and address the camera to them.
  • Trying to game the model. Stuffing keywords, repeating phrases verbatim from the JD, or padding answers with industry jargon all show up as anomalies in the rubric. Several vendors explicitly flag this and route the answer for human review with the flag attached. The flag is worse than a slightly weaker answer.
  • Ignoring the platform's interface. Each vendor has small interface quirks — a 5-second countdown before recording, a "review" button that doesn't actually let you re-record, a question-of-the-day at the start that's part of the scoring. Read the help page for the vendor before you sit down.
  • Skipping the practice question. Most platforms offer one or two practice prompts that don't count toward your score. Candidates routinely skip these to "save mental energy" and then waste their first real take on an interface mistake. Use the practice prompt every time.

How an AI-conducted screen differs from a live human interview

The two formats reward different muscles. Treating them as the same product is the most common mistake candidates make when they hear the words "AI interview."

AI-conducted async screenLive human interview
Who reviews firstA modelA human
Real-time adaptationNone — script and deliverConstant — read and adjust
Tools you can useNotes, prep, practiceNotes, prep, optionally a real-time copilot
Signal weightingTranscript content > prosodySubstance + rapport + chemistry
What rewards preparationDrilled structured answersDrilled structured answers + live-cadence practice
Acedly's roleNot for this formatReal-time copilot during the live call

Acedly is a real-time AI copilot built for the live side of this distinction — a human recruiter on the other end of a Zoom or Teams call, the copilot running silently on the candidate's machine, sub-200 ms latency, hidden from screen sharing. It is not designed for async screens; using a real-time copilot during a HireVue recording is both pointless (there is no live cadence to assist) and risky (most async vendors run their own anti-cheat heuristics). The right preparation for async is rehearsal; the right tool for live is a copilot. See our AI interview assistant pillar for the live-interview side.

Frequently asked questions

AI-conducted interview FAQ