Sierra Killed the Algorithms Round. Your Boolean String Still Grades for It.
Sierra, Meta, and Google now grade AI fluency, not LeetCode. Here's why recruiter Boolean strings still source the wrong engineer, and what to search instead.
Sierra just published "The AI-native interview" and announced it removed coding and algorithms rounds entirely. In the same six-month window, Meta rolled its AI-enabled CoderPad round to E7 and M2, and a leaked Google doc confirmed Gemini-assisted "code comprehension" rounds with explicit scoring on AI fluency. Three top-tier loops now grade tool steering, output validation, and debugging of AI output. Recruiter Boolean strings are still keyed to algorithms, data structures, and "7+ years of Python."
That gap is the story. The interview moved. Sourcing didn't.
What actually changed at Sierra, Meta, and Google
Sierra's new onsite is structured as Plan, Build, Review. Candidates get roughly two hours in the Build phase to actually ship the product using whatever AI tools, frameworks, and approaches they want. They're explicitly encouraged to pivot, cut scope, or change direction when they hit a wall. There is no whiteboard. There is no quicksort. Sierra is also piloting a debugging round where the candidate pulls down a medium-sized codebase plus a draft PR from a colleague that introduces a cross-cutting feature, then has to review, inspect output, and iterate with coding agents to make it better.
Meta began piloting the AI-enabled CoderPad round in October 2025 and is rolling it out to all SWE roles in 2026. The bar that matters for senior sourcing: candidates at E7+ and M1 will only have one coding round during the onsite, and it will be AI-assisted. The environment includes a model switcher across GPT-4o mini, GPT-5, Claude Sonnet 4 and 4.5, Claude Haiku 3.5 and 4.5, Gemini 2.5 Pro, and Llama 4 Maverick. Meta's framing is explicit: it's a productivity booster, not an end-to-end solver.
Google's leaked internal doc says interviewers will evaluate "AI fluency, including prompt engineering, output validation, and debugging skills," and describes the format as "human-led, AI-assisted." VP of Recruiting Brian Ong has gone on the record confirming the Gemini pilot. Sundar Pichai disclosed in an April 22, 2026 blog post that 75% of all new code at Google is now AI-generated and approved by engineers, up from 50% the previous fall.
Canva moved earliest. In June 2025 it announced it now expects candidates for backend, frontend, and ML roles to use Copilot, Cursor, or Claude during technical interviews, and rewrote its questions to be ambiguous enough that they "can't be solved with a single prompt; they require iterative thinking, requirement clarification, and good decision-making."
Your Boolean string is selecting against the new bar
Here's the uncomfortable mechanical fact. A standard senior-SWE search looks something like ("Senior Software Engineer" OR "Staff Engineer") AND ("Python" OR "Go") AND ("data structures" OR "algorithms") AND "5+ years". That string was tuned for a LeetCode-graded loop. It surfaces candidates who optimize for tenure-with-a-language and pattern recognition. Those are exactly the candidates Sierra's blog says it no longer needs, because "the role is shifting from building the machine to designing and honing it, and we should focus less on the precise lines of code that are written and more about whether it produces the right outcomes over time."
Years-of-Python is now an anti-signal at the top of the loop. Not a neutral signal. An anti-signal. The compiler abstraction has moved up. A candidate whose differentiation is deep tenure with a single language is differentiating on the layer that AI tooling has commoditized fastest. Greg Brockman's line on this is worth quoting: AI coding tools went from writing 20% of code to 80% "over the course of December" alone.
The interview is no longer falsifiable by LeetCode prep. Sourcing is now the bottleneck, not the screen.
When the loop graded algorithms, a bad source could be salvaged. The candidate crammed for two weeks and squeaked through. When the loop is two hours of product-building with self-chosen tools plus a debugging round on a real codebase, a mis-sourced candidate fails on contact. The blast radius of a bad Boolean string just got much larger.
"AI fluency" is not "prompt engineer"
This is the trap most sourcing teams are about to walk into. The instinct is to swap one keyword set for another: replace "data structures" with "prompt engineering," replace "5 years Python" with "LangChain," and call it a day. That selects for the wrong profile again.
Read the Google doc carefully. The evaluated skills are prompt engineering, output validation, and debugging. Two of the three are skepticism, not enthusiasm. Sierra's debugging round is literally about rejecting or improving a draft PR. Meta treats the model as a productivity booster, "not an end-to-end solver." The signal you want is a person who reviewed a Claude-generated PR and caught a subtle off-by-one, or pushed back on a Cursor suggestion that compiled but violated an invariant.
That signal does not live in a LinkedIn headline. A candidate with "Prompt Engineering Certified" in their About section is selecting against the new bar. The signal lives in GitHub PR review comments, in commit messages that walk back AI-generated diffs, in eval harnesses, and in repos that look like working notebooks for steering agents.
The "AI Engineer" title is a red herring
A search across our index for current US holders of "AI Engineer," "Applied AI Engineer," or "Agent Engineer" returns roughly 4,050 people, heavily concentrated in SF, the broader Bay Area, and NYC. That's a thin, geographically clustered pool that every well-funded company in the world is already running the same Boolean string against.
The candidates who would actually pass Sierra's Plan/Build/Review are mostly not in that 4,050. They're sitting under titles like "Senior Software Engineer," "Founding Engineer," "Full Stack Engineer," and "Tech Lead." You find them by behavioral signal, not by title match. This is the specific reason we built Refolk: you describe the engineer in plain English ("ships agent demos on GitHub, has a CLAUDE.md in at least one repo, currently at a Series A or B in NYC, not titled AI Engineer") and get a ranked shortlist back, drawing from GitHub, LinkedIn, and the open web at once. The whole point is that the right candidate for an AI-native loop is defined by what they've done, not by what HR put in their job slot.
What to actually source for
Here is the substitution table worth printing out.
Replace "5+ years Python" with shipped 0 to 1 evidence
Sierra grades 0 to 1 in the Plan/Build phase and 1 to N in the debugging round. Neither correlates with LeetCode pattern recognition. Both correlate with engineers who have actually shipped side projects end-to-end, owned a messy migration, or carried a feature from spec ambiguity through to production. Source on commit history, not on tenure.
Replace "prompt engineering" with output-validation artifacts
Look for eval harnesses, golden-set repos, regression suites for LLM outputs, and PRs where the candidate rejected or rewrote AI-generated code with a substantive comment. Addy Osmani's agent-skills repo on GitHub is the canonical artifact to model your search against. It organizes 24 skills across Define, Plan, Build, Verify, Review, and Ship. Note the near-identical vocabulary to Sierra's Plan/Build/Review onsite. That is not a coincidence. That is the new shape of the work.
Replace "AI Engineer" title with file-path signals
The cheapest, highest-yield signal right now is the presence of specific files in a candidate's repos: CLAUDE.md, .cursorrules, agents.md, AGENTS.md, eval directories, or commit trailers like Co-authored-by: Claude or Co-authored-by: Cursor. These files barely existed 18 months ago. Their presence in a senior engineer's personal or work repos is a near-perfect proxy for the AI fluency Google's doc describes. None of these signals are visible to LinkedIn Recruiter. All of them are visible to anyone running GitHub sourcing for AI fluency the right way, which is one of the cases Refolk was designed for: behavioral GitHub signals fused with LinkedIn context in a single query.
Replace "Boolean string" with a behavioral description
This is the meta-point. The interview moved from a falsifiable test to a behavioral one. Your sourcing has to move with it. A behavioral description ("engineer who has shipped an agent-backed product, writes thoughtful PR review comments on AI-generated diffs, currently at a company under 200 people") doesn't translate cleanly into Boolean. It translates cleanly into plain-English search, which is the whole reason that mode of querying now exists.
The market context, briefly
A CoderPad survey found that 42% of organizations now use AI in technical assessments, with vendors reporting 25 to 30% reductions in time-to-hire. But 71% of engineering leaders believe AI makes skills harder to assess, and 62% of organizations still prohibit AI use in live technical interviews. Meanwhile 63% of professional developers are currently using AI in their development process, and another 14% plan to start soon.
Read those numbers together. A majority of working engineers already build with AI. A majority of interview loops still ban it. The top of the market (Sierra, Meta, Google, Canva) has broken ranks and started grading the actual working behavior. Everyone else is interviewing for a job that increasingly does not exist.
The companies that will win the next two hiring cycles are the ones that fix sourcing first, before they touch the loop. Because once your loop grades Plan/Build/Review, your old pipeline doesn't just convert worse. It actively wastes onsite slots on candidates who were never going to pass.
FAQ
Is LeetCode-style prep actually dead for senior candidates?
Not dead, but increasingly irrelevant for the top-tier loops. Meta still has at least one coding round for senior candidates, and many companies haven't moved at all. But at Sierra it's gone entirely, at Meta the senior round is AI-assisted, and at Google it's drifting toward code comprehension. For agent engineer hiring specifically, optimizing your candidate pipeline for algorithms prep is preparing for the previous war.
What's the single best GitHub search to start with today?
Look for recent commits or PRs touching CLAUDE.md, .cursorrules, agents.md, or AGENTS.md in repos with non-trivial history, then cross-reference the author against LinkedIn for current company and tenure. That one filter approximates the Sierra interview process bar better than any title-based search. The catch is that GitHub's native search is bad at cross-referencing identities to employers, which is the gap plain-English sourcing tools fill.
How do I screen for "AI fluency" without it becoming a buzzword filter?
Stop searching for the phrase. Search for the artifact. Eval harnesses, golden-set tests, PR comments that critique AI-generated code, side-project repos that show iteration on agent steering. If a candidate's GitHub shows them rejecting a Copilot diff for a real reason, you have more signal than any certification or self-described headline will ever give you.
Does this mean the 4,050 "AI Engineer" titled candidates aren't worth sourcing?
They're worth sourcing, but they're not the prize. They're the obvious pool that every competitor is also working. The actual edge in sourcing AI engineers right now is the much larger group of senior generalists whose recent behavior shows AI fluency but whose title still reads "Senior Software Engineer." That's where the math is favorable, and that's where the new interview loops are quietly being filled from.