Refolk
May 19, 2026·9 min read

IBM Research's Mellea Hire: Find the 60 Engineers Who Speak LLM and Coq

IBM Research is hiring LLM plus formal methods plus Rust engineers for Mellea. LinkedIn returns about 60. Here's where the real pool actually lives.

formal methods LLM hiringsourcing Rust engineers 2026IBM Research Mellea hiringgenerative computing engineersprogram verification AI talent
IBM Research's Mellea Hire: Find the 60 Engineers Who Speak LLM and Coq

IBM Research dropped a comment into the May 2026 Hacker News "Who is hiring" thread asking for early-career scientists and engineers fluent in LLMs combined with programming languages, formal methods, or compilers, with Rust as a plus. Run that as a LinkedIn title search and you get roughly sixty profiles globally, most of them mislabeled. The job description is correct. The sourcing method everyone is reaching for is wrong.

The role IBM is actually staffing

The comment is tied to Mellea, a library built over the last year by Nathan Fulton and Hendrik Strobelt at the MIT-IBM AI Lab. Mellea is the engineering tip of what IBM Research VP David Cox calls "generative computing," a worldview that splits computing into three phases: imperative (explicit instructions), inductive (learning from examples), and now generative. Cox's framing is not a marketing flourish. It is a hiring rubric.

The core technical pattern Strobelt describes is "instruct-validate-repair": send instructions to the model, validate what comes back against a set of requirements, repair on failure. The library ships a standard library of opinionated prompting patterns, sampling strategies for inference-time scaling, batteries-included verifiers, support for efficient checking of specialized requirements using activated LoRAs, and a @generative decorator that turns typed function signatures into LLM specifications. It integrates with OpenAI, Ollama, vLLM, HuggingFace, Watsonx, LiteLLM, and Bedrock, and now plugs structured validation into CrewAI multi-agent systems.

Read that paragraph again. "Verifiers." "Samplers." "Typed function signatures." "Activated LoRAs as checkers." This is type-theory and program-verification vocabulary applied to LLM I/O. It is not the vocabulary of an ML engineer who fine-tunes Llama on Modal.

Fulton's own bio gives the game away. He is a manager at the MIT-IBM AI Lab, earned his PhD at Carnegie Mellon, and his published work is in formal verification of cyber-physical systems via KeYmaera X. His framing of why this hire matters:

A 10% failure rate when you don't know where the system will fail is unacceptable. If you're trying to automate a task where failure matters and there's no way to detect your failure modes, then it doesn't work.

That sentence is the entire job spec. It is verification-engineer language, and the candidates who think that way are not searchable by the keywords most recruiters reach for.

Why LinkedIn returns essentially nothing

Try the standard recruiter combinations. "LLM engineer" AND "Rust." "Machine learning" AND "formal methods." "AI" AND "Coq." Add a region filter. The result on a major professional-network index is effectively zero distinct matches, somewhere in the dozens worldwide once you strip out students who once took a class. The combination is so rare it does not register as a profession.

~60
LinkedIn profiles globally matching LLM + formal methods + Rust as titled skills
The hybrid does not exist as a named job. The candidates do, just under different labels.

The bug is the title taxonomy, not the candidate pool. The right people exist. They are labeled "Research Engineer (Programming Languages)," "Verification Engineer," "Compilers PhD Student," "Formal Methods Scientist," or simply "Researcher." Their skills sections list Lean, Coq, Dafny, F*, CBMC, SMT, separation logic. Many of them have never put "LLM" anywhere on a profile even though they ship LLM tooling at work.

This is the classic shape of a sourcing problem we built Refolk to solve: the title is wrong, the skills are scattered across two unrelated taxonomies, and the only way to find the person is to describe them. Ask in plain English ("research engineer with PRs to Kani or Creusot who has also touched vLLM or LangChain in the last year"), and let the index do the joining across GitHub, LinkedIn, and the open web.

GitHub is the primary signal. LinkedIn is the cross-check.

For this stack, commit history beats job history. The candidates IBM Research wants are contributors to a short, named list of repositories:

  • Kani (github.com/model-checking/kani), AWS's bounded model checker for Rust.
  • Creusot, Xavier Denis's deductive verifier for safe Rust with Pearlite as a specification language.
  • Aeneas, Son Ho and Jonathan Protzenko's tool that translates Rust's intermediate representation to Coq, Lean, F*, and HOL4.
  • Gillian-Rust, the separation-logic hybrid verifier.
  • Verus, Prusti, RustBelt.
  • Mellea itself (github.com/generative-computing/mellea).
  • KeYmaera X, Fulton's own theorem prover for cyber-physical systems.
  • Lean mathlib and adjacent Coq libraries.
  • ESBMC, which the Rust Foundation flagged in 2026 as "an exciting new frontier for Rust safety and verification tooling," with the team working to integrate it as an alternative backend for Kani.

The unicorn is anyone with a PR into one of those verifiers in the last 24 months and a recent project touching vLLM, Ollama, HuggingFace, LangChain, or an agent framework. That intersection is what IBM is buying. It is also the intersection that no LinkedIn skills filter can express.

GitHub topic searches help: rust + formal-verification, smt-solver, bounded-model-checking. Contributor graphs on each repo's insights tab help more. But the manual cross-walk from "active Kani contributor" to "has a LinkedIn that says they're open to work" is brutal at any volume.

"Generative computing" is hiring code for a specific conference circuit

When IBM publishes a worldview that LLMs need PL-style infrastructure (typed APIs, samplers, verifiers, activated LoRAs as checkers), they have signaled exactly which academic communities to recruit from. The conferences are not NeurIPS and ICML. They are POPL, PLDI, CAV, OOPSLA, and the smaller PL workshops at NeurIPS. The mailing lists are coq-club and the Lean Zulip. The interest group is RFMIG, the Rust Formal Methods Interest Group.

If you are sourcing for generative computing engineers and your funnel does not touch any of those venues, you are fishing in the wrong lake. The standard MLSys recruiting funnel will surface people who can fine-tune a model and serve it on Bedrock. It will not surface anyone who can write a verifier that proves the output of that model satisfies a contract.

A practical shortcut: pull the program committees of CAV 2024 and 2025, the POPL 2025 author list, and the Rust Foundation's AWS-sponsored stdlib verification challenge participants. Cross-reference against GitHub activity in any LLM-adjacent repo from the last 12 months. The list you get back is small, named, and accurate.

The Rust signal is downstream of the verification signal

It is tempting to treat "Rust" as the binding constraint in the JD. It is not. IBM lists Rust as "a plus" because Rust is the language formal-methods PhDs are choosing in 2025 and 2026. Kani, Creusot, Verus, RustBelt, and ESBMC are all Rust-centric. The Rust Foundation and AWS co-sponsor a stdlib verification effort partially underpinned by the aspiration "to demonstrate that formal verification is a way to keep AI programmers on track, by being able to verify programs they generate actually do what they say they do."

Filtering for Rust first is backwards. Filter for verification interest, and Rust falls out for free in most candidates under 35 who came up through the PL track. Filtering for Rust first will also flood you with embedded-systems and crypto-infra engineers who do not have the LLM half of the stack.

This is the second place sourcing Rust engineers in 2026 breaks down with conventional tools. The Rust population on LinkedIn is large enough to look productive (tens of thousands of self-identified Rust engineers), so a recruiter pulling on that string feels like they are making progress. They are. Just not toward the IBM Research role.

AWS Automated Reasoning is the comparable

If you want to know what IBM Research's Mellea team will look like in 18 months, study AWS's Automated Reasoning Group. They have been hiring this exact profile for years: Daniel Schwartz-Narbonne, Zyad Hassan, Rahul Kumar (who leads the Rust stdlib verification effort). ARG alumni are the warmest pool in the world for this kind of role, and they cluster in specific places when they leave: Galois, Microsoft Research RiSE, Meta's Probability and Infer teams, Anthropic's interpretability-adjacent groups, and increasingly the smaller applied shops like Modal and Chroma.

If you are recruiting against IBM Research, the move is not to fight them for Mellea contributors directly. The move is to source the next concentric ring: ARG alumni who left in the last 18 months, RiSE researchers whose last paper cited an LLM, and the 30-or-so Lean and Coq contributors who have recently pushed code into an LLM tooling repo. That ring is roughly an order of magnitude larger than the visible pool, and almost none of it shows up on a title search.

For founders trying to staff a verification-heavy AI team without IBM Research's brand pull, the realistic play is to describe the candidate to your sourcing tool in the same plain-English shape you would describe them to a colleague over coffee. Refolk runs that query across GitHub, LinkedIn, and the open web, surfaces the academic homepages and personal blogs that hold the real evidence, and returns a ranked shortlist rather than a 4,000-row CSV of false positives.

What to do in the next 30 days

  1. Save five queries, not one. "Kani contributors with vLLM commits." "Lean mathlib contributors with any LLM repo in the last year." "POPL 2024-2025 authors with industry email domains." "CAV program committee members under 40." "Rust Foundation stdlib verification challenge participants." Run them weekly.

  2. Read the Mellea repo's issue tracker and discussion forum. External contributors who file substantive bugs or land PRs are pre-qualified. They have already context-switched into the instruct-validate-repair model.

  3. Mine the AWS-Rust Foundation challenge winners. The list is public. Rahul Kumar's podcast appearances name several of them directly.

  4. Stop writing "LLM Engineer" as the job title. The candidates you want will not apply. "Research Engineer, Verification and Generative Systems" pulls the right resumes. So does "Programming Languages Engineer (Applied LLM)."

  5. Set a 90-day timer. IBM Research's HN post will not stay up. Mellea's contributor graph is still small enough that you can know every external committer by name. By Q3 2026, the early-mover window on this hybrid closes, and the pool starts negotiating against multiple offers.

The candidates exist. The job exists. The middle layer, the taxonomy that connects them, is the part that is broken, and it will stay broken on title-based tools. Describe the person. Search the commits. Then talk to them before IBM does.

FAQ

How big is the global pool of LLM plus formal methods plus Rust engineers right now?

Realistically a few hundred people who could do the job on day one, of which roughly sixty are findable by any combination of titles and skills on LinkedIn. The pool widens to a few thousand if you include verification PhDs and PL researchers who could pick up the LLM half in a quarter, and that wider ring is where most actual hires will come from. The IBM Research Mellea hiring effort is sized around early-career scientists for exactly this reason.

Why is Rust specifically listed as a plus if the role is about LLMs?

Because the formal-methods ecosystem has standardized on Rust as its preferred target language in 2025 and 2026. Kani (AWS), Creusot, Verus, Aeneas, and ESBMC are all Rust-centric, and the Rust Foundation runs a verification-focused stdlib initiative co-sponsored by AWS. Filtering candidates by Rust experience first will mislead you. Filtering by verification work and observing that Rust appears in their stack as a side effect is the right order.

What's the fastest way to find program verification AI talent without paying agency fees?

Pull recent contributor lists from Kani, Creusot, Aeneas, Verus, and Mellea on GitHub. Cross-reference against POPL, PLDI, and CAV author lists from the last two cycles. Run those names against LinkedIn and personal sites to find current employer and contact info. This is a multi-hour manual job per query, which is why most teams use a sourcing tool that joins the three indexes for them.

Is "generative computing" just IBM marketing or a real hiring signal?

It is a real hiring signal. David Cox's three-phase framing (imperative, inductive, generative) maps directly onto which research community IBM is recruiting from, and the technical primitives in Mellea (verifiers, samplers, typed generative functions, activated LoRAs as checkers) are PL and verification primitives, not ML primitives. When a research lab publishes that level of detail about its agenda, treat the vocabulary as a candidate-targeting document.

Read next