BombadilLabs

An applied AI research & product lab

Agents that finish what they start.

Bombadil Labs is an independent lab working on long-horizon autonomy — systems that hold up over thousands of steps, and the instruments that prove it. We publish the math, build the products, and take a small number of serious engagements.

Trusted by teams at

aws (Amazon Web Services)Microsoft

What we do

Three practices, one thesis: long horizons are won, not waited for.

Research

Long-horizon reliability is arithmetic before it's engineering. We work out the math of agents that hold up — horizons, verifiers, failure economics — and publish the instruments next to the claims.

Agentic R&D

Warden, Loremaster, Goldberry: products built end to end in the lab, run on our own harnesses, pointed at problems where autonomy compounds instead of merely impressing.

Consulting

A few embedded engagements at a time, for teams whose agentic systems have to work. Architecture, evaluation, and the judgment calls between them.

In the workshop

What we're building.

Three systems in active development, each a bet on the same thesis: the next order of magnitude in software comes from agents you can leave alone.

How we work

House rules, written in ink.

01
Long horizons or nothing.
Demos sprint; value runs marathons. We build for the thousandth step, where the interesting failures and the interesting products both live.
02
Every claim ships with its instrument.
If you can't check us, we haven't finished. The math in our field notes runs live on the page, against the same code we unit-test.
03
Buy nines where they're cheap.
Verifier nines cost an engineering sprint; actor nines cost a training run. Knowing which store to shop in is half the discipline.
04
Small, sharp tools.
Capability is subtraction done well. The best systems carry the fewest moving parts that still clear the bar.
05
Leave things better.
Codebases, datasets, teams — we hand back the keys in better shape than we found them.

Field notes

Dispatches from the lab notebook.

What we learn, written down while it's still sharp — with the instruments embedded, so you can check the math as you read.

Building something with a long horizon?

Tell us what you want to leave running unattended. We read everything that isn't a template, and we reply to everything we read.