AI RESEARCHER · WRITER · ADVISOR

Florin

building frontier AI

I research how large models reason, build systems that put them to work, and write — clearly — about where this is all going. No hype, just the signal.

0
Essays published
0
Newsletter readers
0
Years in AI
0
Total reads
◢◣ [ portrait.jpg ]
SF · Remote est. 2015
// ABOUT

Translating frontier research into things people can actually ship.

I'm Florin — an AI researcher and writer focused on large language models, reasoning, and the messy engineering that makes them useful. I've spent the last decade building machine-learning products and turning dense papers into working systems.

This is where I think out loud: deep dives, build logs, and honest takes on what's working, what isn't, and what's coming next.

AREAS OF EXPERTISE
Large Language Models Reasoning & Agents Retrieval (RAG) Model Evaluation Fine-tuning AI Strategy MLOps Multimodal

Building a research agent that actually finishes the job

Most "autonomous" agents stall halfway through anything that matters. Here's the architecture I landed on after a year of long-horizon experiments — planning, memory, and the verification loop that keeps it honest.

▸ Watch on YouTube · 18:42
Walkthrough — architecting a long-horizon research agent end to end.

Start with the verification loop, not the planner

Everyone reaches for the planner first. But a planner without a way to check its own work just produces confident, expensive nonsense. The component that made the difference for me was a verifier that scores each step against the original goal before the agent is allowed to move on.

Once the agent could tell when it was off-track, everything downstream got simpler. Re-planning became cheap. Memory stopped accumulating garbage. The video above walks through the exact graph — feel free to pause on the architecture diagram around the eight-minute mark.

"An agent that knows when it's wrong is worth ten that are merely fast."

In the full write-up I share the prompt scaffolding, the eval set I used to tune the verifier, and the failure cases that still trip it up. Subscribe below to get the next part — where I put this thing in production.

// ARCHIVE

Everything I've published