Carlos Hernández — Engineer & maker, San Salvador

I build production LLM systems — observable, versioned, costed, deletable.

Eight years in production infrastructure — DevOps and platform at scale, and the teams behind it — now aimed at production AI. I lead AI Platform Engineering at EsePlus / Alilo, and work as a DevOps Engineer at Deckers Brands.

I Production work

Agnostic, composable AI engines In production

Built at EsePlus / Alilo · Head of AI Platform Engineering

Two AI engines I designed and own end-to-end, exposed as agnostic, composable APIs for the product team to consume into experiences. One is an async generative-AI backend that turns uploaded documents into corporate microlearning — a multi-stage pipeline coordinating document analysis, content generation, and activity design, with backpressure handling and streaming feedback. The other is a multi-tenant RAG conversational assistant — hybrid retrieval and reranking, modern embedding strategies, an agent runtime with multimodal tool calling, and a defense-in-depth safety stack validated by automated adversarial regression.

Highlights: 95% → 99.5% availability uplift · 20%+ cloud cost reduction · ~90% token savings on validation retries · sub-3s p50 latency · zero jailbreaks in adversarial testing.

II Field notes

Jun '26 When classical wins EN
4 min

I multiplied the training data by 7.4× and the F1 score moved four hundredths of a point. A controlled NLP benchmark on what actually limits a problem — and the cheapest model that fits beating the impressive one that doesn't earn its cost.

Jun '26 A jailbreak is a regression, not a surprise EN
5 min

Most AI safety in production is one untested sentence in a system prompt. What changed when I started treating every attack that worked — fictional-framing, instruction-override, injection — as a regression test that runs forever. 442 cases, zero through.

Jun '26 Why "deletable" is the most important word in LLM systems EN
6 min

You can delete the document. Its embeddings, cached answers, and summaries stay behind. The four layers of deletability, the architecture call that makes "delete everything" one line, and the cache that outlived a tenant I thought I'd erased.

May '26 Prompts deserve migrations, not vibes EN
5 min

Schema migrations have versioning, rollback, and a paper trail. Most prompts in production don't. What changed for me when I started treating them the same way, and where it broke first.

Mar '26 What "observable" actually means for an LLM app EN
7 min

Infrastructure traces tell you the request landed. Quality metrics tell you the answer was wrong. Until both share an ID, you're guessing. A pattern set from running this on call, and the one thing I'd build first if I had to start over.

Feb '26 My IDE finally feels like a coworker EN
8 min

"Tool calling with extra steps" is the wrong framing. The shift is smaller and weirder: the editor reads our Jira, our repo, our traces, and starts answering the questions I used to walk over to a teammate for.

III Research

Comparative NLP for transaction classification In progress

Master's thesis · Universitat de Barcelona (OBS) · defense Sep 2026

Comparative evaluation of four NLP approaches for automatic bank-transaction categorization — from classical baselines (TF-IDF + SVM, XGBoost) and embedding-based neural networks to transformer fine-tuning (DistilBERT). An evaluation pipeline measuring precision, recall, F1, and inference latency across 68k+ labeled transactions. Read the field note →

IV Workshop

Small tools I build for my own life. Each one started because something annoyed me enough to write code about it.

Cadence

A local-first planner that marries GTD with time blocking. Built because every productivity app either fights your method or assumes you don't have one yet. Works offline, syncs across devices via Realtime, ⌘K does everything.

Next.js · Supabase · TypeScript · cadence.carloshdez.com↗

Fitness Dashboard

A personal training and body-composition tracker — split routine, recomposition goals, observable progress. The dashboard I wanted but couldn't find as an app.

React · Tailwind · personal

V Speaking

Cursor + MCP — when your IDE knows your stack

Internal tech talk · Recording available

2026.02

VI About

I'm a Senior AI Platform Engineer in El Salvador. I work from the same desk where I started in cloud infrastructure eight years ago. The systems have grown, the teams behind them too — but the principles haven't. Only the layer I apply them to.

I care about systems you can operate, not just demo. About prompts that fail loudly. About observability that survives the third on-call rotation. About cost lines that don't blow up the first time a feature gets traction.

Outside of paid work, I build small tools for my own life — usually with the same disciplines, because I don't know how to build software any other way.