Index 001 · Liverpool, UK · MMXXVI

Jeorge Johns.

AI evaluation specialist and design engineer working at the seam between frontier models and physical systems.

250K+ Evaluations
98% Approval rate
§ 01

About - the work

I work in two parallel registers - frontier model evaluation and mechanical design engineering - and live where they overlap.

As a frontier model evaluator across OpenAI, Alphabet, Hugging Face, and Microsoft, I specialise in adversarial red-teaming, failure-mode taxonomy, and rubric design for reasoning and long-form tasks. A quarter of a million completed evaluations, 98%+ approval sustained across four platforms.

As a design engineer, I work on physical vapour deposition systems - designing sputtering equipment, vacuum components, and the surrounding mechanical hardware - with self-educated fluency in plasma physics, sputter deposition, and ultra-high vacuum engineering.

The overlap - evaluating AI on physical, technical, and engineering tasks where domain fluency is uncommon - is the work I do best.

§ 02

Practice - three pillars

I / Evaluation

Frontier model evaluation

Adversarial prompting, rubric design, failure-mode analysis. Specialism in engineering and physical-reasoning evaluation where most evaluators lack domain fluency.

Red-teamRubric designLong-formReasoning
II / Engineering

Design for hard tech

Mechanical design and full-lifecycle development of vacuum and PVD systems. Cathodic arc, magnetron source architecture, sputter deposition.

PTC CreoWindchill PLMUHVPVD
III / Writing

Long-form prose

Two literary manuscripts in progress. Essays on AI evaluation methodology and industrial design now live on Substack.

SubstackManuscriptCriticism
§ 03

Recent writing - Substack

§ 04

Elsewhere - get in touch