Aparna Dhinakaran

Co-founder and Chief Product Officer at Arize AI, pioneering ML observability and LLM evaluation platforms. Forbes 30 Under 30 honoree and Forbes AI Columnist, bringing deep expertise from building production ML infrastructure at Uber, Apple, and TubeMogul.

Pioneer in AI Observability & Evaluation Engineering

Aparna Dhinakaran has established herself as a leading voice in the critical intersection of ML operations, agent evaluation, and production AI systems. Her work bridges the gap between cutting-edge research and practical deployment challenges at scale.

Current Work

As CPO of Arize AI, Aparna leads product strategy for the ML observability platform serving leading enterprises and AI-first companies. Her focus areas include:

LLM Evaluation Systems - Developing LLM-as-a-judge frameworks and prompt optimization techniques
Agent Observability - Monitoring and improving AI agents in production environments
Continuous Learning - Applying RL-inspired techniques to system prompt optimization, achieving +10% improvement on SWE Bench
Evaluation Engineering - Advocating that evaluation infrastructure is as foundational as model training itself

She writes actively on Medium and the Arize AI blog, covering prompt learning, LLM evaluation methodologies, and production AI challenges. Her LinkedIn following of 31,000+ reflects her influence as a thought leader in the space.

Background

Prior to founding Arize AI, Aparna built critical ML infrastructure at scale:

Uber: Led development of Michaelangelo, Uber’s core ML Infrastructure platform, spending three years on both modeling and production deployment
Apple: ML engineer focused on production AI systems
TubeMogul (Adobe): ML engineer in ad tech and ML deployment

Holds a B.S. in Electrical Engineering and Computer Science from UC Berkeley, where she published research with Berkeley’s AI Research group. Currently on leave from Cornell University’s Computer Vision Ph.D. program to focus on building Arize AI.

Philosophy on Evaluation Engineering

Aparna’s work emphasizes a contrarian thesis in agent development:

Evaluation determines what’s possible - Better evaluation frameworks unlock better agent behavior. She advocates that eval engineering delivers higher ROI than most other optimization efforts.

LLM judges with explanations - Using LLMs as evaluators that provide structured feedback and explanations, not just scores. Opaque metrics don’t enable improvement; detailed feedback does.

Prompt-level optimization over weight-level - Demonstrating that systematic prompt learning using evaluation feedback can achieve significant gains (6-15% improvements) without expensive model fine-tuning or full RL.

Arize AI

Arize AI is the leading ML observability and model monitoring platform for production AI systems. The platform provides real-time visibility into model performance, detects data drift and degradation, enables root cause analysis, and supports LLM evaluation and agent monitoring for organizations deploying AI at scale.

Conference Appearance

Event: AI Engineering Code Summit 2025 Date: November 21, 2025 Time: 4:00 PM - 4:20 PM Session: Continual System-Prompt Learning for Code Agents

Aparna presented a practical approach to continuously improving AI agents through LLM-based evaluation and reinforcement learning techniques applied to system prompts. Her methodology achieved 6-15% performance gains using only 150 examples, offering an accessible alternative to expensive full-scale RL training.

Related Themes

Reinforcement Learning for Specialized Models: The Economics of Domain Expertise