All Speakers
Natalie Serrino

Natalie Serrino

Co-Founder

Gimlet Labs

Expert Biography

Natalie Serrino

Co-Founder of Gimlet Labs, building AI infrastructure that makes agentic workloads 10X more efficient. Previously Founding Engineer at Pixie Labs (acquired by New Relic) and Observe Inc. Brown University Computer Engineering graduate.

Making AI Workloads 10X More Efficient

Natalie co-founded Gimlet Labs to solve the infrastructure crisis created by agentic AI systems. While traditional chat models are resource-intensive, agentic systems generate 5-15X more tokens and require heterogeneous compute orchestration across GPUs, memory-bound accelerators, and network-optimized nodes.

Gimlet’s platform automatically decomposes AI workloads into stages and maps each to the optimal hardware—decoupling agentic systems from specific accelerators. The company offers both a hosted platform and kforge, a standalone toolkit for developers.

Research: AI-Generated Kernels

Natalie co-authored groundbreaking research on AI-generated Metal kernels for PyTorch, demonstrating that frontier models can automatically optimize GPU code for Apple devices. Key findings:

  • 1.87x speedup across 215 PyTorch modules using AI-generated kernels
  • Agentic swarm approach (Best of N) outperforms single-model generation
  • Context enhancement (CUDA references, Metal profiling) nearly triples performance gains
  • Some workloads achieved 4.65x speedups through kernel fusion
  • Requires zero kernel engineering expertise—works automatically on existing PyTorch code

This work validates Gimlet’s core thesis: AI can automate performance optimization that previously required specialized GPU programming skills.

Background

Pixie Labs (2019-2023) - Founding Engineer on Kubernetes observability platform acquired by New Relic. Built scalable monitoring infrastructure for cloud-native environments.

Observe Inc. (2018-2019) - Founding Engineer on observability and data pipeline platform. Early expertise in large-scale data systems.

Benchmark Capital (2017-2018) - Entrepreneur in Residence, exploring startup opportunities in infrastructure and AI.

Trifacta (2013-2017) - Senior Software Engineer on data transformation platform (acquired by Alteryx).

Philosophy: Automated Performance Engineering

Natalie’s work embodies a shift from manual GPU optimization to AI-driven automation. Rather than requiring deep CUDA/Metal expertise, her approach:

  • Uses frontier models as kernel generation agents
  • Employs ensemble techniques for reliability
  • Validates correctness and performance automatically
  • Operates on existing codebases without framework changes

This mirrors her broader vision at Gimlet Labs: abstract away infrastructure complexity, letting AI handle hardware optimization while developers focus on building agentic applications.

Key Publications

Conference Appearance

Event: AI Engineering Code Summit 2025 Date: November 21, 2025 Time: 2:45 PM - 3:04 PM Session: Using AI-Generated Kernels to Instantly Speed Up PyTorch

Demonstrated how AI can automatically generate and optimize compute kernels for custom PyTorch operations without manual engineering effort. Covered kernel generation using transformer architectures, validation approaches, and real-world performance improvements.


Last Updated: November 24, 2025

Stay Updated

Get the Latest AI Engineering Insights

Join the Focus.AI newsletter for curated research, analysis, and perspectives on the evolving AI landscape.

No spam. Unsubscribe anytime.

CLASSIFIED_FILES

USER: AUTHORIZED

[ EMPTY DRAWER ]

No documents have been filed.