All roles
ML Engineer - Agent Infrastructure
Engineering
San Francisco, CA
Full-time
About the role
We're building Monad, our agent runtime that orchestrates complex workflows with memory, reasoning, and tool integration. You'll own critical infrastructure that powers AI systems across clinical research, cybersecurity, and scientific computing.
What you'll do
- Build and maintain the Monad DAG execution engine
- Implement trace logging and observability for agent workflows
- Design tool integration patterns for domain-specific capabilities
- Optimize inference latency and throughput for production workloads
- Work closely with research to integrate new memory and reasoning capabilities
Requirements
- 5+ years experience in software engineering, with focus on ML systems
- Strong Python skills; experience with async programming
- Experience building production ML/AI systems at scale
- Understanding of LLM inference, tokenization, and prompting
- Familiarity with distributed systems and message queues
Nice to have
- Experience with agent frameworks (LangChain, AutoGPT, etc.)
- Background in compilers or runtime systems
- Contributions to open-source ML projects
