Frank Goortani
Staff AI/ML Engineer • Distributed Systems • Production LLM Systems
Seeking Staff/Principal IC Role at FAANG or FAANG-Tier Company • Cutting-Edge AI/ML Systems
Professional Summary
Staff-level AI/ML engineer with 25+ years building production systems at scale. Currently Solution Architect at Uber, designed and built ELLE AI Decision Engine - production LLM system processing millions of decisions across Uber's global platform (won multiple innovation awards). Deep expertise in distributed systems (Go microservices, Cadence workflows, Kafka), production LLM applications (LangChain, RAG, AI Agents), and full-stack engineering (React, TypeScript, Python, cloud-native). Proven ability to architect and ship complex systems serving millions of users with sub-200ms latency and 99.95%+ uptime.
Seeking Staff/Principal IC role at FAANG or FAANG-tier company working on cutting-edge AI/ML systems.
Tools, Frameworks, Libraries, Languages, Databases
AI/ML & LLM Frameworks
LangChain • LangGraph • FastMCP • MCP Development • CrewAI • Ollama • Prompt Engineering • Agentic Workflows • AI Agents • RAG • Multi-Agent Systems • ReAct Patterns • Self-Reflection • Model Fine-Tuning
Agentic Coding & AI Tools
Claude Code • Cursor • GitHub Copilot • Codex • Aider • AI-Assisted Development • MCP Server Development • 3-5x Productivity Gains
LLM Platforms
OpenAI (GPT-4, GPT-3.5) • Anthropic Claude (Sonnet, Opus) • Google Gemini • Cohere • Meta LLMs • Hugging Face Models • Open Source LLMs • Model Evaluation • Model Selection
Vector Databases & Embeddings
OpenSearch • Pinecone • Qdrant • Chroma • Weaviate • Embedding Models • Semantic Search • Vector Similarity Search • Re-ranking
Backend Frameworks
Python (FastAPI, LangFx, Pydantic, Uvicorn, async/await) • Go (YARPC, gRPC, microservices) • Java (Spring Boot, Spring Cloud) • Node.js (Express, NestJS) • C# (.NET Core)
Distributed Systems & Workflow
YARPC (Uber RPC) • gRPC • Apache Kafka • Cadence Workflows • Temporal • Apache Thrift • Protocol Buffers • Service Mesh (Istio) • Circuit Breakers • Distributed Tracing • Event-Driven Architecture
Frontend Frameworks
React 18 • TypeScript 5.7 • Fusion.js (Universal Rendering) • NextJS • TanStack React Query • Redux • Angular • BaseUI • Styletron • Server-Side Rendering • Micro-Frontends
Cloud Platforms
Google Cloud (GCP Cloud Run, Firestore, Firebase, GKE) • AWS (EC2, Lambda, S3, RDS, DynamoDB, ECS, EKS) • Azure (VMs, Functions, CosmosDB, AKS) • Serverless Architecture
Container & Orchestration
Docker • Kubernetes (K8S) • Helm • Istio • Rancher • Google Kubernetes Engine • Container Orchestration • Service Discovery • Health Checks
API & Integration
GraphQL • gRPC • REST • Protocol Buffers • Apache Thrift • Swagger/OpenAPI • OAuth2/OIDC • JWT • Webhooks • API Gateway • Rate Limiting
Data & Messaging
Apache Kafka (Kafka Streams, Kafka Connect) • Cadence • Temporal • Apache Spark (PySpark) • Presto • Hive • HDFS • Airflow • Stream Processing • Real-Time Analytics
Databases
Cassandra (Large-Scale Distributed) • MySQL • PostgreSQL • MongoDB • Redis (Caching, Rate Limiting) • DynamoDB • ElasticSearch • Firestore • CouchDB • SQL Server
DevOps & CI/CD
Bazel (Large-Scale Monorepo) • GitHub Actions • Jenkins • Bamboo • Circle CI • BuildKite • Terraform • Ansible • Infrastructure as Code • GitOps • Continuous Deployment
Monitoring & Observability
M3 (Uber Metrics Platform) • Prometheus • Grafana • Datadog • Splunk • Zap Logging • Tally Metrics • Jaeger (Distributed Tracing) • Kibana • Application Performance Monitoring
Security & Identity
Charter (Uber IAM) • SPIFFE/SPIRE (Workload Identity) • RBAC • OAuth2/OIDC • JWT • mTLS • PKI • Encryption • Security Middleware
Build & Tooling
Bazel • Yarn Workspaces • Vite • Webpack • Monorepo Architecture • Code Generation • Static Analysis
Skills and Achievements
Distributed Systems & Scale:
- Production Systems at Scale: Built systems serving millions of users with sub-200ms p95 latency and 99.95%+ uptime
- Microservices Architecture: Designed and deployed 40+ Go microservices using YARPC, gRPC, and service mesh patterns
- Event-Driven Architecture: Apache Kafka for real-time streaming, Cadence/Temporal for workflow orchestration
- Distributed Tracing: Jaeger integration for debugging complex multi-service requests across 30+ microservices
- Performance Optimization: Achieved 3x latency reduction on LLM inference, 60% API response time reduction with caching
- Reliability Engineering: Circuit breakers, health checks, fallback strategies for resilient service communication
- Data Pipelines: Apache Spark processing 2M+ events/day, Kafka streams for real-time analytics
AI/ML & Production LLM Systems:
- Production LLM Systems: Multi-agent architectures, RAG pipelines, semantic search, prompt engineering at Uber scale
- AI Agents: ReAct patterns, self-reflection, multi-agent orchestration with LangGraph (6+ specialized agents)
- RAG Systems: OpenSearch vector store indexing 1M+ documents, semantic search with re-ranking
- Model Integration: OpenAI, Anthropic Claude, Google Gemini with fallback strategies and circuit breakers
- Prompt Engineering: 92% accuracy on privacy classification tasks with optimized prompts and few-shot learning
- Vector Databases: OpenSearch, Pinecone, Qdrant for semantic search and document retrieval
- ML Pipelines: Data preprocessing, feature engineering, model evaluation, continuous monitoring
Technical Leadership & Impact:
- Staff-Level IC: Technical leadership without direct reports, cross-team collaboration with 5+ platform teams
- System Design: Architected ELLE AI Decision Engine processing millions of decisions with 70% efficiency gain
- Innovation Awards: Won multiple Uber internal awards for production AI systems and automation workflows
- Quality Excellence: 96% test coverage across codebases, zero security incidents in 18+ months
- Cross-Functional Collaboration: Partnered with Identity, Workflows, Data, Messaging, Observability teams
- Technical Writing: Designed comprehensive technical documentation, architectural decision records (ADRs)
- Code Reviews: Conducted 500+ code reviews ensuring quality, security, and architectural consistency
Full-Stack Engineering:
- Frontend Excellence: React 18, TypeScript, universal rendering (Fusion.js), SSR achieving 40% faster page loads
- Backend Development: Python (FastAPI, async/await), Go (microservices), Java (Spring Boot)
- Type Safety: Protocol Buffers for RPC, TypeScript for frontend, Pydantic for Python
- Testing: 96% code coverage with Jest, Pytest, React Testing Library, Playwright E2E tests
- Build Systems: Bazel monorepo builds, Yarn Workspaces, CI/CD with GitHub Actions
- Observability: M3 metrics, Zap logging, Jaeger tracing integrated across all applications
Professional Experience
ELLE (EngsecLLM Engine) - AI Decision Engine - Staff-Level Impact
Scale & Complexity:
- Built production LLM system processing millions of security/privacy decisions across Uber's global platform
- Architected system handling 100K+ requests/month with sub-200ms p95 latency
- Designed multi-agent architecture with 6+ specialized agents (Technical Privacy Review, Security Guidance, Legal Review, DSAR Automation, Data Classification, Compliance Checking)
- Implemented RAG system with OpenSearch vector store indexing 1M+ documents (security policies, privacy regulations, compliance frameworks)
Technical Leadership:
- System Design: Designed distributed architecture using Python microservices, Go authorization layer, Cadence workflows, Kafka event streaming
- Cross-Team Collaboration: Partnered with 5+ platform teams (Identity, Workflows, Data, Messaging, Observability) to integrate with Uber infrastructure
- Performance Optimization: Optimized LLM inference achieving 92% accuracy on privacy classification with 3x latency reduction
- Observability: Built comprehensive monitoring with Zap logging, Tally metrics, M3 dashboards, Jaeger distributed tracing
Technical Achievements:
- Won multiple internal innovation awards for automating security/privacy workflows (70% reduction in manual review time)
- 96% test coverage across authorization middleware, LLM workflows, and RAG pipelines
- Zero security incidents since launch (18+ months in production)
- 99.95% uptime serving global compliance teams (500+ users)
Technologies: Python (FastAPI, LangChain, LangGraph), Go (YARPC, gRPC), OpenAI API, OpenSearch (vector store), Cadence workflows, Kafka, Cassandra, Redis, Protocol Buffers, Bazel, Kubernetes, M3, Jaeger
System Design Patterns:
- Multi-agent orchestration with LangGraph
- RAG with semantic search and re-ranking
- Circuit breakers and fallback strategies for LLM calls
- Authorization middleware with dual inbound/outbound protection
- Event-driven architecture with Kafka
- Distributed tracing for debugging microservices
Distributed Systems & Platform Engineering - Scale & Reliability
Contributed to 40+ Engineering Security projects in Uber's Go monorepo, focusing on distributed systems, authorization, and platform services:
Major Platform Contributions:
1. Charter (Authorization Platform):
- Integrated Charter Decision Service (Uber's IAM equivalent) for fine-grained access control
- Designed RBAC policies protecting 100+ GenAI platform APIs
- Built authorization middleware with 96% test coverage
- Implemented dual middleware architecture (inbound/outbound) achieving comprehensive endpoint coverage
2. SPIRE/SPIFFE (Workload Identity):
- Implemented service-to-service authentication using SPIFFE/SPIRE
- Integrated mTLS for secure communication between microservices
- Designed workload identity attestation for containerized services
- Built PKI infrastructure for certificate rotation and management
3. USSO (Uber SSO):
- Integrated Single Sign-On across all internal tools
- Implemented OAuth2/OIDC flows with JWT validation
- Built session management and token refresh logic
- Designed secure token storage and rotation mechanisms
4. Distributed Systems Architecture:
- Go microservices using YARPC (Uber's RPC framework) and gRPC
- Cadence workflows for long-running asynchronous processes
- Kafka messaging for event-driven communication
- Cassandra and MySQL for distributed data storage
- Redis for caching and rate limiting
Technical Achievements:
- Solved complex circular dependency challenges using MCP Gateway pattern with fx.Lifecycle hooks
- Implemented dual middleware architecture (inbound/outbound) achieving comprehensive endpoint coverage
- Built distributed tracing integration with Jaeger for debugging multi-service requests spanning 30+ services
- Designed health check and circuit breaker patterns for resilient service communication
- Achieved 99.95% uptime for authorization services serving 10K+ requests/second
Technologies: Go (Golang), YARPC, gRPC, Protocol Buffers, Apache Thrift, Charter, SPIFFE/SPIRE, authfx, OAuth2, JWT, Cadence, Kafka, Cassandra, MySQL, Redis, Bazel, Docker, Kubernetes, Zap, Tally, Jaeger
Full-Stack Engineering at Scale - 9 Enterprise Web Applications
Led full-stack development of 9+ mission-critical privacy web applications serving Uber's global compliance teams:
Scale & Performance:
- Fusion.js universal rendering achieving 40% faster page load times (SSR + code splitting)
- Type-safe RPC communication via Protocol Buffers for 30+ backend services
- Comprehensive testing: 34+ Jest test suites with 96% code coverage, Playwright E2E tests
- Global deployment: Serving users across 70+ countries with sub-second page loads
- Micro-frontends: Modular deployment enabling independent team velocity
Applications:
- Elle Web - EngsecLLM Engine Frontend (React 18, TypeScript, LangChain UI integration)
- IRIS Web - Privacy Web Frontend (GDPR/CCPA compliance automation)
- DSAR Web - Subject Access Request Automation (60% efficiency gain)
- UDE (User Data Explorer) - Data discovery and lineage tracking
- Consents Manager Web - Contextual consent management (GDPR/CCPA compliant)
- Web CMP - IAB TCF v2 Compliant Consent Management Platform (70+ countries)
- TPRM QC Copilot Web - Third-Party Risk Management quality control
- Agentic AI Observability Web - Real-time AI agent monitoring and debugging
- Design Review Web - Privacy design review workflows
Technical Stack:
- React 18, TypeScript 5.7, Fusion.js (universal rendering)
- TanStack React Query (data fetching), Redux (state management)
- BaseUI design system, Styletron CSS-in-JS
- Protocol Buffers (RPC), GraphQL (Apollo Client)
- Jest (unit tests), Playwright (E2E), React Testing Library
- Bazel (monorepo builds), Yarn Workspaces, Vite
Architecture Patterns:
- Universal rendering with SSR and code splitting
- Micro-frontends for modular deployment
- Type-safe API contracts via Protocol Buffers
- Feature flags with Flipr for gradual rollouts
- Observability with M3 metrics and Zap logging
Python Monorepo - 50+ Security Infrastructure Projects - AI/ML & Data Engineering
Led development of 50+ security infrastructure projects spanning AI/ML, data engineering, and automation:
AI/ML Projects:
- LLM-powered security automation using LangChain, LangGraph, FastMCP (Model Context Protocol)
- RAG systems for security policy search (OpenSearch vector store, semantic embeddings)
- Multi-agent systems with self-reflection and ReAct patterns
- Container image remediation assistant (LangFx service with JIRA integration)
Data Engineering:
- Apache Spark pipelines for security event processing (2M+ events/day)
- Kafka streaming for real-time security alerts and anomaly detection
- Presto SQL queries for compliance reporting and analytics
Technologies: Python 3.9/3.11, LangChain, LangGraph, FastMCP, Apache Spark, Kafka, Presto, OpenSearch, Pinecone, Qdrant, Bazel, pytest
Scale & Impact:
- Team Leadership: Led team of 10+ onshore and 50+ offshore developers for homedepot.ca website
- Customer 360 Platform: Unified customer data platform serving 15M+ customers with personalized experiences
- Revenue Impact: Enabled $50M+ in targeted marketing revenue through personalized customer insights
- Performance: Improved customer satisfaction by 25% through personalized recommendations
Technical Achievements:
- Designed microservices architecture (25+ services) for real-time customer data processing
- Built Kafka/Spark pipeline processing 2M+ events/day for real-time analytics
- Implemented Angular Universal Server-Side Rendering for SEO and social sharing (40% traffic increase)
- Architected caching layer leveraging Redis, reducing API response times by 60%
- Led mobile DevOps automation using Jenkins, Fastlane, achieving 90% deployment automation
Technologies: Java Spring Boot, Kafka, Cassandra, Redis, React, Angular, AWS, Kubernetes, Jenkins
Scale & Impact:
- API Platform: Designed RESTful API platform serving 100+ microservices across insurance operations
- Mobile Architecture: Led iOS/Android native development for customer-facing mobile apps (2M+ users)
- Microservices Migration: Architected migration from monolith to microservices (Kubernetes, Docker, Rancher)
Technical Achievements:
- Designed event-driven architecture using Kafka for real-time policy updates and claims processing
- Built API gateway with rate limiting, authentication, and circuit breakers
- Implemented Kubernetes cluster with automated scaling and self-healing
Technologies: Java Spring, Kubernetes, Docker, Rancher, Kafka, PostgreSQL, Redis, iOS (Swift), Android (Kotlin)
Scale & Impact:
- Team Leadership: Managed team of 12+ developers delivering BI projects for banking operations
- Payment Processing: Real-time payments platform processing $2B+/year with 99.99% uptime
- Omnichannel Banking: React Native mobile + React web serving 8M+ users
Technical Achievements:
- Designed Banking Services Data Mart for OMEGA-PEGA project (PowerDesigner, SSIS, SSRS, SSAS)
- Implemented microservices migration (25+ services) reducing deployment time from weeks to hours
- Built real-time fraud detection system using streaming analytics
Technologies: Java Spring, React, React Native, PostgreSQL, Redis, Docker, Jenkins, SQL Server, SSIS/SSAS/SSRS
Enterprise Architecture:
- Designed complex Data Warehouse projects in Finance, Insurance, and Travel industries
- Implemented ESB web projects using SOA and TOGAF patterns (WCF, Web Services, .NET 4.5)
- Led SQL Server Infrastructure design and upgrades (clustering, performance optimization, HA/DR)
- Architected Enterprise Data Dictionary solution for data warehouse governance
Technologies: SQL Server 2012, SSIS/SSAS/SSRS, QlikView, C#, .NET 4.5, WCF, TOGAF, SharePoint, Windows Azure
Open Source & Community
Contributions:
- LangChain ecosystem contributions (bug fixes, feature development, documentation)
- Active in AI Engineer community (AI Engineer Summit, LangChain Discord, local meetups)
- Open source tools for AI/ML workflows and distributed systems patterns
Thought Leadership:
- LinkedIn: 13K+ followers, AI/ML thought leader focusing on production LLM systems, distributed systems
- Medium: @FrankGoortani - Published articles on LangChain, RAG systems, distributed architectures
- Speaking: Available for conference talks on production AI systems, distributed systems, and scale
Education and Professional Development
M.Sc. in Management
AmirKabir University of Technology (Tehran Polytechnic), Iran | 2003
B.Sc. in Computer Software Engineering
AmirKabir University of Technology (Tehran Polytechnic), Iran | 2001
Certifications:
- Certified Data Scientist, Johns Hopkins University, 2015
- PMP Certified (Project Management Professional), PMI, 2013
- TOGAF 9.1 Certified, The Open Group Architecture Framework, 2013
- ITIL Foundation, Mind Leaders, 2013
- MCPD - Web Application Development 2008, Microsoft, 2012
- MCITP - Database Developer 2008, Microsoft, 2012
- Agile Project Management Strategy and Analysis, 2013
- Data Warehousing Management, Mind Leaders, 2013
- OWASP Top 10, Security Compass, 2012
Key Strengths
- Staff-Level IC: Technical leadership without direct reports, cross-team collaboration, architectural decision-making
- Scale & Performance: Systems serving millions of users with sub-200ms latency, 99.95%+ uptime
- Distributed Systems: Deep expertise in Go microservices, Kafka, Cadence, distributed tracing, service mesh
- AI/ML Production: Shipped production LLM systems (LangChain, RAG, AI Agents) at Uber scale
- Full-Stack Depth: React/TypeScript frontend + Python/Go backend + cloud infrastructure
- Quality-Focused: 96% test coverage, zero security incidents, comprehensive observability
- System Design: Architected complex distributed systems, API platforms, workflow orchestration