Geoffrey Meric

Applied AI Engineer

I help build & evaluate agentic systems, deploy them into the world and monitor & iterate on them.

  • LLMOps & MLOps (improving reliability and reducing latency)
  • Model development & evaluation, with a focus on LLM classification (intent & routing, named-entity recognition, tool usage, LLM-as-a-judge) and generation (RAG generation, agent planning)
  • A/B experimentation at scale
  • Improving programmatic answering capabilities (routing, retrieval, generation, guard-railing) and agent planning capabilities
  • Developing context engineering & experience personalization solutions
  • Digging into product usage & system performance data to identify issues & improvement opportunities and understand user wants

Experience

Applied AI Engineer

Autodesk

Mar 2025 - Present Montreal, Canada
  • Distributed Backend Engineering (AWS), Science and LLMOps for Autodesk's Commerce & Support Agent
  • Designing full-lifecycle feature improvement experiments, progressive rollout & success metric measurement strategies, configuring A/B tests using LaunchDarkly platform, performing causal inference analysis on outcomes
  • Developing planning evaluation framework & named-entity recognition tool for ReAct Langchain Deep Agents
  • Primary MLOps responder in US timezones: monitoring operations, handling error spikes/VDB outages/alerts, shipping hotfixes, and improving observability via CloudWatch, Dynatrace, and the Opik LLMOps/eval platform
  • Latency optimization: deploying ECS + DynamoDB context engineering infrastructure to parallelize context computation (e.g. conversation summarization), reducing answer latency by ~1s/10% in 15k daily conversations
  • Backfilling LLM-as-a-judge conversation evaluation data for ~1M past conversations using Airflow & OpenAI Batch API, creating dashboard to semantically cluster user requests & source documents and inspect query performance (latency, errors, escalations, LLM-j metrics) filtered by product & request type and identify content gaps
  • Creating personalization PoCs using inferred user intent from activity & recommender outputs to steer answering
  • Developing a synthetic evaluation pipeline using LLMs to generate test queries and ground-truth answers from support docs, distill RAG outputs into fact-only representations with stylistic noise removed, and classify responses as semantically equivalent, incomplete, hallucinated, or contradictory while measuring document retrieval recall
  • Addressing repeated client reinitialization causing high latency & resource leaks via singletons (10x Weaviate error reduction), rewrote high-latency event instrumentation services to asynchronously buffer events to batch send, resolved faulty data bugs causing A/B test SRM + UX inconsistencies and errors for all LATAM & Norway users

Machine Learning Engineer

Serifos Technologies

Sep 2024 - Dec 2024 Montreal, Canada
  • Developed online economic sentiment analysis tools for institutional clients using GPT and BART-NLI models
  • Scraped and analyzed online forum discussions (X, reddit, etc.), using GPT and BART-NLI language models to identify salient topics and performed topic modeling to identify trending themes outside of predefined indicators
  • Built Streamlit UI to visualize user sentiment over time across economic indicators, search comments via vectorDB

Machine Learning Operations Engineering Intern

Autodesk

May 2024 - Aug 2024 Montreal, Canada
  • Created a python-based programmatic evaluation system for Retrieval-Augmented Generation (RAG) models to diagnose model failures and identify trends in queries that yield poor quality responses, using LLMs & NLI models
  • Improved conversation summarization by designing evaluation methods to compare summarization prompts

Software Development Intern

Autodesk

May 2023 - Aug 2023 Montreal, Canada
  • Built live job logging & monitoring tools for a cloud rendering Autodesk Maya plug-in & tripled its throughput

Software Engineering Intern

Procter & Gamble

May 2022 - Aug 2022 Geneva, Switzerland
  • Built customer data platform architecture visualization and CRM campaign operation automation dashboards

Other online presence