Interview questions
OpenAI AI engineer interview questions (2026)
19 real interview questions reported by engineers who interviewed at OpenAI, spanning AI Engineering, AI Security Engineering, ML Engineering, Software Engineering. Every question is scored against a golden answer on the things OpenAI actually grades — architecture, token efficiency, security and correctness — not just whether your code runs.
OpenAI AI Engineering questions
- Multi-tenant RAG: Secure Vector Index Isolation for Enterprise
Design a multi-tenant RAG system that serves 50 enterprise customers from a shared vector index without leaking documents across tenants. Cover index layout, query-time filtering,
- Concurrent dictionary writes cause data loss without synchronization locks
A multi-agent research system has agents that write findings to a shared results dictionary. Occasionally results from one agent overwrite results from another and some findings ar
- Vector Embedding Normalization Missing Causes Low Similarity Scores
A vector search system returns completely irrelevant results even though the documents were indexed correctly. The similarity scores are suspiciously low for all queries. Find the
- Improve Low Embedding Accuracy Through Fine-tuning and Domain Adaptation
Suppose you are working with an open AI embedding model, after benchmarking accuracy is coming low, how would you further improve the accuracy of embedding the search model?
- Streaming LLM tool calls: partial parsing, client buffering, progressive UI render
You need to ship a streaming chat UI backed by an LLM with tool use. Walk through how you handle partial tool calls in the stream and how the client renders them.
- ChatGPT Processing Pipeline: Request To Response Generation
Inside ChatGPT: What Happens After You Hit Enter?
- ChatGPT Product Strategy and Metrics Design
Why do you like ChatGPT? Who are its users, what metrics would you use to track its success, and how would you improve it?
- Detect and Prevent Model Overfitting and Underfitting Issues
What is overfitting or underfitting? Which models are most likely to experience this, and why?
- Enterprise RAG Pipeline: Scalable Search with Sub-200ms Latency Guarantees
You're building a RAG pipeline for enterprise document search. The corpus is 10 million documents updated daily. Design for sub-200ms p99 latency with cited sources.
- Implement a RAG pipeline with citation tracking
You are building a RAG pipeline for a legal research platform. The corpus is 2 million case law documents updated weekly. Users ask complex multi-part questions. Requirements: sub-
OpenAI Software Engineering questions
- Token Budget Manager for Multi-turn LLM Applications
Implement a top-level `solve(payload)` that estimates the token cost of a chat history. ## The solve contract `payload` is a dict: `{"messages": [{"role": str, "content": str}],
- Semantic cache using embeddings for similarity matching
Implement a top-level `solve(payload)` for a semantic cache: return a stored response when a new query's embedding is similar enough (by cosine similarity) to a stored query's emb
- Token Counter Mismatch Between Dashboard Estimates And Actual LLM Costs
Your team's cost monitoring dashboard is showing LLM spend 40% lower than the actual Stripe invoices. The token counter used for cost estimation is producing wrong numbers. Find th
- Agent Function Calling JSON Generation Produces Malformed Output Intermittently
Your agent uses function calling to extract structured data from user messages. Occasionally the downstream system crashes because it receives malformed JSON. The bug is intermitte
- Agent hallucinating confident incorrect data from tool composition failures
A financial data agent calls three tools in sequence to generate investment summaries. In production it intermittently returns confident summaries with completely wrong numbers. Th
- Embedding Model Version Mismatch Between Development Production Environments
A semantic search feature worked fine in development but crashes in production with a dimension mismatch error. A new engineer recently upgraded the embedding model to "improve qua
OpenAI AI Security Engineering questions
- Red-team harness for automated prompt-injection bypass discovery
Walk through how you would build a red-team harness that automatically discovers prompt-injection bypasses against a production agent.
- Secure GPT-4o Agent Against Indirect MCP Prompt Injection Attacks
Secure a GPT-4o agent with MCP tool access against indirect prompt injection Your team is deploying a GPT-4o agent using the Model Context Protocol (MCP) with access to: terminal
OpenAI ML Engineering questions
- Handling Multicollinearity: Select Features From Highly Correlated Sets
How do you select input for modeling if there are features highly correlated with each other?
Prepare for your OpenAI interview
Velocode turns reported OpenAI interview questions into scored practice. Free accounts get the full community problem set and one OpenAI-tagged question per domain; Pro unlocks every company-verified question, the interview simulator, and your domain readiness radar.