Azure OpenAI vs OpenAIAzure OpenAI refers to OpenAI models hosted on the Microsoft Azure platform. Models hosted on Azure come with added enterprise features including support for keyless authentication with Entra ID.
Use
ChatOpenAI with v1 API (recommended)Azure OpenAI’s v1 API (Generally Available as of August 2025) allows you to use ChatOpenAI directly with Azure endpoints. This removes the need for dated api-version parameters and provides native support for Microsoft Entra ID authentication with automatic token refresh.We continue to support AzureChatOpenAI, which now shares the same underlying base implementation as ChatOpenAI, which interfaces with OpenAI services directly.This page serves as a quickstart for authenticating and connecting your Azure OpenAI Chat Models to LangChain.Overview
Integration details
| Class | Package | Serializable | JS/TS Support | Downloads | Latest Version |
|---|---|---|---|---|---|
AzureChatOpenAI | langchain-openai | beta | ✅ (npm) |
Model features
| Tool calling | Structured output | Image input | Audio input | Video input | Token-level streaming | Native async | Token usage | Logprobs |
|---|---|---|---|---|---|---|---|---|
| ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ |
Setup
To access Azure OpenAI models you’ll need to create an Azure account, create a deployment of an Azure OpenAI model, get the name and endpoint for your deployment, and install thelangchain-openai integration package.
Installation
Credentials
BothChatOpenAI and AzureChatOpenAI support authenticating to Azure OpenAI with either Microsoft Entra ID (recommended) or an API key.
Microsoft Entra ID
Microsoft Entra ID provides keyless authentication with automatic token refresh. Install theazure-identity package and create a token provider—the same provider works with both ChatOpenAI and AzureChatOpenAI:
API key
Head to the Azure docs to create your deployment and generate an API key. Set theAZURE_OPENAI_API_KEY and AZURE_OPENAI_ENDPOINT environment variables:
Instantiation
ChatOpenAI with v1 API
Setbase_url to your Azure endpoint with /openai/v1/ appended. With the v1 API you can call any model deployed in Microsoft Foundry (including OpenAI, Llama, DeepSeek, Mistral, and Phi) through a single interface by pointing model at your deployment name.
- Entra ID (recommended)
- API key
Pass the token provider to
api_key:AzureChatOpenAI
UseAzureChatOpenAI when working with traditional Azure OpenAI API versions that require api_version.
- Entra ID (recommended)
- API key
Pass the token provider to
azure_ad_token_provider:Invocation
Tool calling
Bind tools to the model using Pydantic classes, dict schemas, LangChain tools, or functions:Build an agent
Usecreate_agent to build an agent with Azure OpenAI and tools:
Streaming usage metadata
OpenAI’s Chat Completions API does not stream token usage statistics by default (see the OpenAI API reference for stream options). To recover token counts when streaming, setstream_usage=True as an initialization parameter or on invocation:
Responses API
Azure OpenAI supports the Responses API, which provides stateful conversations, built-in server-side tools (code interpreter, image generation, file search, and remote MCP), and structured reasoning summaries.ChatOpenAI automatically routes to the Responses API when you set the reasoning parameter, or you can opt in explicitly with use_responses_api=True:
- Entra ID (recommended)
- API key
Reasoning effort and summary
Azure OpenAI reasoning models (for example,o4-mini, gpt-5) spend extra tokens thinking through a request before producing their final answer. With ChatOpenAI on the v1 API, you can configure how much effort the model spends reasoning and optionally request a summary of its chain of thought.
Reasoning effort
Setreasoning_effort to "low", "medium", or "high". Higher settings let the model spend more tokens on reasoning, which typically improves quality for complex tasks at the cost of latency:
Reasoning models use tokens for internal reasoning (
reasoning_tokens in completion_tokens_details). These tokens aren’t returned in the message content but count toward the output token limit. If you see empty responses, increase max_tokens or leave it unset so the model has room for both reasoning and output.Reasoning summary
When using a reasoning model via the Responses API, you can request a summary of the model’s chain of thought by passing areasoning dict. Setting reasoning automatically routes ChatOpenAI to the Responses API:
Even when enabled, reasoning summaries aren’t guaranteed for every step or request—this is expected behavior.
Specifying model version (legacy API)
This section applies only when using
AzureChatOpenAI with traditional API versions. The v1 API does not require api_version parameters.AzureChatOpenAI, Azure OpenAI responses contain a model_name response metadata property. Unlike native OpenAI responses, it does not contain the specific version of the model (which is set on the deployment in Azure). Pass model_version to distinguish between different versions:
API reference
For detailed documentation of all features and configuration options, head to theAzureChatOpenAI API reference.
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

