Every major .NET framework has Microsoft-provided abstractions that make your code portable: ILogger for logging, IDistributedCache for caching, IHttpClientFactory for HTTP. Microsoft.Extensions.AI does the same thing for AI — it’s the standard abstraction layer for chat, embeddings, and AI middleware in .NET.
Build against it once, deploy with any provider. Here’s how it all works.
Table of Contents
- Why MEAI Exists: The Problem It Solves
- Core Interfaces: IChatClient and IEmbeddingGenerator
- Setup and Provider Registration
- Chat Completion in Practice
- Streaming Responses
- Embeddings and Semantic Search
- Middleware Pipeline
- Structured Output
- Function Calling / Tool Use
- Supported Providers
- FAQ
- Conclusion
Why MEAI Exists: The Problem It Solves
Before MEAI, every AI provider had its own SDK with different method signatures, different streaming patterns, different error types, and different configuration approaches. Switching from OpenAI to Azure OpenAI meant touching dozens of files. Using a local model in dev and a cloud model in production required conditional code paths everywhere.
The MEAI Solution
MEAI defines two core interfaces that all providers implement:
IChatClient— for text generation, chat completion, function callingIEmbeddingGenerator<TInput, TEmbedding>— for generating vector embeddings
Your application code uses only these interfaces. The provider implementation is registered in the DI container — change one line to switch providers.
Core Interfaces: IChatClient and IEmbeddingGenerator
IChatClient
public interface IChatClient : IDisposable
{
// Single completion (full response when done)
Task<ChatCompletion> CompleteAsync(
IList<ChatMessage> chatMessages,
ChatOptions? options = null,
CancellationToken cancellationToken = default);
// Streaming completion (tokens as they arrive)
IAsyncEnumerable<StreamingChatCompletionUpdate> CompleteStreamingAsync(
IList<ChatMessage> chatMessages,
ChatOptions? options = null,
CancellationToken cancellationToken = default);
ChatClientMetadata Metadata { get; }
}
Key Types
// ChatMessage — represents one turn in a conversation
var messages = new List<ChatMessage>
{
new ChatMessage(ChatRole.System, "You are a helpful .NET developer assistant."),
new ChatMessage(ChatRole.User, "What is the difference between Task and ValueTask?"),
};
// ChatRole values: System, User, Assistant, Tool
// ChatOptions — configure the request
var options = new ChatOptions
{
Temperature = 0.7f,
MaxOutputTokens = 500,
StopSequences = ["---END---"],
ModelId = "gpt-4o-mini" // Override the default model
};
// ChatCompletion — the response
ChatCompletion completion = await chatClient.CompleteAsync(messages, options);
string responseText = completion.Message.Text;
int tokensUsed = completion.Usage?.TotalTokenCount ?? 0;
Setup and Provider Registration
Install Packages
# Core abstractions (always needed)
dotnet add package Microsoft.Extensions.AI
# OpenAI provider
dotnet add package Microsoft.Extensions.AI.OpenAI
# Azure AI Inference provider (Azure OpenAI, GitHub Models)
dotnet add package Microsoft.Extensions.AI.AzureAIInference
# Ollama provider (local models)
dotnet add package Microsoft.Extensions.AI.Ollama
Register in DI — ASP.NET Core
// Program.cs
var builder = WebApplication.CreateBuilder(args);
// Option A: OpenAI
builder.Services.AddSingleton<IChatClient>(_ =>
new OpenAIClient(builder.Configuration["OpenAI:ApiKey"]!)
.AsChatClient("gpt-4o-mini"));
// Option B: Azure OpenAI
builder.Services.AddSingleton<IChatClient>(_ =>
new AzureOpenAIClient(
new Uri(builder.Configuration["Azure:OpenAI:Endpoint"]!),
new AzureKeyCredential(builder.Configuration["Azure:OpenAI:ApiKey"]!))
.AsChatClient("gpt-4o"));
// Option C: Ollama (local, no API key)
builder.Services.AddSingleton<IChatClient>(_ =>
new OllamaChatClient(
new Uri("http://localhost:11434"),
"llama3.2"));
// Inject and use the same way regardless of provider
builder.Services.AddScoped<IAiAssistantService, AiAssistantService>();
Using IChatClient in a Service
public class AiAssistantService : IAiAssistantService
{
private readonly IChatClient _chatClient;
public AiAssistantService(IChatClient chatClient)
{
_chatClient = chatClient;
}
public async Task<string> AnswerQuestionAsync(string question, CancellationToken ct = default)
{
var messages = new List<ChatMessage>
{
new(ChatRole.System, "You are a concise .NET expert. Answer in under 150 words."),
new(ChatRole.User, question)
};
var completion = await _chatClient.CompleteAsync(messages, cancellationToken: ct);
return completion.Message.Text ?? string.Empty;
}
}
Chat Completion in Practice
Maintaining Conversation History
public class ConversationSession
{
private readonly IChatClient _client;
private readonly List<ChatMessage> _history = new();
public ConversationSession(IChatClient client, string systemPrompt)
{
_client = client;
_history.Add(new ChatMessage(ChatRole.System, systemPrompt));
}
public async Task<string> SendAsync(string userMessage, CancellationToken ct = default)
{
_history.Add(new ChatMessage(ChatRole.User, userMessage));
var completion = await _client.CompleteAsync(_history, cancellationToken: ct);
var response = completion.Message;
_history.Add(response); // Add assistant reply to history
return response.Text ?? string.Empty;
}
public void ClearHistory() =>
_history.RemoveAll(m => m.Role != ChatRole.System);
}
Content Parts — Text and Images
// Multimodal message with image
var message = new ChatMessage(ChatRole.User)
{
Contents = new List<AIContent>
{
new TextContent("What's wrong with this code?"),
new ImageContent(imageBytes, "image/png")
// Or: new ImageContent(new Uri("https://example.com/screenshot.png"))
}
};
var messages = new List<ChatMessage> { message };
var result = await _chatClient.CompleteAsync(messages);
Streaming Responses
// Stream tokens as they arrive
public async IAsyncEnumerable<string> StreamAnswerAsync(
string question,
[EnumeratorCancellation] CancellationToken ct = default)
{
var messages = new List<ChatMessage>
{
new(ChatRole.User, question)
};
await foreach (var update in _chatClient.CompleteStreamingAsync(messages, cancellationToken: ct))
{
if (update.Text is { } text && !string.IsNullOrEmpty(text))
yield return text;
}
}
// Consuming in ASP.NET Core — Server-Sent Events
app.MapGet("/stream", async (string question, IChatClient client, HttpResponse response) =>
{
response.Headers["Content-Type"] = "text/event-stream";
var messages = new List<ChatMessage> { new(ChatRole.User, question) };
await foreach (var update in client.CompleteStreamingAsync(messages))
{
if (update.Text is not null)
await response.WriteAsync($"data: {update.Text}\n\n");
}
});
Embeddings and Semantic Search
// Register embedding generator
builder.Services.AddSingleton<IEmbeddingGenerator<string, Embedding<float>>>(_ =>
new OpenAIClient(apiKey)
.AsEmbeddingGenerator("text-embedding-3-small"));
// Generate embeddings
public class SemanticSearchService
{
private readonly IEmbeddingGenerator<string, Embedding<float>> _embedder;
public SemanticSearchService(IEmbeddingGenerator<string, Embedding<float>> embedder)
{
_embedder = embedder;
}
public async Task<float[]> GetEmbeddingAsync(string text)
{
var result = await _embedder.GenerateAsync([text]);
return result[0].Vector.ToArray();
}
public async Task<List<Document>> SearchAsync(string query, List<Document> documents)
{
var queryEmbedding = await GetEmbeddingAsync(query);
return documents
.Select(doc => new
{
Document = doc,
Score = CosineSimilarity(queryEmbedding, doc.Embedding)
})
.OrderByDescending(x => x.Score)
.Take(5)
.Select(x => x.Document)
.ToList();
}
private static float CosineSimilarity(float[] a, float[] b)
{
float dot = 0, normA = 0, normB = 0;
for (int i = 0; i < a.Length; i++)
{
dot += a[i] * b[i];
normA += a[i] * a[i];
normB += b[i] * b[i];
}
return dot / (MathF.Sqrt(normA) * MathF.Sqrt(normB));
}
}
Middleware Pipeline
MEAI’s middleware pipeline is one of its most powerful features. Add cross-cutting concerns — logging, caching, retries, content filtering — without touching your application code.
builder.Services.AddSingleton<IChatClient>(sp =>
{
var openAiClient = new OpenAIClient(apiKey);
return openAiClient
.AsChatClient("gpt-4o-mini")
.AsBuilder()
// Logging — logs prompts and responses
.UseLogging(sp.GetRequiredService<ILoggerFactory>())
// Distributed cache — caches identical prompts
.UseDistributedCache(sp.GetRequiredService<IDistributedCache>())
// OpenTelemetry traces
.UseOpenTelemetry(logContents: false) // set true only in dev
// Custom middleware
.Use(async (messages, options, next, ct) =>
{
// Pre-processing: inject a system message if none exists
if (!messages.Any(m => m.Role == ChatRole.System))
{
var withSystem = messages.Prepend(
new ChatMessage(ChatRole.System, "You are a helpful assistant.")).ToList();
return await next(withSystem, options, ct);
}
return await next(messages, options, ct);
})
.Build(sp);
});
Custom Middleware Class
public class RateLimitingChatClient : DelegatingChatClient
{
private readonly SemaphoreSlim _semaphore = new(1, 1);
private DateTime _lastCall = DateTime.MinValue;
private readonly TimeSpan _minInterval = TimeSpan.FromMilliseconds(500);
public RateLimitingChatClient(IChatClient innerClient) : base(innerClient) { }
public override async Task<ChatCompletion> CompleteAsync(
IList<ChatMessage> chatMessages,
ChatOptions? options = null,
CancellationToken cancellationToken = default)
{
await _semaphore.WaitAsync(cancellationToken);
try
{
var elapsed = DateTime.UtcNow - _lastCall;
if (elapsed < _minInterval)
await Task.Delay(_minInterval - elapsed, cancellationToken);
_lastCall = DateTime.UtcNow;
return await base.CompleteAsync(chatMessages, options, cancellationToken);
}
finally
{
_semaphore.Release();
}
}
}
Structured Output
// Return typed objects instead of raw text
public record ProductReview(
string Summary,
int SentimentScore, // 1-5
List<string> Pros,
List<string> Cons);
public async Task<ProductReview> AnalyzeReviewAsync(string reviewText)
{
var messages = new List<ChatMessage>
{
new(ChatRole.System, "Analyze the provided product review and return structured data."),
new(ChatRole.User, reviewText)
};
// MEAI structured output — model returns JSON matching the type
var result = await _chatClient.CompleteAsync<ProductReview>(messages);
return result.Result;
}
Function Calling / Tool Use
[Description("Gets the current weather for a location")]
static string GetWeather(
[Description("City name, e.g. 'London'")] string location)
{
// Call a real weather API here
return $"Sunny, 22°C in {location}";
}
// Register tools and let the model call them automatically
var options = new ChatOptions
{
Tools = [AIFunctionFactory.Create(GetWeather)],
ToolMode = ChatToolMode.Auto
};
var messages = new List<ChatMessage>
{
new(ChatRole.User, "What's the weather like in Tokyo right now?")
};
// MEAI handles the tool call loop automatically
var completion = await _chatClient.CompleteAsync(messages, options);
Console.WriteLine(completion.Message.Text);
// Output: "The weather in Tokyo is currently Sunny at 22°C."
Supported Providers
- OpenAI —
Microsoft.Extensions.AI.OpenAI— GPT-4o, GPT-4o-mini, o1, o3 - Azure OpenAI — same package, different client constructor
- Azure AI Inference —
Microsoft.Extensions.AI.AzureAIInference— GitHub Models, Azure AI Foundry models - Ollama —
Microsoft.Extensions.AI.Ollama— local Llama, Phi, Gemma, Mistral - Google Gemini — via community package
- Anthropic (Claude) — via community package
- Any OpenAI-compatible API — use the OpenAI provider with a custom base URL
// Use any OpenAI-compatible endpoint (Groq, Together AI, etc.)
var client = new OpenAIClient(
new ApiKeyCredential("your-groq-key"),
new OpenAIClientOptions { Endpoint = new Uri("https://api.groq.com/openai/v1") }
).AsChatClient("llama-3.1-70b-versatile");
FAQ
How is MEAI different from using the OpenAI SDK directly?
The OpenAI SDK gives you full access to OpenAI-specific features (fine-tuning, assistants API, moderation). MEAI gives you provider portability with a simpler API covering 80% of use cases. Use MEAI for standard chat/embedding scenarios; drop down to the provider SDK when you need provider-specific features.
Does MEAI support function calling with multiple tools?
Yes — register multiple tools in ChatOptions.Tools. The model selects which tool to call, MEAI executes it, and the loop continues until the model returns a final text response. The automatic tool execution loop handles multi-step tool chains.
Is MEAI stable for production use in 2026?
Yes — MEAI reached stable release in late 2024 and is actively maintained by Microsoft. The core interfaces (IChatClient, IEmbeddingGenerator) are considered stable with no breaking changes expected in minor versions.
Can I use MEAI in a .NET Framework project?
MEAI targets .NET Standard 2.0 for the core abstractions, so it works with .NET Framework 4.6.1+. However, provider packages may target .NET 8+ — check each provider package’s target framework.
How do I handle token limits and conversation context management?
MEAI doesn’t manage context automatically — you maintain the message list and decide what to include. For long conversations, implement a sliding window: keep the system prompt + last N messages. The ChatCompletion.Usage property reports token counts so you can track usage.
Conclusion
Microsoft.Extensions.AI is the ILogger of AI in .NET — a standard abstraction that makes your application code portable across providers, testable (mock IChatClient easily), and future-proof. Whether you use OpenAI today and switch to a local model tomorrow, or mix providers per environment, your application code stays unchanged.
Start with IChatClient for chat completion. Add the middleware pipeline for logging and caching. Use IEmbeddingGenerator when you need semantic search. These three pieces cover the majority of AI integration scenarios in .NET applications.
[INTERNAL_LINK: Adding AI chat to .NET MAUI with Microsoft.Extensions.AI] [INTERNAL_LINK: Azure OpenAI SDK for .NET guide]

Leave a Reply