How to use OpenAI for PDF in C# with IronPDF
IronPDF's AI extension enables OpenAI-powered PDF enhancement in C# applications. Add summarization, querying, and memorization features using Microsoft Semantic Kernel with minimal code.
OpenAI is an AI research laboratory that develops advanced artificial intelligence technologies. It provides powerful language models accessible through APIs, enabling developers to integrate AI capabilities into their applications.
The IronPdf.Extensions.AI NuGet package brings OpenAI to PDF processing: summarization, querying, and memorization. Built on Microsoft Semantic Kernel, this SDK simplifies AI service integration in .NET applications. Extract insights, answer questions, and generate summaries from PDF documents automatically.
Key use cases include processing large document volumes, extracting information from reports, creating quick-review summaries, and building intelligent document management systems. The integration supports both one-time summarization and continuous querying for various applications. For more PDF features, explore IronPDF's comprehensive documentation or learn about creating PDFs from HTML.
Quickstart: Summarize PDFs with IronPDF and OpenAI
Start integrating OpenAI into your PDF processing workflow with IronPDF in C#. This example demonstrates quick PDF summarization with just a few lines of code.
Get started making PDFs with NuGet now:
Install IronPDF with NuGet Package Manager
Copy and run this code snippet.
// Install-Package IronPdf.Extensions.AI await IronPdf.AI.PdfAIEngine.Summarize("input.pdf", "summary.txt", azureEndpoint, azureApiKey);Deploy to test on your live environment
Minimal Workflow (5 steps)
- Download the C# library to utilize OpenAI for PDF
- Prepare the Azure Endpoint and API Key for OpenAI
- Import the target PDF document
- Use the
Summarizemethod to generate a summary of the PDF - Use the
Querymethod for continuous querying
Required packages:
Before implementing AI features, set up Azure OpenAI. You need an Azure subscription with Azure OpenAI Service access. The service provides enterprise-grade security and compliance for production applications. See the IronPDF installation overview for detailed instructions.
How Do I Summarize PDFs with OpenAI?
To use OpenAI features, configure the Semantic Kernel with your Azure Endpoint and API Key. Import the PDF document and use the Summarize method to generate summaries. Download the sample PDF from the OpenAI for PDF Summarization Example.
The summarization feature works with various PDF types:
- Scanned documents (when combined with OCR)
- Complex layouts with multiple columns
- Documents containing images and tables
IronPDF extracts text content and processes it through the AI model. For different formats, see converting DOCX to PDF or converting Markdown to PDF.
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<NoWarn>$(NoWarn);SKEXP0001,SKEXP0010,SKEXP0050</NoWarn>
</PropertyGroup>
</Project><Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<NoWarn>$(NoWarn);SKEXP0001,SKEXP0010,SKEXP0050</NoWarn>
</PropertyGroup>
</Project>Here's how to summarize a PDF using Semantic Kernel in C#:
:path=/static-assets/pdf/content-code-examples/how-to/openai-summarize.csusing IronPdf;
using IronPdf.AI;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.Memory;
using System;
using System.Threading.Tasks;
// Setup OpenAI
var azureEndpoint = "<<enter your azure endpoint here>>";
var apiKey = "<<enter your azure API key here>>";
var builder = Kernel.CreateBuilder()
.AddAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey)
.AddAzureOpenAIChatCompletion("oaichat", azureEndpoint, apiKey);
var kernel = builder.Build();
// Setup Memory
var memory_builder = new MemoryBuilder()
// optionally use new ChromaMemoryStore("http://127.0.0.1:8000") (see https://github.com/microsoft/semantic-kernel/blob/main/dotnet/notebooks/09-memory-with-chroma.ipynb)
.WithMemoryStore(new VolatileMemoryStore())
.WithAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey);
var memory = memory_builder.Build();
// Initialize IronAI
IronDocumentAI.Initialize(kernel, memory);
License.LicenseKey = "<<enter your IronPdf license key here";
// Import PDF document
PdfDocument pdf = PdfDocument.FromFile("wikipedia.pdf");
// Summarize the document
Console.WriteLine("Please wait while I summarize the document...");
string summary = await pdf.Summarize(); // optionally pass AI instance or use AI instance directly
Console.WriteLine($"Document summary: {summary}\n\n");Imports Microsoft.VisualBasic
Imports IronPdf
Imports IronPdf.AI
Imports Microsoft.SemanticKernel
Imports Microsoft.SemanticKernel.Connectors.OpenAI
Imports Microsoft.SemanticKernel.Memory
Imports System
Imports System.Threading.Tasks
' Setup OpenAI
Private azureEndpoint = "<<enter your azure endpoint here>>"
Private apiKey = "<<enter your azure API key here>>"
Private builder = Kernel.CreateBuilder().AddAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey).AddAzureOpenAIChatCompletion("oaichat", azureEndpoint, apiKey)
Private kernel = builder.Build()
' Setup Memory
Private memory_builder = (New MemoryBuilder()).WithMemoryStore(New VolatileMemoryStore()).WithAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey)
Private memory = memory_builder.Build()
' Initialize IronAI
IronDocumentAI.Initialize(kernel, memory)
License.LicenseKey = "<<enter your IronPdf license key here"
' Import PDF document
Dim pdf As PdfDocument = PdfDocument.FromFile("wikipedia.pdf")
' Summarize the document
Console.WriteLine("Please wait while I summarize the document...")
Dim summary As String = Await pdf.Summarize() ' optionally pass AI instance or use AI instance directly
Console.WriteLine($"Document summary: {summary}" & vbLf & vbLf)The code initializes both Semantic Kernel and memory store. Memory stores maintain context during continuous queries. Choose from:
- VolatileMemoryStore: In-memory storage for development and testing
- ChromaMemoryStore: Persistent vector database for production
- Other stores: Azure Cognitive Search, Qdrant, and more
For production, implement error handling and custom logging to track AI operations. Explore async and multithreading for processing multiple documents simultaneously.
What Does the Summary Output Look Like?

The summary provides a concise document overview, extracting main topics, important facts, and relevant details. The AI model identifies and prioritizes significant content, enabling quick understanding of lengthy documents.
How Do I Query PDFs Continuously?
Single queries don't suit all scenarios. The IronPdf.Extensions.AI package offers a Query method for continuous queries. Build conversational interfaces, research tools, or document analysis applications where users ask multiple questions about the same document.
Continuous querying maintains conversation context, allowing follow-up questions and clarifications. Ideal for:
- Customer support systems referencing documentation
- Legal document analysis requiring clause interpretation
- Educational applications for studying complex materials
- Research tools extracting specific information
For enhanced processing, consider extracting text and images separately or implementing PDF compression to optimize large documents before AI processing.
:path=/static-assets/pdf/content-code-examples/how-to/openai-summarize.csusing IronPdf;
using IronPdf.AI;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.Memory;
using System;
using System.Threading.Tasks;
// Setup OpenAI
var azureEndpoint = "<<enter your azure endpoint here>>";
var apiKey = "<<enter your azure API key here>>";
var builder = Kernel.CreateBuilder()
.AddAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey)
.AddAzureOpenAIChatCompletion("oaichat", azureEndpoint, apiKey);
var kernel = builder.Build();
// Setup Memory
var memory_builder = new MemoryBuilder()
// optionally use new ChromaMemoryStore("http://127.0.0.1:8000") (see https://github.com/microsoft/semantic-kernel/blob/main/dotnet/notebooks/09-memory-with-chroma.ipynb)
.WithMemoryStore(new VolatileMemoryStore())
.WithAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey);
var memory = memory_builder.Build();
// Initialize IronAI
IronDocumentAI.Initialize(kernel, memory);
License.LicenseKey = "<<enter your IronPdf license key here";
// Import PDF document
PdfDocument pdf = PdfDocument.FromFile("wikipedia.pdf");
// Summarize the document
Console.WriteLine("Please wait while I summarize the document...");
string summary = await pdf.Summarize(); // optionally pass AI instance or use AI instance directly
Console.WriteLine($"Document summary: {summary}\n\n");Imports Microsoft.VisualBasic
Imports IronPdf
Imports IronPdf.AI
Imports Microsoft.SemanticKernel
Imports Microsoft.SemanticKernel.Connectors.OpenAI
Imports Microsoft.SemanticKernel.Memory
Imports System
Imports System.Threading.Tasks
' Setup OpenAI
Private azureEndpoint = "<<enter your azure endpoint here>>"
Private apiKey = "<<enter your azure API key here>>"
Private builder = Kernel.CreateBuilder().AddAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey).AddAzureOpenAIChatCompletion("oaichat", azureEndpoint, apiKey)
Private kernel = builder.Build()
' Setup Memory
Private memory_builder = (New MemoryBuilder()).WithMemoryStore(New VolatileMemoryStore()).WithAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey)
Private memory = memory_builder.Build()
' Initialize IronAI
IronDocumentAI.Initialize(kernel, memory)
License.LicenseKey = "<<enter your IronPdf license key here"
' Import PDF document
Dim pdf As PdfDocument = PdfDocument.FromFile("wikipedia.pdf")
' Summarize the document
Console.WriteLine("Please wait while I summarize the document...")
Dim summary As String = Await pdf.Summarize() ' optionally pass AI instance or use AI instance directly
Console.WriteLine($"Document summary: {summary}" & vbLf & vbLf)The continuous query system uses embeddings to understand question semantics, providing accurate, contextual responses. Each query processes against document content, with AI maintaining conversation history for increasingly relevant answers.
For optimal performance with large documents or concurrent users, implement caching strategies and explore IronPDF's performance optimization techniques. Consider rate limiting and proper license key management for production deployments.
When handling sensitive documents, implement appropriate security measures. IronPDF offers various security and encryption options to protect PDFs before and after AI processing.
Frequently Asked Questions
What is the purpose of the AI extension for PDF processing?
The IronPdf.Extensions.AI NuGet package enables OpenAI-powered PDF enhancement in C# applications. It allows you to add summarization, querying, and memorization features to your PDFs using Microsoft Semantic Kernel with minimal code, helping extract insights and answer questions from documents automatically.
What are the key use cases for AI-powered PDF processing?
IronPDF's AI extension is ideal for processing large document volumes, extracting information from reports, creating quick-review summaries, and building intelligent document management systems. The integration supports both one-time summarization and continuous querying for various applications.
How can I quickly summarize a PDF using OpenAI?
With IronPDF's AI extension, you can summarize any PDF with just one line of code: await IronPdf.AI.PdfAIEngine.Summarize("input.pdf", "summary.txt", azureEndpoint, azureApiKey). This simple implementation makes it easy to generate summaries from PDF documents.
What packages do I need to install for AI PDF processing?
To implement AI features with IronPDF, you need three packages: IronPdf (the main PDF library), IronPdf.Extensions.AI (the AI extension), and Microsoft.SemanticKernel.Plugins.Memory (for semantic kernel functionality).
What are the prerequisites for using OpenAI with PDFs?
Before implementing AI features with IronPDF, you need to set up Azure OpenAI with an Azure subscription that has Azure OpenAI Service access. The service provides enterprise-grade security and compliance for production applications, requiring an Azure Endpoint and API Key.
What is the minimal workflow for AI PDF processing?
The minimal workflow with IronPDF consists of 5 steps: 1) Download the C# library, 2) Prepare the Azure Endpoint and API Key, 3) Import the target PDF document, 4) Use the Summarize method to generate a summary, and 5) Use the Query method for continuous querying.
How does the AI extension integrate with Microsoft Semantic Kernel?
IronPDF's AI extension is built on Microsoft Semantic Kernel, which simplifies AI service integration in .NET applications. This SDK handles the complexity of connecting to OpenAI services and provides a straightforward API for PDF-specific AI operations.






