How to use OpenAI for PDF

Chatgpt related to How to use OpenAI for PDF

OpenAI is an artificial intelligence research laboratory, consisting of the for-profit OpenAI LP and its non-profit parent company, OpenAI Inc. It was founded with the goal of advancing digital intelligence in a way that benefits humanity as a whole. OpenAI conducts research in various areas of artificial intelligence (AI) and aims to develop AI technologies that are safe, beneficial, and accessible.

The IronPdf.Extensions.AI NuGet package now enables OpenAI for PDF enhancement: summarization, querying, and memorization. The package utilizes Microsoft Semantic Kernel.

Nuget IconGet started making PDFs with NuGet now:

  1. Install IronPDF with NuGet

    PM > Install-Package IronPdf

  2. Copy the code

    IronPdf.AI.PdfAIEngine.Summarize("input.pdf", "summary.txt", azureEndpoint, azureApiKey);
  3. Deploy to test on your live environment

    Start using IronPDF in your project today with a free trial
    arrow pointer

Get started with IronPDF

Start using IronPDF in your project today with a free trial.

First Step:
green arrow pointer



Besides the IronPdf package, you will also need the following two packages:

Summarize PDF Example

To use the OpenAI feature, an Azure Endpoint, and an API Key are needed. Configure the Semantic Kernel according to the code example below. Import the PDF document and utilize the Summarize method to generate a summary of the PDF document. You can download the sample PDF file from the OpenAI for PDF Summarization Example.

Please note Note: You may encounter the SKEXP0001, SKEXP0010, and SKEXP0050 errors because the Semantic Kernel methods are experimental. Add the following code to your .csproj file to suppress these errors:

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <NoWarn>$(NoWarn);SKEXP0001,SKEXP0010,SKEXP0050</NoWarn>
  </PropertyGroup>
</Project>
<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <NoWarn>$(NoWarn);SKEXP0001,SKEXP0010,SKEXP0050</NoWarn>
  </PropertyGroup>
</Project>
XML

Here is an example of how you can summarize a PDF using the Semantic Kernel in C#:

:path=/static-assets/pdf/content-code-examples/how-to/openai-summarize.cs
using IronPdf;
using IronPdf.AI;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.Memory;
using System;
using System.Threading.Tasks;

// Setup OpenAI
var azureEndpoint = "<<enter your azure endpoint here>>";
var apiKey = "<<enter your azure API key here>>";
var builder = Kernel.CreateBuilder()
    .AddAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey)
    .AddAzureOpenAIChatCompletion("oaichat", azureEndpoint, apiKey);
var kernel = builder.Build();

// Setup Memory
var memory_builder = new MemoryBuilder()
    // optionally use new ChromaMemoryStore("http://127.0.0.1:8000") (see https://github.com/microsoft/semantic-kernel/blob/main/dotnet/notebooks/09-memory-with-chroma.ipynb)
    .WithMemoryStore(new VolatileMemoryStore())
    .WithAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey);
var memory = memory_builder.Build();

// Initialize IronAI
IronDocumentAI.Initialize(kernel, memory);

License.LicenseKey = "<<enter your IronPdf license key here";

// Import PDF document
PdfDocument pdf = PdfDocument.FromFile("wikipedia.pdf");

// Summarize the document
Console.WriteLine("Please wait while I summarize the document...");
string summary = await pdf.Summarize(); // optionally pass AI instance or use AI instance directly
Console.WriteLine($"Document summary: {summary}\n\n");
Imports Microsoft.VisualBasic
Imports IronPdf
Imports IronPdf.AI
Imports Microsoft.SemanticKernel
Imports Microsoft.SemanticKernel.Connectors.OpenAI
Imports Microsoft.SemanticKernel.Memory
Imports System
Imports System.Threading.Tasks

' Setup OpenAI
Private azureEndpoint = "<<enter your azure endpoint here>>"
Private apiKey = "<<enter your azure API key here>>"
Private builder = Kernel.CreateBuilder().AddAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey).AddAzureOpenAIChatCompletion("oaichat", azureEndpoint, apiKey)
Private kernel = builder.Build()

' Setup Memory
Private memory_builder = (New MemoryBuilder()).WithMemoryStore(New VolatileMemoryStore()).WithAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey)
Private memory = memory_builder.Build()

' Initialize IronAI
IronDocumentAI.Initialize(kernel, memory)

License.LicenseKey = "<<enter your IronPdf license key here"

' Import PDF document
Dim pdf As PdfDocument = PdfDocument.FromFile("wikipedia.pdf")

' Summarize the document
Console.WriteLine("Please wait while I summarize the document...")
Dim summary As String = Await pdf.Summarize() ' optionally pass AI instance or use AI instance directly
Console.WriteLine($"Document summary: {summary}" & vbLf & vbLf)
$vbLabelText   $csharpLabel

Output Summary

Summarize PDF document

Continuous Query Example

A single query may not be suitable for all scenarios. The IronPdf.Extensions.AI package also offers a Query method that allows users to perform continuous queries.

:path=/static-assets/pdf/content-code-examples/how-to/openai-query.cs
using IronPdf;
using IronPdf.AI;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.Memory;
using System;
using System.Threading.Tasks;

// Setup OpenAI
var azureEndpoint = "<<enter your azure endpoint here>>";
var apiKey = "<<enter your azure API key here>>";
var builder = Kernel.CreateBuilder()
    .AddAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey)
    .AddAzureOpenAIChatCompletion("oaichat", azureEndpoint, apiKey);
var kernel = builder.Build();

// Setup Memory
var memory_builder = new MemoryBuilder()
    // optionally use new ChromaMemoryStore("http://127.0.0.1:8000") (see https://github.com/microsoft/semantic-kernel/blob/main/dotnet/notebooks/09-memory-with-chroma.ipynb)
    .WithMemoryStore(new VolatileMemoryStore())
    .WithAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey);
var memory = memory_builder.Build();

// Initialize IronAI
IronDocumentAI.Initialize(kernel, memory);

License.LicenseKey = "<<enter your IronPdf license key here";

// Import PDF document
PdfDocument pdf = PdfDocument.FromFile("wikipedia.pdf");

// Continuous query
while (true)
{
    Console.Write("User Input: ");
    var response = await pdf.Query(Console.ReadLine());
    Console.WriteLine($"\n{response}");
}
Imports Microsoft.VisualBasic
Imports IronPdf
Imports IronPdf.AI
Imports Microsoft.SemanticKernel
Imports Microsoft.SemanticKernel.Connectors.OpenAI
Imports Microsoft.SemanticKernel.Memory
Imports System
Imports System.Threading.Tasks

' Setup OpenAI
Private azureEndpoint = "<<enter your azure endpoint here>>"
Private apiKey = "<<enter your azure API key here>>"
Private builder = Kernel.CreateBuilder().AddAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey).AddAzureOpenAIChatCompletion("oaichat", azureEndpoint, apiKey)
Private kernel = builder.Build()

' Setup Memory
Private memory_builder = (New MemoryBuilder()).WithMemoryStore(New VolatileMemoryStore()).WithAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey)
Private memory = memory_builder.Build()

' Initialize IronAI
IronDocumentAI.Initialize(kernel, memory)

License.LicenseKey = "<<enter your IronPdf license key here"

' Import PDF document
Dim pdf As PdfDocument = PdfDocument.FromFile("wikipedia.pdf")

' Continuous query
Do
	Console.Write("User Input: ")
	Dim response = Await pdf.Query(Console.ReadLine())
	Console.WriteLine($vbLf & "{response}")
Loop
$vbLabelText   $csharpLabel

Frequently Asked Questions

How can I use OpenAI to enhance PDF documents in C#?

To enhance PDF documents in C#, you can use the `IronPdf.Extensions.AI` NuGet package, which enables functionalities like summarization and querying with the help of OpenAI. This involves using an Azure Endpoint and API Key for OpenAI integration.

What is required to integrate OpenAI with PDF processing in C#?

To integrate OpenAI with PDF processing in C#, you need the `IronPdf` and `IronPdf.Extensions.AI` packages, Microsoft Semantic Kernel, an Azure Endpoint, and an API Key.

How do I summarize a PDF document using OpenAI in C#?

You can summarize a PDF document using the `Summarize` method from the `IronPdf.Extensions.AI` package. Import your PDF document and apply this method by providing your Azure Endpoint and API Key.

Can I perform continuous queries on a PDF using AI in C#?

Yes, you can perform continuous queries on a PDF using the `Query` method from the `IronPdf.Extensions.AI` package, which allows for dynamic information extraction from PDF documents.

How do I suppress experimental error warnings in my C# project?

To suppress experimental error warnings such as SKEXP0001, SKEXP0010, and SKEXP0050, add the following code to your .csproj file: <NoWarn>$(NoWarn);SKEXP0001,SKEXP0010,SKEXP0050</NoWarn>.

What role does Microsoft Semantic Kernel play in PDF enhancement?

Microsoft Semantic Kernel is used to configure and run methods like `Summarize` and `Query` on PDF documents, enabling OpenAI functionalities through the `IronPdf.Extensions.AI` package.

What are the benefits of using OpenAI for PDF summarization?

Using OpenAI for PDF summarization provides concise summaries of large documents, making it easier to extract key information quickly. This is achieved using the `Summarize` method from the `IronPdf.Extensions.AI` package.

Chaknith Bin
Software Engineer
Chaknith works on IronXL and IronBarcode. He has deep expertise in C# and .NET, helping improve the software and support customers. His insights from user interactions contribute to better products, documentation, and overall experience.