How to use OpenAI for PDF

Chatgpt related to How to use OpenAI for PDF

OpenAI is an artificial intelligence research laboratory, consisting of the for-profit OpenAI LP and its non-profit parent company, OpenAI Inc. It was founded with the goal of advancing digital intelligence in a way that benefits humanity as a whole. OpenAI conducts research in various areas of artificial intelligence (AI) and aims to develop AI technologies that are safe, beneficial, and accessible.

The IronPdf.Extensions.AI NuGet package now enables OpenAI for PDF enhancement: summarization, querying, and memorization. The package utilizes Microsoft Semantic Kernel.

Get started with IronPDF

Start using IronPDF in your project today with a free trial.

First Step:
green arrow pointer



Besides the IronPdf package, you will also need the following two packages:

Summarize PDF Example

To use the OpenAI feature, an Azure Endpoint, and an API Key are needed. Configure the Semantic Kernel according to the code example below. Import the PDF document and utilize the Summarize method to generate a summary of the PDF document. You can download the sample PDF file from the OpenAI for PDF Summarization Example.

Please note
Note: You may encounter the SKEXP0001, SKEXP0010, and SKEXP0050 errors because the Semantic Kernel methods are experimental. Add the following code to your .csproj file to suppress these errors:

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <NoWarn>$(NoWarn);SKEXP0001,SKEXP0010,SKEXP0050</NoWarn>
  </PropertyGroup>
</Project>
<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <NoWarn>$(NoWarn);SKEXP0001,SKEXP0010,SKEXP0050</NoWarn>
  </PropertyGroup>
</Project>
XML

Here is an example of how you can summarize a PDF using the Semantic Kernel in C#:

:path=/static-assets/pdf/content-code-examples/how-to/openai-summarize.cs
using IronPdf;
using IronPdf.AI;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.Memory;
using System;
using System.Threading.Tasks;

// This code demonstrates how to use IronPdf with Semantic Kernel OpenAI components
// to summarize the content of a PDF document. Ensure that you have the necessary
// API keys and endpoints properly set up.

// Setup OpenAI credentials
var azureEndpoint = "<<enter your azure endpoint here>>";
var apiKey = "<<enter your azure API key here>>";

// Build the Semantic Kernel with OpenAI capabilities
var builder = Kernel.CreateBuilder()
    .AddAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey)
    .AddAzureOpenAIChatCompletion("oaichat", azureEndpoint, apiKey);
var kernel = builder.Build();

// Setup Memory configuration
var memoryBuilder = new MemoryBuilder()
    // Optionally, configure a new ChromaMemoryStore if needed. Uncomment the following line and provide the correct URL.
// .WithMemoryStore(new ChromaMemoryStore("http://127.0.0.1:8000"))
    .WithMemoryStore(new VolatileMemoryStore())
    .WithAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey);
var memory = memoryBuilder.Build();

// Initialize IronAI with the configured kernel and memory
IronDocumentAI.Initialize(kernel, memory);

// Configure the IronPdf license
License.LicenseKey = "<<enter your IronPdf license key here>>";

// Import and process the PDF document
PdfDocument pdf = PdfDocument.FromFile("wikipedia.pdf");

// Asynchronous method to summarize the document
async Task SummarizeDocumentAsync()
{
    Console.WriteLine("Please wait while I summarize the document...");

    // Summarize the document's contents using AI
    // Ensure the proper async method call format
    string summary = await pdf.SummarizeAsync();

    // Output the generated summary
    Console.WriteLine($"Document summary: {summary}\n\n");
}

// Use an async context to call the summarization method
// Ensure the project supports async Main or is invoked within an async scope.
await SummarizeDocumentAsync();
Imports IronPdf
Imports IronPdf.AI
Imports Microsoft.SemanticKernel
Imports Microsoft.SemanticKernel.Connectors.OpenAI
Imports Microsoft.SemanticKernel.Memory
Imports System
Imports System.Threading.Tasks

' This code demonstrates how to use IronPdf with Semantic Kernel OpenAI components
' to summarize the content of a PDF document. Ensure that you have the necessary
' API keys and endpoints properly set up.

' Setup OpenAI credentials
Private azureEndpoint = "<<enter your azure endpoint here>>"
Private apiKey = "<<enter your azure API key here>>"

' Build the Semantic Kernel with OpenAI capabilities
Private builder = Kernel.CreateBuilder().AddAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey).AddAzureOpenAIChatCompletion("oaichat", azureEndpoint, apiKey)
Private kernel = builder.Build()

' Setup Memory configuration
Private memoryBuilder = (New MemoryBuilder()).WithMemoryStore(New VolatileMemoryStore()).WithAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey)
Private memory = memoryBuilder.Build()

' Initialize IronAI with the configured kernel and memory
IronDocumentAI.Initialize(kernel, memory)

' Configure the IronPdf license
License.LicenseKey = "<<enter your IronPdf license key here>>"

' Import and process the PDF document
Dim pdf As PdfDocument = PdfDocument.FromFile("wikipedia.pdf")

' Asynchronous method to summarize the document
'INSTANT VB TODO TASK: Local functions are not converted by Instant VB:
'async Task SummarizeDocumentAsync()
'{
'	Console.WriteLine("Please wait while I summarize the document...");
'
'	' Summarize the document's contents using AI
'	' Ensure the proper async method call format
'	string summary = await pdf.SummarizeAsync();
'
'	' Output the generated summary
'	Console.WriteLine(string.Format("Document summary: {0}" + vbLf + vbLf, summary));
'}

' Use an async context to call the summarization method
' Ensure the project supports async Main or is invoked within an async scope.
Await SummarizeDocumentAsync()
$vbLabelText   $csharpLabel

Output Summary

Summarize PDF document

Continuous Query Example

A single query may not be suitable for all scenarios. The IronPdf.Extensions.AI package also offers a Query method that allows users to perform continuous queries.

:path=/static-assets/pdf/content-code-examples/how-to/openai-query.cs
using IronPdf;
using IronPdf.AI;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.Memory;
using System;
using System.Threading.Tasks;

// This setup code integrates the OpenAI services using Azure and initializes
// necessary dependencies for processing text and document queries.

// Set up the OpenAI integration with Azure
var azureEndpoint = "<<enter your azure endpoint here>>";
var apiKey = "<<enter your azure API key here>>";

// Build and configure the kernel to use OpenAI for text embedding and chat completion
var builder = Kernel.CreateBuilder()
    .AddAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey)
    .AddAzureOpenAIChatCompletion("oaichat", azureEndpoint, apiKey);

var kernel = builder.Build();

// Set up memory for the semantic kernel, leveraging a simple volatile store.
// Optionally, ChromaMemoryStore can be used for additional features.
var memoryBuilder = new MemoryBuilder()
    .WithMemoryStore(new VolatileMemoryStore())
    .WithAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey);

var memory = memoryBuilder.Build();

// Initialize IronAI with the kernel and memory configuration
IronDocumentAI.Initialize(kernel, memory);

// Set the license key for Iron Pdf to ensure it functions correctly
License.LicenseKey = "<<enter your IronPdf license key here>>";

// Import a PDF document that will be queried
PdfDocument pdf = PdfDocument.FromFile("wikipedia.pdf");

while (true)
{
    Console.Write("User Input: ");
    string userInput = Console.ReadLine();

    // Query the PDF document using the user input asynchronously,
    // blocking call with `.Result` is generally not recommended due to potential for deadlocks,
    // but is retained here for simplicity. Consider using `await` in an `async` method in production code.
    Task<string> queryTask = pdf.QueryAsync(userInput);
    string response = queryTask.Result;

    Console.WriteLine($"\n{response}");
}
Imports Microsoft.VisualBasic
Imports IronPdf
Imports IronPdf.AI
Imports Microsoft.SemanticKernel
Imports Microsoft.SemanticKernel.Connectors.OpenAI
Imports Microsoft.SemanticKernel.Memory
Imports System
Imports System.Threading.Tasks

' This setup code integrates the OpenAI services using Azure and initializes
' necessary dependencies for processing text and document queries.

' Set up the OpenAI integration with Azure
Private azureEndpoint = "<<enter your azure endpoint here>>"
Private apiKey = "<<enter your azure API key here>>"

' Build and configure the kernel to use OpenAI for text embedding and chat completion
Private builder = Kernel.CreateBuilder().AddAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey).AddAzureOpenAIChatCompletion("oaichat", azureEndpoint, apiKey)

Private kernel = builder.Build()

' Set up memory for the semantic kernel, leveraging a simple volatile store.
' Optionally, ChromaMemoryStore can be used for additional features.
Private memoryBuilder = (New MemoryBuilder()).WithMemoryStore(New VolatileMemoryStore()).WithAzureOpenAITextEmbeddingGeneration("oaiembed", azureEndpoint, apiKey)

Private memory = memoryBuilder.Build()

' Initialize IronAI with the kernel and memory configuration
IronDocumentAI.Initialize(kernel, memory)

' Set the license key for Iron Pdf to ensure it functions correctly
License.LicenseKey = "<<enter your IronPdf license key here>>"

' Import a PDF document that will be queried
Dim pdf As PdfDocument = PdfDocument.FromFile("wikipedia.pdf")

Do
	Console.Write("User Input: ")
	Dim userInput As String = Console.ReadLine()

	' Query the PDF document using the user input asynchronously,
	' blocking call with `.Result` is generally not recommended due to potential for deadlocks,
	' but is retained here for simplicity. Consider using `await` in an `async` method in production code.
	Dim queryTask As Task(Of String) = pdf.QueryAsync(userInput)
	Dim response As String = queryTask.Result

	Console.WriteLine($vbLf & "{response}")
Loop
$vbLabelText   $csharpLabel

Frequently Asked Questions

What is OpenAI?

OpenAI is an artificial intelligence research laboratory, consisting of the for-profit OpenAI LP and its non-profit parent company, OpenAI Inc. It aims to develop AI technologies that are safe, beneficial, and accessible.

How can I use OpenAI for PDF in C#?

To use OpenAI for PDF in C#, you need to download the IronPDF C# library, prepare the Azure Endpoint and API Key, import the target PDF document, and utilize methods like Summarize and Query for PDF enhancement.

What is the IronPdf.Extensions.AI NuGet package?

The IronPdf.Extensions.AI NuGet package enables OpenAI functionalities for PDF documents such as summarization, querying, and memorization using Microsoft Semantic Kernel.

How do I summarize a PDF using OpenAI?

To summarize a PDF, import the PDF document using IronPDF, and utilize the Summarize method provided by the IronPdf.Extensions.AI package.

What do I need to start using OpenAI for PDF?

You need the IronPDF package, the IronPdf.Extensions.AI package, and Microsoft.SemanticKernel.Plugins.Memory. Additionally, an Azure Endpoint and API Key for OpenAI are required.

What is the Query method used for?

The Query method is used for performing continuous queries on a PDF document to extract information dynamically.

How do I handle SKEXP errors in my project?

To suppress SKEXP errors, add the following code to your .csproj file:$(NoWarn);SKEXP0001,SKEXP0010,SKEXP0050.

What is the purpose of Microsoft Semantic Kernel in this context?

Microsoft Semantic Kernel is utilized to configure and run methods like Summarize and Query for PDF documents, enabling OpenAI functionalities.

Chaknith Bin
Software Engineer
Chaknith works on IronXL and IronBarcode. He has deep expertise in C# and .NET, helping improve the software and support customers. His insights from user interactions contribute to better products, documentation, and overall experience.