PDF OCR & Text Extraction

C# + VB.Net: PDF OCR & Text Extraction PDF OCR & Text Extraction
//  Extracting PDF Image and Text Content
using IronPdf;
using System.Drawing;

// open a 128 bit encrypted PDF
PdfDocument PDF = PdfDocument.FromFile("encrypted.pdf", "password");

//Get all text to put in a search index
string AllText = PDF.ExtractAllText();

//Get all Images
IEnumerable<System.Drawing.Image> AllImages = PDF.ExtractAllImages();

//Or even find the precise text and images for each page in the document
for (var index = 0; index < PDF.PageCount; index++) {
    int PageNumber = index + 1;
    string Text = PDF.ExtractTextFromPage(index);
    IEnumerable<System.Drawing.Image> Images = PDF.ExtractImagesFromPage(index);
    ///...
}
'  Extracting PDF Image and Text Content
Imports IronPdf
Imports System.Drawing

' open a 128 bit encrypted PDF


Dim PDF As PdfDocument = PdfDocument.FromFile("encrypted.pdf", "password")

'Get all text to put in a search index
Dim AllText As String = PDF.ExtractAllText()

'Get all Images
Dim AllImages As IEnumerable(Of System.Drawing.Image) = PDF.ExtractAllImages()

'Or even find the precise text and images for each page in the document
For index = 0 To PDF.PageCount - 1
	Dim PageNumber As Integer = index + 1
	Dim Text As String = PDF.ExtractTextFromPage(index)
	Dim Images As IEnumerable(Of System.Drawing.Image) = PDF.ExtractImagesFromPage(index)
	'''...
Next index

IronPDF allows developers to easily extract the full text and images from almost any PDF file. This PDF OCR behavior is particularly useful when building search indexes.