iText7 Read PDF in C# Alternatives (VS IronPDF)

PDF is a portable document format created by Adobe Acrobat Reader, widely used for sharing information digitally over the internet. It preserves the formatting of data and provides features like setting security permissions and password protection. As a C# developer, you may have encountered scenarios where integrating PDF functionality into your software application is necessary. Building it from scratch can be a time-consuming and tedious task. Therefore, considering the performance, effectiveness, and efficiency of the application, the trade-off between creating a new service from scratch or using a prebuilt library is significant.

There are several PDF libraries available for C#. In this article, we will explore two of the most popular PDF libraries for reading PDF documents in C#.

iText software

iText 7, formerly known as iText 7 Core, is a PDF library to program PDF documents in .NET C# and Java. It is available as open source license (AGPL) and can be licensed for commercial applications.

iText Core is a high level API which provides easy methods to generate, and edit PDFs in all possible ways. With iText 7 Core you can split, merge, annotate, fill forms, digital sign and do much more on PDF files. iText 7 provides an HTML to PDF converter.

IronPDF

IronPDF is a .NET and .NET Framework C# and Java API which is used for generating PDF documents from HTML, CSS and JavaScript either from a URL, HTML files or HTML strings. IronPDF allows you to manipulate existing PDF files like, splitting, merging, annotating, digital signing and much more.

IronPDF is enriched with 50+ features to create, read and edit PDF files. It prioritizes speed, ease of use and accuracy when you need to deliver high quality, pixel perfect professional PDF files with Adobe Acrobat Reader. The API is well documented and a lot of sample source code can be found on its code examples page.

Create a Console Application

We are going to use Visual Studio 2022 IDE for creating an application to start with. Visual Studio is the official IDE for C# development, and you must have installed it. You can download it from Microsoft Visual Studio website, if not installed.

Following steps will create a new project named "DemoApp".

  1. Open Visual Studio and click on "Create a New Project".

    Itext7 Read PDF in C# Alternatives (VS IronPDF) Figure 1 - New project

  2. Select "Console Application" and click "Next".

    Itext7 Read PDF in C# Alternatives (VS IronPDF) Figure 2

  3. Set the name of the project.

    Itext7 Read PDF in C# Alternatives (VS IronPDF) Figure 3

  4. Select the .NET version. Choose the stable version .NET 6.0.

    Itext7 Read PDF in C# Alternatives (VS IronPDF) Figure 4

Install IronPDF Library

Once the project is created, IronPDF library needs to be installed in the project in order to use it. Follow these steps to install it.

  1. Open NuGet Package Manager, either from solution explorer or Tools.

    Itext7 Read PDF in C# Alternatives (VS IronPDF) Figure 5

  2. Browse for IronPDF Library and select for the current project. Click Install.

    Itext7 Read PDF in C# Alternatives (VS IronPDF) Figure 6

Add the following namespace at the top of Program.cs file

using IronPdf;
using IronPdf;
Imports IronPdf
VB   C#

Install iText 7 Library

Once the project is created, iText 7 library needs to be installed in the project in order to use it. Follow the steps to install it.

  1. Open NuGet Package Manager either from solution explorer or Tools.

    Itext7 Read PDF in C# Alternatives (VS IronPDF) Figure 7

  2. Browse for iText 7 Library and select for the current project. Click install.

    Itext7 Read PDF in C# Alternatives (VS IronPDF) Figure 8

Add the following namespaces at the top of Program.cs file

using iText.Kernel.Pdf.Canvas.Parser.Listener;
using iText.Kernel.Pdf.Canvas.Parser;
using iText.Kernel.Pdf;
using iText.Kernel.Pdf.Canvas.Parser.Listener;
using iText.Kernel.Pdf.Canvas.Parser;
using iText.Kernel.Pdf;
Imports iText.Kernel.Pdf.Canvas.Parser.Listener
Imports iText.Kernel.Pdf.Canvas.Parser
Imports iText.Kernel.Pdf
VB   C#

Open PDF files

We are going to use the following PDF file to extract text from it. It is a two-page PDF document.

Itext7 Read PDF in C# Alternatives (VS IronPDF) Figure 9

Using iText library

To open a PDF file using iText library it is a two-step process. First, we create a PdfReader object and pass the file location as a parameter. Then we use PdfDocument class to create a new PDF document. The code goes as follows:

PdfReader pdfReader = new PdfReader("sample.pdf");
PdfDocument pdfDoc = new PdfDocument(pdfReader);
PdfReader pdfReader = new PdfReader("sample.pdf");
PdfDocument pdfDoc = new PdfDocument(pdfReader);
Dim pdfReader As New PdfReader("sample.pdf")
Dim pdfDoc As New PdfDocument(pdfReader)
VB   C#

Using IronPDF

Opening PDF files using IronPDF is easy. Use the PdfDocument class's FromFile method to open PDFs from any file location. The following one line code opens a PDF file for reading data:

var pdf = PdfDocument.FromFile("sample.pdf");
var pdf = PdfDocument.FromFile("sample.pdf");
Dim pdf = PdfDocument.FromFile("sample.pdf")
VB   C#

Read Data from PDF files

Using iText7 library

To read PDF data is not that straightforward in iText 7 library. We have to manually loop through each page of the PDF document to extract text from each page. The following source code helps to extract text from the PDF document page by page:

for (int page = 1; page <= pdfDoc.GetNumberOfPages(); page++)
{
    ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
    string pageContent = PdfTextExtractor.GetTextFromPage(pdfDoc.GetPage(page), strategy);
    Console.WriteLine(pageContent);
}
pdfDoc.Close();
pdfReader.Close();
for (int page = 1; page <= pdfDoc.GetNumberOfPages(); page++)
{
    ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
    string pageContent = PdfTextExtractor.GetTextFromPage(pdfDoc.GetPage(page), strategy);
    Console.WriteLine(pageContent);
}
pdfDoc.Close();
pdfReader.Close();
Dim page As Integer = 1
Do While page <= pdfDoc.GetNumberOfPages()
	Dim strategy As ITextExtractionStrategy = New SimpleTextExtractionStrategy()
	Dim pageContent As String = PdfTextExtractor.GetTextFromPage(pdfDoc.GetPage(page), strategy)
	Console.WriteLine(pageContent)
	page += 1
Loop
pdfDoc.Close()
pdfReader.Close()
VB   C#

There is a lot going on in the code above. First, we declare the Text Extraction Strategy, and then we use PdfExtractor class's GetTextFromPage method to read text. This method accepts two parameters: the first one is the PDF document page and the second one is the strategy. To get the PDF document page use the instance of PdfDocument to call GetPage method and pass the page number as a parameter. The output is returned as a string, which is then displayed on the console output screen. Finally, the PDFReader and PdfDocument objects are closed. Also, look at the following code example here.

Output

Itext7 Read PDF in C# Alternatives (VS IronPDF) Figure 10

Using IronPDF

Just like opening the PDF file was one line of code, similarly, reading text from a PDF file is also a one-line process. The PDFDocument class provides the ExtractAllText method to read the entire content from the file. Console.WriteLine is used to print the text on the screen. The code is as follows:

string text = pdf.ExtractAllText();
Console.WriteLine(text);
string text = pdf.ExtractAllText();
Console.WriteLine(text);
Dim text As String = pdf.ExtractAllText()
Console.WriteLine(text)
VB   C#

Output

Itext7 Read PDF in C# Alternatives (VS IronPDF) Figure 11

The output is accurate and without any errors. However, to use the ExtractAllText method, you need to have a license as it only works in production mode. You can get your trial license key for 30 days from this link.

Comparison

In comparison, both libraries give 100% accurate results while extracting text from a PDF document. They are both the same when it comes to accuracy. However, IronPDF is more efficient in terms of performance and code readability.

IronPDF only takes two lines of code to achieve the same task as iText. It provides text extraction methods out of the box without any extra logic to be implemented. iText code is a bit tricky, and you have to close the two instances created at the time of opening a PDF document. Whereas, IronPDF clears the memory automatically once the task is performed.

Summary

In this article, we looked at how to read PDF documents using iText library in C# and then compared it with IronPDF. Both libraries give accurate results and provides numerous PDF manipulation methods to work with. You can create, edit and read data from PDF files using both of these libraries.

iText is open source and free to use but with restrictions. It can be licensed for commercial use. IronPDF is also free to use and can be licensed for commercial activities with a 30 day free trail.

Download IronPDF and give it a try.