C# Read PDF File: Easy Tutorial

If you are a developer you have probably encountered problems trying to read text from a pdf file. Perhaps one or more of the following scenarios apply to you:

  1. You are developing an application which takes two pdf documents as input and finds the similarity between the documents.
  2. You are developing an application which needs to read pdf documents and return the word count.
  3. You are developing an application which extracts data from a pdf file and puts it in a structured database.
  4. You are developing an application which needs to extract pdf text content and convert it into string.
  5. Extracting data from pdf files using C# was a difficult and complex task until the development of IronPdf.

IronPdf is a library which makes it so much easier for developers to read pdf files.

You can explore more about IronPdf at this link.

You can read pdf files and display the data in a C# Textbox by using just two lines of code. Yes, just two lines of code. You can also extract all the images in your pdf files. Further, you can create another document with those images or display them in your application as per your requirements.

Let us show you how it's done.

We can proceed step by step with the application to select any pdf files and then display their content.

The following steps show you how to read pdf files in C#:

Prerequisite Knowledge:

  1. Basic Knowledge of C# Programming
  2. Basic Knowledge of C# GUI Controls

I have designed this tutorial in such a way that even a person with no programming background will be able to progress.

Who should read this

Any newcomer learning C# should know how to read pdf files, because this is something you are definitely going to use in your career.

Professional developers should also read this to be able to understand the Ironpdf Library, which helps us to read, generate and manipulate pdf documents.

Now, how we can use this Library in our Project to read a pdf file?

I am using a Windows Form App for the purposes of demonstration. You can use a Console Application, a WPF Application or a ASP.Net web Application according to your preference.

Another major advantage of the Ironpdf library is that it can be used with both C# and VB.Net.

Let's begin the demonstration without further delay.


Step #1: Create a Visual Studio Project

Open Visual Studio. I am using Visual Studio 2019.

Click on "Create New Project":

Create New Project

Visual Studio

Now, select the Windows Form App from the template, press "Next", and the following window will appear. Enter a project name. I have written Read Pdf using IronPdf.

Now, click "Next", and the following window will appear. Select .NET Core 3.1 from the drop-down menu.

.Net core 3.1 version

.Net Core 3.1 Version

Click on the "Create" button, and the Project will be created as shown below.


Step #2: Install Nuget Package of IronPdf

Click on the Project Menu from the Menu Bar, and a drop-down list will appear. Select Manage Nuget Packages, and click on it. The following window will appear:

Now, click on "Browse". The following window will appear:

Type IronPdf in the search box, and press "Enter". The following window will appear:

NuGet Solution

NuGet Solution

Select and click on IronPdf. The following window will appear:

Install Free IronPDF

Install Free IronPDF

Press the "Install" button and wait for the installation to complete. The following window will appear after successful installation:

IronPdf for .Net

IronPdf for .Net

Press the "Ok" button, and you are good to go.

Note: There are other ways to download the Nuget Package. You can also install Ironpdf by using the Package Manager console; to do this, open the Package Manager console and write the following code:

install-package ironpdf

You can also download it by clicking here on the NuGet website.

The following Readme.Txt file will open:

I suggest you go through all links and explore more about this Library.


Step #3: Design a Windows Form

Once a Project is created and the Nuget Package installed, the next step is to design a Windows Form that will ask the user to browse for a file and display its content.

Open Form1 Design:

Click on the toolbar that is on the left-hand side of the window:

Search for Label, and drag and drop it into the Form Design

Name the label. Here, I have named it as "C# Read Pdf using IronPdf".

Next, drag and drop one text box (to show the file path), three buttons (one for browsing the files, one for reading pdf files using IronPdf, and the third button for "Clear the Text" fields), and one Rich Text Box (for reading and displaying the file contents).

Set the "Read Only Property" for the Text Box and Rich Text Box to "False". This is so that users can only read the contents and file path.


Step #4: Add the Back End Code for Browsing Pdf Files

Double-click on the "Browse" button and the following window will appear:

 private void Browse_Click(object sender, EventArgs e)
{
}
 private void Browse_Click(object sender, EventArgs e)
{
}
Private Sub Browse_Click(ByVal sender As Object, ByVal e As EventArgs)
End Sub
VB   C#

Next, write the following code inside the Browse_Click function:

 private void Browse_Click(object sender, EventArgs e)
        {
            OpenFileDialog BrowseFile = new OpenFileDialog
            {
                InitialDirectory = @"D:\",
                Title = "Browse Pdf Files",
                CheckFileExists = true,
                CheckPathExists = true,
                DefaultExt = "pdf",
                Filter = "pdf files (*.pdf)|*.pdf",
                FilterIndex = 2,
                RestoreDirectory = true,
                ReadOnlyChecked = true,
                ShowReadOnly = true
            };
            if (BrowseFile.ShowDialog() == DialogResult.OK)
            {
                FilePath.Text = BrowseFile.FileName;
            }
        }
 private void Browse_Click(object sender, EventArgs e)
        {
            OpenFileDialog BrowseFile = new OpenFileDialog
            {
                InitialDirectory = @"D:\",
                Title = "Browse Pdf Files",
                CheckFileExists = true,
                CheckPathExists = true,
                DefaultExt = "pdf",
                Filter = "pdf files (*.pdf)|*.pdf",
                FilterIndex = 2,
                RestoreDirectory = true,
                ReadOnlyChecked = true,
                ShowReadOnly = true
            };
            if (BrowseFile.ShowDialog() == DialogResult.OK)
            {
                FilePath.Text = BrowseFile.FileName;
            }
        }
Private Sub Browse_Click(ByVal sender As Object, ByVal e As EventArgs)
			Dim BrowseFile As New OpenFileDialog With {
				.InitialDirectory = "D:\",
				.Title = "Browse Pdf Files",
				.CheckFileExists = True,
				.CheckPathExists = True,
				.DefaultExt = "pdf",
				.Filter = "pdf files (*.pdf)|*.pdf",
				.FilterIndex = 2,
				.RestoreDirectory = True,
				.ReadOnlyChecked = True,
				.ShowReadOnly = True
			}
			If BrowseFile.ShowDialog() = DialogResult.OK Then
				FilePath.Text = BrowseFile.FileName
			End If
End Sub
VB   C#

OpenFileDialogue will create the instance of the File Dialogue control of the Windows form.

I have set the Initial Path to D Drive; you can set it to any.

I have set DefaultExt = “pdf” as we only have to read the pdf file.

I have used a filter so that the browse file dialog will only show you the pdf file to select.

When the user clicks "Ok", it will show the file path in the File Path field.

Let us run the solution and test the "Browse" button.

Press the "Browse" button, and the following window will appear:

Select the file (I am selecting IronPdfTest.pdf) and press "Open". The following window will appear:

pdf in C#

pdf in C#

Now let's write the code behind the "Read" button to read the file.


Step #5: Add the Back End Code for Read Pdf documents using IronPDF

You might be thinking that code for reading a pdf file would be complex and difficult to write and understand.

Don’t worry. Ironpdf has simplified things and made it all so much easier. We can easily read the pdf file using just two lines of code.

Go to Form1 Design and "double-click" on the "Read" button. The following window will appear:

 private void Read_Click(object sender, EventArgs e)
        {
        }
 private void Read_Click(object sender, EventArgs e)
        {
        }
Private Sub Read_Click(ByVal sender As Object, ByVal e As EventArgs)
End Sub
VB   C#

Add a namespace using IronPDF to import the IronPdf library:

using system;
using IronPdf;
using system;
using IronPdf;
Imports system
Imports IronPdf
VB   C#

Write the following code inside the Read_Click function:

private void Read_Click(object sender, EventArgs e)
        {
            using PdfDocument PDF = PdfDocument.FromFile(FilePath.Text);
            FileContent.Text = PDF.ExtractAllText(); 
        }
private void Read_Click(object sender, EventArgs e)
        {
            using PdfDocument PDF = PdfDocument.FromFile(FilePath.Text);
            FileContent.Text = PDF.ExtractAllText(); 
        }
Private Sub Read_Click(ByVal sender As Object, ByVal e As EventArgs)
			Using PDF As PdfDocument = PdfDocument.FromFile(FilePath.Text)
				FileContent.Text = PDF.ExtractAllText()
			End Using
End Sub
VB   C#

FilePath is the name of the text field that displays the location of the pdf document we want to read. We will get the location of the file dynamically.

ExtractAllText is the IronPdf function which will extract all the data from pdf pages. This data will then be displayed in the Rich Text box and named as "File Content".

Next, let’s write the code behind the "Clear Button". This is just an additional item if you wish to clear the screen once you have read the pdf document.

Double-click on the "Clear Button", and it will take you to the following code:

void Clear_Click(object sender, EventArgs e)
        {
        }
void Clear_Click(object sender, EventArgs e)
        {
        }
Private Sub Clear_Click(ByVal sender As Object, ByVal e As EventArgs)
End Sub
VB   C#

Write the following code inside the "Clear_Click Function":

void Clear_Click(object sender, EventArgs e)
        {
            FileContent.Text = "";
            FilePath.Text = "";
        }
void Clear_Click(object sender, EventArgs e)
        {
            FileContent.Text = "";
            FilePath.Text = "";
        }
Private Sub Clear_Click(ByVal sender As Object, ByVal e As EventArgs)
			FileContent.Text = ""
			FilePath.Text = ""
End Sub
VB   C#

Run the Solution

Click on the "Browse" button and select the document you want to read. In my case, I am reading the IronPdf.pdf file as an example:

pdf documents

pdf documents

Press the "Open" button and the following window will appear:

Example

Example

Press the "Read" button. It will read the file and display the content as shown below:

Example Software

Example Software


Summary

This is an example solution. No matter how many pages, images or the text format in your pdf files, Ironpdf will extract all the text and images for you to use for any purpose. You simply need to get the license for the library and begin using it.

This completes the tutorial. I hope you have understood everything, and if you have any queries, feel free to post them in the comments section.

You can download the project from this link. If you wish to buy the complete package of Iron software products, our special offer means that you can now buy all of them for the price of just two. If you need more details about the license and support, please click this. You can also get the free 30 days trial license by clicking on this link.