using IronPdf; // Disable local disk access or cross-origin requests Installation.EnableWebSecurity = true; // Instantiate Renderer var renderer = new ChromePdfRenderer(); // Create a PDF from a HTML string using C# var pdf = renderer.RenderHtmlAsPdf("<h1>Hello World</h1>"); // Export to a file or Stream pdf.SaveAs("output.pdf"); // Advanced Example with HTML Assets // Load external html assets: Images, CSS and JavaScript. // An optional BasePath 'C:\site\assets\' is set as the file location to load assets from var myAdvancedPdf = renderer.RenderHtmlAsPdf("<img src='icons/iron.png'>", @"C:\site\assets\"); myAdvancedPdf.SaveAs("html-with-assets.pdf");

using IronPdf; using System; // Step 1. Creating a PDF with editable forms from HTML using form and input tags // Radio Button and Checkbox can also be implemented with input type 'radio' and 'checkbox' const string formHtml = @" <html> <body> <h2>Editable PDF Form</h2> <form> First name: <br> <input type='text' name='firstname' value=''> <br> Last name: <br> <input type='text' name='lastname' value=''> <br> <br> <p>Please specify your gender:</p> <input type='radio' id='female' name='gender' value= 'Female'> <label for='female'>Female</label> <br> <br> <input type='radio' id='male' name='gender' value='Male'> <label for='male'>Male</label> <br> <br> <input type='radio' id='non-binary/other' name='gender' value='Non-Binary / Other'> <label for='non-binary/other'>Non-Binary / Other</label> <br> <p>Please select all medical conditions that apply:</p> <input type='checkbox' id='condition1' name='Hypertension' value='Hypertension'> <label for='condition1'> Hypertension</label><br> <input type='checkbox' id='condition2' name='Heart Disease' value='Heart Disease'> <label for='condition2'> Heart Disease</label><br> <input type='checkbox' id='condition3' name='Stoke' value='Stoke'> <label for='condition3'> Stoke</label><br> <input type='checkbox' id='condition4' name='Diabetes' value='Diabetes'> <label for='condition4'> Diabetes</label><br> <input type='checkbox' id='condition5' name='Kidney Disease' value='Kidney Disease'> <label for='condition5'> Kidney Disease</label><br> </form> </body> </html>"; // Instantiate Renderer var renderer = new ChromePdfRenderer(); renderer.RenderingOptions.CreatePdfFormsFromHtml = true; renderer.RenderHtmlAsPdf(formHtml).SaveAs("BasicForm.pdf"); // Step 2. Reading and Writing PDF form values. var FormDocument = PdfDocument.FromFile("BasicForm.pdf"); // Set and Read the value of the "firstname" field var FirstNameField = FormDocument.Form.FindFormField("firstname"); FirstNameField.Value = "Minnie"; Console.WriteLine("FirstNameField value: {0}", FirstNameField.Value); // Set and Read the value of the "lastname" field var LastNameField = FormDocument.Form.FindFormField("lastname"); LastNameField.Value = "Mouse"; Console.WriteLine("LastNameField value: {0}", LastNameField.Value); FormDocument.SaveAs("FilledForm.pdf");

PRODUCT COMPARISONS

itext7 Extract Text From PDF vs IronPDF (Code Example Tutorial)

Updated February 2, 2023

In this tutorial, we will learn how to read data from PDF (Portable Document Format) document in C# with examples using two different tools.

There are many parser libraries/reader available online that can extract text and images from PDFs. We will extract information from a PDF file using the two most useful and best libraries with relevant services to date. We will also compare both libraries to find out which of the two is better.

We will be comparing iText 7 and IronPDF. Before going forward, we will introduce both libraries.

iText 7

iText 7 library is the latest version of iTextSharp. It is used in both .NET and Java applications. It is equipped with a document engine (like Adobe Acrobat Reader), high and low-level programming capabilities, an event listener, and PDF editing capabilities. iText 7 can create, edit and enhance pages of PDF documents without any error. Other features include adding passwords, creating encoding strategies and saving permission options to a PDF document. It is also used to add or change content or canvas images, append PDF elements [dictionaries, etc.], create watermarks and bookmarks, change font sizes, and sign sensitive data.

iText 7 allows us to build custom PDF processing applications for web, mobile, desktop, kernel, or cloud apps in .NET.

IronPDF

IronPDF is a library developed by Iron Software that helps C# and Java Software Engineers create, edit and extract PDF content. It is commonly used to generate PDFs from HTML, from webpages, or from images. It is used to read PDFs and extract their text. Other features include adding headers/footers, signatures, attachments, passwords, and security questions. It provides full performance optimization with its multithreading and asynchronous features.

IronPDF has cross-platform support compatibility with .NET 5, .NET 6 and .NET 7, .NET Core, Standard, and Framework. It is also compatible with Windows, macOS, Linux, Docker, Azure, and AWS.

Now, let's see a demonstration for both of them.

Extract Text from a PDF File Using iText 7

We will use the following PDF file for extracting text from the PDF.

IronPDF

Write the following source code for extracting text using iText 7.

//assign PDF location to a string and create new StringBuilder...
string pdfPath = @"D:/TestDocument.pdf";
 var pageText = new StringBuilder();
//read PDF using new PdfDocument and new PdfReader...
 using (PdfDocument document = new PdfDocument(new PdfReader(pdfPath)))
    {
      var pageNumbers = document.GetNumberOfPages();
       for (int page = 1; page <= pageNumbers; page++)
        {
//new LocationTextExtractionStrategy creates a new text extraction renderer
    LocationTextExtractionStrategy strategy = new LocationTextExtractionStrategy();
     PdfCanvasProcessor parser = new PdfCanvasProcessor(strategy);
     parser.ProcessPageContent(document.GetFirstPage());
     pageText.Append(strategy.GetResultantText());
         }
            Console.WriteLine(pageText.ToString());
     }

//assign PDF location to a string and create new StringBuilder...
string pdfPath = @"D:/TestDocument.pdf";
 var pageText = new StringBuilder();
//read PDF using new PdfDocument and new PdfReader...
 using (PdfDocument document = new PdfDocument(new PdfReader(pdfPath)))
    {
      var pageNumbers = document.GetNumberOfPages();
       for (int page = 1; page <= pageNumbers; page++)
        {
//new LocationTextExtractionStrategy creates a new text extraction renderer
    LocationTextExtractionStrategy strategy = new LocationTextExtractionStrategy();
     PdfCanvasProcessor parser = new PdfCanvasProcessor(strategy);
     parser.ProcessPageContent(document.GetFirstPage());
     pageText.Append(strategy.GetResultantText());
         }
            Console.WriteLine(pageText.ToString());
     }

'assign PDF location to a string and create new StringBuilder...
Dim pdfPath As String = "D:/TestDocument.pdf"
 Dim pageText = New StringBuilder()
'read PDF using new PdfDocument and new PdfReader...
 Using document As New PdfDocument(New PdfReader(pdfPath))
	  Dim pageNumbers = document.GetNumberOfPages()
	   For page As Integer = 1 To pageNumbers
'new LocationTextExtractionStrategy creates a new text extraction renderer
	Dim strategy As New LocationTextExtractionStrategy()
	 Dim parser As New PdfCanvasProcessor(strategy)
	 parser.ProcessPageContent(document.GetFirstPage())
	 pageText.Append(strategy.GetResultantText())
	   Next page
			Console.WriteLine(pageText.ToString())
 End Using

VB C#

Extracted Text Output

Now, let's extract text from PDF using IronPDF.

Extract Text from PDF Documents using IronPDF

The following source code demonstrates the example of extracting text from PDF by using IronPDF.

var pdf = PdfDocument.FromFile(@"D:/TestDocument.pdf");
string text = pdf.ExtractAllText();
Console.WriteLine(text);

var pdf = PdfDocument.FromFile(@"D:/TestDocument.pdf");
string text = pdf.ExtractAllText();
Console.WriteLine(text);

Dim pdf = PdfDocument.FromFile("D:/TestDocument.pdf")
Dim text As String = pdf.ExtractAllText()
Console.WriteLine(text)

VB C#

Extracted Text Using IronPDF

Comparison

With IronPDF, it takes two lines to extract text from PDFs. With iText 7, on the other hand, we have to write about 10 lines of code for the same task.

IronPDF provides convenient text extraction methods out of the box; but iText 7 requires us to write our own logic to do the same task.

IronPDF is efficient in terms of both performance and code readability.

Both libraries are equal in terms of accuracy, as both provide 100% accurate output.