Test in a live environment
Test in production without watermarks.
Works wherever you need it to.
PDF files are the most used document format for transferring data in the modern era mainly because it can preserve the formatting and present data in the same form as it was sent without any exception. To load, open and view PDF document, we require a PDF document reader system. There are many PDF readers available but if you want to open a PDF file in your software application programmatically then a suitable class library is required to do so.
Here we are going to look at one such system library which helps open and read PDF file using filename in Java program.
IronPDF - Java library is built on top of already successful working .NET Framework. This makes IronPDF a versatile tool for working with PDF documents as compared to other class libraries such as Apache PDFBox. It provides the facility to extract/parse content, load text and load images. It also provides options to customize the PDF pages such as page layout, margins, header and footer, page orientation and much more.
In addition to this, IronPDF also supports conversion from other file formats, protecting PDFs with a password, digital signing, merging and splitting PDF documents.
To use IronPDF to make Java PDF reader, we first need to ensure that the following components are installed on the computer:
IronPDF - Finally, IronPDF is required to read the PDF file in Java. This needs to be added as a dependency in your Java Maven Project. Include the IronPDF artifact along with slf4j dependency in the pom.xml file as shown in the example below:
:ProductInstall
Firstly, add the following code on top of the Java source file to reference all the required methods from IronPDF. Import org is optional in this example.
import com.ironsoftware.ironpdf.*;
Next, configure IronPDF with a valid license key to use its method. Invoke setLicenseKey
method in main method.
License.setLicenseKey("Your license key");
Note: You can get a free trial license key to create, read and print PDFs.
To read PDF files, there must be PDF files or we can create one. Here we will use already created PDF file. The code is simple and a two-step process to extract text from the document.
PdfDocument pdf = PdfDocument.fromFile(Paths.get("assets/sample.pdf"));
String text = pdf.extractAllText();
System.out.println(text);
In the above code, fromFile
opens a PDF document. The Paths.get
method gets the directory of the file and is ready to extract content from the file. Then, extractAllText
reads all the text in the document.
The output is below:
IronPDF can also read content from a specific page in a PDF. The extractTextFromPage
method uses a PageSelection
object to accept a range of page(s) from which text will be read.
In the following example, we extract the text from the second page of the PDF document. PageSelection.singlePage
takes the index of the page which needs to be extracted.
PdfDocument pdf = PdfDocument.fromFile(Paths.get("assets/sample.pdf"));
String text = pdf.extractTextFromPage(PageSelection.singlePage(1));
System.out.println(text);
Other methods available in the PageSelection
class which can be used to extract text from various page include: [firstPage
](/java/object-reference/api/com/ironsoftware/ironpdf/edit/PageSelection.html#lastPage()), [lastPage
](/java/object-reference/api/com/ironsoftware/ironpdf/edit/PageSelection.html#firstPage()), pageRange
, and [allPages
](/java/object-reference/api/com/ironsoftware/ironpdf/edit/PageSelection.html#allPages()).
We can also search text from newly generated PDF file from either HTML file or URL. The following sample code generates PDF from URL and extracts all text from the website.
PdfDocument pdf = PdfDocument.renderUrlAsPdf("https://unsplash.com/");
String text = pdf.extractAllText();
System.out.println("Text extracted from the website: " + text);
IronPDF can also be used to extract images from PDF files.
The complete code is as follows:
import com.ironsoftware.ironpdf.License;
import com.ironsoftware.ironpdf.PdfDocument;
import com.ironsoftware.ironpdf.edit.PageSelection;
import java.*;
import java.io.IOException;
import java.nio.file.Paths;
public class Main {
public static void main(String [] args) throws IOException {
License.setLicenseKey("YOUR LICENSE KEY HERE");
PdfDocument pdf = PdfDocument.fromFile(Paths.get("assets/sample.pdf"));
String text = pdf.extractTextFromPage(PageSelection.singlePage(1));
System.out.println(text);
pdf = PdfDocument.renderUrlAsPdf("https://unsplash.com/");
text = pdf.extractAllText();
System.out.println("Text extracted from the website: " + text);
}
}
In this article, we looked at how we can open and read PDFs in Java using IronPDF.
IronPDF helps easily create PDFs from HTML or URL and also convert from different file format. It also helps in getting PDF tasks done quickly and easily.
Try IronPDF for 30-days and find how how well it works for you in production. Commercial licenses starts only from $749.
9 .NET API products for your office documents