How To Convert PDF to PDFA in C++

The Portable Document Format (PDF) has been the go-to format for sharing documents across various platforms. However, with the increasing significance of digital archiving, there is a growing need to archive documents in a standardized format that ensures long-term preservation and accessibility. The PDF/A (PDFA) format is specifically designed for this purpose, offering a stable and self-contained format that is suitable for archiving.

Understanding PDF/A

PDF/A, an ISO-standardized PDF variant, is designed to preserve documents over time, maintaining content accessibility and integrity. In contrast to a regular PDF file, PDF/A file comes with specific restrictions that assure its robustness and self-sufficiency.

This article guides you through the process of converting PDF documents into PDF/A format using the QPDF library in C++. QPDF, a powerful C++ library, empowers developers to programmatically manage and convert PDF documents, including PDF-to-PDF/A conversion in just a few lines of code.

QPDF C++ Library

QPDF is a C++ library and command-line tool designed to support working with, transforming, and parsing PDF files. It allows developers to programmatically access and manipulate the content of PDF files, with word support. QPDF is an open-source project that offers a range of features, including encryption, decryption, linearization, optimization, and PDF/A conformance.

Prerequisites

Before you begin, ensure that you have the following set up:

  1. Code::Blocks IDE: Download and install Code::Blocks from the official website (http://www.codeblocks.org/).
  2. QPDF Library: Download the latest version of the QPDF library for your operating system from the QPDF website (https://qpdf.sourceforge.io/).

Creating a PDF to PDF/A Project

  1. Open Code::Blocks: Launch the Code::Blocks IDE on your computer.
  2. Create a New Project: Click File in the top menu, then select New, followed by Project.
  3. Choose Project Type: In the New from template window, choose Console application and click Go. Select C/C++ language and click Next.
  4. Enter Project Details: Provide a project name in the Project title field (e.g., "PDFtoPDFA"). Choose where to save project files and click Next.
  5. Select Compiler: Choose a compiler, and if needed, manually select one from the list. Click Finish.

Adding QPDF to the Project

To include the QPDF header files in Code::Blocks, follow these steps:

  1. Click Project in the menu.
  2. Select Build options from the context menu.
  3. In the Build options dialog, go to the Search directories tab.
  4. In the Compiler tab, click the Add button and browse to the directory containing the QPDF header files (usually located in the include folder).
  5. Then in the Link tab, click the Add button and include lib and bin directories.
  6. Click OK to close the Build options dialog.

Additionally, you need to establish a connection with the QPDF library during the linking phase. Follow these steps:

  1. In the Build options dialog, go to the Linker settings tab.
  2. Under Link Libraries, click Add, then browse to the directory with QPDF library files (usually with a .a or .lib file extension on Windows).
  3. Click Add again, and enter the QPDF library's name (e.g., libqpdf.a or qpdf.lib).
  4. Click OK to close the Build options dialog.

Steps to Convert PDF to PDF/A File

Include Necessary Header Files

#include <iostream>
#include <qpdf/QPDF.hh>
#include <qpdf/QPDFWriter.hh>

This code incorporates essential header files for working with C++ standard input/output (iostream) and the QPDF library.

Set Paths for Input and Output Files

In the main function (starting point of the C++ program), set paths for input and output files:

std::string input_pdf_path = "input.pdf";
std::string output_pdf_a_path = "output.pdfa";

These two lines declare and initialize two variables of type std::string. These lines declare and initialize variables for input ("input.pdf") and output PDF/A ("output.pdfa") paths.

Input File

The input file is an editable form:

How To Convert PDF to PDF/A in C++: Figure 1

Create QPDF Object for PDF File

QPDF input_pdf;
input_pdf.processFile(input_pdf_path.c_str());

A QPDF object named input_pdf is created. processFile function is used on the input_pdf object, passing the input file path as an argument.

Create QPDFWriter Object for Writing PDF/A File

QPDFWriter writer(input_pdf, output_pdf_a_path.c_str());

A QPDFWriter object, writer, is created to write the output PDF/A file.

Set PDF/A Conformance and Convert PDF Documents

writer.setQDFMode(true);
writer.write();

This code sets the PDF/A conformance using setQDFMode(true) on the writer object. The write() function performs conversion on PDF document and save the output to a PDF/A file.

Output

The output file is non-editable and PDF/A-compliant:

How To Convert PDF to PDF/A in C++: Figure 2

Convert PDF to PDF/A in C#

IronPDF is a .NET PDF library that provides comprehensive functionalities for working with PDF documents. It allows developers to create, modify, and convert PDF documents using C# or VB.NET. Users can use IronPDF for generating PDF files from HTML, ASPX, Word documents (.doc), or images dynamically, and incorporate rich elements like charts, tables, and images (like JPG, PNG formats) data. It also enables merging, splitting, and editing pages of existing PDF files, along with text extraction and content manipulation of PDF data.

IronPDF facilitates to convert PDF to the PDF/A-3b standard with just a few lines of code. The following code helps to achieve this task:

using IronPdf;

// Create a PdfDocument object or open any PDF file
PdfDocument pdf = PdfDocument.FromFile("wikipedia.pdf");

// Use the SaveAsPdfA method to save to file
pdf.SaveAsPdfA("pdf-a3-wikipedia.pdf", PdfAVersions.PdfA3);
using IronPdf;

// Create a PdfDocument object or open any PDF file
PdfDocument pdf = PdfDocument.FromFile("wikipedia.pdf");

// Use the SaveAsPdfA method to save to file
pdf.SaveAsPdfA("pdf-a3-wikipedia.pdf", PdfAVersions.PdfA3);
Imports IronPdf

' Create a PdfDocument object or open any PDF file
Private pdf As PdfDocument = PdfDocument.FromFile("wikipedia.pdf")

' Use the SaveAsPdfA method to save to file
pdf.SaveAsPdfA("pdf-a3-wikipedia.pdf", PdfAVersions.PdfA3)
VB   C#

Conclusion

This article guides through the C++ conversion of a standard PDF document into PDF/A format using the QPDF libraries. PDF/A compliance ensures content preservation and accessibility, making it ideal for archiving.

IronPDF also supports various format conversions such as HTML, images, and Word documents to PDF files. For further information, please visit this site.

IronPDF provides a free trial to test the functionality for commercial production. You can download the software from here.