How to Access All PDF DOM Objects

Chaknith Bin

October 14, 2024

Updated October 15, 2024

Accessing the PDF DOM object refers to interacting with the structure of a PDF file in a way similar to manipulating a webpage's DOM (Document Object Model). In the context of PDFs, the DOM is a representation of the document’s internal structure, allowing developers to access and manipulate different elements such as text, images, annotations, and metadata programmatically.

How to Access All PDF DOM Objects

Download the C# library to access PDF DOM Objects
Import or render the targeted PDF document
Access the PDF's pages collection and select the desired page
Use the ObjectModel property to view and interact with the DOM objects
Save or export the modified PDF document

Start using IronPDF in your project today with a free trial.

First Step:

Access DOM Objects Example

The ObjectModel can be accessed from the PdfPage object. First, import the target PDF and access its Pages property. From there, select any page, and you will have access to the ObjectModel property.

Warning

This feature is still experimental. It leaks memory when accessing text objects from the DOM.

:path=/static-assets/pdf/content-code-examples/how-to/access-pdf-dom-object.cs

using IronPdf;
using System.Linq;

// Instantiate Renderer
ChromePdfRenderer renderer = new ChromePdfRenderer();

// Create a PDF from a URL
PdfDocument pdf = renderer.RenderUrlAsPdf("https://ironpdf.com/");

// Access DOM Objects
var objects = pdf.Pages.First().ObjectModel;

Imports IronPdf
Imports System.Linq

' Instantiate Renderer
Private renderer As New ChromePdfRenderer()

' Create a PDF from a URL
Private pdf As PdfDocument = renderer.RenderUrlAsPdf("https://ironpdf.com/")

' Access DOM Objects
Private objects = pdf.Pages.First().ObjectModel

The ObjectModel property currently consists of ImageObject, PathObject, and TextObject. Each object contains information about the page index it is on, its bounding box, scale, and translation. This information can also be modified.

ImageObject:

Height: Height of the image.
Width: Width of the image.
ExportBytesAsJpg: A method to export the image as a byte array in JPG format.

PathObject:

FillColor: The fill color of the path.
StrokeColor: The stroke color of the path.
Points: A collection of points defining the path.

TextObject:

Color: The color of the text.
Contents: The actual text content.

Chaknith related to Access DOM Objects Example

Chaknith Bin

Chat with engineering team now

Software Engineer

Chaknith is the Sherlock Holmes of developers. It first occurred to him he might have a future in software engineering, when he was doing code challenges for fun. His focus is on IronXL and IronBarcode, but he takes pride in helping customers with every product. Chaknith leverages his knowledge from talking directly with customers, to help further improve the products themselves. His anecdotal feedback goes beyond Jira tickets and supports product development, documentation and marketing, to improve customer’s overall experience.When he isn’t in the office, he can be found learning about machine learning, coding and hiking.