How to Access All PDF DOM Objects

Accessing the PDF DOM object refers to interacting with the structure of a PDF file in a way similar to manipulating a webpage's DOM (Document Object Model). In the context of PDFs, the DOM is a representation of the document’s internal structure, allowing developers to access and manipulate different elements such as text, images, annotations, and metadata programmatically.

Start using IronPDF in your project today with a free trial.

First Step:
green arrow pointer

Access DOM Objects Example

The ObjectModel can be accessed from the PdfPage object. First, import the target PDF and access its Pages property. From there, select any page, and you will have access to the ObjectModel property.

Warning
This feature is still experimental. It leaks memory when accessing text objects from the DOM.

:path=/static-assets/pdf/content-code-examples/how-to/access-pdf-dom-object.cs
using IronPdf;
using System.Linq;

// Instantiate Renderer
ChromePdfRenderer renderer = new ChromePdfRenderer();

// Create a PDF from a URL
PdfDocument pdf = renderer.RenderUrlAsPdf("https://ironpdf.com/");

// Access DOM Objects
var objects = pdf.Pages.First().ObjectModel;
Imports IronPdf
Imports System.Linq

' Instantiate Renderer
Private renderer As New ChromePdfRenderer()

' Create a PDF from a URL
Private pdf As PdfDocument = renderer.RenderUrlAsPdf("https://ironpdf.com/")

' Access DOM Objects
Private objects = pdf.Pages.First().ObjectModel
$vbLabelText   $csharpLabel
Debug

The ObjectModel property currently consists of ImageObject, PathObject, and TextObject. Each object contains information about the page index it is on, its bounding box, scale, and translation. This information can also be modified.

ImageObject:

  • Height: Height of the image.
  • Width: Width of the image.
  • ExportBytesAsJpg: A method to export the image as a byte array in JPG format.

PathObject:

  • FillColor: The fill color of the path.
  • StrokeColor: The stroke color of the path.
  • Points: A collection of points defining the path.

TextObject:

  • Color: The color of the text.
  • Contents: The actual text content.

Ready to see what else you can do? Check out our tutorial page here: Edit PDFs

Frequently Asked Questions

What is the PDF DOM object?

The PDF DOM object refers to the internal structure of a PDF document, allowing developers to access and manipulate elements such as text, images, annotations, and metadata programmatically.

How can I access PDF DOM objects in C#?

To access PDF DOM objects, you can use IronPDF by downloading the C# library, importing or rendering the PDF document, accessing the pages collection, and using the ObjectModel property to interact with the DOM objects.

What are the main types of objects in the PDF DOM?

The main types of objects in the PDF DOM include ImageObject, PathObject, and TextObject, each with properties that can be accessed and modified.

What properties can be accessed in a TextObject?

In a TextObject, you can access properties like Color and Contents, which represent the text color and the actual text content, respectively.

How can I manipulate text objects in a PDF?

You can manipulate text objects in a PDF by using IronPDF to access the ObjectModel from a PdfPage, iterating through TextObjects, and modifying properties such as Color and Contents.

What is the purpose of the ObjectModel property?

The ObjectModel property provides access to the PDF DOM, allowing developers to interact with and manipulate PDF elements programmatically using IronPDF.

Are there any known issues with accessing PDF DOM?

Yes, when using IronPDF, the feature is still experimental and may leak memory when accessing text objects from the DOM.

Chaknith related to Access DOM Objects Example
Software Engineer
Chaknith is the Sherlock Holmes of developers. It first occurred to him he might have a future in software engineering, when he was doing code challenges for fun. His focus is on IronXL and IronBarcode, but he takes pride in helping customers with every product. Chaknith leverages his knowledge from talking directly with customers, to help further improve the products themselves. His anecdotal feedback goes beyond Jira tickets and supports product development, documentation and marketing, to improve customer’s overall experience.When he isn’t in the office, he can be found learning about machine learning, coding and hiking.