How to Access All PDF DOM Objects
Accessing the PDF DOM object refers to interacting with the structure of a PDF file in a way similar to manipulating a webpage's DOM (Document Object Model). In the context of PDFs, the DOM is a representation of the document’s internal structure, allowing developers to access and manipulate different elements such as text, images, annotations, and metadata programmatically.
How to Access All PDF DOM Objects
- Download the C# library to access PDF DOM Objects
- Import or render the targeted PDF document
- Access the PDF's pages collection and select the desired page
- Use the ObjectModel property to view and interact with the DOM objects
- Save or export the modified PDF document
Start using IronPDF in your project today with a free trial.
Access DOM Objects Example
The ObjectModel
can be accessed from the PdfPage
object. First, import the target PDF and access its Pages
property. From there, select any page, and you will have access to the ObjectModel
property.
Warning
:path=/static-assets/pdf/content-code-examples/how-to/access-pdf-dom-object.cs
using IronPdf;
using System;
using System.Linq;
// Instantiate the ChromePdfRenderer.
// This is used to render HTML content to PDF.
ChromePdfRenderer renderer = new ChromePdfRenderer();
try
{
// Create a PDF from a specified URL.
// This method fetches the webpage at the URL and converts it to a PDF document.
PdfDocument pdf = renderer.RenderUrlAsPdf("https://ironpdf.com/");
// Ensure there is at least one page in the document to avoid runtime errors.
if (pdf.Pages.Any())
{
// Access the first page's DOM objects.
// ObjectModel provides access to underlying structures, similar to a DOM, from the rendered PDF page.
var objects = pdf.Pages.First().ObjectModel;
// Example: Print the names of DOM objects on the first page.
foreach (var obj in objects)
{
Console.WriteLine(obj.ToString());
}
}
else
{
// Inform the user if the PDF has no pages.
Console.WriteLine("The PDF contains no pages.");
}
}
catch (Exception ex)
{
// Handle any errors that occur during PDF rendering.
Console.WriteLine("An error occurred while rendering the PDF: " + ex.Message);
}
Imports IronPdf
Imports System
Imports System.Linq
' Instantiate the ChromePdfRenderer.
' This is used to render HTML content to PDF.
Private renderer As New ChromePdfRenderer()
Try
' Create a PDF from a specified URL.
' This method fetches the webpage at the URL and converts it to a PDF document.
Dim pdf As PdfDocument = renderer.RenderUrlAsPdf("https://ironpdf.com/")
' Ensure there is at least one page in the document to avoid runtime errors.
If pdf.Pages.Any() Then
' Access the first page's DOM objects.
' ObjectModel provides access to underlying structures, similar to a DOM, from the rendered PDF page.
Dim objects = pdf.Pages.First().ObjectModel
' Example: Print the names of DOM objects on the first page.
For Each obj In objects
Console.WriteLine(obj.ToString())
Next obj
Else
' Inform the user if the PDF has no pages.
Console.WriteLine("The PDF contains no pages.")
End If
Catch ex As Exception
' Handle any errors that occur during PDF rendering.
Console.WriteLine("An error occurred while rendering the PDF: " & ex.Message)
End Try

The ObjectModel
property currently consists of ImageObject
, PathObject
, and TextObject
. Each object contains information about the page index it is on, its bounding box, scale, and translation. This information can also be modified.
ImageObject
:
Height
: Height of the image.Width
: Width of the image.ExportBytesAsJpg
: A method to export the image as a byte array in JPG format.
PathObject
:
FillColor
: The fill color of the path.StrokeColor
: The stroke color of the path.Points
: A collection of points defining the path.
TextObject
:
Color
: The color of the text.Contents
: The actual text content.
Frequently Asked Questions
What is the PDF DOM object?
The PDF DOM object refers to the internal structure of a PDF document, allowing developers to access and manipulate elements such as text, images, annotations, and metadata programmatically.
How can I access PDF DOM objects in C#?
To access PDF DOM objects, you can use IronPDF by downloading the C# library, importing or rendering the PDF document, accessing the pages collection, and using the ObjectModel property to interact with the DOM objects.
What are the main types of objects in the PDF DOM?
The main types of objects in the PDF DOM include ImageObject, PathObject, and TextObject, each with properties that can be accessed and modified.
What properties can be accessed in a TextObject?
In a TextObject, you can access properties like Color and Contents, which represent the text color and the actual text content, respectively.
How can I manipulate text objects in a PDF?
You can manipulate text objects in a PDF by using IronPDF to access the ObjectModel from a PdfPage, iterating through TextObjects, and modifying properties such as Color and Contents.
What is the purpose of the ObjectModel property?
The ObjectModel property provides access to the PDF DOM, allowing developers to interact with and manipulate PDF elements programmatically using IronPDF.
Are there any known issues with accessing PDF DOM?
Yes, when using IronPDF, the feature is still experimental and may leak memory when accessing text objects from the DOM.