How to Access All PDF DOM Objects

This article was translated from English: Does it need improvement?
Translated
View the article in English

Accessing the PDF DOM object refers to interacting with the structure of a PDF file in a way similar to manipulating a webpage's DOM (Document Object Model). In the context of PDFs, the DOM is a representation of the document’s internal structure, allowing developers to access and manipulate different elements such as text, images, annotations, and metadata programmatically.

Quickstart: Access and Update PDF DOM Elements with IronPDF

Start manipulating your PDF documents with ease using IronPDF's powerful DOM access features. This quick guide demonstrates how to access the PDF DOM, select a page, and modify text objects. It's as simple as loading your PDF, accessing the desired page, and updating content with a few lines of code. Perfect for developers eager to dive into PDF manipulation without the hassle of complex setups.

Nuget IconGet started making PDFs with NuGet now:

  1. Install IronPDF with NuGet Package Manager

    PM > Install-Package IronPdf

  2. Copy and run this code snippet.

    var objs = IronPdf.ChromePdfRenderer.RenderUrlAsPdf("https://example.com").Pages.First().ObjectModel;
  3. Deploy to test on your live environment

    Start using IronPDF in your project today with a free trial
    arrow pointer

Access DOM Objects Example

The ObjectModel can be accessed from the PdfPage object. First, import the target PDF and access its Pages property. From there, select any page, and you will have access to the ObjectModel property.

:path=/static-assets/pdf/content-code-examples/how-to/access-pdf-dom-object.cs
using IronPdf;
using System.Linq;

// Instantiate Renderer
ChromePdfRenderer renderer = new ChromePdfRenderer();

// Create a PDF from a URL
PdfDocument pdf = renderer.RenderUrlAsPdf("https://ironpdf.com/");

// Access DOM Objects
var objects = pdf.Pages.First().ObjectModel;
Imports IronPdf
Imports System.Linq

' Instantiate Renderer
Private renderer As New ChromePdfRenderer()

' Create a PDF from a URL
Private pdf As PdfDocument = renderer.RenderUrlAsPdf("https://ironpdf.com/")

' Access DOM Objects
Private objects = pdf.Pages.First().ObjectModel
$vbLabelText   $csharpLabel
Debug

The ObjectModel property currently consists of ImageObject, PathObject, and TextObject. Each object contains information about the page index it is on, its bounding box, scale, and translation. This information can also be modified.

ImageObject:

  • Height: Height of the image.
  • Width: Width of the image.
  • ExportBytesAsJpg: A method to export the image as a byte array in JPG format.

PathObject:

  • FillColor: The fill color of the path.
  • StrokeColor: The stroke color of the path.
  • Points: A collection of points defining the path.

TextObject:

  • Color: The color of the text.
  • Contents: The actual text content.

Retrieving Glyph information and bounding boxes

When you need to specify exact glyphs rather than just rely on Unicode values to ensure text appears exactly as intended when combined with custom fonts and such, being able to retrieve the bounding box and glyph information is handy. IronPDF offers developers a way to retrieve such information.

We first access the ObjectModel from the PdfPage object. Afterward, we drill deeper down and access the TextObjects, which returns a collection. Finally, we call the GetGlyphInfo method to retrieve the glyph and bounding box information of the first element.

:path=/static-assets/pdf/content-code-examples/how-to/access-pdf-dom-object-retrieve-glyph.cs
using IronPdf;
using System.Linq;

PdfDocument pdf = PdfDocument.FromFile("invoice.pdf");

var glyph = pdf.Pages.First().ObjectModel.TextObjects.First().GetGlyphInfo();
IRON VB CONVERTER ERROR developers@ironsoftware.com
$vbLabelText   $csharpLabel
Glyph Information

Translate PDF Object

There are times when you need to adjust a PDF's layout by repositioning elements, such as text or images. You can easily move an object to a new spot on the page by changing its Translate property.

The code example below renders an HTML string that uses CSS Flexbox to center text in the middle of the PDF. Then, we access the first TextObject, which is the word "Centered."

Finally, we translate the TextObject by assigning a new PointF to its Translate property. This shifts the text 200 points to the right and 150 points up and saves the modified PDF.

Code Example

:path=/static-assets/pdf/content-code-examples/how-to/access-pdf-dom-object-translate.cs
using IronPdf;
using System.Drawing;
using System.Linq;

// Setup the Renderer
var renderer = new ChromePdfRenderer();

// We use CSS Flexbox to perfectly center the text vertically and horizontally.
var html = @"
<div style='display: flex; justify-content: center; align-items: center; font-size: 48px;'>
    Centered
</div>";

// Render the HTML to a PDF
PdfDocument pdf = renderer.RenderHtmlAsPdf(html);

// Save the original PDF to see the "before" state
pdf.SaveAs("BeforeTranslate.pdf");

// Access the first text object on the first page
// In this simple HTML, this will be our "Centered" text block.
var textObject = pdf.Pages.First().ObjectModel.TextObjects.First();

// Apply the translation
// This moves the object 200 points to the right and 150 points up from its original position.
textObject.Translate = new PointF(200, 150);

// Save the modified PDF to see the "after" state
pdf.SaveAs("AfterTranslate.pdf");
IRON VB CONVERTER ERROR developers@ironsoftware.com
$vbLabelText   $csharpLabel

Output

As you can see in the output, the word "Centered" has shifted 200 points to the right and 150 points up from its original position.

Translate object

Scale PDF Object

You can resize any PDF object, such as text or an image, using the Scale property. This property acts as a multiplier. A factor greater than 1 increases the object's size, while a factor between 0 and 1 decreases it.

In this example, we render an HTML string containing an image. Then, we access the first ImageObject and scale it to 70% of its original size. We do this by assigning its Scale property a new PointF with a value of 0.7 for both axes. Finally, we save the modified PDF.

Code Example

:path=/static-assets/pdf/content-code-examples/how-to/access-pdf-dom-object-scale.cs
using IronPdf;
using System.Drawing;
using System.Linq;

// Setup the Renderer
var renderer = new ChromePdfRenderer();

// The image is placed in a div to give it some space on the page.
string html = @"<img src='https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTi8LuOR6_A98euPLs-JRwoLU7Nc31nVP15rw&s'>";

// Render the HTML to a PDF
PdfDocument pdf = renderer.RenderHtmlAsPdf(html);

// Save the PDF before scaling for comparison
pdf.SaveAs("BeforeScale.pdf");

// Access the first image object on the first page
var image = pdf.Pages.First().ObjectModel.ImageObjects.First();

// We scale the image to 70% of its original size on both the X and Y axes.
image.Scale = new PointF(0.7f, 0.7f);

// Save the modified PDF to see the result
pdf.SaveAs("AfterScale.pdf");
IRON VB CONVERTER ERROR developers@ironsoftware.com
$vbLabelText   $csharpLabel

Output

The output shows the image scaled to 70% of its original size.

Scale object

Remove PDF Object

You can clean up a PDF by completely removing objects, such as text blocks, shapes, or images. The process involves accessing the PDF DOM collection of objects, such as ImageObjects or TextObjects, and removing an item from that collection. You can remove an object by calling the RemoveAt method on the collection and passing the index of the object you want to delete.

In the following code, we load the BeforeScale.pdf file created in the previous example and remove the first image from the first page.

:path=/static-assets/pdf/content-code-examples/how-to/access-pdf-dom-object-remove.cs
using IronPdf;
using IronSoftware.Pdfium.Dom;
using System.Linq;

// Load the PDF file we created in the Scale example
 PdfDocument pdf = PdfDocument.FromFile("BeforeScale.pdf");

 // Access DOM Objects
 IPdfPageObjectModel objects = pdf.Pages.First().ObjectModel;

 // Remove first image
 objects.ImageObjects.RemoveAt(0);

 // Save the modified PDF
 pdf.SaveAs("removedFirstImage.pdf");
IRON VB CONVERTER ERROR developers@ironsoftware.com
$vbLabelText   $csharpLabel

Ready to see what else you can do? Check out our tutorial page here: Edit PDFs

Preguntas Frecuentes

¿Cómo puedo acceder a los objetos del PDF DOM en C#?

Para acceder a los objetos del PDF DOM en C#, puede usar IronPDF. Descargue la biblioteca IronPDF, importe o renderice el documento PDF, luego acceda a la colección de páginas. Desde allí, puede usar la propiedad ObjectModel para interactuar con varios objetos del DOM como texto, imágenes y anotaciones.

¿Con qué tipos de objetos puedo interactuar en el PDF DOM?

En el PDF DOM, puede interactuar con objetos como ImageObject, PathObject y TextObject. Estos objetos le permiten acceder y modificar atributos como tamaño, color y contenido.

¿Cómo modifico el contenido de texto en un PDF usando C#?

Puede modificar el contenido de texto en un PDF utilizando IronPDF para acceder al TextObject dentro del ObjectModel de un PdfPage. Luego puede cambiar propiedades como Color y Contents para actualizar el texto.

¿Cuáles son algunas propiedades comunes del ImageObject en el PDF DOM?

El ImageObject en PDF DOM incluye propiedades como Height, Width y métodos como ExportBytesAsJpg que le permiten exportar la imagen como un arreglo de bytes en formato JPG.

¿Puedo cambiar el color de relleno de un camino en un documento PDF?

Sí, puede cambiar el color de relleno de un camino en un documento PDF accediendo al PathObject dentro del PDF DOM usando IronPDF, y luego modificando la propiedad FillColor.

¿Es totalmente estable acceder al PDF DOM con IronPDF?

Acceder al PDF DOM con IronPDF es actualmente una característica experimental y puede causar fugas de memoria al acceder a objetos de texto, por lo que debe usarse con precaución.

¿Qué es el ObjectModel en IronPDF?

El ObjectModel en IronPDF es una propiedad del objeto PdfPage que proporciona acceso al PDF DOM, permitiendo la interacción con elementos PDF como texto, imágenes y caminos de forma programática.

¿Cómo puedo exportar imágenes de un PDF a formato JPEG?

Puede exportar imágenes de un PDF a formato JPEG utilizando IronPDF para acceder al ImageObject en el PDF DOM y luego usar el método ExportBytesAsJpg para exportar la imagen como un arreglo de bytes en formato JPEG.

¿IronPDF es compatible con .NET 10 cuando se trabaja con acceso DOM PDF?

Sí. IronPDF es totalmente compatible con .NET 10, incluyendo funciones como el acceso al DOM de PDF mediante ObjectModel . Funciona de forma inmediata en proyectos .NET 10, igual que en versiones anteriores, sin necesidad de soluciones alternativas especiales. ([ironpdf.com](https://ironpdf.com/blog/net-help/net-10-features/?utm_source=openai))

Chaknith Bin
Ingeniero de Software
Chaknith trabaja en IronXL e IronBarcode. Tiene un profundo conocimiento en C# y .NET, ayudando a mejorar el software y apoyar a los clientes. Sus conocimientos derivados de las interacciones con los usuarios contribuyen a mejores productos, documentación y experiencia en general.
¿Listo para empezar?
Nuget Descargas 16,154,058 | Versión: 2025.11 recién lanzado