Class PdfPageModel
Represents the document object model (DOM) for a single PDF page. Provides access to text, image, and path objects for content analysis and extraction.
This class exposes the internal structure of a PDF page, allowing programmatic access to individual content elements. Use for content extraction, analysis, or understanding page composition.
Example - Analyze page content:
var pdf = PdfDocument.FromFile("document.pdf");
var pageModel = pdf.Pages[0].GetPageModel();
// Count content elements:
Console.WriteLine($"Images: {pageModel.ImageObjects.Count}");
Console.WriteLine($"Text blocks: {pageModel.TextObjects.Count}");
Console.WriteLine($"Paths/shapes: {pageModel.PathObjects.Count}");
// Get page dimensions:
var bounds = pageModel.BoundingBox;
Console.WriteLine($"Page size: {bounds.Width} x {bounds.Height} points");
// Export to JSON:
string json = pageModel.ToJson();
Implements
Namespace: IronPdf.Pages
Assembly: IronPdf.dll
Syntax
public class PdfPageModel : PdfClientAccessor, IPdfPageObjectModel, IDocumentPageObjectModel<TextObjectCollection, PathObjectCollection, ImageObjectCollection>, IBounded, IJsonSerializable
Remarks
Key Properties:
Standards:
Properties
BoundingBox
Declaration
public RectangleF BoundingBox { get; set; }
Property Value
| Type | Description |
|---|---|
| System.Drawing.RectangleF |
ImageObjects
Gets the collection of embedded images on this page. Each image includes position, dimensions, and raw image data.
Example - Extract all images:
foreach (var img in pageModel.ImageObjects)
{
Console.WriteLine($"Image at ({img.BoundingBox.X}, {img.BoundingBox.Y})");
Console.WriteLine($"Size: {img.BoundingBox.Width} x {img.BoundingBox.Height}");
}
Declaration
public ImageObjectCollection ImageObjects { get; }
Property Value
| Type | Description |
|---|---|
| ImageObjectCollection | Collection of ImageObject instances on this page. |
See Also
PageIndex
Declaration
public uint PageIndex { get; }
Property Value
| Type | Description |
|---|---|
| System.UInt32 |
PathObjects
Gets the collection of vector path objects (shapes, lines, curves) on this page. Includes rectangles, lines, bezier curves, and complex vector graphics.
Example - Analyze shapes:
foreach (var path in pageModel.PathObjects)
{
var box = path.BoundingBox;
Console.WriteLine($"Shape at ({box.X}, {box.Y}), size: {box.Width} x {box.Height}");
}
Declaration
public PathObjectCollection PathObjects { get; }
Property Value
| Type | Description |
|---|---|
| PathObjectCollection | Collection of PathObject instances representing vector graphics. |
Remarks
Path objects include borders, lines, shapes, and decorative elements. They are defined using PDF path operators (moveto, lineto, curveto, etc.).
TextObjects
Gets the collection of text objects (characters, words, text runs) on this page. Provides granular access to text content with position, font, and style information.
Example - Extract text with positions:
foreach (var text in pageModel.TextObjects)
{
Console.WriteLine($"Text: '{text.Text}'");
Console.WriteLine($"Position: ({text.BoundingBox.X}, {text.BoundingBox.Y})");
Console.WriteLine($"Font: {text.FontName}, Size: {text.FontSize}");
}
Declaration
public TextObjectCollection TextObjects { get; }
Property Value
| Type | Description |
|---|---|
| TextObjectCollection | Collection of TextObject instances with text content and metadata. |
Remarks
For simpler text extraction without position data, use ExtractAllText(TextExtractionOrder) or ExtractTextFromPage(Int32).
See Also
Methods
ToJson()
Declaration
public string ToJson()
Returns
| Type | Description |
|---|---|
| System.String |