Class TextContent
Extracted text content outside of tables
Provides methods to access text content for the entire document or specific pages.
Inheritance
Namespace: IronPdf.Extractions
Assembly: IronPdf.dll
Syntax
public class TextContent : Object
Remarks
When table extraction is enabled, this class returns only the text that is outside detected table regions.
If table extraction is disabled, all text on the page is returned (including text inside tables), because table boundaries are not available.
Constructors
TextContent()
Declaration
public TextContent()
Properties
PageTexts
Text organized by page
Dictionary containing text content for each page.
Key is the page number (1-based), value is the PageText for that page.
Declaration
public Dictionary<int, PageText> PageTexts { get; }
Property Value
| Type | Description |
|---|---|
| System.Collections.Generic.Dictionary<System.Int32, PageText> |
RawText
Raw text from all pages concatenated
Contains the text content of all pages in the document, concatenated together.
When table extraction is enabled, this contains only text located outside detected tables.
When table extraction is disabled, this contains all text including text inside tables.
Declaration
public string RawText { get; }
Property Value
| Type | Description |
|---|---|
| System.String |
Methods
GetText()
Gets the raw text of the entire document
Returns the text content of all pages in the document, concatenated together.
Returns only outside-table text when table extraction is enabled; otherwise returns all text (tables included).
Declaration
public string GetText()
Returns
| Type | Description |
|---|---|
| System.String | Raw text of the entire document |
GetTextByPage(Int32)
Gets the raw text for a specific page
Returns the text content of the specified page.
Declaration
public string GetTextByPage(int pageNumber)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | pageNumber | Page number (1-based) |
Returns
| Type | Description |
|---|---|
| System.String | Text content of the specified page |
Exceptions
| Type | Condition |
|---|---|
| System.ArgumentOutOfRangeException | Thrown when:
|
GetTextByPageRange(Int32, Int32)
Gets the raw text for a page range
Returns the text content of all pages in the specified range, concatenated together.
Declaration
public string GetTextByPageRange(int startPage, int endPage)
Parameters
| Type | Name | Description |
|---|---|---|
| System.Int32 | startPage | Starting page number (1-based, inclusive) |
| System.Int32 | endPage | Ending page number (1-based, inclusive) |
Returns
| Type | Description |
|---|---|
| System.String | Text content of the specified page range |
Exceptions
| Type | Condition |
|---|---|
| System.ArgumentOutOfRangeException | Thrown when:
|