Search Results for

    Show / Hide Table of Contents

    Class TextContent

    Extracted text content outside of tables

    Provides methods to access text content for the entire document or specific pages.

    Inheritance
    System.Object
    TextContent
    Namespace: IronPdf.Extractions
    Assembly: IronPdf.dll
    Syntax
    public class TextContent : Object
    Remarks

    When table extraction is enabled, this class returns only the text that is outside detected table regions.

    If table extraction is disabled, all text on the page is returned (including text inside tables), because table boundaries are not available.

    Constructors

    TextContent()

    Declaration
    public TextContent()

    Properties

    PageTexts

    Text organized by page

    Dictionary containing text content for each page.

    Key is the page number (1-based), value is the PageText for that page.

    Declaration
    public Dictionary<int, PageText> PageTexts { get; }
    Property Value
    Type Description
    System.Collections.Generic.Dictionary<System.Int32, PageText>

    RawText

    Raw text from all pages concatenated

    Contains the text content of all pages in the document, concatenated together.

    When table extraction is enabled, this contains only text located outside detected tables.

    When table extraction is disabled, this contains all text including text inside tables.

    Declaration
    public string RawText { get; }
    Property Value
    Type Description
    System.String

    Methods

    GetText()

    Gets the raw text of the entire document

    Returns the text content of all pages in the document, concatenated together.

    Returns only outside-table text when table extraction is enabled; otherwise returns all text (tables included).

    Declaration
    public string GetText()
    Returns
    Type Description
    System.String

    Raw text of the entire document

    GetTextByPage(Int32)

    Gets the raw text for a specific page

    Returns the text content of the specified page.

    Declaration
    public string GetTextByPage(int pageNumber)
    Parameters
    Type Name Description
    System.Int32 pageNumber

    Page number (1-based)

    Returns
    Type Description
    System.String

    Text content of the specified page

    Exceptions
    Type Condition
    System.ArgumentOutOfRangeException

    Thrown when:

    • pageNumber is less than 1
    • pageNumber is greater than the total number of pages

    GetTextByPageRange(Int32, Int32)

    Gets the raw text for a page range

    Returns the text content of all pages in the specified range, concatenated together.

    Declaration
    public string GetTextByPageRange(int startPage, int endPage)
    Parameters
    Type Name Description
    System.Int32 startPage

    Starting page number (1-based, inclusive)

    System.Int32 endPage

    Ending page number (1-based, inclusive)

    Returns
    Type Description
    System.String

    Text content of the specified page range

    Exceptions
    Type Condition
    System.ArgumentOutOfRangeException

    Thrown when:

    • startPage is less than 1
    • endPage is less than 1
    • endPage is less than startPage
    • Either startPage or endPage exceeds the total number of pages in the document
    ☀
    ☾
    Downloads
    • Download with Nuget
    • Start for Free
    In This Article
    Back to top
    Install with Nuget
    Want to deploy IronPDF to a live project for FREE?
    What’s included?
    30 days of fully-functional product
    Test and share in a live environment
    No watermarks in production
    Get your free 30-day Trial Key instantly.
    No credit card or account creation required
    Your Trial License Key has been emailed to you.
    Download IronPDF free to apply
    your Trial Licenses Key
    Install with NuGet View Licenses
    Licenses from $499. Have a question? Get in touch.