Search Results for

    Show / Hide Table of Contents

    Namespace IronPdf.Extractions

    Classes

    CsvExportOptions

    Configuration options for CSV exports

    Provides control over delimiters, quoting, and cell formatting specific to CSV format.

    ------------------------------------------------

    Usage:

    var options = new CsvExportOptions
    {
    CsvDelimiter = ";",
    CsvQuoteStrings = true,
    CsvNewlineReplacement = " ",
    IncludeHeaders = true
    };

    ------------------------------------------------

    DocumentMetadata

    Document level metadata

    Contains information about the entire document, such as total pages and table counts.

    Also includes per-page metadata.

    ExportConfiguration

    Configuration for batch export operations

    Controls how tables and text are exported from a complete extraction result.

    Provides options for file organization and table option selection.

    ------------------------------------------------

    Usage:

    var config = new ExportConfiguration
    {
    ExportTables = true,
    ExportText = true,
    TableOptions = new CsvExportOptions(),
    SeparateFilePerTable = true,
    FileNamePattern = "table_{page}_{index}"
    };
    

    ExportManager.ExportResult(result, "output", config);

    ------------------------------------------------

    ExportFormat

    Supported export formats

    Defines the file formats available for exporting extracted data.

    ExportManager

    Provides methods to export tables and text to various formats.

    Acts as a factory for format-specific exporters.

    The export format is automatically determined by the type of ExportOptions provided.

    ------------------------------------------------

    Usage:

    // Export a single table with default options (uses format parameter)
    ExportManager.ExportTable(table, "output.csv", ExportFormat.Csv);
    

    // Export multiple tables with custom options (format inferred from options type) var csvOptions = new CsvExportOptions { CsvDelimiter = ";" }; ExportManager.ExportTables(tables, "output.csv", csvOptions);

    // Export with custom HTML options var htmlOptions = new HtmlExportOptions { HtmlResponsive = true }; ExportManager.ExportTable(table, "output.html", htmlOptions);

    // Export entire extraction result var config = new ExportConfiguration { ExportTables = true, ExportText = true, TableOptions = new JsonExportOptions(), SeparateFilePerTable = true }; ExportManager.ExportResult(result, "output", config);

    ------------------------------------------------

    ExportOptionsBase

    Base configuration options for exporting extracted data

    Contains common options applicable to all export formats.

    ------------------------------------------------

    Usage:

    // Use base options for generic export
    var options = new ExportOptionsBase
    {
    IncludeHeaders = true,
    SpanMode = SpanHandlingMode.Repeat
    };
    

    // Or use format-specific options var csvOptions = new CsvExportOptions { CsvDelimiter = ";", IncludeHeaders = true };

    ------------------------------------------------

    ExtractionProgress

    Information about the progress of an asynchronous extraction operation.

    Used to report progress to the caller during long-running extraction operations.

    HtmlExportOptions

    Configuration options for HTML exports

    Controls styling, responsiveness, and CSS class application for HTML table output.

    ------------------------------------------------

    Usage:

    var options = new HtmlExportOptions
    {
    HtmlIncludeStyles = true,
    HtmlResponsive = true,
    HtmlTableClass = "custom-table"
    };

    ------------------------------------------------

    JsonExportOptions

    Configuration options for JSON exports

    Currently inherits all options from ExportOptionsBase.

    PageMetadata

    Per-page metadata

    Contains information about a specific page, such as page number, table count, and word count.

    PageText

    Text content for a single page

    Contains text extracted from a single page of a PDF document.

    Includes both the raw text and positioned lines for layout reconstruction.

    PdfExtractionOptions

    Configuration options for PDF extraction behavior

    Provides control over how tables and text are extracted from PDF documents.

    Use this class to customize extraction parameters such as text mode, table detection strategy,

    and various tolerance values that affect extraction accuracy.

    ------------------------------------------------

    Usage:

    var options = new PdfExtractionOptions
    {
    TextMode = TextExtractionMode.Stream,
    TableStrategy = TableDetectionStrategy.Hybrid,
    EnableTableExtraction = true,
    EnableTextExtraction = true,
    CellMergeThreshold = 2.0,
    ColumnDetectionSensitivity = 15.0
    };
    var result = PdfExtractor.Extract("document.pdf", options);

    ------------------------------------------------

    PdfExtractionResult

    Represents the output of a PDF extraction operation, containing all extracted

    tables and text content, along with document metadata.

    Provides convenient methods to access specific portions of the extracted content.

    ------------------------------------------------

    Usage:

    var result = PdfExtractor.Extract("document.pdf");
    

    // Access all tables foreach (var table in result.Tables) { Console.WriteLine($"Table on page {table.PageNumber} with {table.RowCount} rows"); }

    // Get tables from a specific page var pageTables = result.GetTablesByPage(5);

    // Get text from a specific page var pageText = result.GetRawTextByPage(5);

    // Get full text including tables var fullText = result.FullText;

    ------------------------------------------------

    PdfExtractor

    Provides methods to extract tables and text from PDF documents with various options.

    Supports both synchronous and asynchronous extraction operations.

    ------------------------------------------------

    Usage:

    // Extract entire document
    var result = PdfExtractor.Extract("document.pdf");
    

    // Extract with custom options var options = new PdfExtractionOptions { TableStrategy = TableDetectionStrategy.Hybrid, EnableTextExtraction = false }; var tablesOnly = PdfExtractor.Extract("document.pdf", options);

    // Extract specific page var pageResult = PdfExtractor.ExtractPage("document.pdf", 5);

    // Extract specific table var table = PdfExtractor.ExtractTable("document.pdf", 5, 0);

    ------------------------------------------------

    SpanHandlingMode

    Enumeration of how to handle cells with rowspan/colspan in exports

    Controls how merged cells are represented in different export formats.

    TableCell

    Represents a table cell with span support

    Contains the content and metadata for a cell in a table.

    Supports merged cells (spans) across rows and columns.

    TableDetectionStrategy

    Table detection strategies

    Determines which algorithm(s) to use for detecting tables in PDF documents.

    TableObject

    Represents an extracted table with structural information

    Contains the data and metadata for a table extracted from a PDF document.

    Provides convenient methods to access table data and structure.

    TableRow

    Represents a table row

    Contains a collection of cells that make up a row in a table.

    TextContent

    Extracted text content outside of tables

    Provides methods to access text content for the entire document or specific pages.

    TextExtractionMode

    Text extraction modes

    Determines how text is extracted from PDF documents.

    TxtExportOptions

    Configuration options for plain text exports

    Currently inherits all options from ExportOptionsBase.

    The TXT exporter does not support different span-handling strategies. Regardless of the value provided in SpanMode, the TXT export always behaves as though SpanHandlingMode.Merge is used.

    Other values (Repeat, Empty, Annotate) have no effect on TXT output.

    XmlExportOptions

    Configuration options for XML exports

    Controls XML schema inclusion and formatting options.

    ------------------------------------------------

    Usage:

    var options = new XmlExportOptions
    {
    XmlIncludeSchema = true,
    XmlPrettyPrint = true
    };

    ------------------------------------------------

    ☀
    ☾
    Downloads
    • Download with Nuget
    • Start for Free
    In This Article
    Back to top
    Install with Nuget
    Want to deploy IronPDF to a live project for FREE?
    What’s included?
    30 days of fully-functional product
    Test and share in a live environment
    No watermarks in production
    Get your free 30-day Trial Key instantly.
    No credit card or account creation required
    Your Trial License Key has been emailed to you.
    Download IronPDF free to apply
    your Trial Licenses Key
    Install with NuGet View Licenses
    Licenses from $499. Have a question? Get in touch.