C#中的发票处理;:使用 .NET 生成、提取和自动处理 PDF 发票

This article was translated from English: Does it need improvement?
Translated
View the article in English

C# .NET 中的发票处理。IronPDF for NET涵盖了整个文档生命周期:从HTML模板生成专业的PDF发票,符合ZUGFeRDFactur-X电子发票标准,使用文本解析和AI驱动的处理从收到的发票中提取结构化数据,以及构建与QuickBooks、Xero和SAP等会计系统集成的批量自动化管道

TL;DR:快速入门指南

本教程涵盖在 C# .NET 中生成、提取和自动化 PDF 发票,包括电子发票合规性、人工智能驱动的解析和会计系统集成。

  • 适用对象:构建发票模块、应付账款自动化或电子发票合规性的 .NET 开发人员。
  • 您将构建的内容:带有细列项目和税额计算的 HTML 模板发票生成、用于付款链接的 QR 代码、符合 ZUGFeRD/Factur-X 标准的 PDF/A-3 输出、使用 regex 的文本提取、人工智能驱动的发票解析以及与会计系统集成的批量处理。
  • 运行环境: .NET 10、.NET 8 LTS、.NET Framework 4.6.2+ 和 .NET Standard 2.0。不依赖外部服务。
  • 何时使用此方法:当您需要生成发票 PDF、满足欧盟电子发票要求或从供应商发票中提取数据用于应付账款时。
  • 为什么它在技术上很重要: IronPDF 可将 HTML 精确地渲染为 PDF,支持嵌入 XML 的 PDF/A-3,并提供文本提取 API,可与 regex 或 AI 配对,将非结构化发票转化为结构化数据。

只需几行代码,即可生成您的第一份 PDF 发票:

Nuget Icon立即开始使用 NuGet 创建 PDF 文件:

  1. 使用 NuGet 包管理器安装 IronPDF

    PM > Install-Package IronPdf

  2. 复制并运行这段代码。

    var renderer = new IronPdf.ChromePdfRenderer();
    var pdf = renderer.RenderHtmlAsPdf("<h1>Invoice #1001</h1><p>Total: $500.00</p>");
    pdf.SaveAs("invoice.pdf");
  3. 部署到您的生产环境中进行测试

    立即开始在您的项目中使用 IronPDF,免费试用!
    arrow pointer

购买或注册 IronPDF 30 天试用版后,请在应用程序的开头添加许可证密钥。

IronPdf.License.LicenseKey = "KEY";
IronPdf.License.LicenseKey = "KEY";
$vbLabelText   $csharpLabel

今天在您的项目中使用 IronPDF,免费试用。

第一步:
green arrow pointer
NuGet 使用 NuGet 安装

PM >  Install-Package IronPdf

IronPDF 上查看 NuGet 快速安装。超过 1000 万次下载,它正以 C# 改变 PDF 开发。 您也可以下载 DLLWindows 安装程序

目录

NuGet 使用 NuGet 安装

PM >  Install-Package IronPdf

IronPDF 上查看 NuGet 快速安装。超过 1000 万次下载,它正以 C# 改变 PDF 开发。 您也可以下载 DLLWindows 安装程序

什么是发票生命周期,为什么 PDF 仍是标准?

在深入研究代码之前,了解发票在现代业务系统中的整个流程会有所帮助。 发票的生命周期包括五个不同的阶段:生成、分发、接收、数据提取和会计整合。

发票流程从生成开始。 某企业创建了一张发票,其中包括细列项目、定价、税额计算、付款条件和品牌。 发票需要看起来很专业,并符合所有法律要求。 接下来是分发,即通过电子邮件、客户门户网站或传统邮件将发票发送给客户。 当客户收到文件后,应付账款团队会捕获文件并准备处理。 数据提取可从发票中提取关键信息,如供应商详情、细列项目、总额和到期日,以便与采购订单进行核对和匹配。 最后,会计集成将这些数据转移到 QuickBooks、Xero 或 SAP 等财务系统中,以便付款和保存记录。

为什么 PDF 这么多年来仍然是使用最广泛的格式? 这归结为一个独特的优势组合。 无论您使用何种设备或操作系统,PDF 都能保持发票格式的一致性。 无论别人是在 Windows、Mac 还是手机上打开您的发票,它看起来都与您设计的一模一样。 PDF 也很难被误改,因此比 Word 或 Excel 等格式更能保护文档的完整性。 您可以添加数字签名以确保真实性,并使用加密技术以确保安全性。 最重要的是,PDF 已成为一种通用标准,每个业务系统都能识别和支持。

当然,这也是一项挑战。 PDF 的制作目的是方便人们阅读,而不是方便计算机处理。 PDF 不是以结构化数据存储信息,而是根据文本、线条、形状和图像在页面上出现的位置来保存它们。 这就是 IronPDF 这样的工具非常有用的原因,它们使得将人类友好的文档转化为软件可以使用的数据成为可能。


如何在 C# 中生成专业的 PDF 发票;

以编程方式生成发票需要将结构化数据(如客户信息、细列项目和计算结果)转换为精美的 PDF 文档。 IronPdf 利用 HTML 和 CSS 这些大多数开发人员都已熟知的技术,使翻译工作变得简单明了。

在本教程中,我们将介绍您在现实世界中可能遇到的情况。 您还可以在下载下面显示的项目。

如何构建发票 HTML 模板

IronPdf 生成发票的基础是 HTML。 与其与低级 PDF 绘图命令搏斗,不如使用标准 HTML 和 CSS 设计发票,然后让 IronPDF 基于 Chrome 浏览器的渲染引擎将其转换为像素完美的 PDF。

下面是一个基本的发票模板,展示了这种方法:

:path=/static-assets/pdf/content-code-examples/tutorials/csharp-invoice-processing/basic-invoice-template.cs
using IronPdf;

// Define the HTML template for a basic invoice
// Uses inline CSS for styling headers, tables, and totals
string invoiceHtml = @"
E html>


le>
body { font-family: Arial, sans-serif; padding: 40px; }
.header { text-align: right; margin-bottom: 40px; }
.company-name { font-size: 24px; font-weight: bold; color: #333; }
.invoice-title { font-size: 32px; margin: 20px 0; }
.bill-to { margin: 20px 0; }
table { width: 100%; border-collapse: collapse; margin: 20px 0; }
th { background-color: #2A95D5; color: white; padding: 10px; text-align: left; }
td { padding: 10px; border-bottom: 1px solid #ddd; }
.total { text-align: right; font-size: 20px; font-weight: bold; margin-top: 20px; }
yle>


 class='header'>
<div class='company-name'>Your Company Name</div>
<div>123 Business Street</div>
<div>City, State 12345</div>
v>

 class='invoice-title'>INVOICE</div>

 class='bill-to'>
<strong>Bill To:</strong><br>
Customer Name<br>
456 Customer Avenue<br>
City, State 67890
v>

le>
<tr>
    <th>Description</th>
    <th>Quantity</th>
    <th>Price</th>
    <th>Total</th>
</tr>
<tr>
    <td>Web Development Services</td>
    <td>10 hours</td>
    <td>$100.00</td>
    <td>$1,000.00</td>
</tr>
<tr>
    <td>Consulting</td>
    <td>5 hours</td>
    <td>$150.00</td>
    <td>$750.00</td>
</tr>
ble>

 class='total'>Total: $1,750.00</div>

;

// Initialize the Chrome-based PDF renderer
var renderer = new ChromePdfRenderer();

// Convert the HTML string to a PDF document
var pdf = renderer.RenderHtmlAsPdf(invoiceHtml);

// Save the generated PDF to disk
pdf.SaveAs("basic-invoice.pdf");
$vbLabelText   $csharpLabel

输出示例

这种方法具有极大的灵活性。 任何能在 Chrome 中使用的 CSS 都能在您的 PDF 中使用,包括 flexbox、网格布局和自定义字体等现代功能。 您甚至可以通过引用 URL 或本地文件路径来使用外部样式表和图片。

如何添加动态行项目并计算总数

真实发票很少有静态内容。 您需要从数据库中填充细列项目、计算小计、应用税率并格式化货币值。 下面的示例演示了动态发票生成的生产就绪模式:

using IronPdf;
using System;
using System.Collections.Generic;
using System.Linq;

// Represents a single line item on an invoice
public class InvoiceLineItem
{
    public string Description { get; set; }
    public decimal Quantity { get; set; }
    public decimal UnitPrice { get; set; }

    // Auto-calculates line total from quantity and unit price
    public decimal Total => Quantity * UnitPrice;
}

// Represents a complete invoice with customer details and line items
public class Invoice
{
    public string InvoiceNumber { get; set; }
    public DateTime InvoiceDate { get; set; }
    public string CustomerName { get; set; }
    public string CustomerAddress { get; set; }
    public List<InvoiceLineItem> LineItems { get; set; }

    // Computed properties for invoice totals
    public decimal Subtotal => LineItems.Sum(item => item.Total);
    public decimal TaxRate { get; set; } = 0.08m;  // Default 8% tax rate
    public decimal Tax => Subtotal * TaxRate;
    public decimal Total => Subtotal + Tax;
}

// Generates PDF invoices from Invoice objects using HTML templates
public class InvoiceGenerator
{
    public PdfDocument GenerateInvoice(Invoice invoice)
    {
        // Build HTML table rows dynamically from line items
        string lineItemsHtml = string.Join("", invoice.LineItems.Select(item => $@"
            <tr>
                <td>{item.Description}</td>
                <td>{item.Quantity}</td>
                <td>${item.UnitPrice:F2}</td>
                <td>${item.Total:F2}</td>
            </tr>
        "));

        // Build the complete HTML invoice using string interpolation
        // All invoice data is injected into the template dynamically
        string invoiceHtml = $@"
<!DOCTYPE html>
<html>
<head>
    <style>
        body {{ font-family: Arial, sans-serif; padding: 40px; }}
        .header {{ text-align: right; margin-bottom: 40px; }}
        .company-name {{ font-size: 24px; font-weight: bold; color: #333; }}
        .invoice-details {{ margin: 20px 0; }}
        table {{ width: 100%; border-collapse: collapse; margin: 20px 0; }}
        th {{ background-color: #2A95D5; color: white; padding: 10px; text-align: left; }}
        td {{ padding: 10px; border-bottom: 1px solid #ddd; }}
        .totals {{ text-align: right; margin-top: 20px; }}
        .totals div {{ margin: 5px 0; }}
        .grand-total {{ font-size: 20px; font-weight: bold; color: #2A95D5; }}
    </style>
</head>
<body>
    <div class='header'>
        <div class='company-name'>Your Company Name</div>
    </div>

    <h1>INVOICE</h1>

    <div class='invoice-details'>
        <strong>Invoice Number:</strong> {invoice.InvoiceNumber}<br>
        <strong>Date:</strong> {invoice.InvoiceDate:MMM dd, yyyy}<br>
        <strong>Bill To:</strong> {invoice.CustomerName}<br>
        {invoice.CustomerAddress}
    </div>

    <table>
        <tr>
            <th>Description</th>
            <th>Quantity</th>
            <th>Unit Price</th>
            <th>Total</th>
        </tr>
        {lineItemsHtml}
    </table>

    <div class='totals'>
        <div>Subtotal: ${invoice.Subtotal:F2}</div>
        <div>Tax ({invoice.TaxRate:P0}): ${invoice.Tax:F2}</div>
        <div class='grand-total'>Total: ${invoice.Total:F2}</div>
    </div>
</body>
</html>";

        // Render HTML to PDF and return the document
        var renderer = new ChromePdfRenderer();
        return renderer.RenderHtmlAsPdf(invoiceHtml);
    }
}
using IronPdf;
using System;
using System.Collections.Generic;
using System.Linq;

// Represents a single line item on an invoice
public class InvoiceLineItem
{
    public string Description { get; set; }
    public decimal Quantity { get; set; }
    public decimal UnitPrice { get; set; }

    // Auto-calculates line total from quantity and unit price
    public decimal Total => Quantity * UnitPrice;
}

// Represents a complete invoice with customer details and line items
public class Invoice
{
    public string InvoiceNumber { get; set; }
    public DateTime InvoiceDate { get; set; }
    public string CustomerName { get; set; }
    public string CustomerAddress { get; set; }
    public List<InvoiceLineItem> LineItems { get; set; }

    // Computed properties for invoice totals
    public decimal Subtotal => LineItems.Sum(item => item.Total);
    public decimal TaxRate { get; set; } = 0.08m;  // Default 8% tax rate
    public decimal Tax => Subtotal * TaxRate;
    public decimal Total => Subtotal + Tax;
}

// Generates PDF invoices from Invoice objects using HTML templates
public class InvoiceGenerator
{
    public PdfDocument GenerateInvoice(Invoice invoice)
    {
        // Build HTML table rows dynamically from line items
        string lineItemsHtml = string.Join("", invoice.LineItems.Select(item => $@"
            <tr>
                <td>{item.Description}</td>
                <td>{item.Quantity}</td>
                <td>${item.UnitPrice:F2}</td>
                <td>${item.Total:F2}</td>
            </tr>
        "));

        // Build the complete HTML invoice using string interpolation
        // All invoice data is injected into the template dynamically
        string invoiceHtml = $@"
<!DOCTYPE html>
<html>
<head>
    <style>
        body {{ font-family: Arial, sans-serif; padding: 40px; }}
        .header {{ text-align: right; margin-bottom: 40px; }}
        .company-name {{ font-size: 24px; font-weight: bold; color: #333; }}
        .invoice-details {{ margin: 20px 0; }}
        table {{ width: 100%; border-collapse: collapse; margin: 20px 0; }}
        th {{ background-color: #2A95D5; color: white; padding: 10px; text-align: left; }}
        td {{ padding: 10px; border-bottom: 1px solid #ddd; }}
        .totals {{ text-align: right; margin-top: 20px; }}
        .totals div {{ margin: 5px 0; }}
        .grand-total {{ font-size: 20px; font-weight: bold; color: #2A95D5; }}
    </style>
</head>
<body>
    <div class='header'>
        <div class='company-name'>Your Company Name</div>
    </div>

    <h1>INVOICE</h1>

    <div class='invoice-details'>
        <strong>Invoice Number:</strong> {invoice.InvoiceNumber}<br>
        <strong>Date:</strong> {invoice.InvoiceDate:MMM dd, yyyy}<br>
        <strong>Bill To:</strong> {invoice.CustomerName}<br>
        {invoice.CustomerAddress}
    </div>

    <table>
        <tr>
            <th>Description</th>
            <th>Quantity</th>
            <th>Unit Price</th>
            <th>Total</th>
        </tr>
        {lineItemsHtml}
    </table>

    <div class='totals'>
        <div>Subtotal: ${invoice.Subtotal:F2}</div>
        <div>Tax ({invoice.TaxRate:P0}): ${invoice.Tax:F2}</div>
        <div class='grand-total'>Total: ${invoice.Total:F2}</div>
    </div>
</body>
</html>";

        // Render HTML to PDF and return the document
        var renderer = new ChromePdfRenderer();
        return renderer.RenderHtmlAsPdf(invoiceHtml);
    }
}
$vbLabelText   $csharpLabel

输出示例

Invoice 类封装了所有发票数据,并具有小计、税金和总额的计算属性。 生成器使用字符串插值将这些数据转换为 HTML,然后渲染为 PDF。 这种分工使代码具有可维护性和可测试性。

如何在发票上添加公司品牌和水印

专业发票需要徽标等品牌元素,有时还需要水印来显示付款状态。 IronPDF 既支持 HTML 中的嵌入式图像,也支持渲染后的程序化水印。

:path=/static-assets/pdf/content-code-examples/tutorials/csharp-invoice-processing/branding-watermarks.cs
using IronPdf;
using IronPdf;

var renderer = new ChromePdfRenderer();

// Invoice HTML template with company logo embedded via URL
// Logo can also be Base64-encoded or a local file path
string htmlWithLogo = @"
E html>


le>
body { font-family: Arial, sans-serif; padding: 40px; }
.logo { width: 200px; margin-bottom: 20px; }
yle>


 style='text-align: center;'>
<img src='https://yourcompany.com/logo.png' alt='Company Logo' class='logo' />
v>
INVOICE</h1>
strong>Invoice Number:</strong> INV-2024-001</p>
strong>Total:</strong> $1,250.00</p>

;

// Render the HTML to PDF
var pdf = renderer.RenderHtmlAsPdf(htmlWithLogo);

// Apply a diagonal "UNPAID" watermark to mark invoice status
// 30% opacity keeps the content readable while the watermark is visible
pdf.ApplyWatermark("<h1 style='color: red;'>UNPAID</h1>",
    opacity: 30,
    rotation: 45,
    verticalAlignment: IronPdf.Editing.VerticalAlignment.Middle);

pdf.SaveAs("invoice-with-watermark.pdf");
using IronPdf;
$vbLabelText   $csharpLabel

输出示例

ApplyWatermark 方法接受 HTML 内容,使您可以完全控制水印的外观。 您可以调整不透明度、旋转和定位,以达到您所需要的效果。 这对于将发票标记为 "已付"、"草稿 "或 "取消 "而无需重新生成整个文档尤其有用。

如何为支付链接嵌入二维码

现代发票通常包含二维码,客户可以扫描二维码快速付款。 虽然 IronPDF 专注于 PDF 生成,但它可与 IronQR 无缝协作,用于创建条形码:

:path=/static-assets/pdf/content-code-examples/tutorials/csharp-invoice-processing/qr-code-payment.cs
using IronPdf;
using IronQr;
using IronSoftware.Drawing;

string invoiceNumber = "INV-2026-002";
decimal amount = 1500.00m;

// Create a payment URL with invoice details as query parameters
string paymentUrl = $"https://yourcompany.com/pay?invoice={invoiceNumber}&amount={amount}";

// Generate QR code from the payment URL using IronQR
QrCode qrCode = QrWriter.Write(paymentUrl);
AnyBitmap qrImage = qrCode.Save();
qrImage.SaveAs("payment-qr.png", AnyBitmap.ImageFormat.Png);

// Build invoice HTML with the QR code image embedded
// Customers can scan the QR to pay directly from their phone
string invoiceHtml = $@"
E html>


le>
body {{ font-family: Arial, sans-serif; padding: 40px; }}
.payment-section {{ margin-top: 40px; text-align: center;
                   border-top: 2px solid #eee; padding-top: 20px; }}
.qr-code {{ width: 150px; height: 150px; }}
yle>


INVOICE {invoiceNumber}</h1>
strong>Amount Due:</strong> ${amount:F2}</p>

 class='payment-section'>
<p><strong>Scan to Pay Instantly:</strong></p>
<img src='payment-qr.png' alt='Payment QR Code' class='qr-code' />
<p style='font-size: 12px; color: #666;'>
    Or visit: {paymentUrl}
</p>
v>

;

// Convert HTML to PDF and save
var renderer = new ChromePdfRenderer();
var pdf = renderer.RenderHtmlAsPdf(invoiceHtml);
pdf.SaveAs($"invoice-{invoiceNumber}.pdf");
$vbLabelText   $csharpLabel

输出示例

二维码可直接链接到支付页面,减少客户操作的摩擦,加快现金流。 该模式适用于任何支持基于 URL 的支付启动的支付提供商。


如何在 C# 中遵守 ZUGFeRD 和 Factur-X 电子发票标准;

在整个欧洲,电子发票正迅速成为强制性规定。 德国以 ZUGFeRD 领先,法国以 Factur-X 紧随其后。 这些标准在 PDF 发票中嵌入了机器可读的 XML 数据,实现了自动处理,同时保留了人类可读的文档。 对于在欧洲市场运营的企业来说,理解和实施这些标准越来越重要。

什么是 ZUGFeRD 及其工作原理?

ZUGFeRD(Zentraler User Guide des Forums elektronische Rechnung Deutschland)是德国电子发票标准,它将发票数据作为 XML 文件附件嵌入符合 PDF/A-3 标准的文档中。 嵌入式 XML 可自动提取数据,无需 OCR 或解析。

该标准定义了三个一致性级别,每个级别提供的数据结构逐步完善:

  • 基本:包含适合简单自动处理的核心发票数据
  • 舒适:添加详细信息,实现全自动发票处理
  • 扩展: 包括各行业复杂业务场景的全面数据

XML 遵循 UN/CEFACT 跨行业发票 (CII) 模式,该模式已成为欧洲电子发票标准化的基础。

什么是 Factur-X,它与 ZUGFeRD 有何不同?

Factur-X 是同一基础标准的法语实施版本。 ZUGFeRD 2.0 和 Factur-X 在技术上完全相同。 它们共享相同的 XML 架构和基于欧洲规范 EN 16931 的一致性配置文件。它们之间的区别纯粹是区域命名:根据 ZUGFeRD 规范创建的发票在 Factur-X 下有效,反之亦然。

如何在 PDF/A-3 发票中嵌入 XML 数据

IronPDF 提供创建合规电子发票所需的附件功能。 翻译过程包括生成 PDF 格式的发票,根据 CII 模式创建 XML 数据,并以正确的命名约定将 XML 作为附件嵌入:

using System;
using System.Xml.Linq;

// Generates ZUGFeRD-compliant invoices by embedding structured XML data
// ZUGFeRD allows automated processing while keeping a human-readable PDF
public class ZUGFeRDInvoiceGenerator
{
    public void GenerateZUGFeRDInvoice(Invoice invoice)
    {
        // First, create the visual PDF that humans will read
        var renderer = new ChromePdfRenderer();
        string invoiceHtml = BuildInvoiceHtml(invoice);
        var pdf = renderer.RenderHtmlAsPdf(invoiceHtml);

        // Define the UN/CEFACT namespaces required by the ZUGFeRD standard
        // These are mandatory for compliance with European e-invoicing regulations
        XNamespace rsm = "urn:un:unece:uncefact:data:standard:CrossIndustryInvoice:100";
        XNamespace ram = "urn:un:unece:uncefact:data:standard:ReusableAggregateBusinessInformationEntity:100";
        XNamespace udt = "urn:un:unece:uncefact:data:standard:UnqualifiedDataType:100";

        // Build the ZUGFeRD XML structure following the Cross-Industry Invoice schema
        var zugferdXml = new XDocument(
            new XDeclaration("1.0", "UTF-8", null),
            new XElement(rsm + "CrossIndustryInvoice",
                new XAttribute(XNamespace.Xmlns + "rsm", rsm.NamespaceName),
                new XAttribute(XNamespace.Xmlns + "ram", ram.NamespaceName),
                new XAttribute(XNamespace.Xmlns + "udt", udt.NamespaceName),

                // Document context identifies which e-invoicing guideline is being followed
                new XElement(rsm + "ExchangedDocumentContext",
                    new XElement(ram + "GuidelineSpecifiedDocumentContextParameter",
                        new XElement(ram + "ID", "urn:cen.eu:en16931:2017")
                    )
                ),

                // Core document identification: invoice number, type, and date
                new XElement(rsm + "ExchangedDocument",
                    new XElement(ram + "ID", invoice.InvoiceNumber),
                    new XElement(ram + "TypeCode", "380"), // 380 = Commercial Invoice per UN/CEFACT
                    new XElement(ram + "IssueDateTime",
                        new XElement(udt + "DateTimeString",
                            new XAttribute("format", "102"),
                            invoice.InvoiceDate.ToString("yyyyMMdd")
                        )
                    )
                ),

                // A complete implementation would include additional sections:
                // - Seller information (ram:SellerTradeParty)
                // - Buyer information (ram:BuyerTradeParty)
                // - Line items (ram:IncludedSupplyChainTradeLineItem)
                // - Payment terms (ram:SpecifiedTradePaymentTerms)
                // - Tax summaries (ram:ApplicableTradeTax)

                // Financial summary with all monetary totals
                new XElement(rsm + "SupplyChainTradeTransaction",
                    new XElement(ram + "ApplicableHeaderTradeSettlement",
                        new XElement(ram + "InvoiceCurrencyCode", "EUR"),
                        new XElement(ram + "SpecifiedTradeSettlementHeaderMonetarySummation",
                            new XElement(ram + "TaxBasisTotalAmount", invoice.Subtotal),
                            new XElement(ram + "TaxTotalAmount",
                                new XAttribute("currencyID", "EUR"),
                                invoice.Tax),
                            new XElement(ram + "GrandTotalAmount", invoice.Total),
                            new XElement(ram + "DuePayableAmount", invoice.Total)
                        )
                    )
                )
            )
        );

        // Save the XML to a temp file for embedding
        string xmlPath = $"zugferd-{invoice.InvoiceNumber}.xml";
        zugferdXml.Save(xmlPath);

        // Attach the XML to the PDF - filename must follow ZUGFeRD conventions
        pdf.Attachments.AddFile(xmlPath, "zugferd-invoice.xml", "ZUGFeRD Invoice Data");

        // Final PDF contains both visual invoice and machine-readable XML
        pdf.SaveAs($"invoice-{invoice.InvoiceNumber}-zugferd.pdf");
    }

    // Generates simple HTML for the visual portion of the invoice
    private string BuildInvoiceHtml(Invoice invoice)
    {
        return $@"
<!DOCTYPE html>
<html>
<head>
    <style>
        body {{ font-family: Arial, sans-serif; padding: 40px; }}
        h1 {{ color: #333; }}
        .zugferd-notice {{ 
            margin-top: 30px; padding: 10px; 
            background: #f0f0f0; font-size: 11px; 
        }}
    </style>
</head>
<body>
    <h1>RECHNUNG / INVOICE</h1>
    <p><strong>Rechnungsnummer:</strong> {invoice.InvoiceNumber}</p>
    <p><strong>Datum:</strong> {invoice.InvoiceDate:dd.MM.yyyy}</p>
    <p><strong>Betrag:</strong> €{invoice.Total:F2}</p>

    <div class='zugferd-notice'>
        This invoice contains embedded ZUGFeRD data for automated processing.
    </div>
</body>
</html>";
    }
}
using System;
using System.Xml.Linq;

// Generates ZUGFeRD-compliant invoices by embedding structured XML data
// ZUGFeRD allows automated processing while keeping a human-readable PDF
public class ZUGFeRDInvoiceGenerator
{
    public void GenerateZUGFeRDInvoice(Invoice invoice)
    {
        // First, create the visual PDF that humans will read
        var renderer = new ChromePdfRenderer();
        string invoiceHtml = BuildInvoiceHtml(invoice);
        var pdf = renderer.RenderHtmlAsPdf(invoiceHtml);

        // Define the UN/CEFACT namespaces required by the ZUGFeRD standard
        // These are mandatory for compliance with European e-invoicing regulations
        XNamespace rsm = "urn:un:unece:uncefact:data:standard:CrossIndustryInvoice:100";
        XNamespace ram = "urn:un:unece:uncefact:data:standard:ReusableAggregateBusinessInformationEntity:100";
        XNamespace udt = "urn:un:unece:uncefact:data:standard:UnqualifiedDataType:100";

        // Build the ZUGFeRD XML structure following the Cross-Industry Invoice schema
        var zugferdXml = new XDocument(
            new XDeclaration("1.0", "UTF-8", null),
            new XElement(rsm + "CrossIndustryInvoice",
                new XAttribute(XNamespace.Xmlns + "rsm", rsm.NamespaceName),
                new XAttribute(XNamespace.Xmlns + "ram", ram.NamespaceName),
                new XAttribute(XNamespace.Xmlns + "udt", udt.NamespaceName),

                // Document context identifies which e-invoicing guideline is being followed
                new XElement(rsm + "ExchangedDocumentContext",
                    new XElement(ram + "GuidelineSpecifiedDocumentContextParameter",
                        new XElement(ram + "ID", "urn:cen.eu:en16931:2017")
                    )
                ),

                // Core document identification: invoice number, type, and date
                new XElement(rsm + "ExchangedDocument",
                    new XElement(ram + "ID", invoice.InvoiceNumber),
                    new XElement(ram + "TypeCode", "380"), // 380 = Commercial Invoice per UN/CEFACT
                    new XElement(ram + "IssueDateTime",
                        new XElement(udt + "DateTimeString",
                            new XAttribute("format", "102"),
                            invoice.InvoiceDate.ToString("yyyyMMdd")
                        )
                    )
                ),

                // A complete implementation would include additional sections:
                // - Seller information (ram:SellerTradeParty)
                // - Buyer information (ram:BuyerTradeParty)
                // - Line items (ram:IncludedSupplyChainTradeLineItem)
                // - Payment terms (ram:SpecifiedTradePaymentTerms)
                // - Tax summaries (ram:ApplicableTradeTax)

                // Financial summary with all monetary totals
                new XElement(rsm + "SupplyChainTradeTransaction",
                    new XElement(ram + "ApplicableHeaderTradeSettlement",
                        new XElement(ram + "InvoiceCurrencyCode", "EUR"),
                        new XElement(ram + "SpecifiedTradeSettlementHeaderMonetarySummation",
                            new XElement(ram + "TaxBasisTotalAmount", invoice.Subtotal),
                            new XElement(ram + "TaxTotalAmount",
                                new XAttribute("currencyID", "EUR"),
                                invoice.Tax),
                            new XElement(ram + "GrandTotalAmount", invoice.Total),
                            new XElement(ram + "DuePayableAmount", invoice.Total)
                        )
                    )
                )
            )
        );

        // Save the XML to a temp file for embedding
        string xmlPath = $"zugferd-{invoice.InvoiceNumber}.xml";
        zugferdXml.Save(xmlPath);

        // Attach the XML to the PDF - filename must follow ZUGFeRD conventions
        pdf.Attachments.AddFile(xmlPath, "zugferd-invoice.xml", "ZUGFeRD Invoice Data");

        // Final PDF contains both visual invoice and machine-readable XML
        pdf.SaveAs($"invoice-{invoice.InvoiceNumber}-zugferd.pdf");
    }

    // Generates simple HTML for the visual portion of the invoice
    private string BuildInvoiceHtml(Invoice invoice)
    {
        return $@"
<!DOCTYPE html>
<html>
<head>
    <style>
        body {{ font-family: Arial, sans-serif; padding: 40px; }}
        h1 {{ color: #333; }}
        .zugferd-notice {{ 
            margin-top: 30px; padding: 10px; 
            background: #f0f0f0; font-size: 11px; 
        }}
    </style>
</head>
<body>
    <h1>RECHNUNG / INVOICE</h1>
    <p><strong>Rechnungsnummer:</strong> {invoice.InvoiceNumber}</p>
    <p><strong>Datum:</strong> {invoice.InvoiceDate:dd.MM.yyyy}</p>
    <p><strong>Betrag:</strong> €{invoice.Total:F2}</p>

    <div class='zugferd-notice'>
        This invoice contains embedded ZUGFeRD data for automated processing.
    </div>
</body>
</html>";
    }
}
$vbLabelText   $csharpLabel

输出示例

合规性的主要方面是使用正确的 XML 命名空间、遵循 CII 架构结构以及以适当的文件名嵌入 XML。 类型代码 "380 "明确指出该文档是 UN/CEFACT 标准中的商业发票。

如何使发票符合欧盟要求

欧盟正在各成员国逐步推行电子发票。 意大利已经要求 B2B 交易必须使用电子发票,法国将在 2026 年前逐步实施相关要求,而德国已宣布从 2025 年开始强制实施 B2B 电子发票。现在就建立 ZUGFeRD/Factur-X 支持,为您的系统满足这些监管要求做好准备。

下面是一个合规感知发票生成器的模式,它可以针对不同的标准:

using IronPdf;
using System;

// Enum representing supported European e-invoicing standards
public enum InvoiceStandard
{
    None,
    ZUGFeRD,    // German standard - uses CII XML format
    FacturX,    // French standard - technically identical to ZUGFeRD 2.0
    Peppol      // Pan-European standard - uses UBL XML format
}

// Factory class that generates invoices compliant with different e-invoicing standards
// Allows switching between standards without changing core invoice generation logic
public class CompliantInvoiceGenerator
{
    public PdfDocument GenerateCompliantInvoice(Invoice invoice, InvoiceStandard standard)
    {
        // Generate the base PDF from HTML
        var renderer = new ChromePdfRenderer();
        string html = BuildInvoiceHtml(invoice);
        var pdf = renderer.RenderHtmlAsPdf(html);

        // Attach the appropriate XML format based on target market/regulation
        switch (standard)
        {
            case InvoiceStandard.ZUGFeRD:
            case InvoiceStandard.FacturX:
                // Both use Cross-Industry Invoice format, just different filenames
                EmbedCIIXmlData(pdf, invoice, standard);
                break;
            case InvoiceStandard.Peppol:
                // Peppol uses Universal Business Language format
                EmbedUBLXmlData(pdf, invoice);
                break;
        }

        return pdf;
    }

    // Creates and embeds CII-format XML (used by ZUGFeRD and Factur-X)
    private void EmbedCIIXmlData(PdfDocument pdf, Invoice invoice, InvoiceStandard standard)
    {
        string xml = GenerateCIIXml(invoice);

        // Filename convention differs between German and French standards
        string filename = standard == InvoiceStandard.ZUGFeRD
            ? "zugferd-invoice.xml"
            : "factur-x.xml";

        System.IO.File.WriteAllText("temp-invoice.xml", xml);
        pdf.Attachments.AddFile("temp-invoice.xml", filename, $"{standard} Invoice Data");
    }

    // Creates and embeds UBL-format XML for Peppol network compliance
    private void EmbedUBLXmlData(PdfDocument pdf, Invoice invoice)
    {
        // UBL (Universal Business Language) is the Peppol standard format
        string xml = $@"<?xml version='1.0' encoding='UTF-8'?>
<Invoice xmlns='urn:oasis:names:specification:ubl:schema:xsd:Invoice-2'>
    <ID>{invoice.InvoiceNumber}</ID>
    <IssueDate>{invoice.InvoiceDate:yyyy-MM-dd}</IssueDate>
    <DocumentCurrencyCode>EUR</DocumentCurrencyCode>
    <LegalMonetaryTotal>
        <PayableAmount currencyID='EUR'>{invoice.Total}</PayableAmount>
    </LegalMonetaryTotal>
</Invoice>";

        System.IO.File.WriteAllText("peppol-invoice.xml", xml);
        pdf.Attachments.AddFile("peppol-invoice.xml", "invoice.xml", "Peppol UBL Invoice");
    }

    // Generates minimal CII XML structure for demonstration
    private string GenerateCIIXml(Invoice invoice)
    {
        return $@"<?xml version='1.0' encoding='UTF-8'?>
<rsm:CrossIndustryInvoice
    xmlns:rsm='urn:un:unece:uncefact:data:standard:CrossIndustryInvoice:100'
    xmlns:ram='urn:un:unece:uncefact:data:standard:ReusableAggregateBusinessInformationEntity:100'>
    <rsm:ExchangedDocument>
        <ram:ID>{invoice.InvoiceNumber}</ram:ID>
        <ram:TypeCode>380</ram:TypeCode>
    </rsm:ExchangedDocument>
</rsm:CrossIndustryInvoice>";
    }

    private string BuildInvoiceHtml(Invoice invoice)
    {
        return $"<html><body><h1>Invoice {invoice.InvoiceNumber}</h1></body></html>";
    }
}
using IronPdf;
using System;

// Enum representing supported European e-invoicing standards
public enum InvoiceStandard
{
    None,
    ZUGFeRD,    // German standard - uses CII XML format
    FacturX,    // French standard - technically identical to ZUGFeRD 2.0
    Peppol      // Pan-European standard - uses UBL XML format
}

// Factory class that generates invoices compliant with different e-invoicing standards
// Allows switching between standards without changing core invoice generation logic
public class CompliantInvoiceGenerator
{
    public PdfDocument GenerateCompliantInvoice(Invoice invoice, InvoiceStandard standard)
    {
        // Generate the base PDF from HTML
        var renderer = new ChromePdfRenderer();
        string html = BuildInvoiceHtml(invoice);
        var pdf = renderer.RenderHtmlAsPdf(html);

        // Attach the appropriate XML format based on target market/regulation
        switch (standard)
        {
            case InvoiceStandard.ZUGFeRD:
            case InvoiceStandard.FacturX:
                // Both use Cross-Industry Invoice format, just different filenames
                EmbedCIIXmlData(pdf, invoice, standard);
                break;
            case InvoiceStandard.Peppol:
                // Peppol uses Universal Business Language format
                EmbedUBLXmlData(pdf, invoice);
                break;
        }

        return pdf;
    }

    // Creates and embeds CII-format XML (used by ZUGFeRD and Factur-X)
    private void EmbedCIIXmlData(PdfDocument pdf, Invoice invoice, InvoiceStandard standard)
    {
        string xml = GenerateCIIXml(invoice);

        // Filename convention differs between German and French standards
        string filename = standard == InvoiceStandard.ZUGFeRD
            ? "zugferd-invoice.xml"
            : "factur-x.xml";

        System.IO.File.WriteAllText("temp-invoice.xml", xml);
        pdf.Attachments.AddFile("temp-invoice.xml", filename, $"{standard} Invoice Data");
    }

    // Creates and embeds UBL-format XML for Peppol network compliance
    private void EmbedUBLXmlData(PdfDocument pdf, Invoice invoice)
    {
        // UBL (Universal Business Language) is the Peppol standard format
        string xml = $@"<?xml version='1.0' encoding='UTF-8'?>
<Invoice xmlns='urn:oasis:names:specification:ubl:schema:xsd:Invoice-2'>
    <ID>{invoice.InvoiceNumber}</ID>
    <IssueDate>{invoice.InvoiceDate:yyyy-MM-dd}</IssueDate>
    <DocumentCurrencyCode>EUR</DocumentCurrencyCode>
    <LegalMonetaryTotal>
        <PayableAmount currencyID='EUR'>{invoice.Total}</PayableAmount>
    </LegalMonetaryTotal>
</Invoice>";

        System.IO.File.WriteAllText("peppol-invoice.xml", xml);
        pdf.Attachments.AddFile("peppol-invoice.xml", "invoice.xml", "Peppol UBL Invoice");
    }

    // Generates minimal CII XML structure for demonstration
    private string GenerateCIIXml(Invoice invoice)
    {
        return $@"<?xml version='1.0' encoding='UTF-8'?>
<rsm:CrossIndustryInvoice
    xmlns:rsm='urn:un:unece:uncefact:data:standard:CrossIndustryInvoice:100'
    xmlns:ram='urn:un:unece:uncefact:data:standard:ReusableAggregateBusinessInformationEntity:100'>
    <rsm:ExchangedDocument>
        <ram:ID>{invoice.InvoiceNumber}</ram:ID>
        <ram:TypeCode>380</ram:TypeCode>
    </rsm:ExchangedDocument>
</rsm:CrossIndustryInvoice>";
    }

    private string BuildInvoiceHtml(Invoice invoice)
    {
        return $"<html><body><h1>Invoice {invoice.InvoiceNumber}</h1></body></html>";
    }
}
$vbLabelText   $csharpLabel

这种架构允许您在新标准出现时添加新标准,而无需重组核心发票生成逻辑。 基于枚举的方法可以让用户或配置轻松决定使用哪种合规模式。


如何在 C# 中从 PDF 发票中提取数据;

生成发票只是成功的一半。 大多数企业也会收到来自供应商的发票,需要提取数据进行处理。 IronPDF 提供强大的文本提取功能,是发票数据采集的基础。

如何从 PDF 发票中提取文本

最基本的提取操作是从 PDF 中检索所有文本内容。 IronPDF 的 ExtractAllText 方法可处理 PDF 文本编码和定位的复杂性:

using IronPdf;
using System;

// Extracts raw text content from PDF invoices for further processing
public class InvoiceTextExtractor
{
    // Extracts all text from a PDF in one operation
    // Best for single-page invoices or when you need the complete content
    public string ExtractInvoiceText(string pdfPath)
    {
        var pdf = PdfDocument.FromFile(pdfPath);

        // IronPDF handles the complexity of PDF text encoding and positioning
        string allText = pdf.ExtractAllText();
        Console.WriteLine("Full invoice text:");
        Console.WriteLine(allText);

        return allText;
    }

    // Extracts text page by page - useful for multi-page invoices
    // Allows you to process header info separately from line items
    public void ExtractTextByPage(string pdfPath)
    {
        var pdf = PdfDocument.FromFile(pdfPath);

        // Iterate through each page (0-indexed)
        for (int i = 0; i < pdf.PageCount; i++)
        {
            string pageText = pdf.ExtractTextFromPage(i);
            Console.WriteLine($"\n--- Page {i + 1} ---");
            Console.WriteLine(pageText);
        }
    }
}
using IronPdf;
using System;

// Extracts raw text content from PDF invoices for further processing
public class InvoiceTextExtractor
{
    // Extracts all text from a PDF in one operation
    // Best for single-page invoices or when you need the complete content
    public string ExtractInvoiceText(string pdfPath)
    {
        var pdf = PdfDocument.FromFile(pdfPath);

        // IronPDF handles the complexity of PDF text encoding and positioning
        string allText = pdf.ExtractAllText();
        Console.WriteLine("Full invoice text:");
        Console.WriteLine(allText);

        return allText;
    }

    // Extracts text page by page - useful for multi-page invoices
    // Allows you to process header info separately from line items
    public void ExtractTextByPage(string pdfPath)
    {
        var pdf = PdfDocument.FromFile(pdfPath);

        // Iterate through each page (0-indexed)
        for (int i = 0; i < pdf.PageCount; i++)
        {
            string pageText = pdf.ExtractTextFromPage(i);
            Console.WriteLine($"\n--- Page {i + 1} ---");
            Console.WriteLine(pageText);
        }
    }
}
$vbLabelText   $csharpLabel

逐页提取对于需要查找特定部分的多页发票特别有用,例如查找跨多页的细列项目,而标题信息只出现在第一页。

如何提取行项目的表数据

发票项目通常以表格形式出现。 PDF 缺乏原生表格结构,但您可以提取文本并进行解析,以重建表格数据:

using IronPdf;
using System;
using System.Collections.Generic;

// Data model for a single invoice line item
public class InvoiceLineItem
{
    public string Description { get; set; }
    public decimal Quantity { get; set; }
    public decimal UnitPrice { get; set; }
    public decimal Total { get; set; }
}

// Extracts tabular line item data from PDF invoices
// Note: PDFs don't have native table structure, so this uses text parsing
public class InvoiceTableExtractor
{
    public List<InvoiceLineItem> ExtractLineItems(string pdfPath)
    {
        var pdf = PdfDocument.FromFile(pdfPath);
        string text = pdf.ExtractAllText();

        var lineItems = new List<InvoiceLineItem>();
        string[] lines = text.Split('\n');

        foreach (string line in lines)
        {
            // Currency symbols indicate potential line items with amounts
            if (line.Contains("$") || line.Contains("€"))
            {
                Console.WriteLine($"Potential line item: {line.Trim()}");

                // Split on whitespace to separate columns
                // Actual parsing logic depends on your invoice format
                string[] parts = line.Split(new[] { '\t', ' ' },
                    StringSplitOptions.RemoveEmptyEntries);

                // Try to find numeric values that could be amounts
                foreach (string part in parts)
                {
                    string cleaned = part.Replace("$", "").Replace("€", "").Replace(",", "");
                    if (decimal.TryParse(cleaned, out decimal amount))
                    {
                        Console.WriteLine($"  Found amount: {amount:C}");
                    }
                }
            }
        }

        return lineItems;
    }
}
using IronPdf;
using System;
using System.Collections.Generic;

// Data model for a single invoice line item
public class InvoiceLineItem
{
    public string Description { get; set; }
    public decimal Quantity { get; set; }
    public decimal UnitPrice { get; set; }
    public decimal Total { get; set; }
}

// Extracts tabular line item data from PDF invoices
// Note: PDFs don't have native table structure, so this uses text parsing
public class InvoiceTableExtractor
{
    public List<InvoiceLineItem> ExtractLineItems(string pdfPath)
    {
        var pdf = PdfDocument.FromFile(pdfPath);
        string text = pdf.ExtractAllText();

        var lineItems = new List<InvoiceLineItem>();
        string[] lines = text.Split('\n');

        foreach (string line in lines)
        {
            // Currency symbols indicate potential line items with amounts
            if (line.Contains("$") || line.Contains("€"))
            {
                Console.WriteLine($"Potential line item: {line.Trim()}");

                // Split on whitespace to separate columns
                // Actual parsing logic depends on your invoice format
                string[] parts = line.Split(new[] { '\t', ' ' },
                    StringSplitOptions.RemoveEmptyEntries);

                // Try to find numeric values that could be amounts
                foreach (string part in parts)
                {
                    string cleaned = part.Replace("$", "").Replace("€", "").Replace(",", "");
                    if (decimal.TryParse(cleaned, out decimal amount))
                    {
                        Console.WriteLine($"  Found amount: {amount:C}");
                    }
                }
            }
        }

        return lineItems;
    }
}
$vbLabelText   $csharpLabel

解析逻辑将根据您的发票格式而有所不同。 对于已知供应商具有一致布局的发票,您可以构建特定格式的解析器。 对于不同的格式,可以考虑本文后面介绍的人工智能提取。

如何对发票号码、日期和总数使用模式匹配

正则表达式对于从发票文本中提取特定数据点非常有用。 发票号码、日期和总额等关键字段通常遵循可识别的模式:

using IronPdf;
using System;
using System.Text.RegularExpressions;

// Data model for extracted invoice information
public class InvoiceData
{
    public string InvoiceNumber { get; set; }
    public string InvoiceDate { get; set; }
    public decimal TotalAmount { get; set; }
    public string VendorName { get; set; }
}

// Extracts key invoice fields using regex pattern matching
// Multiple patterns handle variations across different vendors
public class InvoiceParser
{
    public InvoiceData ParseInvoice(string pdfPath)
    {
        var pdf = PdfDocument.FromFile(pdfPath);
        string text = pdf.ExtractAllText();

        var invoiceData = new InvoiceData();

        // Try multiple patterns to find invoice number
        // Handles: "Invoice #123", "INV-123", "Invoice Number: 123", German "Rechnungsnummer"
        string[] invoiceNumberPatterns = new[]
        {
            @"Invoice\s*#?\s*:?\s*([A-Z0-9-]+)",
            @"INV[-\s]?(\d+)",
            @"Invoice\s+Number\s*:?\s*([A-Z0-9-]+)",
            @"Rechnungsnummer\s*:?\s*([A-Z0-9-]+)"
        };

        foreach (string pattern in invoiceNumberPatterns)
        {
            var match = Regex.Match(text, pattern, RegexOptions.IgnoreCase);
            if (match.Success)
            {
                invoiceData.InvoiceNumber = match.Groups[1].Value;
                Console.WriteLine($"Found Invoice Number: {invoiceData.InvoiceNumber}");
                break;
            }
        }

        // Date patterns for US, European, and written formats
        string[] datePatterns = new[]
        {
            @"Date\s*:?\s*(\d{1,2}[/-]\d{1,2}[/-]\d{2,4})",
            @"Invoice\s+Date\s*:?\s*(\d{1,2}[/-]\d{1,2}[/-]\d{2,4})",
            @"(\d{1,2}\.\d{1,2}\.\d{4})",  // European: DD.MM.YYYY
            @"(\w+\s+\d{1,2},?\s+\d{4})"   // Written: January 15, 2024
        };

        foreach (string pattern in datePatterns)
        {
            var match = Regex.Match(text, pattern, RegexOptions.IgnoreCase);
            if (match.Success)
            {
                invoiceData.InvoiceDate = match.Groups[1].Value;
                Console.WriteLine($"Found Date: {invoiceData.InvoiceDate}");
                break;
            }
        }

        // Look for total amount with various labels
        string[] totalPatterns = new[]
        {
            @"Total\s*:?\s*[\$€]?\s*([\d,]+\.\d{2})",
            @"Amount\s+Due\s*:?\s*[\$€]?\s*([\d,]+\.\d{2})",
            @"Grand\s+Total\s*:?\s*[\$€]?\s*([\d,]+\.\d{2})",
            @"Balance\s+Due\s*:?\s*[\$€]?\s*([\d,]+\.\d{2})"
        };

        foreach (string pattern in totalPatterns)
        {
            var match = Regex.Match(text, pattern, RegexOptions.IgnoreCase);
            if (match.Success)
            {
                // Remove commas before parsing
                string amountStr = match.Groups[1].Value.Replace(",", "");
                if (decimal.TryParse(amountStr, out decimal amount))
                {
                    invoiceData.TotalAmount = amount;
                    Console.WriteLine($"Found Total: ${invoiceData.TotalAmount:F2}");
                    break;
                }
            }
        }

        return invoiceData;
    }
}
using IronPdf;
using System;
using System.Text.RegularExpressions;

// Data model for extracted invoice information
public class InvoiceData
{
    public string InvoiceNumber { get; set; }
    public string InvoiceDate { get; set; }
    public decimal TotalAmount { get; set; }
    public string VendorName { get; set; }
}

// Extracts key invoice fields using regex pattern matching
// Multiple patterns handle variations across different vendors
public class InvoiceParser
{
    public InvoiceData ParseInvoice(string pdfPath)
    {
        var pdf = PdfDocument.FromFile(pdfPath);
        string text = pdf.ExtractAllText();

        var invoiceData = new InvoiceData();

        // Try multiple patterns to find invoice number
        // Handles: "Invoice #123", "INV-123", "Invoice Number: 123", German "Rechnungsnummer"
        string[] invoiceNumberPatterns = new[]
        {
            @"Invoice\s*#?\s*:?\s*([A-Z0-9-]+)",
            @"INV[-\s]?(\d+)",
            @"Invoice\s+Number\s*:?\s*([A-Z0-9-]+)",
            @"Rechnungsnummer\s*:?\s*([A-Z0-9-]+)"
        };

        foreach (string pattern in invoiceNumberPatterns)
        {
            var match = Regex.Match(text, pattern, RegexOptions.IgnoreCase);
            if (match.Success)
            {
                invoiceData.InvoiceNumber = match.Groups[1].Value;
                Console.WriteLine($"Found Invoice Number: {invoiceData.InvoiceNumber}");
                break;
            }
        }

        // Date patterns for US, European, and written formats
        string[] datePatterns = new[]
        {
            @"Date\s*:?\s*(\d{1,2}[/-]\d{1,2}[/-]\d{2,4})",
            @"Invoice\s+Date\s*:?\s*(\d{1,2}[/-]\d{1,2}[/-]\d{2,4})",
            @"(\d{1,2}\.\d{1,2}\.\d{4})",  // European: DD.MM.YYYY
            @"(\w+\s+\d{1,2},?\s+\d{4})"   // Written: January 15, 2024
        };

        foreach (string pattern in datePatterns)
        {
            var match = Regex.Match(text, pattern, RegexOptions.IgnoreCase);
            if (match.Success)
            {
                invoiceData.InvoiceDate = match.Groups[1].Value;
                Console.WriteLine($"Found Date: {invoiceData.InvoiceDate}");
                break;
            }
        }

        // Look for total amount with various labels
        string[] totalPatterns = new[]
        {
            @"Total\s*:?\s*[\$€]?\s*([\d,]+\.\d{2})",
            @"Amount\s+Due\s*:?\s*[\$€]?\s*([\d,]+\.\d{2})",
            @"Grand\s+Total\s*:?\s*[\$€]?\s*([\d,]+\.\d{2})",
            @"Balance\s+Due\s*:?\s*[\$€]?\s*([\d,]+\.\d{2})"
        };

        foreach (string pattern in totalPatterns)
        {
            var match = Regex.Match(text, pattern, RegexOptions.IgnoreCase);
            if (match.Success)
            {
                // Remove commas before parsing
                string amountStr = match.Groups[1].Value.Replace(",", "");
                if (decimal.TryParse(amountStr, out decimal amount))
                {
                    invoiceData.TotalAmount = amount;
                    Console.WriteLine($"Found Total: ${invoiceData.TotalAmount:F2}");
                    break;
                }
            }
        }

        return invoiceData;
    }
}
$vbLabelText   $csharpLabel

这种基于模式的方法对于具有可预测格式的发票非常有效。 多种模式变化可处理不同供应商之间常见的格式差异,如 "发票#"和 "发票号码:"。

扫描发票或基于图像的发票怎么办?

上述文本提取方法适用于包含嵌入文本的 PDF。 然而,扫描文件和基于图像的 PDF 没有可提取的文本。 它们基本上是发票的图片。

请注意处理扫描发票时,您需要 OCR(光学字符识别)功能。 IronOCR 是 Iron Suite 的一部分,可与 IronPDF 无缝集成,用于这些场景。 请访问 https://ironsoftware.com/csharp/ocr/ 了解更多有关从扫描文档和图像中提取文本的信息。


如何在 .NET 中使用 AI 处理发票

传统的模式匹配对于标准化发票非常有效,但现实世界中应付账款部门收到的文件格式数不胜数。 这正是人工智能提取技术的优势所在。 大型语言模型可以理解发票语义,甚至可以从陌生的布局中提取结构化数据。

如何将人工智能整合到发票解析中

人工智能驱动的发票处理模式结合了 IronPDF 的文本提取和 LLM API 调用。 下面是一个通用的实现方法,可与任何兼容 OpenAI 的 API 配合使用:

using IronPdf;
using System;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

// Data model for extracted invoice information
public class InvoiceData
{
    public string InvoiceNumber { get; set; }
    public string InvoiceDate { get; set; }
    public string VendorName { get; set; }
    public decimal TotalAmount { get; set; }
}

// Leverages AI/LLM APIs to extract structured data from any invoice format
// Works with OpenAI or any compatible API endpoint
public class AIInvoiceParser
{
    private readonly HttpClient _httpClient;
    private readonly string _apiKey;
    private readonly string _apiUrl;

    public AIInvoiceParser(string apiKey, string apiUrl = "https://api.openai.com/v1/chat/completions")
    {
        _apiKey = apiKey;
        _apiUrl = apiUrl;
        _httpClient = new HttpClient();
        _httpClient.DefaultRequestHeaders.Add("Authorization", $"Bearer {_apiKey}");
    }

    public async Task<InvoiceData> ParseInvoiceWithAI(string pdfPath)
    {
        // First extract raw text from the PDF using IronPDF
        var pdf = PdfDocument.FromFile(pdfPath);
        string invoiceText = pdf.ExtractAllText();

        // Construct a prompt that instructs the AI to return structured JSON
        // Being explicit about the format reduces parsing errors
        string prompt = $@"Extract the following information from this invoice text.
Return ONLY valid JSON with no additional text or markdown formatting.

Required fields:
- InvoiceNumber: The invoice or document number
- InvoiceDate: The invoice date in YYYY-MM-DD format
- VendorName: The company or person who sent the invoice
- TotalAmount: The total amount due as a number (no currency symbols)

Invoice text:
{invoiceText}

JSON response:";

        // Build the API request with a system prompt for context
        var requestBody = new
        {
            model = "gpt-4",
            messages = new[]
            {
                new {
                    role = "system",
                    content = "You are an invoice data extraction assistant. Extract structured data from invoices and return valid JSON only."
                },
                new { role = "user", content = prompt }
            },
            temperature = 0.1  // Low temperature ensures consistent, deterministic results
        };

        var json = JsonSerializer.Serialize(requestBody);
        var content = new StringContent(json, Encoding.UTF8, "application/json");

        var response = await _httpClient.PostAsync(_apiUrl, content);
        var responseJson = await response.Content.ReadAsStringAsync();

        // Navigate the API response structure to get the extracted content
        using var doc = JsonDocument.Parse(responseJson);
        var messageContent = doc.RootElement
            .GetProperty("choices")[0]
            .GetProperty("message")
            .GetProperty("content")
            .GetString();

        Console.WriteLine("AI Extracted Data:");
        Console.WriteLine(messageContent);

        // Deserialize the AI's JSON response into our data class
        var invoiceData = JsonSerializer.Deserialize<InvoiceData>(messageContent,
            new JsonSerializerOptions { PropertyNameCaseInsensitive = true });

        return invoiceData;
    }
}
using IronPdf;
using System;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

// Data model for extracted invoice information
public class InvoiceData
{
    public string InvoiceNumber { get; set; }
    public string InvoiceDate { get; set; }
    public string VendorName { get; set; }
    public decimal TotalAmount { get; set; }
}

// Leverages AI/LLM APIs to extract structured data from any invoice format
// Works with OpenAI or any compatible API endpoint
public class AIInvoiceParser
{
    private readonly HttpClient _httpClient;
    private readonly string _apiKey;
    private readonly string _apiUrl;

    public AIInvoiceParser(string apiKey, string apiUrl = "https://api.openai.com/v1/chat/completions")
    {
        _apiKey = apiKey;
        _apiUrl = apiUrl;
        _httpClient = new HttpClient();
        _httpClient.DefaultRequestHeaders.Add("Authorization", $"Bearer {_apiKey}");
    }

    public async Task<InvoiceData> ParseInvoiceWithAI(string pdfPath)
    {
        // First extract raw text from the PDF using IronPDF
        var pdf = PdfDocument.FromFile(pdfPath);
        string invoiceText = pdf.ExtractAllText();

        // Construct a prompt that instructs the AI to return structured JSON
        // Being explicit about the format reduces parsing errors
        string prompt = $@"Extract the following information from this invoice text.
Return ONLY valid JSON with no additional text or markdown formatting.

Required fields:
- InvoiceNumber: The invoice or document number
- InvoiceDate: The invoice date in YYYY-MM-DD format
- VendorName: The company or person who sent the invoice
- TotalAmount: The total amount due as a number (no currency symbols)

Invoice text:
{invoiceText}

JSON response:";

        // Build the API request with a system prompt for context
        var requestBody = new
        {
            model = "gpt-4",
            messages = new[]
            {
                new {
                    role = "system",
                    content = "You are an invoice data extraction assistant. Extract structured data from invoices and return valid JSON only."
                },
                new { role = "user", content = prompt }
            },
            temperature = 0.1  // Low temperature ensures consistent, deterministic results
        };

        var json = JsonSerializer.Serialize(requestBody);
        var content = new StringContent(json, Encoding.UTF8, "application/json");

        var response = await _httpClient.PostAsync(_apiUrl, content);
        var responseJson = await response.Content.ReadAsStringAsync();

        // Navigate the API response structure to get the extracted content
        using var doc = JsonDocument.Parse(responseJson);
        var messageContent = doc.RootElement
            .GetProperty("choices")[0]
            .GetProperty("message")
            .GetProperty("content")
            .GetString();

        Console.WriteLine("AI Extracted Data:");
        Console.WriteLine(messageContent);

        // Deserialize the AI's JSON response into our data class
        var invoiceData = JsonSerializer.Deserialize<InvoiceData>(messageContent,
            new JsonSerializerOptions { PropertyNameCaseInsensitive = true });

        return invoiceData;
    }
}
$vbLabelText   $csharpLabel

低温设置(0.1)鼓励确定性输出,这对于数据提取任务非常重要,因为您希望相同输入结果一致。

如何从发票中提取结构化 JSON 数据

对于包含细列项目、供应商详细信息和客户信息的更复杂发票,您可以要求使用更丰富的 JSON 结构:

using IronPdf;
using System;
using System.Collections.Generic;
using System.Text.Json;
using System.Threading.Tasks;

// Comprehensive invoice data model with all details
public class DetailedInvoiceData
{
    public string InvoiceNumber { get; set; }
    public DateTime InvoiceDate { get; set; }
    public DateTime DueDate { get; set; }
    public VendorInfo Vendor { get; set; }
    public CustomerInfo Customer { get; set; }
    public List<LineItem> LineItems { get; set; }
    public decimal Subtotal { get; set; }
    public decimal Tax { get; set; }
    public decimal Total { get; set; }
}

public class VendorInfo
{
    public string Name { get; set; }
    public string Address { get; set; }
    public string TaxId { get; set; }
}

public class CustomerInfo
{
    public string Name { get; set; }
    public string Address { get; set; }
}

public class LineItem
{
    public string Description { get; set; }
    public decimal Quantity { get; set; }
    public decimal UnitPrice { get; set; }
    public decimal Total { get; set; }
}

// Extracts comprehensive invoice data including line items and party details
public class StructuredInvoiceExtractor
{
    private readonly AIInvoiceParser _aiParser;

    public StructuredInvoiceExtractor(string apiKey)
    {
        _aiParser = new AIInvoiceParser(apiKey);
    }

    public async Task<DetailedInvoiceData> ExtractDetailedData(string pdfPath)
    {
        var pdf = PdfDocument.FromFile(pdfPath);
        string text = pdf.ExtractAllText();

        // Define the exact JSON structure we want the AI to return
        // This schema guides the AI to extract all relevant fields
        string jsonSchema = @"{
  ""InvoiceNumber"": ""string"",
  ""InvoiceDate"": ""YYYY-MM-DD"",
  ""DueDate"": ""YYYY-MM-DD"",
  ""Vendor"": {
    ""Name"": ""string"",
    ""Address"": ""string"",
    ""TaxId"": ""string or null""
  },
  ""Customer"": {
    ""Name"": ""string"",
    ""Address"": ""string""
  },
  ""LineItems"": [
    {
      ""Description"": ""string"",
      ""Quantity"": 0.0,
      ""UnitPrice"": 0.00,
      ""Total"": 0.00
    }
  ],
  ""Subtotal"": 0.00,
  ""Tax"": 0.00,
  ""Total"": 0.00
}";

        // Prompt includes both the schema and the extracted text
        string prompt = $@"Extract all invoice data and return it in this exact JSON structure:
{jsonSchema}

Invoice text:
{text}

Return only valid JSON, no markdown formatting or additional text.";

        // Call AI API and parse response (implementation as shown above)
        // Return deserialized DetailedInvoiceData

        return new DetailedInvoiceData(); // Placeholder
    }
}
using IronPdf;
using System;
using System.Collections.Generic;
using System.Text.Json;
using System.Threading.Tasks;

// Comprehensive invoice data model with all details
public class DetailedInvoiceData
{
    public string InvoiceNumber { get; set; }
    public DateTime InvoiceDate { get; set; }
    public DateTime DueDate { get; set; }
    public VendorInfo Vendor { get; set; }
    public CustomerInfo Customer { get; set; }
    public List<LineItem> LineItems { get; set; }
    public decimal Subtotal { get; set; }
    public decimal Tax { get; set; }
    public decimal Total { get; set; }
}

public class VendorInfo
{
    public string Name { get; set; }
    public string Address { get; set; }
    public string TaxId { get; set; }
}

public class CustomerInfo
{
    public string Name { get; set; }
    public string Address { get; set; }
}

public class LineItem
{
    public string Description { get; set; }
    public decimal Quantity { get; set; }
    public decimal UnitPrice { get; set; }
    public decimal Total { get; set; }
}

// Extracts comprehensive invoice data including line items and party details
public class StructuredInvoiceExtractor
{
    private readonly AIInvoiceParser _aiParser;

    public StructuredInvoiceExtractor(string apiKey)
    {
        _aiParser = new AIInvoiceParser(apiKey);
    }

    public async Task<DetailedInvoiceData> ExtractDetailedData(string pdfPath)
    {
        var pdf = PdfDocument.FromFile(pdfPath);
        string text = pdf.ExtractAllText();

        // Define the exact JSON structure we want the AI to return
        // This schema guides the AI to extract all relevant fields
        string jsonSchema = @"{
  ""InvoiceNumber"": ""string"",
  ""InvoiceDate"": ""YYYY-MM-DD"",
  ""DueDate"": ""YYYY-MM-DD"",
  ""Vendor"": {
    ""Name"": ""string"",
    ""Address"": ""string"",
    ""TaxId"": ""string or null""
  },
  ""Customer"": {
    ""Name"": ""string"",
    ""Address"": ""string""
  },
  ""LineItems"": [
    {
      ""Description"": ""string"",
      ""Quantity"": 0.0,
      ""UnitPrice"": 0.00,
      ""Total"": 0.00
    }
  ],
  ""Subtotal"": 0.00,
  ""Tax"": 0.00,
  ""Total"": 0.00
}";

        // Prompt includes both the schema and the extracted text
        string prompt = $@"Extract all invoice data and return it in this exact JSON structure:
{jsonSchema}

Invoice text:
{text}

Return only valid JSON, no markdown formatting or additional text.";

        // Call AI API and parse response (implementation as shown above)
        // Return deserialized DetailedInvoiceData

        return new DetailedInvoiceData(); // Placeholder
    }
}
$vbLabelText   $csharpLabel

如何处理不一致的发票格式

在处理来自多个供应商的发票时,人工智能提取的真正威力就显现出来了,每个供应商都有自己独特的格式。 智能处理器可以首先尝试基于模式的提取(更快、更自由),只有在需要时才使用人工智能:

using IronPdf;
using System.Threading.Tasks;

// Hybrid processor that optimizes for cost and capability
// Tries fast regex patterns first, uses AI only when patterns fail
public class SmartInvoiceProcessor
{
    private readonly AIInvoiceParser _aiParser;

    public SmartInvoiceProcessor(string aiApiKey)
    {
        _aiParser = new AIInvoiceParser(aiApiKey);
    }

    public async Task<InvoiceData> ProcessAnyInvoice(string pdfPath)
    {
        var pdf = PdfDocument.FromFile(pdfPath);
        string text = pdf.ExtractAllText();

        // First attempt: regex patterns (fast and free)
        var patternParser = new InvoiceParser();
        var standardResult = patternParser.ParseInvoiceFromText(text);

        // If pattern matching found all required fields, use that result
        if (IsComplete(standardResult))
        {
            Console.WriteLine("Pattern extraction successful");
            return standardResult;
        }

        // Fallback: use AI for complex or unusual invoice formats
        // This costs money but handles any layout
        Console.WriteLine("Using AI extraction for complex invoice format");
        var aiResult = await _aiParser.ParseInvoiceWithAI(pdfPath);

        return aiResult;
    }

    // Validates that we have the minimum required fields
    private bool IsComplete(InvoiceData data)
    {
        return !string.IsNullOrEmpty(data.InvoiceNumber) &&
               !string.IsNullOrEmpty(data.InvoiceDate) &&
               data.TotalAmount > 0;
    }
}
using IronPdf;
using System.Threading.Tasks;

// Hybrid processor that optimizes for cost and capability
// Tries fast regex patterns first, uses AI only when patterns fail
public class SmartInvoiceProcessor
{
    private readonly AIInvoiceParser _aiParser;

    public SmartInvoiceProcessor(string aiApiKey)
    {
        _aiParser = new AIInvoiceParser(aiApiKey);
    }

    public async Task<InvoiceData> ProcessAnyInvoice(string pdfPath)
    {
        var pdf = PdfDocument.FromFile(pdfPath);
        string text = pdf.ExtractAllText();

        // First attempt: regex patterns (fast and free)
        var patternParser = new InvoiceParser();
        var standardResult = patternParser.ParseInvoiceFromText(text);

        // If pattern matching found all required fields, use that result
        if (IsComplete(standardResult))
        {
            Console.WriteLine("Pattern extraction successful");
            return standardResult;
        }

        // Fallback: use AI for complex or unusual invoice formats
        // This costs money but handles any layout
        Console.WriteLine("Using AI extraction for complex invoice format");
        var aiResult = await _aiParser.ParseInvoiceWithAI(pdfPath);

        return aiResult;
    }

    // Validates that we have the minimum required fields
    private bool IsComplete(InvoiceData data)
    {
        return !string.IsNullOrEmpty(data.InvoiceNumber) &&
               !string.IsNullOrEmpty(data.InvoiceDate) &&
               data.TotalAmount > 0;
    }
}
$vbLabelText   $csharpLabel

如何构建应付账款自动化管道

将所有这些部分整合在一起,这就是一个完整的自动化管道,它可以处理收到的发票、提取数据、验证数据并为您的会计系统做好准备:

using IronPdf;
using System;
using System.IO;
using System.Threading.Tasks;
using System.Collections.Generic;
using System.Linq;

// Tracks the outcome of processing each invoice
public class ProcessingResult
{
    public string FileName { get; set; }
    public bool Success { get; set; }
    public string InvoiceNumber { get; set; }
    public string ErrorMessage { get; set; }
}

// Complete automation pipeline for accounts payable
// Watches a folder, extracts data, validates, and routes to accounting system
public class InvoiceAutomationPipeline
{
    private readonly SmartInvoiceProcessor _processor;
    private readonly string _inputFolder;
    private readonly string _processedFolder;
    private readonly string _errorFolder;

    public InvoiceAutomationPipeline(string apiKey, string inputFolder)
    {
        _processor = new SmartInvoiceProcessor(apiKey);
        _inputFolder = inputFolder;
        _processedFolder = Path.Combine(inputFolder, "processed");
        _errorFolder = Path.Combine(inputFolder, "errors");

        // Create output directories if they don't exist
        Directory.CreateDirectory(_processedFolder);
        Directory.CreateDirectory(_errorFolder);
    }

    // Main entry point - processes all PDFs in the input folder
    public async Task<List<ProcessingResult>> ProcessInvoiceBatch()
    {
        string[] invoiceFiles = Directory.GetFiles(_inputFolder, "*.pdf");
        Console.WriteLine($"Found {invoiceFiles.Length} invoices to process");

        var results = new List<ProcessingResult>();

        foreach (string invoicePath in invoiceFiles)
        {
            string fileName = Path.GetFileName(invoicePath);

            try
            {
                Console.WriteLine($"Processing: {fileName}");

                // Extract data using smart processor (patterns first, then AI)
                var invoiceData = await _processor.ProcessAnyInvoice(invoicePath);

                // Ensure we have minimum required fields before proceeding
                if (ValidateInvoiceData(invoiceData))
                {
                    // Send to accounting system (QuickBooks, Xero, etc.)
                    await SaveToAccountingSystem(invoiceData);

                    // Archive successful invoices
                    string destPath = Path.Combine(_processedFolder, fileName);
                    File.Move(invoicePath, destPath, overwrite: true);

                    results.Add(new ProcessingResult
                    {
                        FileName = fileName,
                        Success = true,
                        InvoiceNumber = invoiceData.InvoiceNumber
                    });

                    Console.WriteLine($"✓ Processed: {invoiceData.InvoiceNumber}");
                }
                else
                {
                    throw new Exception("Validation failed - missing required fields");
                }
            }
            catch (Exception ex)
            {
                Console.WriteLine($"✗ Failed: {fileName} - {ex.Message}");

                // Quarantine failed invoices for manual review
                string destPath = Path.Combine(_errorFolder, fileName);
                File.Move(invoicePath, destPath, overwrite: true);

                results.Add(new ProcessingResult
                {
                    FileName = fileName,
                    Success = false,
                    ErrorMessage = ex.Message
                });
            }
        }

        GenerateReport(results);
        return results;
    }

    // Checks for minimum required fields
    private bool ValidateInvoiceData(InvoiceData data)
    {
        return !string.IsNullOrEmpty(data.InvoiceNumber) &&
               !string.IsNullOrEmpty(data.VendorName) &&
               data.TotalAmount > 0;
    }

    // Placeholder for accounting system integration
    private async Task SaveToAccountingSystem(InvoiceData data)
    {
        // Integrate with your accounting system here
        // Examples: QuickBooks API, Xero API, SAP, or database storage
        Console.WriteLine($"  Saved invoice {data.InvoiceNumber} to accounting system");
        await Task.CompletedTask;
    }

    // Outputs a summary of the batch processing results
    private void GenerateReport(List<ProcessingResult> results)
    {
        int successful = results.Count(r => r.Success);
        int failed = results.Count(r => !r.Success);

        Console.WriteLine($"\n========== Processing Complete ==========");
        Console.WriteLine($"Total Processed: {results.Count}");
        Console.WriteLine($"Successful: {successful}");
        Console.WriteLine($"Failed: {failed}");

        if (failed > 0)
        {
            Console.WriteLine("\nFailed invoices requiring review:");
            foreach (var failure in results.Where(r => !r.Success))
            {
                Console.WriteLine($"  • {failure.FileName}: {failure.ErrorMessage}");
            }
        }
    }
}
using IronPdf;
using System;
using System.IO;
using System.Threading.Tasks;
using System.Collections.Generic;
using System.Linq;

// Tracks the outcome of processing each invoice
public class ProcessingResult
{
    public string FileName { get; set; }
    public bool Success { get; set; }
    public string InvoiceNumber { get; set; }
    public string ErrorMessage { get; set; }
}

// Complete automation pipeline for accounts payable
// Watches a folder, extracts data, validates, and routes to accounting system
public class InvoiceAutomationPipeline
{
    private readonly SmartInvoiceProcessor _processor;
    private readonly string _inputFolder;
    private readonly string _processedFolder;
    private readonly string _errorFolder;

    public InvoiceAutomationPipeline(string apiKey, string inputFolder)
    {
        _processor = new SmartInvoiceProcessor(apiKey);
        _inputFolder = inputFolder;
        _processedFolder = Path.Combine(inputFolder, "processed");
        _errorFolder = Path.Combine(inputFolder, "errors");

        // Create output directories if they don't exist
        Directory.CreateDirectory(_processedFolder);
        Directory.CreateDirectory(_errorFolder);
    }

    // Main entry point - processes all PDFs in the input folder
    public async Task<List<ProcessingResult>> ProcessInvoiceBatch()
    {
        string[] invoiceFiles = Directory.GetFiles(_inputFolder, "*.pdf");
        Console.WriteLine($"Found {invoiceFiles.Length} invoices to process");

        var results = new List<ProcessingResult>();

        foreach (string invoicePath in invoiceFiles)
        {
            string fileName = Path.GetFileName(invoicePath);

            try
            {
                Console.WriteLine($"Processing: {fileName}");

                // Extract data using smart processor (patterns first, then AI)
                var invoiceData = await _processor.ProcessAnyInvoice(invoicePath);

                // Ensure we have minimum required fields before proceeding
                if (ValidateInvoiceData(invoiceData))
                {
                    // Send to accounting system (QuickBooks, Xero, etc.)
                    await SaveToAccountingSystem(invoiceData);

                    // Archive successful invoices
                    string destPath = Path.Combine(_processedFolder, fileName);
                    File.Move(invoicePath, destPath, overwrite: true);

                    results.Add(new ProcessingResult
                    {
                        FileName = fileName,
                        Success = true,
                        InvoiceNumber = invoiceData.InvoiceNumber
                    });

                    Console.WriteLine($"✓ Processed: {invoiceData.InvoiceNumber}");
                }
                else
                {
                    throw new Exception("Validation failed - missing required fields");
                }
            }
            catch (Exception ex)
            {
                Console.WriteLine($"✗ Failed: {fileName} - {ex.Message}");

                // Quarantine failed invoices for manual review
                string destPath = Path.Combine(_errorFolder, fileName);
                File.Move(invoicePath, destPath, overwrite: true);

                results.Add(new ProcessingResult
                {
                    FileName = fileName,
                    Success = false,
                    ErrorMessage = ex.Message
                });
            }
        }

        GenerateReport(results);
        return results;
    }

    // Checks for minimum required fields
    private bool ValidateInvoiceData(InvoiceData data)
    {
        return !string.IsNullOrEmpty(data.InvoiceNumber) &&
               !string.IsNullOrEmpty(data.VendorName) &&
               data.TotalAmount > 0;
    }

    // Placeholder for accounting system integration
    private async Task SaveToAccountingSystem(InvoiceData data)
    {
        // Integrate with your accounting system here
        // Examples: QuickBooks API, Xero API, SAP, or database storage
        Console.WriteLine($"  Saved invoice {data.InvoiceNumber} to accounting system");
        await Task.CompletedTask;
    }

    // Outputs a summary of the batch processing results
    private void GenerateReport(List<ProcessingResult> results)
    {
        int successful = results.Count(r => r.Success);
        int failed = results.Count(r => !r.Success);

        Console.WriteLine($"\n========== Processing Complete ==========");
        Console.WriteLine($"Total Processed: {results.Count}");
        Console.WriteLine($"Successful: {successful}");
        Console.WriteLine($"Failed: {failed}");

        if (failed > 0)
        {
            Console.WriteLine("\nFailed invoices requiring review:");
            foreach (var failure in results.Where(r => !r.Success))
            {
                Console.WriteLine($"  • {failure.FileName}: {failure.ErrorMessage}");
            }
        }
    }
}
$vbLabelText   $csharpLabel

该流水线实现了一个完整的工作流程:它会扫描文件夹中传入的 PDF 文件,处理每一个文件,验证提取的数据,将成功提取的数据传输到您的会计系统,并将失败的数据隔离以便进行人工审核。 摘要报告可提供处理结果的可见性。


如何将 C## 发票处理与会计系统集成

提取的发票数据最终需要流入会计系统,用于支付和记录。 具体内容因平台而异,但集成模式是一致的。

QuickBooks、Xero 和 SAP 的常见集成模式有哪些?

大多数会计平台都提供 REST API,用于以编程方式创建账单或发票。 以下是一种通用模式,您可以根据具体平台进行调整:

using System;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

// Generic integration layer for pushing invoice data to accounting systems
// Adapt the API calls based on your specific platform
public class AccountingSystemIntegration
{
    private readonly HttpClient _httpClient;
    private readonly string _apiKey;
    private readonly string _baseUrl;

    public AccountingSystemIntegration(string apiKey, string baseUrl)
    {
        _apiKey = apiKey;
        _baseUrl = baseUrl;
        _httpClient = new HttpClient();
        _httpClient.DefaultRequestHeaders.Add("Authorization", $"Bearer {_apiKey}");
    }

    // Creates a Bill in QuickBooks (vendor invoices are called "Bills")
    public async Task SendToQuickBooks(InvoiceData invoice)
    {
        // QuickBooks Bill structure - see their API docs for full schema
        var bill = new
        {
            VendorRef = new { name = invoice.VendorName },
            TxnDate = invoice.InvoiceDate,
            DocNumber = invoice.InvoiceNumber,
            TotalAmt = invoice.TotalAmount,
            Line = new[]
            {
                new
                {
                    Amount = invoice.TotalAmount,
                    DetailType = "AccountBasedExpenseLineDetail",
                    AccountBasedExpenseLineDetail = new
                    {
                        AccountRef = new { name = "Accounts Payable" }
                    }
                }
            }
        };

        await PostToApi("/v3/company/{companyId}/bill", bill);
    }

    // Creates an accounts payable invoice in Xero
    public async Task SendToXero(InvoiceData invoice)
    {
        // ACCPAY type indicates this is a bill to pay (not a sales invoice)
        var bill = new
        {
            Type = "ACCPAY",
            Contact = new { Name = invoice.VendorName },
            Date = invoice.InvoiceDate,
            InvoiceNumber = invoice.InvoiceNumber,
            Total = invoice.TotalAmount
        };

        await PostToApi("/api.xro/2.0/Invoices", bill);
    }

    // Generic POST helper with error handling
    private async Task PostToApi(string endpoint, object payload)
    {
        string json = JsonSerializer.Serialize(payload);
        var content = new StringContent(json, Encoding.UTF8, "application/json");

        var response = await _httpClient.PostAsync($"{_baseUrl}{endpoint}", content);

        if (!response.IsSuccessStatusCode)
        {
            string error = await response.Content.ReadAsStringAsync();
            throw new Exception($"API Error: {response.StatusCode} - {error}");
        }

        Console.WriteLine($"Successfully posted to {endpoint}");
    }
}
using System;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

// Generic integration layer for pushing invoice data to accounting systems
// Adapt the API calls based on your specific platform
public class AccountingSystemIntegration
{
    private readonly HttpClient _httpClient;
    private readonly string _apiKey;
    private readonly string _baseUrl;

    public AccountingSystemIntegration(string apiKey, string baseUrl)
    {
        _apiKey = apiKey;
        _baseUrl = baseUrl;
        _httpClient = new HttpClient();
        _httpClient.DefaultRequestHeaders.Add("Authorization", $"Bearer {_apiKey}");
    }

    // Creates a Bill in QuickBooks (vendor invoices are called "Bills")
    public async Task SendToQuickBooks(InvoiceData invoice)
    {
        // QuickBooks Bill structure - see their API docs for full schema
        var bill = new
        {
            VendorRef = new { name = invoice.VendorName },
            TxnDate = invoice.InvoiceDate,
            DocNumber = invoice.InvoiceNumber,
            TotalAmt = invoice.TotalAmount,
            Line = new[]
            {
                new
                {
                    Amount = invoice.TotalAmount,
                    DetailType = "AccountBasedExpenseLineDetail",
                    AccountBasedExpenseLineDetail = new
                    {
                        AccountRef = new { name = "Accounts Payable" }
                    }
                }
            }
        };

        await PostToApi("/v3/company/{companyId}/bill", bill);
    }

    // Creates an accounts payable invoice in Xero
    public async Task SendToXero(InvoiceData invoice)
    {
        // ACCPAY type indicates this is a bill to pay (not a sales invoice)
        var bill = new
        {
            Type = "ACCPAY",
            Contact = new { Name = invoice.VendorName },
            Date = invoice.InvoiceDate,
            InvoiceNumber = invoice.InvoiceNumber,
            Total = invoice.TotalAmount
        };

        await PostToApi("/api.xro/2.0/Invoices", bill);
    }

    // Generic POST helper with error handling
    private async Task PostToApi(string endpoint, object payload)
    {
        string json = JsonSerializer.Serialize(payload);
        var content = new StringContent(json, Encoding.UTF8, "application/json");

        var response = await _httpClient.PostAsync($"{_baseUrl}{endpoint}", content);

        if (!response.IsSuccessStatusCode)
        {
            string error = await response.Content.ReadAsStringAsync();
            throw new Exception($"API Error: {response.StatusCode} - {error}");
        }

        Console.WriteLine($"Successfully posted to {endpoint}");
    }
}
$vbLabelText   $csharpLabel

每个平台都有自己的验证机制(QuickBooks 和 Xero 使用 OAuth,SAP 使用各种方法)、必填字段和 API 约定。 具体细节请参考目标平台的文档,但将提取的发票数据转换为 API 有效载荷的模式保持一致。

如何批量处理数百份发票

大批量的发票处理需要仔细关注并发性和资源管理。 下面是一种使用并行处理和受控并发的模式:

using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;

// Tracks the result of processing a single invoice in a batch
public class BatchResult
{
    public string FilePath { get; set; }
    public bool Success { get; set; }
    public string InvoiceNumber { get; set; }
    public string Error { get; set; }
}

// High-volume invoice processor with controlled parallelism
// Prevents overwhelming APIs while maximizing throughput
public class BatchInvoiceProcessor
{
    private readonly SmartInvoiceProcessor _invoiceProcessor;
    private readonly AccountingSystemIntegration _accountingIntegration;
    private readonly int _maxConcurrency;

    public BatchInvoiceProcessor(string aiApiKey, string accountingApiKey,
        string accountingUrl, int maxConcurrency = 5)
    {
        _invoiceProcessor = new SmartInvoiceProcessor(aiApiKey);
        _accountingIntegration = new AccountingSystemIntegration(accountingApiKey, accountingUrl);
        _maxConcurrency = maxConcurrency;  // Adjust based on API rate limits
    }

    // Processes multiple invoices in parallel with controlled concurrency
    public async Task<List<BatchResult>> ProcessInvoiceBatch(List<string> invoicePaths)
    {
        // Thread-safe collection for gathering results from parallel tasks
        var results = new ConcurrentBag<BatchResult>();

        // Semaphore limits how many invoices process simultaneously
        var semaphore = new SemaphoreSlim(_maxConcurrency);

        // Create a task for each invoice
        var tasks = invoicePaths.Select(async path =>
        {
            // Wait for a slot to become available
            await semaphore.WaitAsync();
            try
            {
                var result = await ProcessSingleInvoice(path);
                results.Add(result);
            }
            finally
            {
                // Release slot for next invoice
                semaphore.Release();
            }
        });

        // Wait for all invoices to complete
        await Task.WhenAll(tasks);

        // Output summary statistics
        var resultList = results.ToList();
        int successful = resultList.Count(r => r.Success);
        int failed = resultList.Count(r => !r.Success);

        Console.WriteLine($"\nBatch Processing Complete:");
        Console.WriteLine($"  Total: {resultList.Count}");
        Console.WriteLine($"  Successful: {successful}");
        Console.WriteLine($"  Failed: {failed}");

        return resultList;
    }

    // Processes one invoice: extract data and send to accounting system
    private async Task<BatchResult> ProcessSingleInvoice(string pdfPath)
    {
        try
        {
            Console.WriteLine($"Processing: {pdfPath}");

            var invoiceData = await _invoiceProcessor.ProcessAnyInvoice(pdfPath);
            await _accountingIntegration.SendToQuickBooks(invoiceData);

            Console.WriteLine($"✓ Completed: {invoiceData.InvoiceNumber}");

            return new BatchResult
            {
                FilePath = pdfPath,
                Success = true,
                InvoiceNumber = invoiceData.InvoiceNumber
            };
        }
        catch (Exception ex)
        {
            Console.WriteLine($"✗ Failed: {pdfPath}");

            return new BatchResult
            {
                FilePath = pdfPath,
                Success = false,
                Error = ex.Message
            };
        }
    }
}
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;

// Tracks the result of processing a single invoice in a batch
public class BatchResult
{
    public string FilePath { get; set; }
    public bool Success { get; set; }
    public string InvoiceNumber { get; set; }
    public string Error { get; set; }
}

// High-volume invoice processor with controlled parallelism
// Prevents overwhelming APIs while maximizing throughput
public class BatchInvoiceProcessor
{
    private readonly SmartInvoiceProcessor _invoiceProcessor;
    private readonly AccountingSystemIntegration _accountingIntegration;
    private readonly int _maxConcurrency;

    public BatchInvoiceProcessor(string aiApiKey, string accountingApiKey,
        string accountingUrl, int maxConcurrency = 5)
    {
        _invoiceProcessor = new SmartInvoiceProcessor(aiApiKey);
        _accountingIntegration = new AccountingSystemIntegration(accountingApiKey, accountingUrl);
        _maxConcurrency = maxConcurrency;  // Adjust based on API rate limits
    }

    // Processes multiple invoices in parallel with controlled concurrency
    public async Task<List<BatchResult>> ProcessInvoiceBatch(List<string> invoicePaths)
    {
        // Thread-safe collection for gathering results from parallel tasks
        var results = new ConcurrentBag<BatchResult>();

        // Semaphore limits how many invoices process simultaneously
        var semaphore = new SemaphoreSlim(_maxConcurrency);

        // Create a task for each invoice
        var tasks = invoicePaths.Select(async path =>
        {
            // Wait for a slot to become available
            await semaphore.WaitAsync();
            try
            {
                var result = await ProcessSingleInvoice(path);
                results.Add(result);
            }
            finally
            {
                // Release slot for next invoice
                semaphore.Release();
            }
        });

        // Wait for all invoices to complete
        await Task.WhenAll(tasks);

        // Output summary statistics
        var resultList = results.ToList();
        int successful = resultList.Count(r => r.Success);
        int failed = resultList.Count(r => !r.Success);

        Console.WriteLine($"\nBatch Processing Complete:");
        Console.WriteLine($"  Total: {resultList.Count}");
        Console.WriteLine($"  Successful: {successful}");
        Console.WriteLine($"  Failed: {failed}");

        return resultList;
    }

    // Processes one invoice: extract data and send to accounting system
    private async Task<BatchResult> ProcessSingleInvoice(string pdfPath)
    {
        try
        {
            Console.WriteLine($"Processing: {pdfPath}");

            var invoiceData = await _invoiceProcessor.ProcessAnyInvoice(pdfPath);
            await _accountingIntegration.SendToQuickBooks(invoiceData);

            Console.WriteLine($"✓ Completed: {invoiceData.InvoiceNumber}");

            return new BatchResult
            {
                FilePath = pdfPath,
                Success = true,
                InvoiceNumber = invoiceData.InvoiceNumber
            };
        }
        catch (Exception ex)
        {
            Console.WriteLine($"✗ Failed: {pdfPath}");

            return new BatchResult
            {
                FilePath = pdfPath,
                Success = false,
                Error = ex.Message
            };
        }
    }
}
$vbLabelText   $csharpLabel

SemaphoreSlim可确保您不会淹没外部 API 或耗尽系统资源。 根据您的 API 速率限制和服务器容量调整 _maxConcurrencyConcurrentBag 可以安全地收集并行操作的结果。


下一步

发票自动化是减少人工工作、减少错误和加快业务流程的重要机会。 本指南将引导您了解整个生命周期:从 HTML 模板生成专业发票,符合 ZUGFeRD 和 Factur-X 电子发票标准、使用模式匹配和AI 驱动的处理从收到的发票中提取数据,并构建可扩展的自动化管道。

IronPDF是这些功能的基础,它提供了强大的 HTML 到 PDF 渲染功能、可靠的 文本提取功能以及 PDF/A-3 电子发票合规性所需的附件功能。 其基于 Chrome 浏览器的渲染引擎可确保您的发票看起来与设计完全一致,而其提取方法可自动处理 PDF 文本编码的复杂性。

此处显示的模式是起点。 实际实施时需要根据您的具体发票格式、会计系统和业务规则进行调整。 对于大容量场景,批处理教程涵盖了并行执行与受控并发和错误恢复。

准备好开始构建了吗? 下载 IronPDF 并免费试用。 该库包含一个免费的开发许可证,因此您可以在获得生产许可证之前充分评估发票生成、数据提取和PDF 报告功能。 如果您对发票自动化或会计系统集成有任何疑问,请联系我们的工程支持团队

常见问题解答

IronPDF 在 C# 发票处理中的用途是什么?

IronPDF 用于 C# 发票处理,可生成专业的 PDF 发票、提取结构化数据并自动执行发票工作流程,同时确保符合 ZUGFeRD 和 Factur-X 等标准。

如何在 C# 中使用 IronPDF 生成 PDF 发票?

通过利用 IronPDF 的 API 以编程方式创建和自定义 PDF 文档,您可以使用 IronPDF 在 C# 中生成 PDF 发票。这包括添加构成发票的文本、表格和图片等元素。

什么是 ZUGFeRD 和 Factur-X,IronPDF 如何支持它们?

ZUGFeRD 和 Factur-X 是电子发票标准,可确保发票的人机可读性。IronPDF 支持这些标准,允许您生成符合这些规范的 PDF 发票。

IronPdf 如何帮助实现应付账款流程自动化?

IronPdf 可以从发票中提取结构化数据并与自动化管道集成,从而实现应付账款流程的自动化,减少人工数据录入并提高效率。

IronPDF 能否从现有的 PDF 发票中提取数据?

是的,IronPDF 可以从现有的 PDF 发票中提取结构化数据,从而更容易自动处理和分析发票信息。

使用 IronPDF 在 C# 中处理发票有什么好处?

使用 IronPdf 在 C# 中处理发票的好处包括:简化发票生成、符合国际发票标准、高效提取数据以及增强自动化功能。

是否可以使用 IronPDF 自定义 PDF 发票的外观?

是的,IronPDF 允许您通过添加各种设计元素(如徽标、文本格式和布局调整)自定义 PDF 发票的外观,以满足品牌推广的要求。

使用 IronPdf 自动处理发票的典型步骤是什么?

要使用 IronPdf 自动处理发票,通常需要生成发票、提取必要的数据,并与其他系统或自动化工具集成以简化工作流程。

IronPdf 如何处理不同的发票格式?

IronPDF 可通过提供生成、处理和读取 PDF 文档的工具来处理各种发票格式,确保与常见的电子发票标准兼容。

Curtis Chau
技术作家

Curtis Chau 拥有卡尔顿大学的计算机科学学士学位,专注于前端开发,精通 Node.js、TypeScript、JavaScript 和 React。他热衷于打造直观且美观的用户界面,喜欢使用现代框架并创建结构良好、视觉吸引力强的手册。

除了开发之外,Curtis 对物联网 (IoT) 有浓厚的兴趣,探索将硬件和软件集成的新方法。在空闲时间,他喜欢玩游戏和构建 Discord 机器人,将他对技术的热爱与创造力相结合。

准备开始了吗?
Nuget 下载 17,570,948 | 版本: 2026.2 刚刚发布