IronPDF 教程 C# 中的发票处理 C#中的发票处理;:使用 .NET 生成、提取和自动处理 PDF 发票 Curtis Chau 已更新:2026年1月20日 下载 IronPDF NuGet 下载 DLL 下载 Windows 安装程序 免费试用 法学硕士副本 法学硕士副本 将页面复制为 Markdown 格式,用于 LLMs 在 ChatGPT 中打开 向 ChatGPT 咨询此页面 在双子座打开 向 Gemini 询问此页面 在 Grok 中打开 向 Grok 询问此页面 打开困惑 向 Perplexity 询问有关此页面的信息 分享 在 Facebook 上分享 分享到 X(Twitter) 在 LinkedIn 上分享 复制链接 电子邮件文章 This article was translated from English: Does it need improvement? Translated View the article in English C# .NET 中的发票处理。IronPDF for NET涵盖了整个文档生命周期:从HTML模板生成专业的PDF发票,符合ZUGFeRD和Factur-X电子发票标准,使用文本解析和AI驱动的处理从收到的发票中提取结构化数据,以及构建与QuickBooks、Xero和SAP等会计系统集成的批量自动化管道。 TL;DR:快速入门指南 本教程涵盖在 C# .NET 中生成、提取和自动化 PDF 发票,包括电子发票合规性、人工智能驱动的解析和会计系统集成。 适用对象:构建发票模块、应付账款自动化或电子发票合规性的 .NET 开发人员。 您将构建的内容:带有细列项目和税额计算的 HTML 模板发票生成、用于付款链接的 QR 代码、符合 ZUGFeRD/Factur-X 标准的 PDF/A-3 输出、使用 regex 的文本提取、人工智能驱动的发票解析以及与会计系统集成的批量处理。 运行环境: .NET 10、.NET 8 LTS、.NET Framework 4.6.2+ 和 .NET Standard 2.0。不依赖外部服务。 何时使用此方法:当您需要生成发票 PDF、满足欧盟电子发票要求或从供应商发票中提取数据用于应付账款时。 为什么它在技术上很重要: IronPDF 可将 HTML 精确地渲染为 PDF,支持嵌入 XML 的 PDF/A-3,并提供文本提取 API,可与 regex 或 AI 配对,将非结构化发票转化为结构化数据。 只需几行代码,即可生成您的第一份 PDF 发票: 立即开始使用 NuGet 创建 PDF 文件: 使用 NuGet 包管理器安装 IronPDF PM > Install-Package IronPdf 复制并运行这段代码。 var renderer = new IronPdf.ChromePdfRenderer(); var pdf = renderer.RenderHtmlAsPdf("<h1>Invoice #1001</h1><p>Total: $500.00</p>"); pdf.SaveAs("invoice.pdf"); 部署到您的生产环境中进行测试 立即开始在您的项目中使用 IronPDF,免费试用! 免费试用30天 购买或注册 IronPDF 30 天试用版后,请在应用程序的开头添加许可证密钥。 IronPdf.License.LicenseKey = "KEY"; IronPdf.License.LicenseKey = "KEY"; $vbLabelText $csharpLabel 今天在您的项目中使用 IronPDF,免费试用。 第一步: 免费开始 使用 NuGet 安装 PM > Install-Package IronPdf 在 IronPDF 上查看 NuGet 快速安装。超过 1000 万次下载,它正以 C# 改变 PDF 开发。 您也可以下载 DLL 或 Windows 安装程序。 目录 TL;DR: 快速入门指南 快速概述 生成专业的 PDF 发票 构建发票 HTML 模板 添加动态行项目并计算总数 添加公司品牌和水印。 嵌入支付链接的二维码 符合电子发票标准 什么是 ZUGFeRD 及其工作原理? 什么是 Factur-X? 在 PDF/A-3 发票中嵌入 XML 数据。 面向欧盟法规的未来兼容发票 从 PDF 发票中提取数据 从 PDF 发票中提取文本 为行项目提取表数据 发票号码、日期和总数的模式匹配 人工智能驱动的发票处理 Integrate AI for Invoice Parsing 提取结构化 JSON 数据 处理不一致的发票格式 构建应付账款自动化流水线 与会计系统集成 QuickBooks、Xero 和 SAP 的集成模式 批量处理数百份发票 使用 NuGet 安装 PM > Install-Package IronPdf 在 IronPDF 上查看 NuGet 快速安装。超过 1000 万次下载,它正以 C# 改变 PDF 开发。 您也可以下载 DLL 或 Windows 安装程序。 什么是发票生命周期,为什么 PDF 仍是标准? 在深入研究代码之前,了解发票在现代业务系统中的整个流程会有所帮助。 发票的生命周期包括五个不同的阶段:生成、分发、接收、数据提取和会计整合。 发票流程从生成开始。 某企业创建了一张发票,其中包括细列项目、定价、税额计算、付款条件和品牌。 发票需要看起来很专业,并符合所有法律要求。 接下来是分发,即通过电子邮件、客户门户网站或传统邮件将发票发送给客户。 当客户收到文件后,应付账款团队会捕获文件并准备处理。 数据提取可从发票中提取关键信息,如供应商详情、细列项目、总额和到期日,以便与采购订单进行核对和匹配。 最后,会计集成将这些数据转移到 QuickBooks、Xero 或 SAP 等财务系统中,以便付款和保存记录。 为什么 PDF 这么多年来仍然是使用最广泛的格式? 这归结为一个独特的优势组合。 无论您使用何种设备或操作系统,PDF 都能保持发票格式的一致性。 无论别人是在 Windows、Mac 还是手机上打开您的发票,它看起来都与您设计的一模一样。 PDF 也很难被误改,因此比 Word 或 Excel 等格式更能保护文档的完整性。 您可以添加数字签名以确保真实性,并使用加密技术以确保安全性。 最重要的是,PDF 已成为一种通用标准,每个业务系统都能识别和支持。 当然,这也是一项挑战。 PDF 的制作目的是方便人们阅读,而不是方便计算机处理。 PDF 不是以结构化数据存储信息,而是根据文本、线条、形状和图像在页面上出现的位置来保存它们。 这就是 IronPDF 这样的工具非常有用的原因,它们使得将人类友好的文档转化为软件可以使用的数据成为可能。 如何在 C# 中生成专业的 PDF 发票; 以编程方式生成发票需要将结构化数据(如客户信息、细列项目和计算结果)转换为精美的 PDF 文档。 IronPdf 利用 HTML 和 CSS 这些大多数开发人员都已熟知的技术,使翻译工作变得简单明了。 在本教程中,我们将介绍您在现实世界中可能遇到的情况。 您还可以在下载下面显示的项目。 如何构建发票 HTML 模板 IronPdf 生成发票的基础是 HTML。 与其与低级 PDF 绘图命令搏斗,不如使用标准 HTML 和 CSS 设计发票,然后让 IronPDF 基于 Chrome 浏览器的渲染引擎将其转换为像素完美的 PDF。 下面是一个基本的发票模板,展示了这种方法: :path=/static-assets/pdf/content-code-examples/tutorials/csharp-invoice-processing/basic-invoice-template.cs using IronPdf; // Define the HTML template for a basic invoice // Uses inline CSS for styling headers, tables, and totals string invoiceHtml = @" E html> le> body { font-family: Arial, sans-serif; padding: 40px; } .header { text-align: right; margin-bottom: 40px; } .company-name { font-size: 24px; font-weight: bold; color: #333; } .invoice-title { font-size: 32px; margin: 20px 0; } .bill-to { margin: 20px 0; } table { width: 100%; border-collapse: collapse; margin: 20px 0; } th { background-color: #2A95D5; color: white; padding: 10px; text-align: left; } td { padding: 10px; border-bottom: 1px solid #ddd; } .total { text-align: right; font-size: 20px; font-weight: bold; margin-top: 20px; } yle> class='header'> <div class='company-name'>Your Company Name</div> <div>123 Business Street</div> <div>City, State 12345</div> v> class='invoice-title'>INVOICE</div> class='bill-to'> <strong>Bill To:</strong><br> Customer Name<br> 456 Customer Avenue<br> City, State 67890 v> le> <tr> <th>Description</th> <th>Quantity</th> <th>Price</th> <th>Total</th> </tr> <tr> <td>Web Development Services</td> <td>10 hours</td> <td>$100.00</td> <td>$1,000.00</td> </tr> <tr> <td>Consulting</td> <td>5 hours</td> <td>$150.00</td> <td>$750.00</td> </tr> ble> class='total'>Total: $1,750.00</div> ; // Initialize the Chrome-based PDF renderer var renderer = new ChromePdfRenderer(); // Convert the HTML string to a PDF document var pdf = renderer.RenderHtmlAsPdf(invoiceHtml); // Save the generated PDF to disk pdf.SaveAs("basic-invoice.pdf"); $vbLabelText $csharpLabel 输出示例 这种方法具有极大的灵活性。 任何能在 Chrome 中使用的 CSS 都能在您的 PDF 中使用,包括 flexbox、网格布局和自定义字体等现代功能。 您甚至可以通过引用 URL 或本地文件路径来使用外部样式表和图片。 如何添加动态行项目并计算总数 真实发票很少有静态内容。 您需要从数据库中填充细列项目、计算小计、应用税率并格式化货币值。 下面的示例演示了动态发票生成的生产就绪模式: using IronPdf; using System; using System.Collections.Generic; using System.Linq; // Represents a single line item on an invoice public class InvoiceLineItem { public string Description { get; set; } public decimal Quantity { get; set; } public decimal UnitPrice { get; set; } // Auto-calculates line total from quantity and unit price public decimal Total => Quantity * UnitPrice; } // Represents a complete invoice with customer details and line items public class Invoice { public string InvoiceNumber { get; set; } public DateTime InvoiceDate { get; set; } public string CustomerName { get; set; } public string CustomerAddress { get; set; } public List<InvoiceLineItem> LineItems { get; set; } // Computed properties for invoice totals public decimal Subtotal => LineItems.Sum(item => item.Total); public decimal TaxRate { get; set; } = 0.08m; // Default 8% tax rate public decimal Tax => Subtotal * TaxRate; public decimal Total => Subtotal + Tax; } // Generates PDF invoices from Invoice objects using HTML templates public class InvoiceGenerator { public PdfDocument GenerateInvoice(Invoice invoice) { // Build HTML table rows dynamically from line items string lineItemsHtml = string.Join("", invoice.LineItems.Select(item => $@" <tr> <td>{item.Description}</td> <td>{item.Quantity}</td> <td>${item.UnitPrice:F2}</td> <td>${item.Total:F2}</td> </tr> ")); // Build the complete HTML invoice using string interpolation // All invoice data is injected into the template dynamically string invoiceHtml = $@" <!DOCTYPE html> <html> <head> <style> body {{ font-family: Arial, sans-serif; padding: 40px; }} .header {{ text-align: right; margin-bottom: 40px; }} .company-name {{ font-size: 24px; font-weight: bold; color: #333; }} .invoice-details {{ margin: 20px 0; }} table {{ width: 100%; border-collapse: collapse; margin: 20px 0; }} th {{ background-color: #2A95D5; color: white; padding: 10px; text-align: left; }} td {{ padding: 10px; border-bottom: 1px solid #ddd; }} .totals {{ text-align: right; margin-top: 20px; }} .totals div {{ margin: 5px 0; }} .grand-total {{ font-size: 20px; font-weight: bold; color: #2A95D5; }} </style> </head> <body> <div class='header'> <div class='company-name'>Your Company Name</div> </div> <h1>INVOICE</h1> <div class='invoice-details'> <strong>Invoice Number:</strong> {invoice.InvoiceNumber}<br> <strong>Date:</strong> {invoice.InvoiceDate:MMM dd, yyyy}<br> <strong>Bill To:</strong> {invoice.CustomerName}<br> {invoice.CustomerAddress} </div> <table> <tr> <th>Description</th> <th>Quantity</th> <th>Unit Price</th> <th>Total</th> </tr> {lineItemsHtml} </table> <div class='totals'> <div>Subtotal: ${invoice.Subtotal:F2}</div> <div>Tax ({invoice.TaxRate:P0}): ${invoice.Tax:F2}</div> <div class='grand-total'>Total: ${invoice.Total:F2}</div> </div> </body> </html>"; // Render HTML to PDF and return the document var renderer = new ChromePdfRenderer(); return renderer.RenderHtmlAsPdf(invoiceHtml); } } using IronPdf; using System; using System.Collections.Generic; using System.Linq; // Represents a single line item on an invoice public class InvoiceLineItem { public string Description { get; set; } public decimal Quantity { get; set; } public decimal UnitPrice { get; set; } // Auto-calculates line total from quantity and unit price public decimal Total => Quantity * UnitPrice; } // Represents a complete invoice with customer details and line items public class Invoice { public string InvoiceNumber { get; set; } public DateTime InvoiceDate { get; set; } public string CustomerName { get; set; } public string CustomerAddress { get; set; } public List<InvoiceLineItem> LineItems { get; set; } // Computed properties for invoice totals public decimal Subtotal => LineItems.Sum(item => item.Total); public decimal TaxRate { get; set; } = 0.08m; // Default 8% tax rate public decimal Tax => Subtotal * TaxRate; public decimal Total => Subtotal + Tax; } // Generates PDF invoices from Invoice objects using HTML templates public class InvoiceGenerator { public PdfDocument GenerateInvoice(Invoice invoice) { // Build HTML table rows dynamically from line items string lineItemsHtml = string.Join("", invoice.LineItems.Select(item => $@" <tr> <td>{item.Description}</td> <td>{item.Quantity}</td> <td>${item.UnitPrice:F2}</td> <td>${item.Total:F2}</td> </tr> ")); // Build the complete HTML invoice using string interpolation // All invoice data is injected into the template dynamically string invoiceHtml = $@" <!DOCTYPE html> <html> <head> <style> body {{ font-family: Arial, sans-serif; padding: 40px; }} .header {{ text-align: right; margin-bottom: 40px; }} .company-name {{ font-size: 24px; font-weight: bold; color: #333; }} .invoice-details {{ margin: 20px 0; }} table {{ width: 100%; border-collapse: collapse; margin: 20px 0; }} th {{ background-color: #2A95D5; color: white; padding: 10px; text-align: left; }} td {{ padding: 10px; border-bottom: 1px solid #ddd; }} .totals {{ text-align: right; margin-top: 20px; }} .totals div {{ margin: 5px 0; }} .grand-total {{ font-size: 20px; font-weight: bold; color: #2A95D5; }} </style> </head> <body> <div class='header'> <div class='company-name'>Your Company Name</div> </div> <h1>INVOICE</h1> <div class='invoice-details'> <strong>Invoice Number:</strong> {invoice.InvoiceNumber}<br> <strong>Date:</strong> {invoice.InvoiceDate:MMM dd, yyyy}<br> <strong>Bill To:</strong> {invoice.CustomerName}<br> {invoice.CustomerAddress} </div> <table> <tr> <th>Description</th> <th>Quantity</th> <th>Unit Price</th> <th>Total</th> </tr> {lineItemsHtml} </table> <div class='totals'> <div>Subtotal: ${invoice.Subtotal:F2}</div> <div>Tax ({invoice.TaxRate:P0}): ${invoice.Tax:F2}</div> <div class='grand-total'>Total: ${invoice.Total:F2}</div> </div> </body> </html>"; // Render HTML to PDF and return the document var renderer = new ChromePdfRenderer(); return renderer.RenderHtmlAsPdf(invoiceHtml); } } $vbLabelText $csharpLabel 输出示例 Invoice 类封装了所有发票数据,并具有小计、税金和总额的计算属性。 生成器使用字符串插值将这些数据转换为 HTML,然后渲染为 PDF。 这种分工使代码具有可维护性和可测试性。 如何在发票上添加公司品牌和水印 专业发票需要徽标等品牌元素,有时还需要水印来显示付款状态。 IronPDF 既支持 HTML 中的嵌入式图像,也支持渲染后的程序化水印。 :path=/static-assets/pdf/content-code-examples/tutorials/csharp-invoice-processing/branding-watermarks.cs using IronPdf; using IronPdf; var renderer = new ChromePdfRenderer(); // Invoice HTML template with company logo embedded via URL // Logo can also be Base64-encoded or a local file path string htmlWithLogo = @" E html> le> body { font-family: Arial, sans-serif; padding: 40px; } .logo { width: 200px; margin-bottom: 20px; } yle> style='text-align: center;'> <img src='https://yourcompany.com/logo.png' alt='Company Logo' class='logo' /> v> INVOICE</h1> strong>Invoice Number:</strong> INV-2024-001</p> strong>Total:</strong> $1,250.00</p> ; // Render the HTML to PDF var pdf = renderer.RenderHtmlAsPdf(htmlWithLogo); // Apply a diagonal "UNPAID" watermark to mark invoice status // 30% opacity keeps the content readable while the watermark is visible pdf.ApplyWatermark("<h1 style='color: red;'>UNPAID</h1>", opacity: 30, rotation: 45, verticalAlignment: IronPdf.Editing.VerticalAlignment.Middle); pdf.SaveAs("invoice-with-watermark.pdf"); using IronPdf; $vbLabelText $csharpLabel 输出示例 ApplyWatermark 方法接受 HTML 内容,使您可以完全控制水印的外观。 您可以调整不透明度、旋转和定位,以达到您所需要的效果。 这对于将发票标记为 "已付"、"草稿 "或 "取消 "而无需重新生成整个文档尤其有用。 如何为支付链接嵌入二维码 现代发票通常包含二维码,客户可以扫描二维码快速付款。 虽然 IronPDF 专注于 PDF 生成,但它可与 IronQR 无缝协作,用于创建条形码: :path=/static-assets/pdf/content-code-examples/tutorials/csharp-invoice-processing/qr-code-payment.cs using IronPdf; using IronQr; using IronSoftware.Drawing; string invoiceNumber = "INV-2026-002"; decimal amount = 1500.00m; // Create a payment URL with invoice details as query parameters string paymentUrl = $"https://yourcompany.com/pay?invoice={invoiceNumber}&amount={amount}"; // Generate QR code from the payment URL using IronQR QrCode qrCode = QrWriter.Write(paymentUrl); AnyBitmap qrImage = qrCode.Save(); qrImage.SaveAs("payment-qr.png", AnyBitmap.ImageFormat.Png); // Build invoice HTML with the QR code image embedded // Customers can scan the QR to pay directly from their phone string invoiceHtml = $@" E html> le> body {{ font-family: Arial, sans-serif; padding: 40px; }} .payment-section {{ margin-top: 40px; text-align: center; border-top: 2px solid #eee; padding-top: 20px; }} .qr-code {{ width: 150px; height: 150px; }} yle> INVOICE {invoiceNumber}</h1> strong>Amount Due:</strong> ${amount:F2}</p> class='payment-section'> <p><strong>Scan to Pay Instantly:</strong></p> <img src='payment-qr.png' alt='Payment QR Code' class='qr-code' /> <p style='font-size: 12px; color: #666;'> Or visit: {paymentUrl} </p> v> ; // Convert HTML to PDF and save var renderer = new ChromePdfRenderer(); var pdf = renderer.RenderHtmlAsPdf(invoiceHtml); pdf.SaveAs($"invoice-{invoiceNumber}.pdf"); $vbLabelText $csharpLabel 输出示例 二维码可直接链接到支付页面,减少客户操作的摩擦,加快现金流。 该模式适用于任何支持基于 URL 的支付启动的支付提供商。 如何在 C# 中遵守 ZUGFeRD 和 Factur-X 电子发票标准; 在整个欧洲,电子发票正迅速成为强制性规定。 德国以 ZUGFeRD 领先,法国以 Factur-X 紧随其后。 这些标准在 PDF 发票中嵌入了机器可读的 XML 数据,实现了自动处理,同时保留了人类可读的文档。 对于在欧洲市场运营的企业来说,理解和实施这些标准越来越重要。 什么是 ZUGFeRD 及其工作原理? ZUGFeRD(Zentraler User Guide des Forums elektronische Rechnung Deutschland)是德国电子发票标准,它将发票数据作为 XML 文件附件嵌入符合 PDF/A-3 标准的文档中。 嵌入式 XML 可自动提取数据,无需 OCR 或解析。 该标准定义了三个一致性级别,每个级别提供的数据结构逐步完善: 基本:包含适合简单自动处理的核心发票数据 舒适:添加详细信息,实现全自动发票处理 扩展: 包括各行业复杂业务场景的全面数据 XML 遵循 UN/CEFACT 跨行业发票 (CII) 模式,该模式已成为欧洲电子发票标准化的基础。 什么是 Factur-X,它与 ZUGFeRD 有何不同? Factur-X 是同一基础标准的法语实施版本。 ZUGFeRD 2.0 和 Factur-X 在技术上完全相同。 它们共享相同的 XML 架构和基于欧洲规范 EN 16931 的一致性配置文件。它们之间的区别纯粹是区域命名:根据 ZUGFeRD 规范创建的发票在 Factur-X 下有效,反之亦然。 如何在 PDF/A-3 发票中嵌入 XML 数据 IronPDF 提供创建合规电子发票所需的附件功能。 翻译过程包括生成 PDF 格式的发票,根据 CII 模式创建 XML 数据,并以正确的命名约定将 XML 作为附件嵌入: using System; using System.Xml.Linq; // Generates ZUGFeRD-compliant invoices by embedding structured XML data // ZUGFeRD allows automated processing while keeping a human-readable PDF public class ZUGFeRDInvoiceGenerator { public void GenerateZUGFeRDInvoice(Invoice invoice) { // First, create the visual PDF that humans will read var renderer = new ChromePdfRenderer(); string invoiceHtml = BuildInvoiceHtml(invoice); var pdf = renderer.RenderHtmlAsPdf(invoiceHtml); // Define the UN/CEFACT namespaces required by the ZUGFeRD standard // These are mandatory for compliance with European e-invoicing regulations XNamespace rsm = "urn:un:unece:uncefact:data:standard:CrossIndustryInvoice:100"; XNamespace ram = "urn:un:unece:uncefact:data:standard:ReusableAggregateBusinessInformationEntity:100"; XNamespace udt = "urn:un:unece:uncefact:data:standard:UnqualifiedDataType:100"; // Build the ZUGFeRD XML structure following the Cross-Industry Invoice schema var zugferdXml = new XDocument( new XDeclaration("1.0", "UTF-8", null), new XElement(rsm + "CrossIndustryInvoice", new XAttribute(XNamespace.Xmlns + "rsm", rsm.NamespaceName), new XAttribute(XNamespace.Xmlns + "ram", ram.NamespaceName), new XAttribute(XNamespace.Xmlns + "udt", udt.NamespaceName), // Document context identifies which e-invoicing guideline is being followed new XElement(rsm + "ExchangedDocumentContext", new XElement(ram + "GuidelineSpecifiedDocumentContextParameter", new XElement(ram + "ID", "urn:cen.eu:en16931:2017") ) ), // Core document identification: invoice number, type, and date new XElement(rsm + "ExchangedDocument", new XElement(ram + "ID", invoice.InvoiceNumber), new XElement(ram + "TypeCode", "380"), // 380 = Commercial Invoice per UN/CEFACT new XElement(ram + "IssueDateTime", new XElement(udt + "DateTimeString", new XAttribute("format", "102"), invoice.InvoiceDate.ToString("yyyyMMdd") ) ) ), // A complete implementation would include additional sections: // - Seller information (ram:SellerTradeParty) // - Buyer information (ram:BuyerTradeParty) // - Line items (ram:IncludedSupplyChainTradeLineItem) // - Payment terms (ram:SpecifiedTradePaymentTerms) // - Tax summaries (ram:ApplicableTradeTax) // Financial summary with all monetary totals new XElement(rsm + "SupplyChainTradeTransaction", new XElement(ram + "ApplicableHeaderTradeSettlement", new XElement(ram + "InvoiceCurrencyCode", "EUR"), new XElement(ram + "SpecifiedTradeSettlementHeaderMonetarySummation", new XElement(ram + "TaxBasisTotalAmount", invoice.Subtotal), new XElement(ram + "TaxTotalAmount", new XAttribute("currencyID", "EUR"), invoice.Tax), new XElement(ram + "GrandTotalAmount", invoice.Total), new XElement(ram + "DuePayableAmount", invoice.Total) ) ) ) ) ); // Save the XML to a temp file for embedding string xmlPath = $"zugferd-{invoice.InvoiceNumber}.xml"; zugferdXml.Save(xmlPath); // Attach the XML to the PDF - filename must follow ZUGFeRD conventions pdf.Attachments.AddFile(xmlPath, "zugferd-invoice.xml", "ZUGFeRD Invoice Data"); // Final PDF contains both visual invoice and machine-readable XML pdf.SaveAs($"invoice-{invoice.InvoiceNumber}-zugferd.pdf"); } // Generates simple HTML for the visual portion of the invoice private string BuildInvoiceHtml(Invoice invoice) { return $@" <!DOCTYPE html> <html> <head> <style> body {{ font-family: Arial, sans-serif; padding: 40px; }} h1 {{ color: #333; }} .zugferd-notice {{ margin-top: 30px; padding: 10px; background: #f0f0f0; font-size: 11px; }} </style> </head> <body> <h1>RECHNUNG / INVOICE</h1> <p><strong>Rechnungsnummer:</strong> {invoice.InvoiceNumber}</p> <p><strong>Datum:</strong> {invoice.InvoiceDate:dd.MM.yyyy}</p> <p><strong>Betrag:</strong> €{invoice.Total:F2}</p> <div class='zugferd-notice'> This invoice contains embedded ZUGFeRD data for automated processing. </div> </body> </html>"; } } using System; using System.Xml.Linq; // Generates ZUGFeRD-compliant invoices by embedding structured XML data // ZUGFeRD allows automated processing while keeping a human-readable PDF public class ZUGFeRDInvoiceGenerator { public void GenerateZUGFeRDInvoice(Invoice invoice) { // First, create the visual PDF that humans will read var renderer = new ChromePdfRenderer(); string invoiceHtml = BuildInvoiceHtml(invoice); var pdf = renderer.RenderHtmlAsPdf(invoiceHtml); // Define the UN/CEFACT namespaces required by the ZUGFeRD standard // These are mandatory for compliance with European e-invoicing regulations XNamespace rsm = "urn:un:unece:uncefact:data:standard:CrossIndustryInvoice:100"; XNamespace ram = "urn:un:unece:uncefact:data:standard:ReusableAggregateBusinessInformationEntity:100"; XNamespace udt = "urn:un:unece:uncefact:data:standard:UnqualifiedDataType:100"; // Build the ZUGFeRD XML structure following the Cross-Industry Invoice schema var zugferdXml = new XDocument( new XDeclaration("1.0", "UTF-8", null), new XElement(rsm + "CrossIndustryInvoice", new XAttribute(XNamespace.Xmlns + "rsm", rsm.NamespaceName), new XAttribute(XNamespace.Xmlns + "ram", ram.NamespaceName), new XAttribute(XNamespace.Xmlns + "udt", udt.NamespaceName), // Document context identifies which e-invoicing guideline is being followed new XElement(rsm + "ExchangedDocumentContext", new XElement(ram + "GuidelineSpecifiedDocumentContextParameter", new XElement(ram + "ID", "urn:cen.eu:en16931:2017") ) ), // Core document identification: invoice number, type, and date new XElement(rsm + "ExchangedDocument", new XElement(ram + "ID", invoice.InvoiceNumber), new XElement(ram + "TypeCode", "380"), // 380 = Commercial Invoice per UN/CEFACT new XElement(ram + "IssueDateTime", new XElement(udt + "DateTimeString", new XAttribute("format", "102"), invoice.InvoiceDate.ToString("yyyyMMdd") ) ) ), // A complete implementation would include additional sections: // - Seller information (ram:SellerTradeParty) // - Buyer information (ram:BuyerTradeParty) // - Line items (ram:IncludedSupplyChainTradeLineItem) // - Payment terms (ram:SpecifiedTradePaymentTerms) // - Tax summaries (ram:ApplicableTradeTax) // Financial summary with all monetary totals new XElement(rsm + "SupplyChainTradeTransaction", new XElement(ram + "ApplicableHeaderTradeSettlement", new XElement(ram + "InvoiceCurrencyCode", "EUR"), new XElement(ram + "SpecifiedTradeSettlementHeaderMonetarySummation", new XElement(ram + "TaxBasisTotalAmount", invoice.Subtotal), new XElement(ram + "TaxTotalAmount", new XAttribute("currencyID", "EUR"), invoice.Tax), new XElement(ram + "GrandTotalAmount", invoice.Total), new XElement(ram + "DuePayableAmount", invoice.Total) ) ) ) ) ); // Save the XML to a temp file for embedding string xmlPath = $"zugferd-{invoice.InvoiceNumber}.xml"; zugferdXml.Save(xmlPath); // Attach the XML to the PDF - filename must follow ZUGFeRD conventions pdf.Attachments.AddFile(xmlPath, "zugferd-invoice.xml", "ZUGFeRD Invoice Data"); // Final PDF contains both visual invoice and machine-readable XML pdf.SaveAs($"invoice-{invoice.InvoiceNumber}-zugferd.pdf"); } // Generates simple HTML for the visual portion of the invoice private string BuildInvoiceHtml(Invoice invoice) { return $@" <!DOCTYPE html> <html> <head> <style> body {{ font-family: Arial, sans-serif; padding: 40px; }} h1 {{ color: #333; }} .zugferd-notice {{ margin-top: 30px; padding: 10px; background: #f0f0f0; font-size: 11px; }} </style> </head> <body> <h1>RECHNUNG / INVOICE</h1> <p><strong>Rechnungsnummer:</strong> {invoice.InvoiceNumber}</p> <p><strong>Datum:</strong> {invoice.InvoiceDate:dd.MM.yyyy}</p> <p><strong>Betrag:</strong> €{invoice.Total:F2}</p> <div class='zugferd-notice'> This invoice contains embedded ZUGFeRD data for automated processing. </div> </body> </html>"; } } $vbLabelText $csharpLabel 输出示例 合规性的主要方面是使用正确的 XML 命名空间、遵循 CII 架构结构以及以适当的文件名嵌入 XML。 类型代码 "380 "明确指出该文档是 UN/CEFACT 标准中的商业发票。 如何使发票符合欧盟要求 欧盟正在各成员国逐步推行电子发票。 意大利已经要求 B2B 交易必须使用电子发票,法国将在 2026 年前逐步实施相关要求,而德国已宣布从 2025 年开始强制实施 B2B 电子发票。现在就建立 ZUGFeRD/Factur-X 支持,为您的系统满足这些监管要求做好准备。 下面是一个合规感知发票生成器的模式,它可以针对不同的标准: using IronPdf; using System; // Enum representing supported European e-invoicing standards public enum InvoiceStandard { None, ZUGFeRD, // German standard - uses CII XML format FacturX, // French standard - technically identical to ZUGFeRD 2.0 Peppol // Pan-European standard - uses UBL XML format } // Factory class that generates invoices compliant with different e-invoicing standards // Allows switching between standards without changing core invoice generation logic public class CompliantInvoiceGenerator { public PdfDocument GenerateCompliantInvoice(Invoice invoice, InvoiceStandard standard) { // Generate the base PDF from HTML var renderer = new ChromePdfRenderer(); string html = BuildInvoiceHtml(invoice); var pdf = renderer.RenderHtmlAsPdf(html); // Attach the appropriate XML format based on target market/regulation switch (standard) { case InvoiceStandard.ZUGFeRD: case InvoiceStandard.FacturX: // Both use Cross-Industry Invoice format, just different filenames EmbedCIIXmlData(pdf, invoice, standard); break; case InvoiceStandard.Peppol: // Peppol uses Universal Business Language format EmbedUBLXmlData(pdf, invoice); break; } return pdf; } // Creates and embeds CII-format XML (used by ZUGFeRD and Factur-X) private void EmbedCIIXmlData(PdfDocument pdf, Invoice invoice, InvoiceStandard standard) { string xml = GenerateCIIXml(invoice); // Filename convention differs between German and French standards string filename = standard == InvoiceStandard.ZUGFeRD ? "zugferd-invoice.xml" : "factur-x.xml"; System.IO.File.WriteAllText("temp-invoice.xml", xml); pdf.Attachments.AddFile("temp-invoice.xml", filename, $"{standard} Invoice Data"); } // Creates and embeds UBL-format XML for Peppol network compliance private void EmbedUBLXmlData(PdfDocument pdf, Invoice invoice) { // UBL (Universal Business Language) is the Peppol standard format string xml = $@"<?xml version='1.0' encoding='UTF-8'?> <Invoice xmlns='urn:oasis:names:specification:ubl:schema:xsd:Invoice-2'> <ID>{invoice.InvoiceNumber}</ID> <IssueDate>{invoice.InvoiceDate:yyyy-MM-dd}</IssueDate> <DocumentCurrencyCode>EUR</DocumentCurrencyCode> <LegalMonetaryTotal> <PayableAmount currencyID='EUR'>{invoice.Total}</PayableAmount> </LegalMonetaryTotal> </Invoice>"; System.IO.File.WriteAllText("peppol-invoice.xml", xml); pdf.Attachments.AddFile("peppol-invoice.xml", "invoice.xml", "Peppol UBL Invoice"); } // Generates minimal CII XML structure for demonstration private string GenerateCIIXml(Invoice invoice) { return $@"<?xml version='1.0' encoding='UTF-8'?> <rsm:CrossIndustryInvoice xmlns:rsm='urn:un:unece:uncefact:data:standard:CrossIndustryInvoice:100' xmlns:ram='urn:un:unece:uncefact:data:standard:ReusableAggregateBusinessInformationEntity:100'> <rsm:ExchangedDocument> <ram:ID>{invoice.InvoiceNumber}</ram:ID> <ram:TypeCode>380</ram:TypeCode> </rsm:ExchangedDocument> </rsm:CrossIndustryInvoice>"; } private string BuildInvoiceHtml(Invoice invoice) { return $"<html><body><h1>Invoice {invoice.InvoiceNumber}</h1></body></html>"; } } using IronPdf; using System; // Enum representing supported European e-invoicing standards public enum InvoiceStandard { None, ZUGFeRD, // German standard - uses CII XML format FacturX, // French standard - technically identical to ZUGFeRD 2.0 Peppol // Pan-European standard - uses UBL XML format } // Factory class that generates invoices compliant with different e-invoicing standards // Allows switching between standards without changing core invoice generation logic public class CompliantInvoiceGenerator { public PdfDocument GenerateCompliantInvoice(Invoice invoice, InvoiceStandard standard) { // Generate the base PDF from HTML var renderer = new ChromePdfRenderer(); string html = BuildInvoiceHtml(invoice); var pdf = renderer.RenderHtmlAsPdf(html); // Attach the appropriate XML format based on target market/regulation switch (standard) { case InvoiceStandard.ZUGFeRD: case InvoiceStandard.FacturX: // Both use Cross-Industry Invoice format, just different filenames EmbedCIIXmlData(pdf, invoice, standard); break; case InvoiceStandard.Peppol: // Peppol uses Universal Business Language format EmbedUBLXmlData(pdf, invoice); break; } return pdf; } // Creates and embeds CII-format XML (used by ZUGFeRD and Factur-X) private void EmbedCIIXmlData(PdfDocument pdf, Invoice invoice, InvoiceStandard standard) { string xml = GenerateCIIXml(invoice); // Filename convention differs between German and French standards string filename = standard == InvoiceStandard.ZUGFeRD ? "zugferd-invoice.xml" : "factur-x.xml"; System.IO.File.WriteAllText("temp-invoice.xml", xml); pdf.Attachments.AddFile("temp-invoice.xml", filename, $"{standard} Invoice Data"); } // Creates and embeds UBL-format XML for Peppol network compliance private void EmbedUBLXmlData(PdfDocument pdf, Invoice invoice) { // UBL (Universal Business Language) is the Peppol standard format string xml = $@"<?xml version='1.0' encoding='UTF-8'?> <Invoice xmlns='urn:oasis:names:specification:ubl:schema:xsd:Invoice-2'> <ID>{invoice.InvoiceNumber}</ID> <IssueDate>{invoice.InvoiceDate:yyyy-MM-dd}</IssueDate> <DocumentCurrencyCode>EUR</DocumentCurrencyCode> <LegalMonetaryTotal> <PayableAmount currencyID='EUR'>{invoice.Total}</PayableAmount> </LegalMonetaryTotal> </Invoice>"; System.IO.File.WriteAllText("peppol-invoice.xml", xml); pdf.Attachments.AddFile("peppol-invoice.xml", "invoice.xml", "Peppol UBL Invoice"); } // Generates minimal CII XML structure for demonstration private string GenerateCIIXml(Invoice invoice) { return $@"<?xml version='1.0' encoding='UTF-8'?> <rsm:CrossIndustryInvoice xmlns:rsm='urn:un:unece:uncefact:data:standard:CrossIndustryInvoice:100' xmlns:ram='urn:un:unece:uncefact:data:standard:ReusableAggregateBusinessInformationEntity:100'> <rsm:ExchangedDocument> <ram:ID>{invoice.InvoiceNumber}</ram:ID> <ram:TypeCode>380</ram:TypeCode> </rsm:ExchangedDocument> </rsm:CrossIndustryInvoice>"; } private string BuildInvoiceHtml(Invoice invoice) { return $"<html><body><h1>Invoice {invoice.InvoiceNumber}</h1></body></html>"; } } $vbLabelText $csharpLabel 这种架构允许您在新标准出现时添加新标准,而无需重组核心发票生成逻辑。 基于枚举的方法可以让用户或配置轻松决定使用哪种合规模式。 如何在 C# 中从 PDF 发票中提取数据; 生成发票只是成功的一半。 大多数企业也会收到来自供应商的发票,需要提取数据进行处理。 IronPDF 提供强大的文本提取功能,是发票数据采集的基础。 如何从 PDF 发票中提取文本 最基本的提取操作是从 PDF 中检索所有文本内容。 IronPDF 的 ExtractAllText 方法可处理 PDF 文本编码和定位的复杂性: using IronPdf; using System; // Extracts raw text content from PDF invoices for further processing public class InvoiceTextExtractor { // Extracts all text from a PDF in one operation // Best for single-page invoices or when you need the complete content public string ExtractInvoiceText(string pdfPath) { var pdf = PdfDocument.FromFile(pdfPath); // IronPDF handles the complexity of PDF text encoding and positioning string allText = pdf.ExtractAllText(); Console.WriteLine("Full invoice text:"); Console.WriteLine(allText); return allText; } // Extracts text page by page - useful for multi-page invoices // Allows you to process header info separately from line items public void ExtractTextByPage(string pdfPath) { var pdf = PdfDocument.FromFile(pdfPath); // Iterate through each page (0-indexed) for (int i = 0; i < pdf.PageCount; i++) { string pageText = pdf.ExtractTextFromPage(i); Console.WriteLine($"\n--- Page {i + 1} ---"); Console.WriteLine(pageText); } } } using IronPdf; using System; // Extracts raw text content from PDF invoices for further processing public class InvoiceTextExtractor { // Extracts all text from a PDF in one operation // Best for single-page invoices or when you need the complete content public string ExtractInvoiceText(string pdfPath) { var pdf = PdfDocument.FromFile(pdfPath); // IronPDF handles the complexity of PDF text encoding and positioning string allText = pdf.ExtractAllText(); Console.WriteLine("Full invoice text:"); Console.WriteLine(allText); return allText; } // Extracts text page by page - useful for multi-page invoices // Allows you to process header info separately from line items public void ExtractTextByPage(string pdfPath) { var pdf = PdfDocument.FromFile(pdfPath); // Iterate through each page (0-indexed) for (int i = 0; i < pdf.PageCount; i++) { string pageText = pdf.ExtractTextFromPage(i); Console.WriteLine($"\n--- Page {i + 1} ---"); Console.WriteLine(pageText); } } } $vbLabelText $csharpLabel 逐页提取对于需要查找特定部分的多页发票特别有用,例如查找跨多页的细列项目,而标题信息只出现在第一页。 如何提取行项目的表数据 发票项目通常以表格形式出现。 PDF 缺乏原生表格结构,但您可以提取文本并进行解析,以重建表格数据: using IronPdf; using System; using System.Collections.Generic; // Data model for a single invoice line item public class InvoiceLineItem { public string Description { get; set; } public decimal Quantity { get; set; } public decimal UnitPrice { get; set; } public decimal Total { get; set; } } // Extracts tabular line item data from PDF invoices // Note: PDFs don't have native table structure, so this uses text parsing public class InvoiceTableExtractor { public List<InvoiceLineItem> ExtractLineItems(string pdfPath) { var pdf = PdfDocument.FromFile(pdfPath); string text = pdf.ExtractAllText(); var lineItems = new List<InvoiceLineItem>(); string[] lines = text.Split('\n'); foreach (string line in lines) { // Currency symbols indicate potential line items with amounts if (line.Contains("$") || line.Contains("€")) { Console.WriteLine($"Potential line item: {line.Trim()}"); // Split on whitespace to separate columns // Actual parsing logic depends on your invoice format string[] parts = line.Split(new[] { '\t', ' ' }, StringSplitOptions.RemoveEmptyEntries); // Try to find numeric values that could be amounts foreach (string part in parts) { string cleaned = part.Replace("$", "").Replace("€", "").Replace(",", ""); if (decimal.TryParse(cleaned, out decimal amount)) { Console.WriteLine($" Found amount: {amount:C}"); } } } } return lineItems; } } using IronPdf; using System; using System.Collections.Generic; // Data model for a single invoice line item public class InvoiceLineItem { public string Description { get; set; } public decimal Quantity { get; set; } public decimal UnitPrice { get; set; } public decimal Total { get; set; } } // Extracts tabular line item data from PDF invoices // Note: PDFs don't have native table structure, so this uses text parsing public class InvoiceTableExtractor { public List<InvoiceLineItem> ExtractLineItems(string pdfPath) { var pdf = PdfDocument.FromFile(pdfPath); string text = pdf.ExtractAllText(); var lineItems = new List<InvoiceLineItem>(); string[] lines = text.Split('\n'); foreach (string line in lines) { // Currency symbols indicate potential line items with amounts if (line.Contains("$") || line.Contains("€")) { Console.WriteLine($"Potential line item: {line.Trim()}"); // Split on whitespace to separate columns // Actual parsing logic depends on your invoice format string[] parts = line.Split(new[] { '\t', ' ' }, StringSplitOptions.RemoveEmptyEntries); // Try to find numeric values that could be amounts foreach (string part in parts) { string cleaned = part.Replace("$", "").Replace("€", "").Replace(",", ""); if (decimal.TryParse(cleaned, out decimal amount)) { Console.WriteLine($" Found amount: {amount:C}"); } } } } return lineItems; } } $vbLabelText $csharpLabel 解析逻辑将根据您的发票格式而有所不同。 对于已知供应商具有一致布局的发票,您可以构建特定格式的解析器。 对于不同的格式,可以考虑本文后面介绍的人工智能提取。 如何对发票号码、日期和总数使用模式匹配 正则表达式对于从发票文本中提取特定数据点非常有用。 发票号码、日期和总额等关键字段通常遵循可识别的模式: using IronPdf; using System; using System.Text.RegularExpressions; // Data model for extracted invoice information public class InvoiceData { public string InvoiceNumber { get; set; } public string InvoiceDate { get; set; } public decimal TotalAmount { get; set; } public string VendorName { get; set; } } // Extracts key invoice fields using regex pattern matching // Multiple patterns handle variations across different vendors public class InvoiceParser { public InvoiceData ParseInvoice(string pdfPath) { var pdf = PdfDocument.FromFile(pdfPath); string text = pdf.ExtractAllText(); var invoiceData = new InvoiceData(); // Try multiple patterns to find invoice number // Handles: "Invoice #123", "INV-123", "Invoice Number: 123", German "Rechnungsnummer" string[] invoiceNumberPatterns = new[] { @"Invoice\s*#?\s*:?\s*([A-Z0-9-]+)", @"INV[-\s]?(\d+)", @"Invoice\s+Number\s*:?\s*([A-Z0-9-]+)", @"Rechnungsnummer\s*:?\s*([A-Z0-9-]+)" }; foreach (string pattern in invoiceNumberPatterns) { var match = Regex.Match(text, pattern, RegexOptions.IgnoreCase); if (match.Success) { invoiceData.InvoiceNumber = match.Groups[1].Value; Console.WriteLine($"Found Invoice Number: {invoiceData.InvoiceNumber}"); break; } } // Date patterns for US, European, and written formats string[] datePatterns = new[] { @"Date\s*:?\s*(\d{1,2}[/-]\d{1,2}[/-]\d{2,4})", @"Invoice\s+Date\s*:?\s*(\d{1,2}[/-]\d{1,2}[/-]\d{2,4})", @"(\d{1,2}\.\d{1,2}\.\d{4})", // European: DD.MM.YYYY @"(\w+\s+\d{1,2},?\s+\d{4})" // Written: January 15, 2024 }; foreach (string pattern in datePatterns) { var match = Regex.Match(text, pattern, RegexOptions.IgnoreCase); if (match.Success) { invoiceData.InvoiceDate = match.Groups[1].Value; Console.WriteLine($"Found Date: {invoiceData.InvoiceDate}"); break; } } // Look for total amount with various labels string[] totalPatterns = new[] { @"Total\s*:?\s*[\$€]?\s*([\d,]+\.\d{2})", @"Amount\s+Due\s*:?\s*[\$€]?\s*([\d,]+\.\d{2})", @"Grand\s+Total\s*:?\s*[\$€]?\s*([\d,]+\.\d{2})", @"Balance\s+Due\s*:?\s*[\$€]?\s*([\d,]+\.\d{2})" }; foreach (string pattern in totalPatterns) { var match = Regex.Match(text, pattern, RegexOptions.IgnoreCase); if (match.Success) { // Remove commas before parsing string amountStr = match.Groups[1].Value.Replace(",", ""); if (decimal.TryParse(amountStr, out decimal amount)) { invoiceData.TotalAmount = amount; Console.WriteLine($"Found Total: ${invoiceData.TotalAmount:F2}"); break; } } } return invoiceData; } } using IronPdf; using System; using System.Text.RegularExpressions; // Data model for extracted invoice information public class InvoiceData { public string InvoiceNumber { get; set; } public string InvoiceDate { get; set; } public decimal TotalAmount { get; set; } public string VendorName { get; set; } } // Extracts key invoice fields using regex pattern matching // Multiple patterns handle variations across different vendors public class InvoiceParser { public InvoiceData ParseInvoice(string pdfPath) { var pdf = PdfDocument.FromFile(pdfPath); string text = pdf.ExtractAllText(); var invoiceData = new InvoiceData(); // Try multiple patterns to find invoice number // Handles: "Invoice #123", "INV-123", "Invoice Number: 123", German "Rechnungsnummer" string[] invoiceNumberPatterns = new[] { @"Invoice\s*#?\s*:?\s*([A-Z0-9-]+)", @"INV[-\s]?(\d+)", @"Invoice\s+Number\s*:?\s*([A-Z0-9-]+)", @"Rechnungsnummer\s*:?\s*([A-Z0-9-]+)" }; foreach (string pattern in invoiceNumberPatterns) { var match = Regex.Match(text, pattern, RegexOptions.IgnoreCase); if (match.Success) { invoiceData.InvoiceNumber = match.Groups[1].Value; Console.WriteLine($"Found Invoice Number: {invoiceData.InvoiceNumber}"); break; } } // Date patterns for US, European, and written formats string[] datePatterns = new[] { @"Date\s*:?\s*(\d{1,2}[/-]\d{1,2}[/-]\d{2,4})", @"Invoice\s+Date\s*:?\s*(\d{1,2}[/-]\d{1,2}[/-]\d{2,4})", @"(\d{1,2}\.\d{1,2}\.\d{4})", // European: DD.MM.YYYY @"(\w+\s+\d{1,2},?\s+\d{4})" // Written: January 15, 2024 }; foreach (string pattern in datePatterns) { var match = Regex.Match(text, pattern, RegexOptions.IgnoreCase); if (match.Success) { invoiceData.InvoiceDate = match.Groups[1].Value; Console.WriteLine($"Found Date: {invoiceData.InvoiceDate}"); break; } } // Look for total amount with various labels string[] totalPatterns = new[] { @"Total\s*:?\s*[\$€]?\s*([\d,]+\.\d{2})", @"Amount\s+Due\s*:?\s*[\$€]?\s*([\d,]+\.\d{2})", @"Grand\s+Total\s*:?\s*[\$€]?\s*([\d,]+\.\d{2})", @"Balance\s+Due\s*:?\s*[\$€]?\s*([\d,]+\.\d{2})" }; foreach (string pattern in totalPatterns) { var match = Regex.Match(text, pattern, RegexOptions.IgnoreCase); if (match.Success) { // Remove commas before parsing string amountStr = match.Groups[1].Value.Replace(",", ""); if (decimal.TryParse(amountStr, out decimal amount)) { invoiceData.TotalAmount = amount; Console.WriteLine($"Found Total: ${invoiceData.TotalAmount:F2}"); break; } } } return invoiceData; } } $vbLabelText $csharpLabel 这种基于模式的方法对于具有可预测格式的发票非常有效。 多种模式变化可处理不同供应商之间常见的格式差异,如 "发票#"和 "发票号码:"。 扫描发票或基于图像的发票怎么办? 上述文本提取方法适用于包含嵌入文本的 PDF。 然而,扫描文件和基于图像的 PDF 没有可提取的文本。 它们基本上是发票的图片。 请注意处理扫描发票时,您需要 OCR(光学字符识别)功能。 IronOCR 是 Iron Suite 的一部分,可与 IronPDF 无缝集成,用于这些场景。 请访问 https://ironsoftware.com/csharp/ocr/ 了解更多有关从扫描文档和图像中提取文本的信息。 如何在 .NET 中使用 AI 处理发票 传统的模式匹配对于标准化发票非常有效,但现实世界中应付账款部门收到的文件格式数不胜数。 这正是人工智能提取技术的优势所在。 大型语言模型可以理解发票语义,甚至可以从陌生的布局中提取结构化数据。 如何将人工智能整合到发票解析中 人工智能驱动的发票处理模式结合了 IronPDF 的文本提取和 LLM API 调用。 下面是一个通用的实现方法,可与任何兼容 OpenAI 的 API 配合使用: using IronPdf; using System; using System.Net.Http; using System.Text; using System.Text.Json; using System.Threading.Tasks; // Data model for extracted invoice information public class InvoiceData { public string InvoiceNumber { get; set; } public string InvoiceDate { get; set; } public string VendorName { get; set; } public decimal TotalAmount { get; set; } } // Leverages AI/LLM APIs to extract structured data from any invoice format // Works with OpenAI or any compatible API endpoint public class AIInvoiceParser { private readonly HttpClient _httpClient; private readonly string _apiKey; private readonly string _apiUrl; public AIInvoiceParser(string apiKey, string apiUrl = "https://api.openai.com/v1/chat/completions") { _apiKey = apiKey; _apiUrl = apiUrl; _httpClient = new HttpClient(); _httpClient.DefaultRequestHeaders.Add("Authorization", $"Bearer {_apiKey}"); } public async Task<InvoiceData> ParseInvoiceWithAI(string pdfPath) { // First extract raw text from the PDF using IronPDF var pdf = PdfDocument.FromFile(pdfPath); string invoiceText = pdf.ExtractAllText(); // Construct a prompt that instructs the AI to return structured JSON // Being explicit about the format reduces parsing errors string prompt = $@"Extract the following information from this invoice text. Return ONLY valid JSON with no additional text or markdown formatting. Required fields: - InvoiceNumber: The invoice or document number - InvoiceDate: The invoice date in YYYY-MM-DD format - VendorName: The company or person who sent the invoice - TotalAmount: The total amount due as a number (no currency symbols) Invoice text: {invoiceText} JSON response:"; // Build the API request with a system prompt for context var requestBody = new { model = "gpt-4", messages = new[] { new { role = "system", content = "You are an invoice data extraction assistant. Extract structured data from invoices and return valid JSON only." }, new { role = "user", content = prompt } }, temperature = 0.1 // Low temperature ensures consistent, deterministic results }; var json = JsonSerializer.Serialize(requestBody); var content = new StringContent(json, Encoding.UTF8, "application/json"); var response = await _httpClient.PostAsync(_apiUrl, content); var responseJson = await response.Content.ReadAsStringAsync(); // Navigate the API response structure to get the extracted content using var doc = JsonDocument.Parse(responseJson); var messageContent = doc.RootElement .GetProperty("choices")[0] .GetProperty("message") .GetProperty("content") .GetString(); Console.WriteLine("AI Extracted Data:"); Console.WriteLine(messageContent); // Deserialize the AI's JSON response into our data class var invoiceData = JsonSerializer.Deserialize<InvoiceData>(messageContent, new JsonSerializerOptions { PropertyNameCaseInsensitive = true }); return invoiceData; } } using IronPdf; using System; using System.Net.Http; using System.Text; using System.Text.Json; using System.Threading.Tasks; // Data model for extracted invoice information public class InvoiceData { public string InvoiceNumber { get; set; } public string InvoiceDate { get; set; } public string VendorName { get; set; } public decimal TotalAmount { get; set; } } // Leverages AI/LLM APIs to extract structured data from any invoice format // Works with OpenAI or any compatible API endpoint public class AIInvoiceParser { private readonly HttpClient _httpClient; private readonly string _apiKey; private readonly string _apiUrl; public AIInvoiceParser(string apiKey, string apiUrl = "https://api.openai.com/v1/chat/completions") { _apiKey = apiKey; _apiUrl = apiUrl; _httpClient = new HttpClient(); _httpClient.DefaultRequestHeaders.Add("Authorization", $"Bearer {_apiKey}"); } public async Task<InvoiceData> ParseInvoiceWithAI(string pdfPath) { // First extract raw text from the PDF using IronPDF var pdf = PdfDocument.FromFile(pdfPath); string invoiceText = pdf.ExtractAllText(); // Construct a prompt that instructs the AI to return structured JSON // Being explicit about the format reduces parsing errors string prompt = $@"Extract the following information from this invoice text. Return ONLY valid JSON with no additional text or markdown formatting. Required fields: - InvoiceNumber: The invoice or document number - InvoiceDate: The invoice date in YYYY-MM-DD format - VendorName: The company or person who sent the invoice - TotalAmount: The total amount due as a number (no currency symbols) Invoice text: {invoiceText} JSON response:"; // Build the API request with a system prompt for context var requestBody = new { model = "gpt-4", messages = new[] { new { role = "system", content = "You are an invoice data extraction assistant. Extract structured data from invoices and return valid JSON only." }, new { role = "user", content = prompt } }, temperature = 0.1 // Low temperature ensures consistent, deterministic results }; var json = JsonSerializer.Serialize(requestBody); var content = new StringContent(json, Encoding.UTF8, "application/json"); var response = await _httpClient.PostAsync(_apiUrl, content); var responseJson = await response.Content.ReadAsStringAsync(); // Navigate the API response structure to get the extracted content using var doc = JsonDocument.Parse(responseJson); var messageContent = doc.RootElement .GetProperty("choices")[0] .GetProperty("message") .GetProperty("content") .GetString(); Console.WriteLine("AI Extracted Data:"); Console.WriteLine(messageContent); // Deserialize the AI's JSON response into our data class var invoiceData = JsonSerializer.Deserialize<InvoiceData>(messageContent, new JsonSerializerOptions { PropertyNameCaseInsensitive = true }); return invoiceData; } } $vbLabelText $csharpLabel 低温设置(0.1)鼓励确定性输出,这对于数据提取任务非常重要,因为您希望相同输入结果一致。 如何从发票中提取结构化 JSON 数据 对于包含细列项目、供应商详细信息和客户信息的更复杂发票,您可以要求使用更丰富的 JSON 结构: using IronPdf; using System; using System.Collections.Generic; using System.Text.Json; using System.Threading.Tasks; // Comprehensive invoice data model with all details public class DetailedInvoiceData { public string InvoiceNumber { get; set; } public DateTime InvoiceDate { get; set; } public DateTime DueDate { get; set; } public VendorInfo Vendor { get; set; } public CustomerInfo Customer { get; set; } public List<LineItem> LineItems { get; set; } public decimal Subtotal { get; set; } public decimal Tax { get; set; } public decimal Total { get; set; } } public class VendorInfo { public string Name { get; set; } public string Address { get; set; } public string TaxId { get; set; } } public class CustomerInfo { public string Name { get; set; } public string Address { get; set; } } public class LineItem { public string Description { get; set; } public decimal Quantity { get; set; } public decimal UnitPrice { get; set; } public decimal Total { get; set; } } // Extracts comprehensive invoice data including line items and party details public class StructuredInvoiceExtractor { private readonly AIInvoiceParser _aiParser; public StructuredInvoiceExtractor(string apiKey) { _aiParser = new AIInvoiceParser(apiKey); } public async Task<DetailedInvoiceData> ExtractDetailedData(string pdfPath) { var pdf = PdfDocument.FromFile(pdfPath); string text = pdf.ExtractAllText(); // Define the exact JSON structure we want the AI to return // This schema guides the AI to extract all relevant fields string jsonSchema = @"{ ""InvoiceNumber"": ""string"", ""InvoiceDate"": ""YYYY-MM-DD"", ""DueDate"": ""YYYY-MM-DD"", ""Vendor"": { ""Name"": ""string"", ""Address"": ""string"", ""TaxId"": ""string or null"" }, ""Customer"": { ""Name"": ""string"", ""Address"": ""string"" }, ""LineItems"": [ { ""Description"": ""string"", ""Quantity"": 0.0, ""UnitPrice"": 0.00, ""Total"": 0.00 } ], ""Subtotal"": 0.00, ""Tax"": 0.00, ""Total"": 0.00 }"; // Prompt includes both the schema and the extracted text string prompt = $@"Extract all invoice data and return it in this exact JSON structure: {jsonSchema} Invoice text: {text} Return only valid JSON, no markdown formatting or additional text."; // Call AI API and parse response (implementation as shown above) // Return deserialized DetailedInvoiceData return new DetailedInvoiceData(); // Placeholder } } using IronPdf; using System; using System.Collections.Generic; using System.Text.Json; using System.Threading.Tasks; // Comprehensive invoice data model with all details public class DetailedInvoiceData { public string InvoiceNumber { get; set; } public DateTime InvoiceDate { get; set; } public DateTime DueDate { get; set; } public VendorInfo Vendor { get; set; } public CustomerInfo Customer { get; set; } public List<LineItem> LineItems { get; set; } public decimal Subtotal { get; set; } public decimal Tax { get; set; } public decimal Total { get; set; } } public class VendorInfo { public string Name { get; set; } public string Address { get; set; } public string TaxId { get; set; } } public class CustomerInfo { public string Name { get; set; } public string Address { get; set; } } public class LineItem { public string Description { get; set; } public decimal Quantity { get; set; } public decimal UnitPrice { get; set; } public decimal Total { get; set; } } // Extracts comprehensive invoice data including line items and party details public class StructuredInvoiceExtractor { private readonly AIInvoiceParser _aiParser; public StructuredInvoiceExtractor(string apiKey) { _aiParser = new AIInvoiceParser(apiKey); } public async Task<DetailedInvoiceData> ExtractDetailedData(string pdfPath) { var pdf = PdfDocument.FromFile(pdfPath); string text = pdf.ExtractAllText(); // Define the exact JSON structure we want the AI to return // This schema guides the AI to extract all relevant fields string jsonSchema = @"{ ""InvoiceNumber"": ""string"", ""InvoiceDate"": ""YYYY-MM-DD"", ""DueDate"": ""YYYY-MM-DD"", ""Vendor"": { ""Name"": ""string"", ""Address"": ""string"", ""TaxId"": ""string or null"" }, ""Customer"": { ""Name"": ""string"", ""Address"": ""string"" }, ""LineItems"": [ { ""Description"": ""string"", ""Quantity"": 0.0, ""UnitPrice"": 0.00, ""Total"": 0.00 } ], ""Subtotal"": 0.00, ""Tax"": 0.00, ""Total"": 0.00 }"; // Prompt includes both the schema and the extracted text string prompt = $@"Extract all invoice data and return it in this exact JSON structure: {jsonSchema} Invoice text: {text} Return only valid JSON, no markdown formatting or additional text."; // Call AI API and parse response (implementation as shown above) // Return deserialized DetailedInvoiceData return new DetailedInvoiceData(); // Placeholder } } $vbLabelText $csharpLabel 如何处理不一致的发票格式 在处理来自多个供应商的发票时,人工智能提取的真正威力就显现出来了,每个供应商都有自己独特的格式。 智能处理器可以首先尝试基于模式的提取(更快、更自由),只有在需要时才使用人工智能: using IronPdf; using System.Threading.Tasks; // Hybrid processor that optimizes for cost and capability // Tries fast regex patterns first, uses AI only when patterns fail public class SmartInvoiceProcessor { private readonly AIInvoiceParser _aiParser; public SmartInvoiceProcessor(string aiApiKey) { _aiParser = new AIInvoiceParser(aiApiKey); } public async Task<InvoiceData> ProcessAnyInvoice(string pdfPath) { var pdf = PdfDocument.FromFile(pdfPath); string text = pdf.ExtractAllText(); // First attempt: regex patterns (fast and free) var patternParser = new InvoiceParser(); var standardResult = patternParser.ParseInvoiceFromText(text); // If pattern matching found all required fields, use that result if (IsComplete(standardResult)) { Console.WriteLine("Pattern extraction successful"); return standardResult; } // Fallback: use AI for complex or unusual invoice formats // This costs money but handles any layout Console.WriteLine("Using AI extraction for complex invoice format"); var aiResult = await _aiParser.ParseInvoiceWithAI(pdfPath); return aiResult; } // Validates that we have the minimum required fields private bool IsComplete(InvoiceData data) { return !string.IsNullOrEmpty(data.InvoiceNumber) && !string.IsNullOrEmpty(data.InvoiceDate) && data.TotalAmount > 0; } } using IronPdf; using System.Threading.Tasks; // Hybrid processor that optimizes for cost and capability // Tries fast regex patterns first, uses AI only when patterns fail public class SmartInvoiceProcessor { private readonly AIInvoiceParser _aiParser; public SmartInvoiceProcessor(string aiApiKey) { _aiParser = new AIInvoiceParser(aiApiKey); } public async Task<InvoiceData> ProcessAnyInvoice(string pdfPath) { var pdf = PdfDocument.FromFile(pdfPath); string text = pdf.ExtractAllText(); // First attempt: regex patterns (fast and free) var patternParser = new InvoiceParser(); var standardResult = patternParser.ParseInvoiceFromText(text); // If pattern matching found all required fields, use that result if (IsComplete(standardResult)) { Console.WriteLine("Pattern extraction successful"); return standardResult; } // Fallback: use AI for complex or unusual invoice formats // This costs money but handles any layout Console.WriteLine("Using AI extraction for complex invoice format"); var aiResult = await _aiParser.ParseInvoiceWithAI(pdfPath); return aiResult; } // Validates that we have the minimum required fields private bool IsComplete(InvoiceData data) { return !string.IsNullOrEmpty(data.InvoiceNumber) && !string.IsNullOrEmpty(data.InvoiceDate) && data.TotalAmount > 0; } } $vbLabelText $csharpLabel 如何构建应付账款自动化管道 将所有这些部分整合在一起,这就是一个完整的自动化管道,它可以处理收到的发票、提取数据、验证数据并为您的会计系统做好准备: using IronPdf; using System; using System.IO; using System.Threading.Tasks; using System.Collections.Generic; using System.Linq; // Tracks the outcome of processing each invoice public class ProcessingResult { public string FileName { get; set; } public bool Success { get; set; } public string InvoiceNumber { get; set; } public string ErrorMessage { get; set; } } // Complete automation pipeline for accounts payable // Watches a folder, extracts data, validates, and routes to accounting system public class InvoiceAutomationPipeline { private readonly SmartInvoiceProcessor _processor; private readonly string _inputFolder; private readonly string _processedFolder; private readonly string _errorFolder; public InvoiceAutomationPipeline(string apiKey, string inputFolder) { _processor = new SmartInvoiceProcessor(apiKey); _inputFolder = inputFolder; _processedFolder = Path.Combine(inputFolder, "processed"); _errorFolder = Path.Combine(inputFolder, "errors"); // Create output directories if they don't exist Directory.CreateDirectory(_processedFolder); Directory.CreateDirectory(_errorFolder); } // Main entry point - processes all PDFs in the input folder public async Task<List<ProcessingResult>> ProcessInvoiceBatch() { string[] invoiceFiles = Directory.GetFiles(_inputFolder, "*.pdf"); Console.WriteLine($"Found {invoiceFiles.Length} invoices to process"); var results = new List<ProcessingResult>(); foreach (string invoicePath in invoiceFiles) { string fileName = Path.GetFileName(invoicePath); try { Console.WriteLine($"Processing: {fileName}"); // Extract data using smart processor (patterns first, then AI) var invoiceData = await _processor.ProcessAnyInvoice(invoicePath); // Ensure we have minimum required fields before proceeding if (ValidateInvoiceData(invoiceData)) { // Send to accounting system (QuickBooks, Xero, etc.) await SaveToAccountingSystem(invoiceData); // Archive successful invoices string destPath = Path.Combine(_processedFolder, fileName); File.Move(invoicePath, destPath, overwrite: true); results.Add(new ProcessingResult { FileName = fileName, Success = true, InvoiceNumber = invoiceData.InvoiceNumber }); Console.WriteLine($"✓ Processed: {invoiceData.InvoiceNumber}"); } else { throw new Exception("Validation failed - missing required fields"); } } catch (Exception ex) { Console.WriteLine($"✗ Failed: {fileName} - {ex.Message}"); // Quarantine failed invoices for manual review string destPath = Path.Combine(_errorFolder, fileName); File.Move(invoicePath, destPath, overwrite: true); results.Add(new ProcessingResult { FileName = fileName, Success = false, ErrorMessage = ex.Message }); } } GenerateReport(results); return results; } // Checks for minimum required fields private bool ValidateInvoiceData(InvoiceData data) { return !string.IsNullOrEmpty(data.InvoiceNumber) && !string.IsNullOrEmpty(data.VendorName) && data.TotalAmount > 0; } // Placeholder for accounting system integration private async Task SaveToAccountingSystem(InvoiceData data) { // Integrate with your accounting system here // Examples: QuickBooks API, Xero API, SAP, or database storage Console.WriteLine($" Saved invoice {data.InvoiceNumber} to accounting system"); await Task.CompletedTask; } // Outputs a summary of the batch processing results private void GenerateReport(List<ProcessingResult> results) { int successful = results.Count(r => r.Success); int failed = results.Count(r => !r.Success); Console.WriteLine($"\n========== Processing Complete =========="); Console.WriteLine($"Total Processed: {results.Count}"); Console.WriteLine($"Successful: {successful}"); Console.WriteLine($"Failed: {failed}"); if (failed > 0) { Console.WriteLine("\nFailed invoices requiring review:"); foreach (var failure in results.Where(r => !r.Success)) { Console.WriteLine($" • {failure.FileName}: {failure.ErrorMessage}"); } } } } using IronPdf; using System; using System.IO; using System.Threading.Tasks; using System.Collections.Generic; using System.Linq; // Tracks the outcome of processing each invoice public class ProcessingResult { public string FileName { get; set; } public bool Success { get; set; } public string InvoiceNumber { get; set; } public string ErrorMessage { get; set; } } // Complete automation pipeline for accounts payable // Watches a folder, extracts data, validates, and routes to accounting system public class InvoiceAutomationPipeline { private readonly SmartInvoiceProcessor _processor; private readonly string _inputFolder; private readonly string _processedFolder; private readonly string _errorFolder; public InvoiceAutomationPipeline(string apiKey, string inputFolder) { _processor = new SmartInvoiceProcessor(apiKey); _inputFolder = inputFolder; _processedFolder = Path.Combine(inputFolder, "processed"); _errorFolder = Path.Combine(inputFolder, "errors"); // Create output directories if they don't exist Directory.CreateDirectory(_processedFolder); Directory.CreateDirectory(_errorFolder); } // Main entry point - processes all PDFs in the input folder public async Task<List<ProcessingResult>> ProcessInvoiceBatch() { string[] invoiceFiles = Directory.GetFiles(_inputFolder, "*.pdf"); Console.WriteLine($"Found {invoiceFiles.Length} invoices to process"); var results = new List<ProcessingResult>(); foreach (string invoicePath in invoiceFiles) { string fileName = Path.GetFileName(invoicePath); try { Console.WriteLine($"Processing: {fileName}"); // Extract data using smart processor (patterns first, then AI) var invoiceData = await _processor.ProcessAnyInvoice(invoicePath); // Ensure we have minimum required fields before proceeding if (ValidateInvoiceData(invoiceData)) { // Send to accounting system (QuickBooks, Xero, etc.) await SaveToAccountingSystem(invoiceData); // Archive successful invoices string destPath = Path.Combine(_processedFolder, fileName); File.Move(invoicePath, destPath, overwrite: true); results.Add(new ProcessingResult { FileName = fileName, Success = true, InvoiceNumber = invoiceData.InvoiceNumber }); Console.WriteLine($"✓ Processed: {invoiceData.InvoiceNumber}"); } else { throw new Exception("Validation failed - missing required fields"); } } catch (Exception ex) { Console.WriteLine($"✗ Failed: {fileName} - {ex.Message}"); // Quarantine failed invoices for manual review string destPath = Path.Combine(_errorFolder, fileName); File.Move(invoicePath, destPath, overwrite: true); results.Add(new ProcessingResult { FileName = fileName, Success = false, ErrorMessage = ex.Message }); } } GenerateReport(results); return results; } // Checks for minimum required fields private bool ValidateInvoiceData(InvoiceData data) { return !string.IsNullOrEmpty(data.InvoiceNumber) && !string.IsNullOrEmpty(data.VendorName) && data.TotalAmount > 0; } // Placeholder for accounting system integration private async Task SaveToAccountingSystem(InvoiceData data) { // Integrate with your accounting system here // Examples: QuickBooks API, Xero API, SAP, or database storage Console.WriteLine($" Saved invoice {data.InvoiceNumber} to accounting system"); await Task.CompletedTask; } // Outputs a summary of the batch processing results private void GenerateReport(List<ProcessingResult> results) { int successful = results.Count(r => r.Success); int failed = results.Count(r => !r.Success); Console.WriteLine($"\n========== Processing Complete =========="); Console.WriteLine($"Total Processed: {results.Count}"); Console.WriteLine($"Successful: {successful}"); Console.WriteLine($"Failed: {failed}"); if (failed > 0) { Console.WriteLine("\nFailed invoices requiring review:"); foreach (var failure in results.Where(r => !r.Success)) { Console.WriteLine($" • {failure.FileName}: {failure.ErrorMessage}"); } } } } $vbLabelText $csharpLabel 该流水线实现了一个完整的工作流程:它会扫描文件夹中传入的 PDF 文件,处理每一个文件,验证提取的数据,将成功提取的数据传输到您的会计系统,并将失败的数据隔离以便进行人工审核。 摘要报告可提供处理结果的可见性。 如何将 C## 发票处理与会计系统集成 提取的发票数据最终需要流入会计系统,用于支付和记录。 具体内容因平台而异,但集成模式是一致的。 QuickBooks、Xero 和 SAP 的常见集成模式有哪些? 大多数会计平台都提供 REST API,用于以编程方式创建账单或发票。 以下是一种通用模式,您可以根据具体平台进行调整: using System; using System.Net.Http; using System.Text; using System.Text.Json; using System.Threading.Tasks; // Generic integration layer for pushing invoice data to accounting systems // Adapt the API calls based on your specific platform public class AccountingSystemIntegration { private readonly HttpClient _httpClient; private readonly string _apiKey; private readonly string _baseUrl; public AccountingSystemIntegration(string apiKey, string baseUrl) { _apiKey = apiKey; _baseUrl = baseUrl; _httpClient = new HttpClient(); _httpClient.DefaultRequestHeaders.Add("Authorization", $"Bearer {_apiKey}"); } // Creates a Bill in QuickBooks (vendor invoices are called "Bills") public async Task SendToQuickBooks(InvoiceData invoice) { // QuickBooks Bill structure - see their API docs for full schema var bill = new { VendorRef = new { name = invoice.VendorName }, TxnDate = invoice.InvoiceDate, DocNumber = invoice.InvoiceNumber, TotalAmt = invoice.TotalAmount, Line = new[] { new { Amount = invoice.TotalAmount, DetailType = "AccountBasedExpenseLineDetail", AccountBasedExpenseLineDetail = new { AccountRef = new { name = "Accounts Payable" } } } } }; await PostToApi("/v3/company/{companyId}/bill", bill); } // Creates an accounts payable invoice in Xero public async Task SendToXero(InvoiceData invoice) { // ACCPAY type indicates this is a bill to pay (not a sales invoice) var bill = new { Type = "ACCPAY", Contact = new { Name = invoice.VendorName }, Date = invoice.InvoiceDate, InvoiceNumber = invoice.InvoiceNumber, Total = invoice.TotalAmount }; await PostToApi("/api.xro/2.0/Invoices", bill); } // Generic POST helper with error handling private async Task PostToApi(string endpoint, object payload) { string json = JsonSerializer.Serialize(payload); var content = new StringContent(json, Encoding.UTF8, "application/json"); var response = await _httpClient.PostAsync($"{_baseUrl}{endpoint}", content); if (!response.IsSuccessStatusCode) { string error = await response.Content.ReadAsStringAsync(); throw new Exception($"API Error: {response.StatusCode} - {error}"); } Console.WriteLine($"Successfully posted to {endpoint}"); } } using System; using System.Net.Http; using System.Text; using System.Text.Json; using System.Threading.Tasks; // Generic integration layer for pushing invoice data to accounting systems // Adapt the API calls based on your specific platform public class AccountingSystemIntegration { private readonly HttpClient _httpClient; private readonly string _apiKey; private readonly string _baseUrl; public AccountingSystemIntegration(string apiKey, string baseUrl) { _apiKey = apiKey; _baseUrl = baseUrl; _httpClient = new HttpClient(); _httpClient.DefaultRequestHeaders.Add("Authorization", $"Bearer {_apiKey}"); } // Creates a Bill in QuickBooks (vendor invoices are called "Bills") public async Task SendToQuickBooks(InvoiceData invoice) { // QuickBooks Bill structure - see their API docs for full schema var bill = new { VendorRef = new { name = invoice.VendorName }, TxnDate = invoice.InvoiceDate, DocNumber = invoice.InvoiceNumber, TotalAmt = invoice.TotalAmount, Line = new[] { new { Amount = invoice.TotalAmount, DetailType = "AccountBasedExpenseLineDetail", AccountBasedExpenseLineDetail = new { AccountRef = new { name = "Accounts Payable" } } } } }; await PostToApi("/v3/company/{companyId}/bill", bill); } // Creates an accounts payable invoice in Xero public async Task SendToXero(InvoiceData invoice) { // ACCPAY type indicates this is a bill to pay (not a sales invoice) var bill = new { Type = "ACCPAY", Contact = new { Name = invoice.VendorName }, Date = invoice.InvoiceDate, InvoiceNumber = invoice.InvoiceNumber, Total = invoice.TotalAmount }; await PostToApi("/api.xro/2.0/Invoices", bill); } // Generic POST helper with error handling private async Task PostToApi(string endpoint, object payload) { string json = JsonSerializer.Serialize(payload); var content = new StringContent(json, Encoding.UTF8, "application/json"); var response = await _httpClient.PostAsync($"{_baseUrl}{endpoint}", content); if (!response.IsSuccessStatusCode) { string error = await response.Content.ReadAsStringAsync(); throw new Exception($"API Error: {response.StatusCode} - {error}"); } Console.WriteLine($"Successfully posted to {endpoint}"); } } $vbLabelText $csharpLabel 每个平台都有自己的验证机制(QuickBooks 和 Xero 使用 OAuth,SAP 使用各种方法)、必填字段和 API 约定。 具体细节请参考目标平台的文档,但将提取的发票数据转换为 API 有效载荷的模式保持一致。 如何批量处理数百份发票 大批量的发票处理需要仔细关注并发性和资源管理。 下面是一种使用并行处理和受控并发的模式: using System; using System.Collections.Concurrent; using System.Collections.Generic; using System.Linq; using System.Threading; using System.Threading.Tasks; // Tracks the result of processing a single invoice in a batch public class BatchResult { public string FilePath { get; set; } public bool Success { get; set; } public string InvoiceNumber { get; set; } public string Error { get; set; } } // High-volume invoice processor with controlled parallelism // Prevents overwhelming APIs while maximizing throughput public class BatchInvoiceProcessor { private readonly SmartInvoiceProcessor _invoiceProcessor; private readonly AccountingSystemIntegration _accountingIntegration; private readonly int _maxConcurrency; public BatchInvoiceProcessor(string aiApiKey, string accountingApiKey, string accountingUrl, int maxConcurrency = 5) { _invoiceProcessor = new SmartInvoiceProcessor(aiApiKey); _accountingIntegration = new AccountingSystemIntegration(accountingApiKey, accountingUrl); _maxConcurrency = maxConcurrency; // Adjust based on API rate limits } // Processes multiple invoices in parallel with controlled concurrency public async Task<List<BatchResult>> ProcessInvoiceBatch(List<string> invoicePaths) { // Thread-safe collection for gathering results from parallel tasks var results = new ConcurrentBag<BatchResult>(); // Semaphore limits how many invoices process simultaneously var semaphore = new SemaphoreSlim(_maxConcurrency); // Create a task for each invoice var tasks = invoicePaths.Select(async path => { // Wait for a slot to become available await semaphore.WaitAsync(); try { var result = await ProcessSingleInvoice(path); results.Add(result); } finally { // Release slot for next invoice semaphore.Release(); } }); // Wait for all invoices to complete await Task.WhenAll(tasks); // Output summary statistics var resultList = results.ToList(); int successful = resultList.Count(r => r.Success); int failed = resultList.Count(r => !r.Success); Console.WriteLine($"\nBatch Processing Complete:"); Console.WriteLine($" Total: {resultList.Count}"); Console.WriteLine($" Successful: {successful}"); Console.WriteLine($" Failed: {failed}"); return resultList; } // Processes one invoice: extract data and send to accounting system private async Task<BatchResult> ProcessSingleInvoice(string pdfPath) { try { Console.WriteLine($"Processing: {pdfPath}"); var invoiceData = await _invoiceProcessor.ProcessAnyInvoice(pdfPath); await _accountingIntegration.SendToQuickBooks(invoiceData); Console.WriteLine($"✓ Completed: {invoiceData.InvoiceNumber}"); return new BatchResult { FilePath = pdfPath, Success = true, InvoiceNumber = invoiceData.InvoiceNumber }; } catch (Exception ex) { Console.WriteLine($"✗ Failed: {pdfPath}"); return new BatchResult { FilePath = pdfPath, Success = false, Error = ex.Message }; } } } using System; using System.Collections.Concurrent; using System.Collections.Generic; using System.Linq; using System.Threading; using System.Threading.Tasks; // Tracks the result of processing a single invoice in a batch public class BatchResult { public string FilePath { get; set; } public bool Success { get; set; } public string InvoiceNumber { get; set; } public string Error { get; set; } } // High-volume invoice processor with controlled parallelism // Prevents overwhelming APIs while maximizing throughput public class BatchInvoiceProcessor { private readonly SmartInvoiceProcessor _invoiceProcessor; private readonly AccountingSystemIntegration _accountingIntegration; private readonly int _maxConcurrency; public BatchInvoiceProcessor(string aiApiKey, string accountingApiKey, string accountingUrl, int maxConcurrency = 5) { _invoiceProcessor = new SmartInvoiceProcessor(aiApiKey); _accountingIntegration = new AccountingSystemIntegration(accountingApiKey, accountingUrl); _maxConcurrency = maxConcurrency; // Adjust based on API rate limits } // Processes multiple invoices in parallel with controlled concurrency public async Task<List<BatchResult>> ProcessInvoiceBatch(List<string> invoicePaths) { // Thread-safe collection for gathering results from parallel tasks var results = new ConcurrentBag<BatchResult>(); // Semaphore limits how many invoices process simultaneously var semaphore = new SemaphoreSlim(_maxConcurrency); // Create a task for each invoice var tasks = invoicePaths.Select(async path => { // Wait for a slot to become available await semaphore.WaitAsync(); try { var result = await ProcessSingleInvoice(path); results.Add(result); } finally { // Release slot for next invoice semaphore.Release(); } }); // Wait for all invoices to complete await Task.WhenAll(tasks); // Output summary statistics var resultList = results.ToList(); int successful = resultList.Count(r => r.Success); int failed = resultList.Count(r => !r.Success); Console.WriteLine($"\nBatch Processing Complete:"); Console.WriteLine($" Total: {resultList.Count}"); Console.WriteLine($" Successful: {successful}"); Console.WriteLine($" Failed: {failed}"); return resultList; } // Processes one invoice: extract data and send to accounting system private async Task<BatchResult> ProcessSingleInvoice(string pdfPath) { try { Console.WriteLine($"Processing: {pdfPath}"); var invoiceData = await _invoiceProcessor.ProcessAnyInvoice(pdfPath); await _accountingIntegration.SendToQuickBooks(invoiceData); Console.WriteLine($"✓ Completed: {invoiceData.InvoiceNumber}"); return new BatchResult { FilePath = pdfPath, Success = true, InvoiceNumber = invoiceData.InvoiceNumber }; } catch (Exception ex) { Console.WriteLine($"✗ Failed: {pdfPath}"); return new BatchResult { FilePath = pdfPath, Success = false, Error = ex.Message }; } } } $vbLabelText $csharpLabel SemaphoreSlim可确保您不会淹没外部 API 或耗尽系统资源。 根据您的 API 速率限制和服务器容量调整 _maxConcurrency 。 ConcurrentBag 可以安全地收集并行操作的结果。 下一步 发票自动化是减少人工工作、减少错误和加快业务流程的重要机会。 本指南将引导您了解整个生命周期:从 HTML 模板生成专业发票,符合 ZUGFeRD 和 Factur-X 电子发票标准、使用模式匹配和AI 驱动的处理从收到的发票中提取数据,并构建可扩展的自动化管道。 IronPDF是这些功能的基础,它提供了强大的 HTML 到 PDF 渲染功能、可靠的 文本提取功能以及 PDF/A-3 电子发票合规性所需的附件功能。 其基于 Chrome 浏览器的渲染引擎可确保您的发票看起来与设计完全一致,而其提取方法可自动处理 PDF 文本编码的复杂性。 此处显示的模式是起点。 实际实施时需要根据您的具体发票格式、会计系统和业务规则进行调整。 对于大容量场景,批处理教程涵盖了并行执行与受控并发和错误恢复。 准备好开始构建了吗? 下载 IronPDF 并免费试用。 该库包含一个免费的开发许可证,因此您可以在获得生产许可证之前充分评估发票生成、数据提取和PDF 报告功能。 如果您对发票自动化或会计系统集成有任何疑问,请联系我们的工程支持团队。 常见问题解答 IronPDF 在 C# 发票处理中的用途是什么? IronPDF 用于 C# 发票处理,可生成专业的 PDF 发票、提取结构化数据并自动执行发票工作流程,同时确保符合 ZUGFeRD 和 Factur-X 等标准。 如何在 C# 中使用 IronPDF 生成 PDF 发票? 通过利用 IronPDF 的 API 以编程方式创建和自定义 PDF 文档,您可以使用 IronPDF 在 C# 中生成 PDF 发票。这包括添加构成发票的文本、表格和图片等元素。 什么是 ZUGFeRD 和 Factur-X,IronPDF 如何支持它们? ZUGFeRD 和 Factur-X 是电子发票标准,可确保发票的人机可读性。IronPDF 支持这些标准,允许您生成符合这些规范的 PDF 发票。 IronPdf 如何帮助实现应付账款流程自动化? IronPdf 可以从发票中提取结构化数据并与自动化管道集成,从而实现应付账款流程的自动化,减少人工数据录入并提高效率。 IronPDF 能否从现有的 PDF 发票中提取数据? 是的,IronPDF 可以从现有的 PDF 发票中提取结构化数据,从而更容易自动处理和分析发票信息。 使用 IronPDF 在 C# 中处理发票有什么好处? 使用 IronPdf 在 C# 中处理发票的好处包括:简化发票生成、符合国际发票标准、高效提取数据以及增强自动化功能。 是否可以使用 IronPDF 自定义 PDF 发票的外观? 是的,IronPDF 允许您通过添加各种设计元素(如徽标、文本格式和布局调整)自定义 PDF 发票的外观,以满足品牌推广的要求。 使用 IronPdf 自动处理发票的典型步骤是什么? 要使用 IronPdf 自动处理发票,通常需要生成发票、提取必要的数据,并与其他系统或自动化工具集成以简化工作流程。 IronPdf 如何处理不同的发票格式? IronPDF 可通过提供生成、处理和读取 PDF 文档的工具来处理各种发票格式,确保与常见的电子发票标准兼容。 Curtis Chau 立即与工程团队聊天 技术作家 Curtis Chau 拥有卡尔顿大学的计算机科学学士学位,专注于前端开发,精通 Node.js、TypeScript、JavaScript 和 React。他热衷于打造直观且美观的用户界面,喜欢使用现代框架并创建结构良好、视觉吸引力强的手册。除了开发之外,Curtis 对物联网 (IoT) 有浓厚的兴趣,探索将硬件和软件集成的新方法。在空闲时间,他喜欢玩游戏和构建 Discord 机器人,将他对技术的热爱与创造力相结合。 准备开始了吗? Nuget 下载 17,570,948 | 版本: 2026.2 刚刚发布 免费 NuGet 下载 总下载量:17,570,948 查看许可证