使用IRONPDF 如何在C#中從PDF中提取數據 Curtis Chau 更新日期:8月 20, 2025 Download IronPDF NuGet 下載 DLL 下載 Windows 安裝程式 Start Free Trial Copy for LLMs Copy for LLMs Copy page as Markdown for LLMs Open in ChatGPT Ask ChatGPT about this page Open in Gemini Ask Gemini about this page Open in Grok Ask Grok about this page Open in Perplexity Ask Perplexity about this page Share Share on Facebook Share on X (Twitter) Share on LinkedIn Copy URL Email article Your business is spending too much on yearly subscriptions for PDF security and compliance. Consider IronSecureDoc, which provides solutions for managing SaaS services like digital signing, redaction, encryption, and protection, all for one-time payment. Learn more about IronSecureDoc Extracting data from PDFs is crucial for saving time on manual inputting. This article explains how developers can use the IronPDF library to extract text and images from PDF documents. How to Extract Data from PDF in C# Download Extract Data from PDF C# library Create a New Project in Visual Studio Install Library to your Project Extract the data from specific pages and extract specific from PDF View Data Output from PDF Document IronPDF: C# PDF Library IronPDF is a .NET library that can be used to create, edit, and convert PDF files. It provides an easy-to-use API for developers to use in their applications. It is one of the most popular libraries for creating, editing, and converting PDF files globally. With IronPDF, you can create a straightforward and quick solution to PDFs. Your text will be customized for each document, your layout will be set up for easy reading, and your graphics will be designed with help from the accompanying .NET program. The IronPDF library has a fantastic feature for extracting data from PDF files. This article will look at how to extract data using IronPDF. First, a C# Project needs to be created or opened. Let's move on to the next section. Create or Open a C# Project in Visual Studio This tutorial recommends using the latest version of Visual Studio. Once Visual Studio is opened, follow the steps below to create a new C# Project. If there is an existing project that you would like to use, then skip these next steps and proceed to the next section directly. Open Visual Studio Click on the "Create a new project" button. Visual Studio opening UI Select the "C# Console Application" from the templates. Create a new project Give a name to the Project and click on the Next button. Select a .NET Framework according to your project's requirements and click on the Create button. .NET Framework selection Visual Studio will now generate a new C# .NET project. Install the IronPDF Library The IronPDF library can be installed in multiple ways. Using Package Manager Console Open the Package Manager Console by going to Tools > NuGet Package Manager > Package Manager Console. Run the following command to install the IronPDF library: Install-Package IronPdf Installation progress in the Package Manager Console tab After installation, you will see the IronPDF dependency in the dependencies section of the Solution Explorer, as shown below. Reference IronPdf package in Solution Explorer Using the NuGet Package Manager Another way to install the IronPDF library is by using Visual Studio's integrated NuGet Package Manager UI. Go to the Tools from the main menu. Hover on "NuGet Package Manager" from the drop-down menu and select the "Manage NuGet Packages for Solution...". Navigate to NuGet Package Manager This will open the NuGet Package Manager window. Go to the Browse tab, write IronPdf in search, and press Enter. Select IronPDF from the search results and click on the "Install" button to begin the installation. Install the IronPdf package from the NuGet Package Manager Extract Data from PDF Files Let's have a look at the following code on how to extract data using IronPDF: // Import necessary namespaces using IronPdf; using System.Collections.Generic; using System.Drawing; public class PDFExtractor { public void ExtractDataFromPDF() { // Open a 128-bit encrypted PDF file by providing the filename and password using PdfDocument pdf = PdfDocument.FromFile("encrypted.pdf", "password"); // Extract all text from the PDF document string allText = pdf.ExtractAllText(); // Extract all images from the PDF document IEnumerable<Image> allImages = pdf.ExtractAllImages(); // Iterate over each page in the PDF document for (var index = 0; index < pdf.PageCount; index++) { int pageNumber = index + 1; // Extract text from the specific page string text = pdf.ExtractTextFromPage(index); // Extract images from the specific page IEnumerable<Image> images = pdf.ExtractImagesFromPage(index); // Code to process the extracted text and images //... } } } // Import necessary namespaces using IronPdf; using System.Collections.Generic; using System.Drawing; public class PDFExtractor { public void ExtractDataFromPDF() { // Open a 128-bit encrypted PDF file by providing the filename and password using PdfDocument pdf = PdfDocument.FromFile("encrypted.pdf", "password"); // Extract all text from the PDF document string allText = pdf.ExtractAllText(); // Extract all images from the PDF document IEnumerable<Image> allImages = pdf.ExtractAllImages(); // Iterate over each page in the PDF document for (var index = 0; index < pdf.PageCount; index++) { int pageNumber = index + 1; // Extract text from the specific page string text = pdf.ExtractTextFromPage(index); // Extract images from the specific page IEnumerable<Image> images = pdf.ExtractImagesFromPage(index); // Code to process the extracted text and images //... } } } ' Import necessary namespaces Imports IronPdf Imports System.Collections.Generic Imports System.Drawing Public Class PDFExtractor Public Sub ExtractDataFromPDF() ' Open a 128-bit encrypted PDF file by providing the filename and password Using pdf As PdfDocument = PdfDocument.FromFile("encrypted.pdf", "password") ' Extract all text from the PDF document Dim allText As String = pdf.ExtractAllText() ' Extract all images from the PDF document Dim allImages As IEnumerable(Of Image) = pdf.ExtractAllImages() ' Iterate over each page in the PDF document For index = 0 To pdf.PageCount - 1 Dim pageNumber As Integer = index + 1 ' Extract text from the specific page Dim text As String = pdf.ExtractTextFromPage(index) ' Extract images from the specific page Dim images As IEnumerable(Of Image) = pdf.ExtractImagesFromPage(index) ' Code to process the extracted text and images '... Next index End Using End Sub End Class $vbLabelText $csharpLabel In this code example: The FromFile method is used to load the input PDF document, which is encrypted and requires a password. The ExtractAllText method extracts all textual content from the PDF. The ExtractAllImages method fetches all embedded images. A loop iterates over each page of the document to extract text and images from that specific page using ExtractTextFromPage and ExtractImagesFromPage. Conclusion IronPDF allows developers to extract text and images from PDF files with ease. Using ExtractAllText and ExtractAllImages, the entire contents of a PDF file can be extracted instantly. Alternatively, these methods can be used to extract content from a specific page. The previous code demonstrated how to use both methods to read text and images from a range of pages. Additionally, IronPDF offers features like rendering charts, adding barcodes, enhancing security with passwords, watermarking, and handling PDF forms programmatically. IronPDF is available for free during development, with payment required for commercial use. A free trial of IronPDF is available for production use without payment. Purchase the full suite of Iron Software's document libraries for the cost of two IronPDF Lite Licenses. Download IronPDF now to start extracting data from PDFs today! 常見問題解答 如何使用 C# 從 PDF 文件中提取文字? 您可以使用 IronPDF 的ExtractAllText方法從 PDF 文件中提取所有文字。此方法簡化了操作流程,讓您可以輕鬆存取 PDF 的文字內容。 如何使用 C# 從 PDF 中提取圖像? 使用 IronPDF,您可以透過ExtractAllImages方法從 PDF 文件中提取圖像。此方法可以有效率地檢索 PDF 文件中的所有嵌入影像。 如何在 C# 專案中安裝 PDF 處理庫? 若要在 C# 專案中安裝 IronPDF,可以使用套件管理器控制台,透過命令Install-Package IronPdf或透過 Visual Studio 中的 NuGet 套件管理器 UI 來安裝該套件。 C# 是否可以處理加密的 PDF 檔案? 是的,IronPDF 允許您使用FromFile方法開啟和操作加密的 PDF 文件,您可以透過提供文件名稱和密碼來存取內容。 我可以用 C# 從 PDF 的特定頁面中提取資料嗎? IronPDF 讓您可以遍歷 PDF 文件的每一頁,並使用ExtractTextFromPage和ExtractImagesFromPage等方法從特定頁面提取資料。 C# PDF 函式庫還提供了哪些其他功能? 除了資料擷取之外,IronPDF 還提供圖表渲染、添加條碼、使用密碼增強文件安全性、浮水印以及以程式設計方式處理 PDF 表單等功能。 如何在C#中將HTML轉換為PDF? 您可以使用 IronPDF 的RenderHtmlAsPdf方法將 HTML 字串轉換為 PDF,這對於從 Web 內容建立 PDF 文件特別有用。 C# PDF 庫是否有試用版? IronPDF 在開發階段可免費使用,方便您測試其各項功能。生產環境使用需要商業許可證,但也提供免費試用版。 我該如何開始使用 C# 庫從 PDF 中提取資料? 若要開始使用 IronPDF 進行資料擷取,請下載資料庫,在 Visual Studio 中建立或開啟 C# 項目,安裝 IronPDF,然後依照程式碼範例有效率地從 PDF 中擷取文字和影像。 .NET 10 相容性:我可以在 .NET 10 中使用 IronPDF 的資料擷取功能嗎? 是的——IronPDF 完全支援 .NET 10,包括其資料提取功能,例如提取文字和圖像。您無需特殊配置即可在 .NET 10 專案中使用 IronPDF。它支援 .NET 10、.NET 9、.NET 8 及更早版本,以及 .NET Standard 和 .NET Framework。 (ironpdf.com) Curtis Chau 立即與工程團隊聊天 技術作家 Curtis Chau 擁有卡爾頓大學計算機科學學士學位,專注於前端開發,擅長於 Node.js、TypeScript、JavaScript 和 React。Curtis 熱衷於創建直觀且美觀的用戶界面,喜歡使用現代框架並打造結構良好、視覺吸引人的手冊。除了開發之外,Curtis 對物聯網 (IoT) 有著濃厚的興趣,探索將硬體和軟體結合的創新方式。在閒暇時間,他喜愛遊戲並構建 Discord 機器人,結合科技與創意的樂趣。 相關文章 發表日期 11月 13, 2025 如何在 C# 中合併兩個 PDF 位元組數組 使用 IronPDF 在 C# 中合併兩個 PDF 位元組數組。學習如何透過簡單的程式碼範例,將來自位元組數組、記憶體流和資料庫的多個 PDF 文件合併在一起。 閱讀更多 發表日期 11月 13, 2025 如何在 ASP.NET MVC 中創建 PDF 檢視器 為 ASP.NET MVC 應用程式構建一個強大的 PDF 檢視器。顯示 PDF 文件,將視圖轉換為 PDF,使用 IronPDF 添加互動功能。 閱讀更多 發表日期 11月 13, 2025 如何建立 .NET HTML 轉 PDF 轉換器 學習如何在.NET中使用IronPDF將HTML轉換為PDF。 閱讀更多 C#提取PDF文本(代碼示例教程)如何在PDF中使用C#添加頁碼
發表日期 11月 13, 2025 如何在 C# 中合併兩個 PDF 位元組數組 使用 IronPDF 在 C# 中合併兩個 PDF 位元組數組。學習如何透過簡單的程式碼範例,將來自位元組數組、記憶體流和資料庫的多個 PDF 文件合併在一起。 閱讀更多
發表日期 11月 13, 2025 如何在 ASP.NET MVC 中創建 PDF 檢視器 為 ASP.NET MVC 應用程式構建一個強大的 PDF 檢視器。顯示 PDF 文件,將視圖轉換為 PDF,使用 IronPDF 添加互動功能。 閱讀更多