PDF 工具 如何使用 Puppeteer 在 Node.js 中将 HTML 转换为 PDF Curtis Chau 已更新:七月 28, 2025 Download IronPDF NuGet 下载 DLL 下载 Windows 安装程序 Start Free Trial Copy for LLMs Copy for LLMs Copy page as Markdown for LLMs Open in ChatGPT Ask ChatGPT about this page Open in Gemini Ask Gemini about this page Open in Grok Ask Grok about this page Open in Perplexity Ask Perplexity about this page Share Share on Facebook Share on X (Twitter) Share on LinkedIn Copy URL Email article In today's digital world, it is crucial to have the ability to convert web pages or HTML documents into PDF files. This can be useful for generating reports, creating invoices, or simply sharing information in a more presentable format. In this blog post, we will explore how to convert HTML pages to PDF using Node.js and Puppeteer, an open-source library developed by Google. Introduction to Puppeteer Puppeteer is a powerful Node.js library that allows developers to control headless browsers, mainly Google Chrome or Chromium, and perform various actions like web scraping, taking screenshots, and generating PDFs. Puppeteer provides an extensive API to interact with the browser, making it an excellent choice for converting HTML to PDF. Why Puppeteer? Ease of use: Puppeteer offers a simple and easy-to-use API that abstracts away the complexities of working with headless browsers. Powerful: Puppeteer provides extensive capabilities for manipulating web pages and interacting with browser elements. Scalable: With Puppeteer, you can easily scale your PDF generation process by running multiple browser instances in parallel. Setting Up Your NodeJS Project Before we begin, you'll need to set up a new NodeJS project. Follow these steps to get started: Install NodeJS if you haven't already (you can download it from the NodeJS website). Create a new folder for your project and open it in Visual Studio Code or any specific code editor. Run npm init to create a new package.json file for your project. Follow the prompts and fill in the required information. Install Puppeteer by running npm install puppeteer. Now that we have our project set up, let's dive into the code. Loading HTML Template and Converting to PDF File To convert HTML templates to a PDF file using Puppeteer, follow these steps: Create a file named "HTML To PDF.js" in the folder. Importing Puppeteer and fs const puppeteer = require('puppeteer'); const fs = require('fs'); The code starts by importing two essential libraries: puppeteer, a versatile tool for controlling headless browsers like Chrome and Chromium, and fs, a built-in NodeJS module for handling file system operations. Puppeteer enables you to automate a wide range of web-based tasks, including rendering HTML, capturing screenshots, and generating PDF files. Defining the exportWebsiteAsPdf Function async function exportWebsiteAsPdf(html, outputPath) { // Create a browser instance const browser = await puppeteer.launch({ headless: true // Launches the browser in headless mode }); // Create a new page const page = await browser.newPage(); // Set the HTML content for the page, waiting for DOM content to load await page.setContent(html, { waitUntil: 'domcontentloaded' }); // To reflect CSS used for screens instead of print await page.emulateMediaType('screen'); // Download the PDF const PDF = await page.pdf({ path: outputPath, margin: { top: '100px', right: '50px', bottom: '100px', left: '50px' }, printBackground: true, format: 'A4', }); // Close the browser instance await browser.close(); return PDF; } The exportWebsiteAsPdf function serves as the core of our code snippet. This asynchronous function accepts an html string and an outputPath as input parameters and returns a PDF file. The function performs the following steps: Launches a new headless browser instance using Puppeteer. Creates a new browser page. Sets the provided html string as the page content, waiting for the DOM content to load. Emulates the 'screen' media type to apply the CSS used for screens instead of print-specific styles. Generates a PDF file from the loaded HTML content, specifying margins, background printing, and format (A4). Closes the browser instance. Returns the created PDF file. Using the exportWebsiteAsPdf Function // Usage example // Get HTML content from HTML file const html = fs.readFileSync('test.html', 'utf-8'); // Convert the HTML content into a PDF and save it to the specified path exportWebsiteAsPdf(html, 'result.pdf').then(() => { console.log('PDF created successfully.'); }).catch((error) => { console.error('Error creating PDF:', error); }); The last section of the code illustrates how to use the exportWebsiteAsPdf function. We perform the following steps: Read the HTML content from an HTML file using the fs module's readFileSync method. Call the exportWebsiteAsPdf function with the loaded html string and the desired outputPath. Utilize a .then block to handle the successful PDF creation, logging a success message to the console. Employ a .catch block to manage any errors that occur during the HTML to PDF conversion process, logging an error message to the console. This code snippet provides a comprehensive example of how to convert an HTML template to a PDF file using NodeJS and Puppeteer. By implementing this solution, you can efficiently generate high-quality PDFs, meeting the needs of various applications and users. Converting URLs to PDF Files In addition to converting HTML templates, Puppeteer also allows you to convert URLs directly into PDF files. Importing Puppeteer const puppeteer = require('puppeteer'); The code starts by importing the Puppeteer library, which is a powerful tool for controlling headless browsers like Chrome and Chromium. Puppeteer allows you to automate a variety of web-based tasks, including rendering your HTML code, capturing screenshots, and in our case, generating PDF files. Defining the exportWebsiteAsPdf Function async function exportWebsiteAsPdf(websiteUrl, outputPath) { // Create a browser instance const browser = await puppeteer.launch({ headless: true // Launches the browser in headless mode }); // Create a new page const page = await browser.newPage(); // Open the URL in the current page await page.goto(websiteUrl, { waitUntil: 'networkidle0' }); // To reflect CSS used for screens instead of print await page.emulateMediaType('screen'); // Download the PDF const PDF = await page.pdf({ path: outputPath, margin: { top: '100px', right: '50px', bottom: '100px', left: '50px' }, printBackground: true, format: 'A4', }); // Close the browser instance await browser.close(); return PDF; } The exportWebsiteAsPdf function is the core of our code snippet. This asynchronous function accepts a websiteUrl and an outputPath as its input parameters and returns a PDF file. The function performs the following steps: Launches a new headless browser instance using Puppeteer. Creates a new browser page. Navigates to the provided websiteUrl and waits for the network to become idle using the waitUntil option set to networkidle0. Emulates the 'screen' media type to ensure the CSS used for screens is applied instead of print-specific styles. Converts the loaded web page to a PDF file with the specified margins, background printing, and format (A4). Closes the browser instance. Returns the generated PDF file. Using the exportWebsiteAsPdf Function // Usage example // Convert the URL content into a PDF and save it to the specified path exportWebsiteAsPdf('https://ironpdf.com/', 'result.pdf').then(() => { console.log('PDF created successfully.'); }).catch((error) => { console.error('Error creating PDF:', error); }); The final section of the code demonstrates how to use the exportWebsiteAsPdf function. We execute the following steps: Call the exportWebsiteAsPdf function with the desired websiteUrl and outputPath. Use a then block to handle the successful PDF creation. In this block, we log a success message to the console. Use a catch block to handle any errors that occur during the website to PDF conversion process. If an error occurs, we log an error message to the console. By integrating this code snippet into your projects, you can effortlessly convert URLs into high-quality PDF files using NodeJS and Puppeteer. Best HTML To PDF Library for C# Developers Explore IronPDF is a popular .NET library used for generating, editing, and extracting content from PDF files. It provides a simple and efficient solution for creating PDFs from HTML, text, images, and existing PDF documents. IronPDF supports .NET Core, .NET Framework, and .NET 5.0+ projects, making it a versatile choice for various applications. IronPDF 的主要功能 HTML to PDF Conversion with IronPDF: IronPDF allows you to convert HTML content, including CSS, to PDF files. This feature enables you to create pixel-perfect PDF documents from web pages or HTML templates. URL Rendering: IronPDF can fetch web pages directly from a server using a URL and convert them to PDF files, making it easy to archive web content or generate reports from dynamic web pages. Text, Image, and PDF Merging: IronPDF allows you to merge text, images, and existing PDF files into a single PDF document. This feature is particularly useful for creating complex documents with multiple sources of content. PDF Manipulation: IronPDF provides tools for editing existing PDF files, such as adding or removing pages, modifying metadata, or even extracting text and images from PDF documents. 结论 In conclusion, generating and manipulating PDF files is a common requirement in many applications, and having the right tools at your disposal is crucial. The solutions provided in this article, such as using Puppeteer with NodeJS or IronPDF with .NET, offer powerful and efficient methods for converting HTML content and URLs into professional, high-quality PDF documents. IronPDF, in particular, stands out with its extensive feature set, making it a top choice for .NET developers. IronPDF offers a free trial allowing you to explore its capabilities. Users can also benefit from the Iron Suite package, a suite of five professional .NET libraries including IronXL, IronPDF, IronOCR and more. Curtis Chau 立即与工程团队聊天 技术作家 Curtis Chau 拥有卡尔顿大学的计算机科学学士学位,专注于前端开发,精通 Node.js、TypeScript、JavaScript 和 React。他热衷于打造直观且美观的用户界面,喜欢使用现代框架并创建结构良好、视觉吸引力强的手册。除了开发之外,Curtis 对物联网 (IoT) 有浓厚的兴趣,探索将硬件和软件集成的新方法。在空闲时间,他喜欢玩游戏和构建 Discord 机器人,将他对技术的热爱与创造力相结合。 相关文章 已更新六月 22, 2025 发现 2025 年最佳 PDF 涂黑软件 探索 2025 年的顶级 PDF 涂黑解决方案,包括 Adobe Acrobat Pro DC、Nitro PDF Pro、Foxit PDF Editor 和 PDF-XChange Editor。了解 IronPDF 在 .NET 中自动化遮盖以增强安全性和合规性的方式。 阅读更多 已更新六月 22, 2025 iPhone 上的最佳 PDF 阅读器(免费和付费工具比较) 在本文中,我们将探索一些 iPhone 的最佳 PDF 阅读器,并得出为何选择 IronPDF 是最佳选项的结论。 阅读更多 已更新六月 26, 2025 Windows 的最佳免费 PDF 编辑器(免费和付费工具比较) 本文探讨了 2025 年可用的顶级免费 PDF 编辑器,并得出最强大和灵活的选项:IronPDF。 阅读更多 如何在 C++ 中将 HTML 转换为 PDF开源 PDF 编辑器(更新列表)
已更新六月 22, 2025 发现 2025 年最佳 PDF 涂黑软件 探索 2025 年的顶级 PDF 涂黑解决方案,包括 Adobe Acrobat Pro DC、Nitro PDF Pro、Foxit PDF Editor 和 PDF-XChange Editor。了解 IronPDF 在 .NET 中自动化遮盖以增强安全性和合规性的方式。 阅读更多
已更新六月 22, 2025 iPhone 上的最佳 PDF 阅读器(免费和付费工具比较) 在本文中,我们将探索一些 iPhone 的最佳 PDF 阅读器,并得出为何选择 IronPDF 是最佳选项的结论。 阅读更多
已更新六月 26, 2025 Windows 的最佳免费 PDF 编辑器(免费和付费工具比较) 本文探讨了 2025 年可用的顶级免费 PDF 编辑器,并得出最强大和灵活的选项:IronPDF。 阅读更多