Test in a live environment
Test in production without watermarks.
Works wherever you need it to.
In today's digital world, it is crucial to have the ability to convert web pages or HTML documents into PDF files. This can be useful for generating reports, creating invoices, or simply sharing information in a more presentable format. In this blog post, we will explore how to convert HTML pages to PDF using Node.js and Puppeteer, an open-source library developed by Google.
Puppeteer is a powerful Node.js library that allows developers to control headless browsers, mainly Google Chrome or Chromium, and perform various actions like web scraping, taking screenshots, and generating PDFs. Puppeteer provides an extensive API to interact with the browser, making it an excellent choice for converting HTML to PDF.
Before we begin, you'll need to set up a new NodeJS project. Follow these steps to get started:
Run npm init
to create a new package.json
file for your project. Follow the prompts and fill in the required information.
Install Puppeteer by running npm install puppeteer
.
Now that we have our project set up, let's dive into the code.
To convert HTML templates to a PDF file using Puppeteer, follow these steps:
Create a file named "HTML To PDF.js" in the folder.
const puppeteer = require('puppeteer');
const fs = require('fs');
The code starts by importing two essential libraries: puppeteer
, a versatile tool for controlling headless browsers like Chrome and Chromium, and fs
, a built-in NodeJS module for handling file system operations. Puppeteer enables you to automate a wide range of web-based tasks, including rendering HTML, capturing screenshots, and generating PDF files.
async function exportWebsiteAsPdf(html, outputPath) {
// Create a browser instance
const browser = await puppeteer.launch({
headless: 'new'
});
// Create a new page
const page = await browser.newPage();
await page.setContent(html, { waitUntil: 'domcontentloaded' });
// To reflect CSS used for screens instead of print
await page.emulateMediaType('screen');
// Download the PDF
const PDF = await page.pdf({
path: outputPath,
margin: { top: '100px', right: '50px', bottom: '100px', left: '50px' },
printBackground: true,
format: 'A4',
});
// Close the browser instance
await browser.close();
return PDF;
}
The exportWebsiteAsPdf
function serves as the core of our code snippet. This asynchronous function accepts a html
string and an outputPath
as input parameters and returns a PDF file. The function performs the following steps:
html
string as the page content, waiting for the DOM content to load. We load html
templates as an HTML string to convert it into the PDF format.
// Usage example
// Get HTML content from HTML file
const html = fs.readFileSync('test.html', 'utf-8');
exportWebsiteAsPdf(html, 'result.PDF').then(() => {
console.log('PDF created successfully.');
}).catch((error) => {
console.error('Error creating PDF:', error);
});
The last section of the code illustrates how to use the exportWebsiteAsPdf
function. We perform the following steps:
fs
module's readFileSync
method. Here we are loading template files to generate PDF from HTML pages.exportWebsiteAsPdf
function with the loaded html
string and the desired outputPath
..then
block to handle the successful PDF creation, logging a success message to the console..catch
block to manage any errors that occur during the HTML to PDF conversion process, logging an error message to the console.This code snippet provides a comprehensive example of how to convert an HTML template to a PDF file using NodeJS and Puppeteer. By implementing this solution, you can efficiently generate high-quality PDFs, meeting the needs of various applications and users.
In addition to converting HTML templates, Puppeteer also allows you to convert URLs directly into PDF files.
const puppeteer = require('puppeteer');
The code starts by importing the Puppeteer library, which is a powerful tool for controlling headless browsers like Chrome and Chromium. Puppeteer allows you to automate a variety of web-based tasks, including rendering your HTML code, capturing screenshots, and in our case, generating PDF files.
async function exportWebsiteAsPdf(websiteUrl, outputPath) {
// Create a browser instance
const browser = await puppeteer.launch({
headless: 'new'
});
// Create a new page
const page = await browser.newPage();
// Open URL in current page
await page.goto(websiteUrl, { waitUntil: 'networkidle0' });
// To reflect CSS used for screens instead of print
await page.emulateMediaType('screen');
// Download the PDF
const PDF = await page.pdf({
path: outputPath,
margin: { top: '100px', right: '50px', bottom: '100px', left: '50px' },
printBackground: true,
format: 'A4',
});
// Close the browser instance
await browser.close();
return PDF;
}
The exportWebsiteAsPdf
function is the core of our code snippet. This asynchronous function accepts a websiteUrl
and an outputPath
as its input parameters and returns a PDF file. The function performs the following steps:
websiteUrl
and waits for the network to become idle using the waitUntil
option set to networkidle0
.
// Usage example
exportWebsiteAsPdf('https://ironpdf.com/', 'result.pdf').then(() => {
console.log('PDF created successfully.');
}).catch((error) => {
console.error('Error creating PDF:', error);
});
The final section of the code demonstrates how to use the exportWebsiteAsPdf
function. We execute the following steps:
exportWebsiteAsPdf
function with the desired websiteUrl
and outputPath
.then
block to handle the successful PDF creation. In this block, we log a success message to the console.catch
block to handle any errors that occur during the website to PDF conversion process. If an error occurs, we log an error message to the console.By integrating this code snippet into your projects, you can effortlessly convert URLs into high-quality PDF files using NodeJS and Puppeteer.
Explore IronPDF is a popular .NET library used for generating, editing, and extracting content from PDF files. It provides a simple and efficient solution for creating PDFs from HTML, text, images, and existing PDF documents. IronPDF supports .NET Core, .NET Framework, and .NET 5.0+ projects, making it a versatile choice for various applications.
HTML to PDF Conversion with IronPDF: IronPDF allows you to convert HTML content, including CSS, to PDF files. This feature enables you to create pixel-perfect PDF documents from web pages or HTML templates.
URL Rendering: IronPDF can fetch web pages directly from a server using a URL and convert them to PDF files, making it easy to archive web content or generate reports from dynamic web pages.
Text, Image, and PDF Merging: IronPDF allows you to merge text, images, and existing PDF files into a single PDF document. This feature is particularly useful for creating complex documents with multiple sources of content.
PDF Manipulation: IronPDF provides tools for editing existing PDF files, such as adding or removing pages, modifying metadata, or even extracting text and images from PDF documents.
In conclusion, generating and manipulating PDF files is a common requirement in many applications, and having the right tools at your disposal is crucial. The solutions provided in this article, such as using Puppeteer with NodeJS or IronPDF with .NET, offer powerful and efficient methods for converting HTML content and URLs into professional, high-quality PDF documents.
IronPDF, in particular, stands out with its extensive feature set, making it a top choice for .NET developers. IronPDF offers a free trial allowing you to explore its capabilities.
Users can also benefit from the Iron Suite package, a suite of five professional .NET libraries including IronXL, IronPDF, IronOCR and more.
9 .NET API products for your office documents