HTML to PDF Converter C# Open Source (.NET Libraries Comparison)
Converting HTML to PDF is a common requirement in many software applications, such as generating reports, invoices, or saving web pages as PDFs. In this article, we'll explore three popular open-source libraries for HTML to PDF conversion in C#, review their strengths and limitations, and discuss why IronPDF is a better alternative in numerous instances.
HTML to PDF converter C# open source
1. PuppeteerSharp
PuppeteerSharp is a .NET wrapper for Puppeteer, a headless Chromium browser. It enables developers to convert HTML documents to PDFs by leveraging the Chromium rendering engine.
PuppeteerSharp provides precise control over the rendering process. Here's an example:
using PuppeteerSharp;
using System.Threading.Tasks;
class Program
{
static async Task Main(string[] args)
{
// Download Chromium to ensure compatibility with PuppeteerSharp
await new BrowserFetcher().DownloadAsync(BrowserFetcher.DefaultChromiumRevision);
// Launch a headless instance of Chromium browser
using (var browser = await Puppeteer.LaunchAsync(new LaunchOptions { Headless = true }))
{
// Open a new browser page
var page = await browser.NewPageAsync();
// Set the HTML content for the page
await page.SetContentAsync("<html><body><h1>Hello, PuppeteerSharp!</h1></body></html>");
// Generate a PDF from the rendered HTML content
await page.PdfAsync("output.pdf");
Console.WriteLine("PDF Generated Successfully!");
}
}
}
using PuppeteerSharp;
using System.Threading.Tasks;
class Program
{
static async Task Main(string[] args)
{
// Download Chromium to ensure compatibility with PuppeteerSharp
await new BrowserFetcher().DownloadAsync(BrowserFetcher.DefaultChromiumRevision);
// Launch a headless instance of Chromium browser
using (var browser = await Puppeteer.LaunchAsync(new LaunchOptions { Headless = true }))
{
// Open a new browser page
var page = await browser.NewPageAsync();
// Set the HTML content for the page
await page.SetContentAsync("<html><body><h1>Hello, PuppeteerSharp!</h1></body></html>");
// Generate a PDF from the rendered HTML content
await page.PdfAsync("output.pdf");
Console.WriteLine("PDF Generated Successfully!");
}
}
}
Imports PuppeteerSharp
Imports System.Threading.Tasks
Friend Class Program
Shared Async Function Main(ByVal args() As String) As Task
' Download Chromium to ensure compatibility with PuppeteerSharp
Await (New BrowserFetcher()).DownloadAsync(BrowserFetcher.DefaultChromiumRevision)
' Launch a headless instance of Chromium browser
Using browser = Await Puppeteer.LaunchAsync(New LaunchOptions With {.Headless = True})
' Open a new browser page
Dim page = Await browser.NewPageAsync()
' Set the HTML content for the page
Await page.SetContentAsync("<html><body><h1>Hello, PuppeteerSharp!</h1></body></html>")
' Generate a PDF from the rendered HTML content
Await page.PdfAsync("output.pdf")
Console.WriteLine("PDF Generated Successfully!")
End Using
End Function
End Class
Code Explanation
Download Chromium: PuppeteerSharp automatically downloads the required Chromium version to ensure compatibility.
Launch Browser: Start a headless instance of Chromium using
Puppeteer.LaunchAsync()
.Set HTML Content: Load the desired HTML into the browser page using
page.SetContentAsync()
.- Generate PDF: Use the
page.PdfAsync()
method to generate a PDF of the rendered content.
The result is a high-quality PDF (output.pdf
) that accurately replicates the HTML structure and design.
Pros
- High Fidelity Rendering: Supports modern web technologies, including advanced CSS and JavaScript.
- Automation Capabilities: Besides PDFs, PuppeteerSharp can automate web browsing, testing, and data extraction.
- Active Development: PuppeteerSharp is actively maintained and regularly updated.
Cons
- Large File Size: Requires downloading and bundling the Chromium browser, increasing deployment size.
- Resource Intensive: Running a browser instance can be heavy on system resources, especially for large-scale applications.
- Limited PDF-Specific Features: PuppeteerSharp focuses on rendering rather than enhancing PDFs (e.g., adding headers or footers).
2. PdfSharp
PdfSharp is a powerful open-source library for creating and manipulating PDF files in C#. While it doesn't directly support HTML rendering, it excels at providing developers with tools to generate and edit PDF documents programmatically.
Key Features of PdfSharp
PDF Creation: PdfSharp allows developers to generate new PDF files from scratch by defining page sizes, adding text, shapes, images, and more.
Manipulation of Existing PDFs: You can modify existing PDF documents, such as merging, splitting, or extracting content.
Drawing Capabilities: PdfSharp provides robust graphics capabilities for adding custom designs to PDF files using the XGraphics class.
- Lightweight: It is a lightweight library, making it ideal for projects where simplicity and speed are priorities.
using PdfSharp.Pdf;
using PdfSharp.Drawing;
using HtmlAgilityPack;
class Program
{
static void Main(string[] args)
{
// Example HTML content
string htmlContent = "<html><body><h1>Hello, PdfSharp!</h1><p>This is an example of HTML to PDF.</p></body></html>";
// Parse HTML using HtmlAgilityPack (You need to add HtmlAgilityPack via NuGet)
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(htmlContent);
// Create a new PDF document
PdfDocument pdfDocument = new PdfDocument
{
Info = { Title = "HTML to PDF Example" }
};
// Add a new page to the document
PdfPage page = pdfDocument.AddPage();
XGraphics gfx = XGraphics.FromPdfPage(page);
XFont titleFont = new XFont("Arial", 20, XFontStyle.Bold);
XFont textFont = new XFont("Arial", 12, XFontStyle.Regular);
// Draw the parsed HTML content
int yPosition = 50; // Starting Y position
foreach (var node in htmlDoc.DocumentNode.SelectNodes("//h1 | //p"))
{
if (node.Name == "h1")
{
gfx.DrawString(node.InnerText, titleFont, XBrushes.Black, new XRect(50, yPosition, page.Width - 100, page.Height - 100), XStringFormats.TopLeft);
yPosition += 30; // Adjust spacing
}
else if (node.Name == "p")
{
gfx.DrawString(node.InnerText, textFont, XBrushes.Black, new XRect(50, yPosition, page.Width - 100, page.Height - 100), XStringFormats.TopLeft);
yPosition += 20; // Adjust spacing
}
}
// Save the PDF document
string outputFilePath = "HtmlToPdf.pdf";
pdfDocument.Save(outputFilePath);
System.Console.WriteLine($"PDF file created: {outputFilePath}");
}
}
using PdfSharp.Pdf;
using PdfSharp.Drawing;
using HtmlAgilityPack;
class Program
{
static void Main(string[] args)
{
// Example HTML content
string htmlContent = "<html><body><h1>Hello, PdfSharp!</h1><p>This is an example of HTML to PDF.</p></body></html>";
// Parse HTML using HtmlAgilityPack (You need to add HtmlAgilityPack via NuGet)
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(htmlContent);
// Create a new PDF document
PdfDocument pdfDocument = new PdfDocument
{
Info = { Title = "HTML to PDF Example" }
};
// Add a new page to the document
PdfPage page = pdfDocument.AddPage();
XGraphics gfx = XGraphics.FromPdfPage(page);
XFont titleFont = new XFont("Arial", 20, XFontStyle.Bold);
XFont textFont = new XFont("Arial", 12, XFontStyle.Regular);
// Draw the parsed HTML content
int yPosition = 50; // Starting Y position
foreach (var node in htmlDoc.DocumentNode.SelectNodes("//h1 | //p"))
{
if (node.Name == "h1")
{
gfx.DrawString(node.InnerText, titleFont, XBrushes.Black, new XRect(50, yPosition, page.Width - 100, page.Height - 100), XStringFormats.TopLeft);
yPosition += 30; // Adjust spacing
}
else if (node.Name == "p")
{
gfx.DrawString(node.InnerText, textFont, XBrushes.Black, new XRect(50, yPosition, page.Width - 100, page.Height - 100), XStringFormats.TopLeft);
yPosition += 20; // Adjust spacing
}
}
// Save the PDF document
string outputFilePath = "HtmlToPdf.pdf";
pdfDocument.Save(outputFilePath);
System.Console.WriteLine($"PDF file created: {outputFilePath}");
}
}
Imports PdfSharp.Pdf
Imports PdfSharp.Drawing
Imports HtmlAgilityPack
Friend Class Program
Shared Sub Main(ByVal args() As String)
' Example HTML content
Dim htmlContent As String = "<html><body><h1>Hello, PdfSharp!</h1><p>This is an example of HTML to PDF.</p></body></html>"
' Parse HTML using HtmlAgilityPack (You need to add HtmlAgilityPack via NuGet)
Dim htmlDoc = New HtmlDocument()
htmlDoc.LoadHtml(htmlContent)
' Create a new PDF document
Dim pdfDocument As New PdfDocument With {
.Info = { Title = "HTML to PDF Example" }
}
' Add a new page to the document
Dim page As PdfPage = pdfDocument.AddPage()
Dim gfx As XGraphics = XGraphics.FromPdfPage(page)
Dim titleFont As New XFont("Arial", 20, XFontStyle.Bold)
Dim textFont As New XFont("Arial", 12, XFontStyle.Regular)
' Draw the parsed HTML content
Dim yPosition As Integer = 50 ' Starting Y position
For Each node In htmlDoc.DocumentNode.SelectNodes("//h1 | //p")
If node.Name = "h1" Then
gfx.DrawString(node.InnerText, titleFont, XBrushes.Black, New XRect(50, yPosition, page.Width - 100, page.Height - 100), XStringFormats.TopLeft)
yPosition += 30 ' Adjust spacing
ElseIf node.Name = "p" Then
gfx.DrawString(node.InnerText, textFont, XBrushes.Black, New XRect(50, yPosition, page.Width - 100, page.Height - 100), XStringFormats.TopLeft)
yPosition += 20 ' Adjust spacing
End If
Next node
' Save the PDF document
Dim outputFilePath As String = "HtmlToPdf.pdf"
pdfDocument.Save(outputFilePath)
System.Console.WriteLine($"PDF file created: {outputFilePath}")
End Sub
End Class
Code Explanation
HTML Parsing: The example uses HtmlAgilityPack (an open-source library for parsing and manipulating HTML) to extract text content from
<h1>
and<p>
tags.Drawing Content: PdfSharp's XGraphics class is used to render the parsed HTML content as text on a PDF page.
- Limitations: This approach works for simple HTML structures but won't handle complex layouts, styles, or JavaScript.
Pros and Cons of PdfSharp
Pros
- Lightweight and Easy to Use: PdfSharp is intuitive and straightforward, making it ideal for developers starting with PDF generation.
- Open-Source and Free: No licensing fees, and the source code is available for customization.
- Custom Drawing: Provides excellent capabilities for creating PDFs from scratch with custom designs.
Cons
- No HTML to PDF Conversion: PdfSharp does not natively support rendering HTML to PDF, requiring additional libraries for parsing HTML.
- Limited Support for Modern Features: Does not provide advanced capabilities like interactive PDFs, digital signatures, or annotations.
- Performance Constraints: May not be as optimized as professional libraries for large-scale or enterprise applications.
3. Pdfium.NET SDK
Pdfium.NET is a comprehensive library based on the open-source PDFium project, designed for viewing, editing, and manipulating PDF files in .NET applications. It provides developers with powerful tools to create, edit, and extract content from PDFs, making it suitable for a wide range of use cases. It is basically a free HTML to PDF converter library.
Key Features of Pdfium.NET SDK
PDF Creation and Editing:
- Generate PDFs from scratch or from scanned images.
- Edit existing PDFs by adding text, images, or annotations.
Text and Image Extraction:
- Extract text and images from PDF file format documents for further processing.
- Search for specific text within a PDF document.
PDF Viewer Control:
- Embed a standalone PDF viewer in WinForms or WPF applications.
- Supports zooming, scrolling, bookmarks, and text search.
Compatibility:
- Works with .NET Framework, .NET Core, .NET Standard, and .NET 6+.
- Compatible with Windows and macOS platforms.
- Advanced Features:
- Merge and split PDF files.
- Render PDFs as images for display or printing.
using Pdfium.Net.SDK;
using System;
class Program
{
static void Main(string[] args)
{
// Initialize Pdfium.NET SDK functionalities
PdfCommon.Initialize();
// Create a new PDF document
PdfDocument pdfDocument = PdfDocument.CreateNew();
// Add a page to the document (A4 size in points: 8.27 x 11.69 inches)
var page = pdfDocument.Pages.InsertPageAt(pdfDocument.Pages.Count, 595, 842);
// Sample HTML content to be parsed and rendered manually
var htmlContent = "<h1>Hello, Pdfium.NET SDK!</h1><p>This is an example of HTML to PDF.</p>";
// Example: Manually render text since Pdfium.NET doesn't render HTML directly
var font = PdfFont.CreateFont(pdfDocument, "Arial");
page.AddText(72, 750, font, 20, "Hello, Pdfium.NET SDK!");
page.AddText(72, 700, font, 14, "This is an example of HTML to PDF.");
// Save the document to a file
string outputFilePath = "HtmlToPdfExample.pdf";
pdfDocument.Save(outputFilePath, SaveFlags.Default);
Console.WriteLine($"PDF created successfully: {outputFilePath}");
}
}
using Pdfium.Net.SDK;
using System;
class Program
{
static void Main(string[] args)
{
// Initialize Pdfium.NET SDK functionalities
PdfCommon.Initialize();
// Create a new PDF document
PdfDocument pdfDocument = PdfDocument.CreateNew();
// Add a page to the document (A4 size in points: 8.27 x 11.69 inches)
var page = pdfDocument.Pages.InsertPageAt(pdfDocument.Pages.Count, 595, 842);
// Sample HTML content to be parsed and rendered manually
var htmlContent = "<h1>Hello, Pdfium.NET SDK!</h1><p>This is an example of HTML to PDF.</p>";
// Example: Manually render text since Pdfium.NET doesn't render HTML directly
var font = PdfFont.CreateFont(pdfDocument, "Arial");
page.AddText(72, 750, font, 20, "Hello, Pdfium.NET SDK!");
page.AddText(72, 700, font, 14, "This is an example of HTML to PDF.");
// Save the document to a file
string outputFilePath = "HtmlToPdfExample.pdf";
pdfDocument.Save(outputFilePath, SaveFlags.Default);
Console.WriteLine($"PDF created successfully: {outputFilePath}");
}
}
Imports Pdfium.Net.SDK
Imports System
Friend Class Program
Shared Sub Main(ByVal args() As String)
' Initialize Pdfium.NET SDK functionalities
PdfCommon.Initialize()
' Create a new PDF document
Dim pdfDocument As PdfDocument = PdfDocument.CreateNew()
' Add a page to the document (A4 size in points: 8.27 x 11.69 inches)
Dim page = pdfDocument.Pages.InsertPageAt(pdfDocument.Pages.Count, 595, 842)
' Sample HTML content to be parsed and rendered manually
Dim htmlContent = "<h1>Hello, Pdfium.NET SDK!</h1><p>This is an example of HTML to PDF.</p>"
' Example: Manually render text since Pdfium.NET doesn't render HTML directly
Dim font = PdfFont.CreateFont(pdfDocument, "Arial")
page.AddText(72, 750, font, 20, "Hello, Pdfium.NET SDK!")
page.AddText(72, 700, font, 14, "This is an example of HTML to PDF.")
' Save the document to a file
Dim outputFilePath As String = "HtmlToPdfExample.pdf"
pdfDocument.Save(outputFilePath, SaveFlags.Default)
Console.WriteLine($"PDF created successfully: {outputFilePath}")
End Sub
End Class
Code Explanation
SDK Initialization: The
PdfCommon.Initialize()
method initializes Pdfium.NET functionalities.Creating a PDF: A new PDF document is created using
PdfDocument.CreateNew()
.Adding Pages: Pages are inserted into the PDF with specified dimensions (e.g., A4 size).
Rendering HTML Content: Since Pdfium.NET SDK does not natively support HTML rendering, you need to manually parse and render HTML elements as text, shapes, or images.
- Saving the PDF: The document is saved to a file path with the
Save()
method.
Pros
- Allows full control over PDF creation and editing.
- Flexible for drawing and adding text, images, and shapes.
- Powerful capabilities for viewing and manipulating PDFs in desktop applications.
Cons
- Does not directly convert HTML to PDF.
- Parsing and rendering HTML manually can be complex and time-consuming.
- Best suited for applications focusing on PDF editing and manipulation rather than HTML conversion.
Introducing IronPDF
IronPDF is a professional-grade library designed for .NET developers to effortlessly convert HTML content into high-quality PDFs. Known for its reliability, advanced features, and ease of use, IronPDF streamlines the development process while delivering precise rendering and robust functionality. Here’s why IronPDF is a standout solution:
Key Features
Direct HTML to PDF Conversion: Create PDF documents directly using IronPDF with HTML content, including CSS and JavaScript, into fully formatted PDFs. With just a few lines of code, developers can generate PDFs from web pages, raw HTML strings, or local HTML files.
Modern Rendering Capabilities: Supporting the latest web standards, IronPDF ensures accurate rendering of complex layouts, styles, and interactive elements to convert HTML pages to PDF.
Advanced PDF Features: IronPDF offers extensive customization options, such as adding headers, footers, watermarks, annotations, and bookmarks. It also supports merging, splitting, and editing existing PDFs.
Performance and Scalability: Optimized for both small-scale applications and enterprise environments, IronPDF delivers fast, reliable performance for projects of any size.
- Ease of Integration: Designed for .NET Framework and .NET Core, IronPDF integrates smoothly with C# applications, offering developers a straightforward setup process and comprehensive documentation.
Why Choose IronPDF?
IronPDF stands out among other solutions due to its combination of features, developer support, and performance. Unlike open-source alternatives that often require extensive configuration or external dependencies, IronPDF is a self-contained solution that simplifies development without sacrificing functionality. Whether it's for generating invoices, reports, or archiving web content, IronPDF empowers developers with the tools they need to achieve professional-grade results quickly and efficiently.
IronPDF is a practical choice for developers who value reliability, scalability, and ease of use in their HTML to PDF workflows.
How to convert HTML to PDF using IronPDF
using IronPdf;
class Program
{
static void Main()
{
// Specify license key
IronPdf.License.LicenseKey = "Your Key";
// Create a new HtmlToPdf object using ChromePdfRenderer
var Renderer = new ChromePdfRenderer();
// Define the HTML string to be converted
string htmlContent = "<html><body><h1>IronPDF: Better than Open source</h1></body></html>";
// Convert the HTML string to a PDF document
var document = Renderer.RenderHtmlAsPdf(htmlContent);
// Save the PDF document to a file
document.SaveAs("html2Pdf.pdf");
Console.WriteLine("PDF generated and saved successfully!");
}
}
using IronPdf;
class Program
{
static void Main()
{
// Specify license key
IronPdf.License.LicenseKey = "Your Key";
// Create a new HtmlToPdf object using ChromePdfRenderer
var Renderer = new ChromePdfRenderer();
// Define the HTML string to be converted
string htmlContent = "<html><body><h1>IronPDF: Better than Open source</h1></body></html>";
// Convert the HTML string to a PDF document
var document = Renderer.RenderHtmlAsPdf(htmlContent);
// Save the PDF document to a file
document.SaveAs("html2Pdf.pdf");
Console.WriteLine("PDF generated and saved successfully!");
}
}
Imports IronPdf
Friend Class Program
Shared Sub Main()
' Specify license key
IronPdf.License.LicenseKey = "Your Key"
' Create a new HtmlToPdf object using ChromePdfRenderer
Dim Renderer = New ChromePdfRenderer()
' Define the HTML string to be converted
Dim htmlContent As String = "<html><body><h1>IronPDF: Better than Open source</h1></body></html>"
' Convert the HTML string to a PDF document
Dim document = Renderer.RenderHtmlAsPdf(htmlContent)
' Save the PDF document to a file
document.SaveAs("html2Pdf.pdf")
Console.WriteLine("PDF generated and saved successfully!")
End Sub
End Class
Code Snippet Explanation
License Key Setup: The program starts by setting the IronPDF license key, which is required to unlock the full functionality of the library.
Creating the Renderer: An instance of
ChromePdfRenderer
is initialized. This component is responsible for converting HTML content into a PDF document, acting as a bridge between the raw HTML and the final output.Defining HTML Content: A string variable,
htmlContent
, is created to store the HTML structure that will be converted into a PDF. In this example, it contains a simple heading.Converting HTML to PDF: The
RenderHtmlAsPdf()
method is called on theChromePdfRenderer
instance, passing the HTML string as input. This function processes the content and transforms it into a PDF document.- Saving the PDF: Finally, the generated PDF is saved to a file named "html2Pdf.pdf" using the
SaveAs()
method, storing it on the disk for future access.
Output PDF
License Information (Trial Available)
IronPDF requires a valid license key for full functionality. You can obtain a trial license from the official website. Before using the IronPDF library, set the license key as follows:
IronPdf.License.LicenseKey = "your key";
IronPdf.License.LicenseKey = "your key";
IronPdf.License.LicenseKey = "your key"
This ensures that the library operates without limitations.
Conclusion
PuppeteerSharp is an excellent choice for developers who need precise rendering of HTML to PDF, especially when dealing with complex web pages. However, for applications that require advanced PDF-specific features, performance optimization, and ease of integration, professional tools like IronPDF are often the better option.
PdfSharp is a great choice for lightweight, programmatic PDF creation and manipulation, especially for projects with simple requirements. However, if your application requires converting HTML to PDF or advanced PDF features, IronPDF provides a more efficient and feature-rich solution.
While Pdfium.NET SDK is a robust tool for PDF manipulation, IronPDF provides native support for direct HTML-to-PDF conversion, including rendering modern HTML, CSS, and JavaScript. IronPDF simplifies the workflow with built-in methods like HtmlToPdf.RenderHtmlAsPdf()
, making it faster and more efficient for developers.
Whether it's for generating invoices, reports, or archiving web content, IronPDF empowers developers with the tools they need to achieve professional-grade results quickly and efficiently.
IronPDF is a practical choice for developers who value reliability, scalability, and ease of use in their HTML to PDF workflows.
Frequently Asked Questions
What is PuppeteerSharp and how does it convert HTML to PDF?
PuppeteerSharp is a .NET wrapper for Puppeteer, a headless Chromium browser, which allows developers to convert HTML documents to PDFs using the Chromium rendering engine.
What are the advantages of using a certain tool for PDF generation?
PdfSharp excels in creating and manipulating PDF files programmatically. It offers PDF creation, modification of existing PDFs, and robust graphics capabilities while being lightweight and open-source.
What features does a certain SDK offer for PDF manipulation?
Pdfium.NET SDK provides tools for creating, editing, and extracting content from PDFs. It supports PDF creation, text/image extraction, and embedding a PDF viewer in applications.
What is a better alternative for HTML to PDF conversion?
IronPDF offers direct HTML to PDF conversion with support for modern web standards, advanced PDF features, and easy integration with .NET applications, providing a professional solution compared to open-source alternatives.
Can a specific tool handle complex HTML, CSS, and JavaScript in PDF conversion?
Yes, IronPDF supports the latest web standards, ensuring accurate rendering of complex layouts, styles, and interactive elements during HTML to PDF conversion.
What is required to use a certain tool for HTML to PDF conversion?
To use IronPDF, a valid license key is required. Developers can obtain a trial license from the official website to unlock full functionality.
How does a specific SDK differ from another tool in terms of HTML to PDF conversion?
Pdfium.NET SDK does not natively support HTML to PDF conversion, requiring manual rendering of HTML elements, whereas IronPDF provides built-in methods for direct conversion.
What are some limitations of using PuppeteerSharp for PDF generation?
PuppeteerSharp requires downloading and bundling the Chromium browser, which increases file size and can be resource-intensive. It focuses on rendering rather than enhancing PDFs with additional features.
Is a certain tool suitable for rendering complex HTML structures to PDF?
PdfSharp does not natively support HTML to PDF conversion and may struggle with complex layouts, styles, or JavaScript, requiring additional libraries for parsing HTML.
What makes a specific tool a practical choice for developers?
IronPDF is practical due to its reliability, scalability, ease of use, and robust features for HTML to PDF conversion, making it ideal for generating professional-grade PDFs efficiently.