跳過到頁腳內容
.NET幫助

HTML美化器(開發者如何執行)

When working with HTML-to-PDF conversion in .NET, clean and well-structured HTML can make a significant difference in the quality of the final PDF. Formatting raw HTML properly ensures readability, correct rendering, and consistency. This is where an HTML formatter, or an HTML prettifier, comes into play.

In this article, we’ll explore how to use an HTML prettifier in .NET before converting HTML to PDF using IronPDF. We’ll discuss the benefits of prettification, showcase libraries that can help, and provide a practical code example.

What is an HTML Prettifier?

An HTML prettifier is a tool that reformats raw or minified HTML code into a readable, well-structured format. This process involves:

  • Properly indenting nested elements
  • Closing unclosed tags
  • Formatting attributes consistently
  • Removing unnecessary whitespace

Using an HTML prettifier before converting to PDF ensures that the content remains structured and visually coherent, reducing rendering issues in the generated PDF.

IronPDF: A Powerful PDF Solution

HTML Prettifier (How it Works for Developers): Figure 1

IronPDF is a comprehensive and feature-rich .NET library designed for seamless HTML-to-PDF conversion. It enables developers to convert HTML, URLs, or even raw HTML strings into high-quality PDFs with minimal effort. Unlike many other PDF libraries, IronPDF fully supports modern web standards, including HTML5, CSS3, and JavaScript, ensuring that rendered PDFs maintain their intended design and layout. This makes it an ideal choice for projects requiring precise PDF output from complex HTML structures.

Some of the key features of IronPDF include:

By integrating IronPDF with an HTML prettifier, you ensure that your documents are not only visually appealing but also free of rendering issues, making your workflow smoother and more efficient.

Prettifying HTML in .NET

There are several libraries available in .NET to prettify unformatted or ugly HTML code, including:

1. HtmlAgilityPack

  • A popular library for parsing and modifying HTML code in C#.
  • Can be used to format and clean up HTML code before processing.

2. AngleSharp

  • A modern HTML parser for .NET that provides detailed document manipulation capabilities.
  • Can format HTML in a way that makes it more readable.

3. HTML Beautifier (BeautifyTools)

  • Formats and indents messy HTML for better readability.
  • Online Tool that works directly in the browser—no installation required.

Using HtmlAgilityPack to Format HTML Code

HTML Prettifier (How it Works for Developers): Figure 2

HtmlAgilityPack is a popular .NET library that provides a fast and efficient way to parse and manipulate HTML documents. It can handle malformed or poorly structured HTML, making it a great choice for web scraping and data extraction. Although it's not explicitly designed as a "prettifier," it can be used to clean and format HTML code by parsing and saving it with proper indentation.

Here’s how you can use HtmlAgilityPack to prettify HTML before passing it to IronPDF:

using IronPdf;
using HtmlAgilityPack;
using System.IO;

class Program
{
    static void Main()
    {
        string htmlContent = "<html><body><h1>Hello World!</h1><p>This is a test.</p></body></html>";

        // Load the HTML content into an HtmlDocument
        HtmlDocument doc = new HtmlDocument();
        doc.LoadHtml(htmlContent);

        // Prettify the HTML by saving it with indentation
        // Saves the formatted HTML with the prettified indenting
        string prettyHtml = doc.DocumentNode.OuterHtml;
        doc.Save("pretty.html"); // Save the pretty HTML to a file
    }
}
using IronPdf;
using HtmlAgilityPack;
using System.IO;

class Program
{
    static void Main()
    {
        string htmlContent = "<html><body><h1>Hello World!</h1><p>This is a test.</p></body></html>";

        // Load the HTML content into an HtmlDocument
        HtmlDocument doc = new HtmlDocument();
        doc.LoadHtml(htmlContent);

        // Prettify the HTML by saving it with indentation
        // Saves the formatted HTML with the prettified indenting
        string prettyHtml = doc.DocumentNode.OuterHtml;
        doc.Save("pretty.html"); // Save the pretty HTML to a file
    }
}
Imports IronPdf
Imports HtmlAgilityPack
Imports System.IO

Friend Class Program
	Shared Sub Main()
		Dim htmlContent As String = "<html><body><h1>Hello World!</h1><p>This is a test.</p></body></html>"

		' Load the HTML content into an HtmlDocument
		Dim doc As New HtmlDocument()
		doc.LoadHtml(htmlContent)

		' Prettify the HTML by saving it with indentation
		' Saves the formatted HTML with the prettified indenting
		Dim prettyHtml As String = doc.DocumentNode.OuterHtml
		doc.Save("pretty.html") ' Save the pretty HTML to a file
	End Sub
End Class
$vbLabelText   $csharpLabel

Output HTML File

HTML Prettifier (How it Works for Developers): Figure 3

Using AngleSharp as an HTML Prettifier

HTML Prettifier (How it Works for Developers): Figure 4

AngleSharp is a .NET library designed for parsing and manipulating HTML, XML, and SVG documents. It provides a modern and flexible approach to DOM manipulation and formatting. AngleSharp’s HtmlFormatter class can be used to format HTML content, providing nice, readable output.

using AngleSharp.Html.Parser;
using System;

class Program
{
    static void Main()
    {
        string htmlContent = "<html><body><h1>Hello World!</h1><p>This is a test.</p></body></html>";

        // Parse the HTML content using HtmlParser
        var parser = new HtmlParser();
        var document = parser.ParseDocument(htmlContent);

        // Format the HTML using AngleSharp’s HtmlFormatter
        var prettyHtml = document.ToHtml();
    }
}
using AngleSharp.Html.Parser;
using System;

class Program
{
    static void Main()
    {
        string htmlContent = "<html><body><h1>Hello World!</h1><p>This is a test.</p></body></html>";

        // Parse the HTML content using HtmlParser
        var parser = new HtmlParser();
        var document = parser.ParseDocument(htmlContent);

        // Format the HTML using AngleSharp’s HtmlFormatter
        var prettyHtml = document.ToHtml();
    }
}
Imports AngleSharp.Html.Parser
Imports System

Friend Class Program
	Shared Sub Main()
		Dim htmlContent As String = "<html><body><h1>Hello World!</h1><p>This is a test.</p></body></html>"

		' Parse the HTML content using HtmlParser
		Dim parser = New HtmlParser()
		Dim document = parser.ParseDocument(htmlContent)

		' Format the HTML using AngleSharp's HtmlFormatter
		Dim prettyHtml = document.ToHtml()
	End Sub
End Class
$vbLabelText   $csharpLabel

HTML Output

HTML Prettifier (How it Works for Developers): Figure 5

Online HTML Beautifier (BeautifyTools)

HTML Prettifier (How it Works for Developers): Figure 6

BeautifyTools.com provides an easy-to-use online HTML formatter that allows you to format and prettify messy HTML code. This is useful if you want a quick and free way to clean up your HTML without installing any libraries or writing code.

How to Use the Online HTML Beautifier

  1. Go to the Website

    Open BeautifyTools.com HTML Beautifier in your web browser.

  2. Paste Your HTML

    Copy your raw or minified HTML and paste it into the input box.

  3. Adjust the Settings (Optional)

    • Choose the indentation level (Spaces: 2, 4, etc.).
    • Enable/disable line breaks and formatting options.
  4. Click "Beautify HTML"

    The tool will process your HTML and display the prettified result in the output box.

  5. Copy the Formatted HTML

    Click "Copy to Clipboard" or manually copy the formatted HTML for use in your project.

HTML Prettifier (How it Works for Developers): Figure 7

Pros & Cons of Using an Online Beautifier

HTML Prettifier (How it Works for Developers): Figure 8

Pros & Cons of Using a Code-Based HTML Prettifier

HTML Prettifier (How it Works for Developers): Figure 9

Converting Prettified HTML to PDF with IronPDF

Once we have prettified our HTML, we can use IronPDF to convert it into a high-quality PDF. Here’s a simple example using AngleSharp:

using AngleSharp.Html.Parser;
using System.IO;
using IronPdf;
using System;

class Program
{
    static void Main()
    {
        string htmlContent = "<html><body><h1>Hello World!</h1><p>This was formatted using AngleSharp.</p><p>Then it was converted using IronPDF.</p></body></html>";

        // Parse the HTML content using HtmlParser
        var parser = new HtmlParser();
        var document = parser.ParseDocument(htmlContent);

        // Format the HTML using PrettyMarkupFormatter
        using (var writer = new StringWriter())
        {
            document.ToHtml(writer, new PrettyMarkupFormatter()); // Format the HTML
            var prettyHtml = writer.ToString();

            // Save the formatted HTML to a file
            string outputPath = "formatted.html";
            File.WriteAllText(outputPath, prettyHtml);
            Console.WriteLine(prettyHtml);
        }

        // Convert the formatted HTML to PDF using IronPdf
        var renderer = new ChromePdfRenderer();
        var pdf = renderer.RenderHtmlFileAsPdf("formatted.html");
        pdf.SaveAs("output.pdf");
    }
}
using AngleSharp.Html.Parser;
using System.IO;
using IronPdf;
using System;

class Program
{
    static void Main()
    {
        string htmlContent = "<html><body><h1>Hello World!</h1><p>This was formatted using AngleSharp.</p><p>Then it was converted using IronPDF.</p></body></html>";

        // Parse the HTML content using HtmlParser
        var parser = new HtmlParser();
        var document = parser.ParseDocument(htmlContent);

        // Format the HTML using PrettyMarkupFormatter
        using (var writer = new StringWriter())
        {
            document.ToHtml(writer, new PrettyMarkupFormatter()); // Format the HTML
            var prettyHtml = writer.ToString();

            // Save the formatted HTML to a file
            string outputPath = "formatted.html";
            File.WriteAllText(outputPath, prettyHtml);
            Console.WriteLine(prettyHtml);
        }

        // Convert the formatted HTML to PDF using IronPdf
        var renderer = new ChromePdfRenderer();
        var pdf = renderer.RenderHtmlFileAsPdf("formatted.html");
        pdf.SaveAs("output.pdf");
    }
}
Imports AngleSharp.Html.Parser
Imports System.IO
Imports IronPdf
Imports System

Friend Class Program
	Shared Sub Main()
		Dim htmlContent As String = "<html><body><h1>Hello World!</h1><p>This was formatted using AngleSharp.</p><p>Then it was converted using IronPDF.</p></body></html>"

		' Parse the HTML content using HtmlParser
		Dim parser = New HtmlParser()
		Dim document = parser.ParseDocument(htmlContent)

		' Format the HTML using PrettyMarkupFormatter
		Using writer = New StringWriter()
			document.ToHtml(writer, New PrettyMarkupFormatter()) ' Format the HTML
			Dim prettyHtml = writer.ToString()

			' Save the formatted HTML to a file
			Dim outputPath As String = "formatted.html"
			File.WriteAllText(outputPath, prettyHtml)
			Console.WriteLine(prettyHtml)
		End Using

		' Convert the formatted HTML to PDF using IronPdf
		Dim renderer = New ChromePdfRenderer()
		Dim pdf = renderer.RenderHtmlFileAsPdf("formatted.html")
		pdf.SaveAs("output.pdf")
	End Sub
End Class
$vbLabelText   $csharpLabel

解釋

The above code demonstrates how to prettify HTML using AngleSharp and then convert it to a PDF using IronPDF. 其工作原理如下:

  1. Define the Raw HTML Content:

    The program starts with a simple HTML string containing a <h1> header and two paragraphs.

  2. Parse the HTML with AngleSharp:

    It initializes an HtmlParser instance and parses the raw HTML into a structured IDocument object.

  3. Format the HTML using PrettyMarkupFormatter:

    • The PrettyMarkupFormatter class is used to properly format and indent the HTML.
    • A StringWriter is used to capture the formatted HTML as a string.
    • After formatting, the formatted HTML is saved to a file named "formatted.html".
  4. Convert the Formatted HTML to PDF using IronPDF:

    • A ChromePdfRenderer instance is created to handle the conversion.
    • The formatted HTML file is loaded and converted into a PdfDocument.
    • The resulting PDF is saved as "output.pdf".
  5. Final Output:

    • The prettified HTML is displayed in the console.
    • The program produces two output files:
      • formatted.html (a well-structured version of the HTML)
      • output.pdf (the final PDF document generated from the formatted HTML).

This approach ensures that the HTML is neatly structured before converting it to a PDF, which improves readability and avoids potential rendering issues in the PDF output.

Console Output

HTML Prettifier (How it Works for Developers): Figure 10

PDF 輸出

HTML Prettifier (How it Works for Developers): Figure 11

Why Use a Prettifier with IronPDF?

1. Better Readability and Debugging

Formatted HTML is easier to read, debug, and maintain. This is especially useful when working with dynamic content or large HTML templates.

2. Improved Styling Consistency

Prettified HTML maintains consistent spacing and structure, leading to a more predictable rendering in IronPDF.

3. Reduced Rendering Issues

Minified or unstructured HTML can sometimes cause unexpected issues in PDF generation. Prettification helps prevent missing elements or broken layouts.

4. Simplifies Automated Workflows

If your application programmatically generates PDFs, ensuring HTML is clean and well-formed before conversion improves stability and accuracy.

結論

Using an HTML prettifier with IronPDF in .NET is a simple but effective way to enhance PDF conversion. By structuring your HTML correctly, you ensure better rendering, improved maintainability, and fewer debugging headaches.

With libraries like HtmlAgilityPack, AngleSharp, and HTML Beautifier, prettifying HTML before PDF generation becomes an effortless task. If you frequently work with HTML-to-PDF conversions, consider integrating an HTML prettifier into your workflow for optimal results.

Give it a try today and see how it enhances your IronPDF experience! Download the free trial and get start exploring all that IronPDF has to offer within your own projects.

常見問題解答

在將HTML轉換為PDF之前使用HTML美化工具的目的是什么?

在將HTML轉換為PDF之前使用HTML美化工具可以確保HTML代碼干凈、結構良好且可讀。這個過程有助於防止渲染問題,並確保最終PDF輸出保持預期的設計和佈局。

如何在 .NET 中將 HTML 轉換為 PDF?

您可以使用IronPDF(一個.NET庫)將HTML轉換為PDF。IronPDF支持HTML5、CSS3和JavaScript,確保在PDF中準確渲染復雜的HTML結構。

在.NET中有哪些可用的HTML美化庫?

如HtmlAgilityPack和AngleSharp等庫可用於在.NET中美化HTML。這些庫有助於解析、操作和格式化HTML文檔,以確保它們結構良好且干凈。

HtmlAgilityPack如何協助格式化HTML?

HtmlAgilityPack透過解析和操作HTML文檔來協助格式化HTML,即便這些文檔格式不正確。它可以以正確的縮進格式化HTML代碼,使其適合用於網頁抓取和數據提取任務。

使用AngleSharp進行HTML格式化的好處是什么?

AngleSharp提供現代的DOM操作能力,並使用其HtmlFormatter類格式化HTML。它允許開發人員將HTML內容解析並格式化為易讀的輸出,尤其是在將HTML轉換為PDF之前非常有用。

我能否在線美化HTML而無需安裝任何軟體?

是的,您可以使用像BeautifyTools.com這樣的工具線上美化HTML,提供一種不需安裝任何庫或寫代碼即可快速免費清理HTML代碼的方法。

在選擇HTML到PDF轉換庫時,我應該尋找哪些功能?

選擇HTML到PDF轉換庫時,應尋找以下功能:完整的HTML5和CSS3支持,JavaScript執行,支持頁眉、頁腳和水印,PDF簽名和安全功能,以及多線程處理的高效性能,這些功能都由IronPDF提供。

HTML格式化如何改善PDF輸出質量?

HTML格式化透過確保HTML在轉換之前結構整齊且無錯誤來改善PDF輸出質量。這可防止渲染問題,並生成更高質量、更準確的PDF文檔。

Curtis Chau
技術作家

Curtis Chau 擁有卡爾頓大學計算機科學學士學位,專注於前端開發,擅長於 Node.js、TypeScript、JavaScript 和 React。Curtis 熱衷於創建直觀且美觀的用戶界面,喜歡使用現代框架並打造結構良好、視覺吸引人的手冊。

除了開發之外,Curtis 對物聯網 (IoT) 有著濃厚的興趣,探索將硬體和軟體結合的創新方式。在閒暇時間,他喜愛遊戲並構建 Discord 機器人,結合科技與創意的樂趣。