How to create PDF documents in C# and VB.NET

This article describes different ways to create PDF documents in .NET with the help of the Docotic.Pdf library. It is a high‑performance, pure C# .NET library without external dependencies for creating, editing, converting, and processing PDF documents.

Illustration highlighting PDF creation and document automation with Docotic.Pdf.

In the following sections, I present the main approaches to creating PDFs with Docotic.Pdf:

  • Using the Core API, which provides low level control over text, graphics, and PDF internals. This option is best for custom layouts, graphics heavy documents, and advanced features.
  • Using the high level Layout API, which supports paragraphs, tables, headers, footers, and automatic pagination. This API is ideal when you want structured documents without manually calculating positions.
  • Converting HTML to PDF with support for SVG and other web formats. This approach is especially useful when your solution already produces HTML documents and you need PDF versions of those HTML and CSS files.
  • Creating PDFs from images. This method is useful for scanned documents, image based reports, receipts, and any workflow that starts with raster images.
  • Merging or splitting PDFs. This is a good choice for assembling reports, processing user uploads, combining related documents, and restructuring large PDFs.
  • Creating PDFs from templates. This approach works well when consistent formatting is required for batch generated documents such as receipts, tax forms, employment contracts, and other repeatable document types.

Extra topics covered in the guide include:

Creating PDFs with the Core API

The Core API is the foundation of PDF creation in Docotic.Pdf. It gives you full, low level control over placing text, images, and vector graphics on a PDF canvas through its Canvas API. This drawing API is a subset of the Core API and provides the methods and properties used to add content to pages and other objects with canvases. Beyond rendering, the Core API also supports annotations, form fields, layers, bookmarks, and other PDF features.

Here is C# code that creates a simple PDF using three fundamental operations: drawing text, placing an image, and rendering vector graphics on a page canvas.

using var pdf = new PdfDocument();
var canvas = pdf.Pages[0].Canvas;

canvas.Font = pdf.CreateFont(PdfBuiltInFont.HelveticaBold);
canvas.FontSize = 14;
canvas.DrawString(40, 100, "Core API demo: text, images, and vector graphics");

var image = pdf.CreateImage("image.png");
canvas.DrawImage(image, 40, 180, 120, 120, 0);

canvas.Pen.Color = new PdfRgbColor(30, 60, 160);
canvas.Pen.Width = 2;
canvas.Brush.Color = new PdfRgbColor(200, 230, 255);
canvas.DrawRectangle(new PdfRectangle(200, 200, 150, 80), PdfDrawMode.FillAndStroke);

pdf.Save("core-api-demo.pdf");

This overview introduces only a small portion of what the Core API can do. For advanced topics, see the detailed article on building PDFs with the Core API. The article covers measuring text, working with color spaces, applying clipping, filling areas with patterns, handling transparency, and other capabilities.

Generating PDFs with the Layout API

The Layout API is a high‑level document‑building engine that provides the easiest and most efficient way to generate complex, content‑rich PDFs.

When using the API, you compose PDFs from structural elements such as pages, containers, text spans, images, tables, links, headers, footers, and more. Instead of calculating coordinates or manually managing pagination, you describe the document's structure and let the layout engine handle the rest.

This example illustrates how to create a PDF using the Layout API, relying on declarative layout instead of manual positioning.

PdfDocumentBuilder.Create()
    .Info(info => info.Title = "Docotic.Pdf Layout API demo")
    .Generate("layout-api-demo.pdf", doc => doc.Pages(pages =>
    {
        pages.Content().Padding(100).Text(text =>
        {
            text.Span("The Layout API lets you compose PDFs from structural elements ");
            text.Line("without manually calculating coordinates or handling pagination.")
                .Style(s => s.Strong);
        });
    }));

Check out the in‑depth guide to learn how to use the Layout API when generating PDFs in .NET applications.

Converting web content with the HTML to PDF API

Docotic.Pdf, paired with its free HtmlToPdf add‑on, provides a modern, high‑quality, Chrome‑based HTML‑to‑PDF engine. You can turn modern HTML and other web content, such as SVG or WebP images, into high‑quality PDF documents using the API provided by the add‑on.

The HTML to PDF API can create PDFs from complete HTML pages or HTML fragments. You can convert content from URLs, raw HTML strings, and local HTML files. The latter two options make it easy to generate PDFs from HTML templates.

See an example of how to produce a PDF from an HTML template:

public static async Task HelloHtmlTemplate()
{
    static string GetUserName()
    {
        // Replace with real logic: form input, API call, config, etc.
        return "World";
    }

    string html = $@"
        <h1>Hello, {GetUserName()}!</h1>
        <p>This PDF was generated from an HTML template.</p>";

    using var converter = await HtmlConverter.CreateAsync();
    using var pdf = await converter.CreatePdfFromStringAsync(html);
    pdf.Save("hello-html-template.pdf");
}

For more details and examples, check out our in‑depth HTML‑to‑PDF overview.

Creating PDFs from images

Docotic.Pdf provides a flexible, developer‑friendly way to convert images to PDF. The library supports JPEG, BMP, GIF, PNG, TIFF, and JPEG 2000 image formats through the Core API.

Illustration showing how Docotic.Pdf converts multiple image files into a single PDF document.

When supported by the PDF format, Docotic.Pdf embeds image bytes as‑is, avoiding pixel decoding and re‑encoding to preserve the original compression. The library also preserves the color space when possible.

In addition, SVG and WebP formats are supported through the HTML to PDF API. When you need to place images alongside labels or descriptions, the Layout API helps you arrange and align elements with minimal effort.

How to combine several images into one PDF

With Docotic.Pdf, you can easily convert a set of images into a single PDF, placing one image on each page.

The example below loads images from files and draws each one on its own page. Each image is scaled to fit the page and centered for a clean, consistent layout.

public static void ImagesOnToPdf(string[] imagePaths, string outputPath)
{
    using var pdf = new PdfDocument();

    foreach (string path in imagePaths)
    {
        var image = pdf.CreateImage(path);

        var page = pdf.AddPage();
        var pageWidth = page.Width;
        var pageHeight = page.Height;

        var scale = Math.Min(pageWidth / image.Width, pageHeight / image.Height);
        var drawWidth = image.Width * scale;
        var drawHeight = image.Height * scale;
        var x = (pageWidth - drawWidth) / 2;
        var y = (pageHeight - drawHeight) / 2;

        page.Canvas.DrawImage(image, x, y, drawWidth, drawHeight, 0);
    }

    pdf.RemovePage(0);

    pdf.Save(outputPath);
}

Handling multipage TIFF and GIF images

Docotic.Pdf fully supports multipage TIFF and GIF files. When adding images to a PDF, use the OpenImage method instead of CreateImage if any of the images may contain multiple pages.

The following code demonstrates how to convert TIFF to PDF, and it works for both single‑page and multipage images:

public static void OddFramesToPdf(string[] imagePaths, string outputPath)
{
    using var pdf = new PdfDocument();
    foreach (string path in imagePaths)
    {
        var imageFrames = pdf.OpenImage(path);
        for (int i = 0; i < imageFrames.Count; i++)
        {
            if (i % 2 != 0)
                continue;

            var image = pdf.CreateImage(imageFrames[i]);
            var page = pdf.AddPage();

            page.Width = image.Width;
            page.Height = image.Height;

            page.Canvas.DrawImage(image, 0, 0, image.Width, image.Height, 0);
        }
    }

    pdf.RemovePage(0);

    pdf.Save(outputPath);
}

You can use the same approach to convert GIF to PDF. It also applies to other image formats, though it is more elaborate than necessary for formats that contain only a single frame.

Merging and splitting PDFs

As a fully featured .NET library, Docotic.Pdf can create new PDFs by merging, extracting, and reorganizing pages from existing documents.

When you merge PDFs, the library not only adds pages from another document but also appends layers, bookmarks, page labels, shared JavaScript, destinations (link targets), and embedded files. For more details and for guidance on reducing the size of combined PDFs, see the article on merging PDFs.

Digitally signed PDFs cannot be merged without invalidating their existing signatures. To preserve signatures, create a PDF portfolio instead of appending documents. Another option is to merge the PDFs first and then apply a new digital signature to the combined document.

Docotic.Pdf also lets you copy and extract pages into new documents. All content associated with the copied pages is preserved, including annotations, form controls, structured content, layers, and other related data. For practical examples, refer to the article on splitting PDFs in .NET. It also explains how to extract or remove pages.

Working with PDF templates

PDF templates are pre‑designed PDF files that serve as a base structure for creating new documents. They are useful when you need to produce PDFs with a consistent layout while supplying different data. If you want to separate visual design from the data itself, PDF templates are also a good choice.

Templates can be either form‑based PDFs or static PDFs without forms. Both types serve the same purpose. In addition, form‑based templates include interactive elements that, if not flattened, can collect information from users.

Creating PDFs from form-based templates

Form‑based templates usually contain AcroForms, the standard and widely supported type of interactive PDF form. To generate a PDF from such a template, you typically need to:

  • fill each placeholder field
  • flatten the fields to prevent further editing
  • save the result as a new PDF

Here is C# code that finds a placeholder text field by name, assigns a value to it, flattens the field, and saves the result, creating a PDF from the template:

var nameOnCertificate = "Eva Marin";
using var pdf = new PdfDocument("certificate-template.pdf");
if (pdf.TryGetControl("name", out var field))
{
    if (field is PdfTextBox nameField)
    {
        nameField.Text = nameOnCertificate;
        nameField.Flatten();
    }
}

pdf.Save($"certificate-{nameOnCertificate}.pdf");

If your template contains many placeholders, you can import FDF data instead of filling each field individually. You can also use PdfDocument.FlattenControls to flatten all fields at once.

Creating PDFs from static templates without forms

If your template does not contain form fields, you will draw names and other data directly onto the page canvas. Static PDF templates usually contain fixed visual placeholders such as text, images, or empty areas. To produce a PDF from the template, you will need to fill these empty areas and replace placeholder text and images programmatically.

Empty areas

Make use of the Canvas API to place text and images in empty areas. In simple cases, when you need only minor changes such as adding a name and a photo, this approach works well. You need to know the coordinates and size of the areas, and to position text properly, you may need to measure it first and then align it accordingly.

Working with variable‑length or multi‑line text is more challenging but still feasible. By combining DrawText, DrawString, and the methods for measuring text, you can wrap and position lines as needed. If your template contains more than a few such areas, consider an alternative approach such as generating PDFs with the Layout API.

Placeholder text

Docotic.Pdf also provides methods for finding and replacing text. However, using text search as a templating mechanism is usually no easier than working with empty placeholder areas. Before inserting new content, you must locate the exact text fragment and remove it cleanly.

Placeholder images

Static templates may include placeholder images for user avatars or product photos. To find a placeholder image, enumerate the collection of images painted on each page. For each painted image, you can obtain its visible size and position. To replace the placeholder, use PdfImage.ReplaceWith.

using var pdf = new PdfDocument("invoice-template.pdf");
var paintedImages = pdf.Pages[0].GetPaintedImages();

var placeholder = paintedImages.First();
placeholder.Image.ReplaceWith("company-logo.jpg");

pdf.Save($"invoice.pdf");

Another option is to draw a new image over the area occupied by the placeholder image, but this typically increases the size of the resulting PDF without good reason.

Designing placeholders for easy replacement

For static templates, it helps to design the layout with predictable, well‑defined regions for both text and images. Leave enough padding around areas that will contain variable‑length content, and use neutral placeholder images that match the aspect ratios you expect to insert later.

If your template uses placeholder text that you plan to replace, you can simplify the workflow by using text boxes instead of raw text. Add a read‑only, borderless text box to the template and place the placeholder text inside it. When generating the final PDF, open the template, locate the text box by name, and assign the new value directly with box.Text = "new text";. Then flatten the text box to prevent further edits.

Adding interactive elements

Interactive features turn a static PDF into a dynamic, easy‑to‑navigate document enriched with annotations and markup. Actions and JavaScript enable automation directly within the PDF.

Annotations

Annotations are objects attached to a page that represent comments, highlights, file attachments, and other interactive widgets. They are visible in the page content and support review workflows and collaboration.

The following C# example shows how to add text annotations, also known as sticky notes, to a PDF page using Docotic.Pdf.

using var pdf = new PdfDocument("example.pdf");
var page = pdf.Pages[0];

var textAnnot = page.AddTextAnnotation(new PdfPoint(50, 100), "Reviewer comment");
textAnnot.Contents = "Please check the figures on this page.";

pdf.Save("text-annotation.pdf");

The next example demonstrates how to highlight text and other content to draw attention to key parts of a document.

using var pdf = new PdfDocument("example.pdf");
var page = pdf.Pages[0];

var color = new PdfRgbColor(255, 255, 120);
var annotationText = "Please confirm this part.";
var bounds = new PdfRectangle(50, 250, 120, 40);
page.AddHighlightAnnotation(annotationText, bounds, color);

pdf.Save("highlight-annotation.pdf");

PDF standards define several types of PDF links. The most important and widely used are internal links and hyperlinks.

Internal links, also called GoTo actions, allow jumps to a page or a named destination inside the same PDF. They are useful for cross‑references and internal navigation.

Here is C# code that creates a link from the first page to the page with index equal to 5:

using var pdf = new PdfDocument();
var page = pdf.Pages[0];

int targetPageIndex = 5;
for (int i = 0; i < targetPageIndex; i++)
    pdf.AddPage();

var rect = new PdfRectangle(50, 50, 100, 40);
page.Canvas.DrawRectangle(rect);
page.AddLinkToPage(rect, targetPageIndex);

pdf.Pages[targetPageIndex].Canvas.DrawString(50, 50, "Glad to have you here.");

pdf.Save("link-to-page.pdf");

The Layout API provides another way to create internal links that does not require absolute positioning.

External links, also called URI actions, open a web URL. You can add a hyperlink to a PDF page by using the PdfPage.AddHyperlink method. Otherwise, the approach is the same as with internal links.

Bookmarks

Bookmarks, also called outlines, are special shortcuts or links that help readers navigate to specific sections or pages quickly. When the reader clicks a bookmark, the viewer application jumps to the designated part of the document.

Outlines appear in the viewer's bookmarks panel and provide a hierarchical navigation tree similar to a table of contents in a book, but interactive. PDF outlines can include main bookmarks and nested bookmarks, which makes it easier to structure large documents.

The following example shows how to create bookmarks in PDFs using C# and Docotic.Pdf. The code creates three top‑level bookmarks. The second bookmark contains one nested bookmark.

using var pdf = new PdfDocument();

for (int i = 0; i < 5; i++)
{
    var page = i == 0 ? pdf.Pages[0] : pdf.AddPage();

    var canvas = page.Canvas;
    canvas.FontSize = 14;
    canvas.DrawString(50, 50, $"Page {i + 1}");
}

var root = pdf.OutlineRoot;
root.AddChild("Getting Started", 1);

var child = root.AddChild("Things You Can Do", 2);
child.AddChild("Making Quick Improvements", 3);

root.AddChild("Keeping Everything Running Smoothly", 4);

pdf.PageMode = PdfPageMode.UseOutlines;

pdf.Save("bookmarks.pdf");

Bookmarks differ from the table of contents you might find printed on the pages of a physical book or displayed in a PDF. You can create a table of contents programmatically by measuring headings and writing entries with page numbers.

To see an alternative approach to creating a table of contents by using the Layout API, take a look at the related code in our sample repository.

PDF scripting

JavaScript actions are among the most powerful interactive features. PDF JavaScript is a subset of JavaScript that exposes document and viewer APIs. It is used for form validation, calculations, user interface dialogs, and small automation tasks.

You can attach scripts to annotations, bookmarks, form controls, or open actions. With Docotic.Pdf you can embed JavaScript code in a PDF. The code can validate form input, compute values, show or hide fields, or perform viewer interactions.

The collection of shared JavaScript contains scripts stored at the document level. These scripts can be reused by multiple actions. In other words, shared scripts are useful for utility functions and shared logic. They help reduce duplication and simplify maintenance.

The code below defines a shared script that displays an alert message in the PDF viewer and then shows how to trigger that script by assigning it to the button's click action.

using var pdf = new PdfDocument();

pdf.SharedScripts.Add(
    pdf.CreateJavaScriptAction("function messageBox(message) { app.alert(message,3); }")
);

var button = pdf.Pages[0].AddButton(50, 50, 100, 40);
button.Text = "Click me";
button.OnMouseUp = pdf.CreateJavaScriptAction("messageBox('Hello, dear!');");

pdf.Save("shared-javascript.pdf");

The script in the example is simple, but you can build JavaScript actions of any complexity. The Adobe JavaScript API reference provides many methods you can use. Keep in mind that non‑Adobe viewers usually support only a subset of the API.

Open actions

An open action is an action that the PDF viewer executes when the document is opened. Typical uses include opening at a specific page, running a JavaScript initialization routine, or setting viewer preferences. There are no restrictions on the type of open action.

The following example shows how to create a GoTo open action. The code adds text to the second page and sets an open action that makes the viewer automatically navigate to that page when the PDF is opened.

using var pdf = new PdfDocument();

var canvas = pdf.AddPage().Canvas;
canvas.FontSize = 14;

var message =
    "If you see this immediately after opening the file, " +
    "your PDF viewer supports open actions.";
var options = new PdfTextDrawingOptions(new PdfRectangle(100, 100, 100, 150));
canvas.DrawText(message, options);

pdf.OnOpenDocument = pdf.CreateGoToPageAction(1, 0);

pdf.Save("open-action.pdf");

Note that not all viewers execute JavaScript open actions. Some ignore them or prompt the user first. Some viewers block open actions entirely.

To check whether a PDF contains an open action, load it into a PdfDocument and inspect the OnOpenDocument property. If the property is null, the document has no open action defined.

Applying encryption and digital signatures

Encryption and digital signatures address two complementary aspects of securing the PDFs you create. Encryption controls who can open a document and what they can do with it, while signatures prove who created or approved the file and confirm that it has not been altered.

Password protection lets you set access rules during creation. You can assign an open password to restrict viewing and an owner password to define permissions such as printing, copying, editing, or filling forms. Certificate encryption provides stronger, recipient specific protection and works well when distributing confidential PDFs to multiple people without relying on a shared password. For more detail, see the article on encrypting PDFs with passwords and certificates.

Digital signatures add authenticity and integrity at creation time. Docotic.Pdf can sign PDFs using certificates from files, the Windows store, hardware tokens, HSMs, or cloud key services. You can include timestamps and long term validation data so signatures remain verifiable long after the document is produced. External signing workflows, including PKCS#11 and cloud KMS, are also supported.

Setting PDF metadata

PDF metadata is descriptive information embedded in a document, such as the title, author, subject, keywords, creation dates, and similar fields. It helps software, search engines, and document‑management systems understand what a file is about without opening it.

A PDF document can carry metadata in two coexisting systems:

  • XMP metadata
  • document information dictionary (Info dictionary)

Illustration of the process of adding XMP metadata to a PDF document using Docotic.Pdf.

XMP is the richer, structured, standardized format for embedding descriptive metadata. The Info dictionary is simple and widely supported, but limited, and is deprecated in the PDF 2.0 standard (ISO 32000‑2) in favor of XMP metadata. Docotic.Pdf can read and write both systems and provides a helper method to keep them in sync.

Docotic.Pdf updates some metadata automatically before saving a PDF file. For example, the library sets Producer and Creator values by default. Use save options to change this behavior and preserve explicitly set metadata values.

XMP metadata

Use the PdfDocument.Metadata property to access and modify XMP metadata in a PDF. Through this property you can work with well‑known schemas such as XMP Core, Dublin Core, and the PDF schema, as well as manage your own custom metadata.

using var pdf = new PdfDocument();
var xmp = pdf.Metadata;

xmp.Pdf.Creator = new XmpString("Second-line authoring terminal");
xmp.Pdf.Title = new XmpString("Quarterly Report");

var creators = new XmpArray(XmpArrayType.Ordered);
creators.Values.Add(new XmpString("Second-line authoring terminal"));
creators.Values.Add(new XmpString("Assistive authoring terminal"));
xmp.DublinCore.Creators = creators;

var descriptions = new XmpArray(XmpArrayType.Alternative);
descriptions.Values.Add(new XmpLanguageAlternative("x-default", "Quarterly Report"));
descriptions.Values.Add(new XmpLanguageAlternative("fr", "Rapport trimestriel"));
descriptions.Values.Add(new XmpLanguageAlternative("de", "Quartalsbericht"));
xmp.DublinCore.Descriptions = descriptions;

var author1 = new XmpString("First Author");
author1.Qualifiers.Add("role", "main author");

var author2 = new XmpString("Second Author");
author2.Qualifiers.Add("role", "co-author");

var authors = new XmpArray(XmpArrayType.Unordered);
authors.Values.Add(author1);
authors.Values.Add(author2);
xmp.Custom.Properties.Add("authors", authors);

pdf.Save("with-xmp-metadata.pdf");

XMP supports arrays, structures, and typed values, which makes it a good fit for rich metadata. The code above also shows how to store application‑specific properties in the custom XMP schema.

Document information dictionary

The Info dictionary primarily stores text string values. It is compact and broadly supported but limited. Use the Info dictionary for compatibility with older tools, and prefer XMP in other cases.

Synchronizing metadata

It is a good practice to keep both metadata systems in sync to avoid inconsistencies that can confuse readers and automated tools.

Use PdfDocument.SyncMetadata to align XMP and Info values so that corresponding fields match. The method fills missing Info properties from XMP and, similarly, populates missing XMP fields from Info. Set preferXmp: true when XMP is your authoritative source, or false when the Info dictionary should take precedence.

pdf.SyncMetadata(preferXmp: true);

See the Remarks section in the SyncMetadata documentation for detailed information about which properties the method synchronizes.

Configuring page labels and viewer preferences

A newly created PDF can benefit from explicit page numbering, fine‑tuned viewer preferences, and a page layout chosen to present the document's content more effectively. These settings affect how readers first see and navigate the file.

Page labels

Page labels are metadata that tell PDF viewers what label to display for each page. Use them when the visible numbering must differ from the physical page index. For example, when you want i, ii, iii for the front matter and 1, 2, 3 for the main text in your PDF.

This C# code shows how to label PDF pages with lowercase Roman numerals for the first three pages and Arabic numbering starting at 1 for the rest.

using var pdf = new PdfDocument();

for (int i = 0; i < 8; i++)
    pdf.AddPage();

pdf.PageLabels.AddRange(0, 2, PdfPageNumberingStyle.LowercaseRoman);
pdf.PageLabels.AddRange(3, PdfPageNumberingStyle.DecimalArabic);

pdf.Save("with-page-labels.pdf");

PDF viewer preferences

PDF viewer preferences are recommendations embedded in the document that suggest how a viewer should present it. For example, you can specify that the viewer should hide toolbars, center the window, or fit the window to the page. Viewer preferences complement page layout and open action settings.

Here is how to change PDF viewing preferences using Docotic.Pdf:

using var pdf = new PdfDocument();

pdf.ViewerPreferences.DisplayTitle = false;
pdf.ViewerPreferences.FitWindow = true;
pdf.ViewerPreferences.HideToolBar = true;
pdf.ViewerPreferences.HideMenuBar = true;
pdf.ViewerPreferences.HideWindowUI = true;
pdf.ViewerPreferences.CenterWindow = true;

pdf.Save("with-viewer-prefs.pdf");

Please note that, depending on their configuration, Adobe Acrobat and other viewers may ignore these preferences.

Page layout and page mode

Page layout determines how pages are arranged when the document opens: as a single page at a time, one-column continuous, or two-page spreads. Page mode controls which UI panels are visible on open: bookmarks/outlines, attachments, thumbnails, or none.

Here is how to specify that the created PDF should display as a two‑page spread, left page first, with the thumbnails panel visible on open:

using var pdf = new PdfDocument();

for (int i = 0; i < 7; i++)
{
    var page = i > 0 ? pdf.AddPage() : pdf.Pages[0];
    page.Canvas.FontSize = 36;
    page.Canvas.DrawString(100, 100, $"Page {i + 1}");
}

pdf.PageLayout = PdfPageLayout.TwoPageLeft;
pdf.PageMode = PdfPageMode.UseThumbs;

pdf.Save("with-layout-and-mode.pdf");

Saving PDFs

Docotic.Pdf can produce different PDF files or streams from the same document you created or edited. These outputs can conform to different versions of the PDF format, vary in byte length, and require different amounts of memory to generate.

The way the library produces the bytes of a PDF depends on the save options. When you don't explicitly specify save options, the Save, SignAndSave, and TimestampAndSave methods of a PdfDocument object use default settings. These defaults are carefully chosen and work well for most scenarios, but you may still need to adjust them.

Refer to the documentation for the PdfSaveOptions class for detailed information about available options and their default values. The sections below highlight some of the more important options and provide practical recommendations.

PDF version

Docotic.Pdf uses object streams by default to achieve better compression of the files it produces. As a result, the library creates PDF 1.5 files and streams by default.

PDF 1.5 requires Adobe Reader 6 (released in 2003) or newer to view the produced documents. This is usually not a problem unless you must support legacy tools, older viewers, or embedded devices that only accept older PDF versions.

Here is how to save with an older PDF file version:

using var pdf = new PdfDocument();

var options = new PdfSaveOptions
{
    Version = PdfVersion.Pdf14,
    UseObjectStreams = false,
};
pdf.Save("version-1.4.pdf", options);

To save with the PDF 1.4 version, object streams must also be disabled. The library will not use an older version if the document contains features introduced in later versions.

File size reduction

Several save options, when set to true, cause Docotic.Pdf to produce smaller files (byte‑wise): RemoveUnusedObjects, OptimizeIndirectObjects, WriteWithoutFormatting, and UseObjectStreams.

Here is how to produce PDFs without unreferenced objects and extra whitespace, with data tightly packed into object streams:

using var pdf = new PdfDocument();

var options = new PdfSaveOptions
{
    UseObjectStreams = true,
    RemoveUnusedObjects = true,
    OptimizeIndirectObjects = true,
    WriteWithoutFormatting = true,
};
pdf.Save("optimized.pdf", options);

These options are most effective when the PDF is fully rewritten. During an incremental save, they apply only to the newly added revision and cannot clean or optimize earlier parts of the file.

Incremental updates

Docotic.Pdf can update PDFs incrementally. When WriteIncrementally is true, the library appends changes to the existing file rather than rewriting it. The previous cross‑reference and object data remain intact. The appended data is called an incremental update, and the current update together with all earlier updates constitutes a new revision of the file.

Incremental updates are not possible for newly created documents because there is no previous revision to append to. The library ignores this option for new documents and writes them in non‑incremental mode.

When incremental updates are required

When adding a new digital signature to a document that already contains signatures, you must save the file incrementally. The same applies when updating a previously signed file with new annotations or form data. Rewriting the entire file in these cases would invalidate existing signatures.

At the same time, it is best to perform a non‑incremental (full) save before applying the first digital signature so that the signed baseline is a clean, fully rewritten file. Signing a document that contains structural issues in earlier revisions may lead to unexpected signature validation problems.

Incremental appends are also required in workflows that must preserve an auditable revision history or enforce append‑only document storage.

Benefits of using incremental updates

Incremental updates enable multiple signatures on the same file and allow a limited set of post‑signature modifications, such as filling form fields, without invalidating existing signatures.

This approach additionally offers faster saves for small changes because only the modified data is written. It also preserves the document's revision history, which is essential for auditing and other compliance‑driven workflows.

Problems and pitfalls to avoid

Incremental updates cannot apply global compression or remove obsolete objects across the entire file because they append only the modified objects. As a result, they generally produce larger and less optimized files than a full rewrite.

File size grows with each revision, even when no unused objects are present, because all previous revisions remain embedded in the file and continue to occupy space.

Sensitive or incorrect information from earlier revisions remains recoverable, and existing PDF format issues or structural defects in prior revisions are not corrected by appending new data.

Finally, some viewers and processing tools struggle with multi‑revision PDFs. Before relying on incremental updates, ensure that all consumers of your documents can handle files with multiple revisions.

Testing PDF output

Automated PDF testing protects releases from regressions in content and layout by comparing generated PDFs with baseline PDFs stored in your repository or artifact storage. Baselines help you detect accidental changes in text, fonts, images, or layout and reduce the need for manual QA in every build.

Combine structural checks, text extraction, and visual comparisons for the most reliable results.

Quick comparison of approaches

Method Speed Sensitivity Best for
Structural comparison Fast High: detects object-level changes Regression tests that must confirm two versions of the same document are structurally identical
Text extraction Fast Medium: usually ignores layout changes Verifying semantic content and tables
Visual diffing Slower High: detects both content and rendering/layout changes Catching visual regressions

Comparing document structure

Use PdfDocument.DocumentsAreEqual to compare PDF object graphs, the PDF version, and the document security store (DSS) while ignoring time‑dependent document properties. The method also ignores document metadata, trailer IDs, and other auto‑generated properties.

This method is ideal for PDF document test workflows that must ensure no unexpected objects were added or removed. DocumentsAreEqual supports file and stream overloads and can compare encrypted PDFs.

A complete example demonstrating this technique is available in the Docotic.Pdf samples. In addition to showing how to use the method in regular .NET applications, the sample also demonstrates how to use DocumentsAreEqual in Native AOT applications.

Verifying PDFs via extracted text

Extract text from the entire document at once or individual pages one after another and compare the strings. You can use text extraction options to fine-tune the extraction process, like excluding the rectangle with the footer. To ease the comparison, you may split the extracted text into lines or words.

For structured checks, first extract text with position, font, and other detailed information about each chunk, word, or character. Then compare each extracted element with the corresponding baseline element.

Detecting visual differences

Start by rendering PDF pages to images and compare each image with the baseline one. Use specialized libraries like ImageSharp.Compare or Magick.NET to detect differences in images.

Prefer strict pixel‑by‑pixel comparison so that every corresponding pixel in both images must match. If your requirements allow for small rendering variations, you can adjust the comparison logic to tolerate minor differences, but exact pixel equality provides the most reliable results.

Consider using hashing as a fast pre‑check to determine whether two images are likely identical without performing a full pixel comparison. Compute a SHA‑256 hash for each rendered image, and if the hashes match, the images are almost certainly the same. If the hashes differ, then run the full pixel‑by‑pixel comparison.

Conclusion

Docotic.Pdf provides a comprehensive, multi‑layered toolkit for creating and manipulating PDFs in .NET. Developers can choose between low‑level control with the Core API, high‑level document generation with the Layout API, or HTML‑to‑PDF conversion for workflows already built around web technologies.

The library also supports image‑based PDFs, template‑driven generation, and a rich set of interactive features such as annotations, links, bookmarks, JavaScript actions, and open actions.

To ensure reliability, Docotic.Pdf includes methods for testing PDF output so that changes in your application do not introduce regressions or unexpected differences.

Photo of Vitaliy Shibaev
Written by

Vitaliy is a lead developer of Docotic.Pdf and a co-founder of Bit Miracle. He is a proponent of clean code and automated testing.