Merge PDF documents in C# and VB.NET

Business often merges PDF files for document archiving. While PDF merging sounds like a simple task, there are many pitfalls here. You should properly combine form fields, bookmarks, layers, and other PDF objects. You should also avoid duplicate objects to get a compact output file.

Docotic.Pdf library handles all merging nuances. It allows you to combine PDF documents in just a few lines of C# or VB.NET code.

Merge PDF documents

Docotic.Pdf comes with free and paid licenses. You can download the library and get an evaluation license key on the Docotic.Pdf download page.

Docotic.Pdf library 9.4.17467-dev Regression tests 14,760 passed Total NuGet downloads 4,415,970

PDF merging basics

The PdfDocument.Append methods allow you to append PDF documents from files, streams, or byte arrays. There are also options for appending protected files and for merging form fields.

Combine two PDF files

This sample code shows how to merge PDF files in C#:

using var pdf = new PdfDocument("first.pdf");
pdf.Append("second.pdf");
pdf.Save("merged.pdf");

Try the Merge two PDF documents code sample from GitHub.

Combine PDF streams

It is easy to adapt the previous example to work with streams instead of file paths. Here is the helper method for merging streams:

void Merge(Stream first, Stream second, Stream result)
{
    using var pdf = new PdfDocument(first);
    pdf.Append(second);
    pdf.Save(result);
}

Combine multiple PDF files

You can repeatedly call the Append method to append multiple PDF files:

string[] filesToMerge = ..;
using var pdf = new PdfDocument();
foreach (string file in filesToMerge)
    pdf.Append(file);

// Remove the empty page added by the PdfDocument() call
pdf.RemovePage(0);

pdf.Save(pathToFile);

Combine encrypted PDF files

There are Append overloads for merging encrypted documents:

using var pdf = new PdfDocument();
pdf.Append("encrypted.pdf", new PdfStandardDecryptionHandler("password"));
pdf.Save("merged.pdf");

You can find more information in the Decrypt PDF documents in C# and VB.NET article.

Combine PDF forms

Every form field in a PDF document must have a unique name. That might lead to a problem when the documents to merge contain fields with the same names. Docotic.Pdf provides the following merge strategies for conflicting form controls:

  • Rename appended controls when they conflict with existing controls
  • Merge appended controls to existing controls
  • Flatten appended controls
  • Do not append any controls
  • Append controls as is

By default, the library renames appended controls on conflict. You can choose an alternative strategy with the PdfMergingOptions class:

using var pdf = new PdfDocument("form.pdf");

var decryptionHandler = new PdfStandardDecryptionHandler(string.Empty);
var mergingOptions = new PdfMergingOptions()
{
    ControlMergingMode = PdfControlMergingMode.CopyAsKids
};
pdf.Append("form.pdf", decryptionHandler, mergingOptions);

pdf.Save("merged.pdf");

With the CopyAsKids mode, the library merges and synchronizes the conflicted controls. I.e., when you change one control, the second one will have the same value.

Reduce merged PDF file

PDF documents may contain identical objects, like fonts or images. When you merge such documents, the resulting document will contain copies of the same objects. Use the PdfDocument.ReplaceDuplicateObjects() method to optimize the merge result:

using var pdf = new PdfDocument("2024-05-28.pdf");
pdf.Append("2024-05-29.pdf");

pdf.ReplaceDuplicateObjects();

pdf.Save("merged.pdf");

You may reduce the output file size even more. For example, you can remove unused font glyphs or compress images. Read about supported compression options in the Compress PDF documents in C# and VB.NET article.

Customize PDF merging

Docotic.Pdf provides methods for extracting, reordering, or removing PDF pages. You can use them with the Append method to implement custom PDF merging tasks.

Append specific PDF pages

Docotic.Pdf also allows you to merge a part of a PDF document. There are different ways to do that. For example, you can split an added PDF document and append extracted pages. The following C# helper appends selected pages to the PdfDocument:

private static void AppendPart(PdfDocument pdf, string filePath, params int[] pagesToAppend)
{
    using var streamToAppend = new MemoryStream();
    using var other = new PdfDocument(filePath);
    using var extracted = other.CopyPages(pagesToAppend);
    var options = new PdfSaveOptions
    {
        UseObjectStreams = false
    };
    extracted.Save(streamToAppend, options);

    pdf.Append(streamToAppend);
}

Or you can append an entire PDF document and remove unnecessary pages. The following code sample appends the first two pages of the second.pdf:

using var pdf = new PdfDocument(@"first.pdf");

int pageCountBefore = pdf.PageCount;
pdf.Append(@"second.pdf");
pdf.RemovePages(pageCountBefore + 2);

pdf.Save(pathToFile);

Another solution relates to PDF imposition. You can read about that in the corresponding section.

Prepend PDF

The Append methods always append pages to the end of the current document. How to merge PDF files in different order? Sometimes you can change order of the Append calls. I.e., use

pdf.Append("first.pdf");
pdf.Append("second.pdf");

instead of

pdf.Append("second.pdf");
pdf.Append("first.pdf");

Or you can reorder pages after merging. This C# code moves the appended PDF document to the beginning:

using var pdf = new PdfDocument(@"second.pdf");

int pageCountBefore = pdf.PageCount;
pdf.Append(@"first.pdf");
pdf.MovePages(pageCountBefore, pdf.PageCount - pageCountBefore, 0);

pdf.Save(pathToFile);

Related code samples for reordering PDF pages:

Impose PDF

Docotic.Pdf allows you to combine multiple PDF pages on a single page. Use the PdfDocument.CreateXObject(PdfPage) method to create a PdfXObject object based on an existing page. Then, draw this object with desired scaling. Sample code:

using var src = new PdfDocument(@"src.pdf");
using var dest = new PdfDocument();
PdfXObject firstXObject = dest.CreateXObject(src.Pages[0]);
PdfXObject secondXObject = dest.CreateXObject(src.Pages[1]);

PdfPage page = dest.Pages[0];
page.Orientation = PdfPaperOrientation.Landscape;
double halfOfPage = page.Width / 2;
page.Canvas.DrawXObject(firstXObject, 0, 0, halfOfPage, page.Height, 0);
page.Canvas.DrawXObject(secondXObject, halfOfPage, 0, halfOfPage, page.Height, 0);

dest.Save("result.pdf");

Test the related Create XObject from page sample project from GitHub.

Merge as attachment

Sometimes, you may need to embed a PDF file to another one as an attachment. That is also possible. You can also add links to the embedded file on PDF pages:

using var pdf = new PdfDocument();

PdfFileSpecification first = pdf.CreateFileAttachment("first.pdf");
pdf.SharedAttachments.Add(first);

var bounds = new PdfRectangle(20, 70, 100, 100);
PdfFileSpecification fs = pdf.CreateFileAttachment("second.pdf");
pdf.Pages[0].AddFileAnnotation(bounds, fs);

pdf.Save("attachments.pdf");

You can find related code samples in the PDF attachments group on GitHub.

Merge in parallel threads

When merging many PDF files, it is possible to parallelize the code. The PdfDocument class is not thread-safe. So, we need to use separate PdfDocument objects in different threads. Look at the Merge PDF documents in parallel threads code sample for more detail.

This code shows how you can combine PDF streams parallelly:

Stream[] documentsToMerge = ..;

int rangeSize = 50;
while (documentsToMerge.Length > rangeSize)
{
    int partitionCount = (int)Math.Ceiling(documentsToMerge.Length / (double)rangeSize);
    var result = new Stream[partitionCount];

    var partitioner = Partitioner.Create(0, documentsToMerge.Length, rangeSize);
    Parallel.ForEach(partitioner, range =>
    {
        int startIndex = range.Item1;
        int count = range.Item2 - range.Item1;
        result[startIndex / rangeSize] = MergeToStream(documentsToMerge, startIndex, count);
    });
    documentsToMerge = result;
}

using PdfDocument final = GetMergedDocument(documentsToMerge, 0, documentsToMerge.Length);
final.Save("merged.pdf");


private static Stream MergeToStream(Stream[] streams, int startIndex, int count)
{
    using PdfDocument pdf = GetMergedDocument(streams, startIndex, count);

    var result = new MemoryStream();

    var options = new PdfSaveOptions
    {
        UseObjectStreams = false // speed up writing of intermediate documents
    };
    pdf.Save(result, options);
    return result;
}

private static PdfDocument GetMergedDocument(Stream[] streams, int startIndex, int count)
{
    var pdf = new PdfDocument();
    try
    {
        for (int i = 0; i < count; ++i)
        {
            var s = streams[startIndex + i];
            pdf.Append(s);
            s.Dispose();
        }

        pdf.RemovePage(0);

        pdf.ReplaceDuplicateObjects();

        return pdf;
    }
    catch
    {
        pdf.Dispose();
        throw;
    }
}

The above code splits input documents to groups of the rangeSize size. Then, the code merges each group into intermediate documents in parallel. The process continues until the number of input documents is small enough for the simple merging.

Parallel solution is not necessarily faster than the single-threaded version. Results may vary depending on the number of input documents and their sizes. In the sample code, the optimal value of the rangeSize parameter might be greater or less. You should benchmark your application to find the most effective implementation.

Conclusion

You can use Docotic.Pdf library to combine PDF documents in C# and VB.NET. It allows you to merge files, streams, or byte arrays. You can merge encrypted files, PDF forms, specific PDF pages. Docotic.Pdf also helps you to compress resulting files and save disk space.

Try code samples from the Docotic.Pdf samples repository on GitHub. You can get an evaluation license key and download the library on the Docotic.Pdf download page.