Merge PDF documents in C# and VB.NET
Business often merges PDF files for document archiving. While PDF merging sounds like a simple task, there are many pitfalls here. You should properly combine form fields, bookmarks, layers, and other PDF objects. You should also avoid duplicate objects to get a compact output file.
Docotic.Pdf library handles all merging nuances. It allows you to combine PDF documents in just a few lines of C# or VB.NET code.
Docotic.Pdf comes with free and paid licenses. Get the library and a free time-limited license key on the Download C# .NET PDF library page.
9.6.17807 14,868 passed Total NuGet downloads 5,134,090PDF merging basics
The PdfDocument.Append methods allow you to append PDF documents from files, streams, or byte arrays. There are also options for appending protected files and for merging form fields.
Combine two PDF files
This sample code shows how to merge PDF files in C#:
using var pdf = new PdfDocument("first.pdf");
pdf.Append("second.pdf");
pdf.Save("merged.pdf");
Try the Merge two PDF documents code sample from GitHub.
Combine PDF streams
It is easy to adapt the previous example to work with streams instead of file paths. Here is the helper method for merging streams:
void Merge(Stream first, Stream second, Stream result)
{
using var pdf = new PdfDocument(first);
pdf.Append(second);
pdf.Save(result);
}
Combine multiple PDF files
You can repeatedly call the Append method to append multiple PDF files:
string[] filesToMerge = ..;
using var pdf = new PdfDocument();
foreach (string file in filesToMerge)
pdf.Append(file);
// Remove the empty page added by the PdfDocument() call
pdf.RemovePage(0);
pdf.Save(pathToFile);
Combine encrypted PDF files
There are Append overloads for merging encrypted documents:
using var pdf = new PdfDocument();
pdf.Append("encrypted.pdf", new PdfStandardDecryptionHandler("password"));
pdf.Save("merged.pdf");
You can find more information in the Decrypt PDF documents in C# and VB.NET article.
Combine PDF forms
Every form field in a PDF document must have a unique name. That might lead to a problem when the documents to merge contain fields with the same names. Docotic.Pdf provides the following merge strategies for conflicting form controls:
- Rename appended controls when they conflict with existing controls
- Merge appended controls to existing controls
- Flatten appended controls
- Do not append any controls
- Append controls as is
By default, the library renames appended controls on conflict. You can choose an alternative strategy with the PdfMergingOptions class:
using var pdf = new PdfDocument("form.pdf");
var decryptionHandler = new PdfStandardDecryptionHandler(string.Empty);
var mergingOptions = new PdfMergingOptions()
{
ControlMergingMode = PdfControlMergingMode.CopyAsKids
};
pdf.Append("form.pdf", decryptionHandler, mergingOptions);
pdf.Save("merged.pdf");
With the CopyAsKids mode, the library merges and synchronizes the conflicted controls. I.e., when you change one control, the second one will have the same value.
Reduce merged PDF file
PDF documents may contain identical objects, like fonts or images. When you merge such documents, the resulting document will contain copies of the same objects. Use the PdfDocument.ReplaceDuplicateObjects() method to optimize the merge result:
using var pdf = new PdfDocument("2024-05-28.pdf");
pdf.Append("2024-05-29.pdf");
pdf.ReplaceDuplicateObjects();
pdf.Save("merged.pdf");
You may reduce the output file size even more. For example, you can remove unused font glyphs or compress images. Read about supported compression options in the Compress PDF documents in C# and VB.NET article.
Customize PDF merging
Docotic.Pdf provides methods for extracting, reordering, or removing PDF pages. You can use them with the Append method to implement custom PDF merging tasks.
Append specific PDF pages
Docotic.Pdf also allows you to merge a part of a PDF document. There are different ways to do that. For example, you can split an added PDF document and append extracted pages. The following C# helper appends selected pages to the PdfDocument:
private static void AppendPart(PdfDocument pdf, string filePath, params int[] pagesToAppend)
{
using var streamToAppend = new MemoryStream();
using var other = new PdfDocument(filePath);
using var extracted = other.CopyPages(pagesToAppend);
var options = new PdfSaveOptions
{
UseObjectStreams = false
};
extracted.Save(streamToAppend, options);
pdf.Append(streamToAppend);
}
Or you can append an entire PDF document and remove unnecessary pages. The following code sample
appends the first two pages of the second.pdf
:
using var pdf = new PdfDocument(@"first.pdf");
int pageCountBefore = pdf.PageCount;
pdf.Append(@"second.pdf");
pdf.RemovePages(pageCountBefore + 2);
pdf.Save(pathToFile);
Another solution relates to PDF imposition. You can read about that in the corresponding section.
Prepend PDF
The Append methods always append pages to the end of
the current document. How to merge PDF files in different order? Sometimes you can change order of
the Append
calls. I.e., use
pdf.Append("first.pdf");
pdf.Append("second.pdf");
instead of
pdf.Append("second.pdf");
pdf.Append("first.pdf");
Or you can reorder pages after merging. This C# code moves the appended PDF document to the beginning:
using var pdf = new PdfDocument(@"second.pdf");
int pageCountBefore = pdf.PageCount;
pdf.Append(@"first.pdf");
pdf.MovePages(pageCountBefore, pdf.PageCount - pageCountBefore, 0);
pdf.Save(pathToFile);
For more information about reordering PDF pages, read:
Impose PDF
Docotic.Pdf allows you to combine multiple PDF pages on a single page. Use the PdfDocument.CreateXObject(PdfPage) method to create a PdfXObject object based on an existing page. Then, draw this object with desired scaling. Sample code:
using var src = new PdfDocument(@"src.pdf");
using var dest = new PdfDocument();
PdfXObject firstXObject = dest.CreateXObject(src.Pages[0]);
PdfXObject secondXObject = dest.CreateXObject(src.Pages[1]);
PdfPage page = dest.Pages[0];
page.Orientation = PdfPaperOrientation.Landscape;
double halfOfPage = page.Width / 2;
page.Canvas.DrawXObject(firstXObject, 0, 0, halfOfPage, page.Height, 0);
page.Canvas.DrawXObject(secondXObject, halfOfPage, 0, halfOfPage, page.Height, 0);
dest.Save("result.pdf");
Test the related Create XObject from page sample project from GitHub.
Merge as attachment
Sometimes, you may need to embed a PDF file to another one as an attachment. That is also possible. You can also add links to the embedded file on PDF pages:
using var pdf = new PdfDocument();
PdfFileSpecification first = pdf.CreateFileAttachment("first.pdf");
pdf.SharedAttachments.Add(first);
var bounds = new PdfRectangle(20, 70, 100, 100);
PdfFileSpecification fs = pdf.CreateFileAttachment("second.pdf");
pdf.Pages[0].AddFileAnnotation(bounds, fs);
pdf.Save("attachments.pdf");
You can find related code samples in the PDF attachments group on GitHub.
Merge in parallel threads
When merging many PDF files, it is possible to parallelize the code. The
PdfDocument class is not thread-safe. So, we need to use
separate PdfDocument
objects in different threads. Look at the
Merge PDF documents in parallel threads code sample for more detail.
This code shows how you can combine PDF streams parallelly:
Stream[] documentsToMerge = ..;
int rangeSize = 50;
while (documentsToMerge.Length > rangeSize)
{
int partitionCount = (int)Math.Ceiling(documentsToMerge.Length / (double)rangeSize);
var result = new Stream[partitionCount];
var partitioner = Partitioner.Create(0, documentsToMerge.Length, rangeSize);
Parallel.ForEach(partitioner, range =>
{
int startIndex = range.Item1;
int count = range.Item2 - range.Item1;
result[startIndex / rangeSize] = MergeToStream(documentsToMerge, startIndex, count);
});
documentsToMerge = result;
}
using PdfDocument final = GetMergedDocument(documentsToMerge, 0, documentsToMerge.Length);
final.Save("merged.pdf");
private static Stream MergeToStream(Stream[] streams, int startIndex, int count)
{
using PdfDocument pdf = GetMergedDocument(streams, startIndex, count);
var result = new MemoryStream();
var options = new PdfSaveOptions
{
UseObjectStreams = false // speed up writing of intermediate documents
};
pdf.Save(result, options);
return result;
}
private static PdfDocument GetMergedDocument(Stream[] streams, int startIndex, int count)
{
var pdf = new PdfDocument();
try
{
for (int i = 0; i < count; ++i)
{
var s = streams[startIndex + i];
pdf.Append(s);
s.Dispose();
}
pdf.RemovePage(0);
pdf.ReplaceDuplicateObjects();
return pdf;
}
catch
{
pdf.Dispose();
throw;
}
}
The above code splits input documents to groups of the rangeSize
size. Then, the code merges each group
into intermediate documents in parallel. The process continues until the number of input documents
is small enough for the simple merging.
Parallel solution is not necessarily faster than the single-threaded version. Results may vary depending
on the number of input documents and their sizes. In the sample code, the optimal value of
the rangeSize
parameter might be greater or less. You should benchmark your application to find the most
effective implementation.
Conclusion
You can use Docotic.Pdf library to combine PDF documents in C# and VB.NET. It allows you to merge files, streams, or byte arrays. You can merge encrypted files, PDF forms, specific PDF pages. Docotic.Pdf also helps you to compress resulting files and save disk space.
Try code samples from the Docotic.Pdf samples repository on GitHub. You can get an evaluation license key and download the library on the Docotic.Pdf download page.