Improved extraction of text and images, PDF rasterizer and other improvements in Docotic.Pdf 3.4

Published on 27 April 2012

Hello!

We released new version of Docotic.Pdf library.

The version 3.4 adds new major feature: PDF rasterizer. Now the library can be used to draw and print PDF documents. And of course you can save images of document pages in PNG and JPEG format. Take a look at PdfPage.Save and PdfPage.Draw methods. You might find new group of samples interesting too.

This version also features improved support for extraction of text and images. You can now extract text as collection of words (with their bounding rectangles) and even individual characters. We added new Extract text by words sample that demonstrates how to do this.

From now on the library might be used to extract page objects. I.e. you can get collection of text and image objects to perform sophisticated analysis of what’s drawn on a page. Take a look at Extract text and images sample to get an idea of what information could be retrieved.

The new version adds support for extraction of previously unsupported image types. You might also be interested in new ability to scale and resize existing images in PDF documents. This ability is useful for optimization of existing documents.

As with any release of Docotic.Pdf, we also fixed some bugs. This version fixes bugs related to opening of existing PDF documents and processing of fonts and images. We also made library to use less time and memory for opening of existing PDFs.

Read about all new features and improvements in Docotic.Pdf 3.4 in Version History document.

As always, we encourage you to download and try the new version.

Please write us about your findings with Docotic.Pdf using e-mail or via the support form. Don’t hesitate to write us your questions or ask for help.

Posted in PDF Library