Docotic.Pdf 5.7 extracts text better


We have released Docotic.Pdf 5.7 on our site and on NuGet.

In this version we added ability to extract text as vector paths. For this we added PdfPage.GetObjects(PdfObjectExtractionOptions) method and PdfObjectExtractionOptions.ExtractTextAsPath property. Please take a look at them.

We also improved extraction of text with vertical writing mode. And fixed some bugs related to text extraction.

There are new features and improvements related to forms. Now you can flatten individual form fields using PdfControl.Flatten() method. The PdfDocument.GetControl method now performs searches not only by control name but also by control full name. Thanks to our customers, we fixed some forms filling related bugs.

As our users suggested, in the new version we added ability to extract file specifications associated with rich media annotations. Take a look at the new PdfRichMediaAnnotation class. And ability to extract raw contents of XMP metadata using one of the new XmpMetadata.Extract() methods.

As always, we improved support for broken and incorrect documents. And we fixed some bugs of our own.

Read about all new features and improvements in Docotic.Pdf 5.7 in the Version History document.

We encourage you to download and try the new version. This version is also available on NuGet.

Please tell us your thoughts about the new version using e-mail or via the support form. Don’t hesitate to write us your questions, suggest features or ask for help.

Posted in ,