You can extract text from PDF files using Docotic.Pdf library.
Text can be extracted from a page at a time or from a whole document at once.
The library supports the extraction of plain and formatted text. Additionally, you can extract separate words, chars, or text chunks with their coordinates.
In case you need to perform a sophisticated analysis, there is also the ability to extract text, paths and image objects in one collection.
Extracted images can be saved as TIFF and JPEG images.
The library does not recompress images while extracting them. You will get images with the same quality as in PDF.
You can get information about where on a page images are actually drawn.
You can also extract text as vector paths using PdfPage.GetObjects(PdfObjectExtractionOptions) overload. This feature can be used to flatten text.