Skip to content

Releases: ChristophWenk/PDFSorter

Switch PDF parsing approach from text to OCR

10 Nov 20:01
e98bb79
Compare
Choose a tag to compare

The text-based parsing approach did not work for many PDFs. They just produced gibberish. Therefore, switched the PDF reader library to PyMuPDF to create images from the PDF, which can then be parsed by EasyOCR.

Installing CUDA is not necessary but allows to use the GPU for processing, which decreases the processing time.

CI and Config Versioning

12 Aug 07:12
e12251c
Compare
Choose a tag to compare
  • CI
  • Test setup
  • Multi page reading
  • Config versioning
  • Cleanups

Minor Cleanups for first Release

10 Jul 14:35
f6143e3
Compare
Choose a tag to compare
  • Set Dry run to false
  • Update Readme
  • Conda enviroment file

PDFSorter Initial Release

10 Jul 11:31
1ac79f6
Compare
Choose a tag to compare
1.0.0

License (#6)