Releases: ChristophWenk/PDFSorter
Releases · ChristophWenk/PDFSorter
Switch PDF parsing approach from text to OCR
The text-based parsing approach did not work for many PDFs. They just produced gibberish. Therefore, switched the PDF reader library to PyMuPDF to create images from the PDF, which can then be parsed by EasyOCR.
Installing CUDA is not necessary but allows to use the GPU for processing, which decreases the processing time.
CI and Config Versioning
- CI
- Test setup
- Multi page reading
- Config versioning
- Cleanups
Minor Cleanups for first Release
- Set Dry run to false
- Update Readme
- Conda enviroment file
PDFSorter Initial Release
1.0.0 License (#6)