Caution
This project is dedicated to helping analyze the massive amounts of data released every semester by Texas A&M University's Registrar's office.
- Parse grade report PDFs published by Texas A&M University's Registrar:
- Automatically adds parsed course data to a mySQL database.
-
Grade report parsing -
Automatically add data to a mySQL database backend -
Full rewrite with multithreading -
Fully automated grade report updates (auto add new reports)
- Open MySQL:
# open sql prompt $ sudo mysql
- Create mySQL database and user:
mysql> CREATE DATABASE database_name_here; mysql> CREATE USER 'database_user_name_here'@'localhost' IDENTIFIED BY 'database_user_password_here'; mysql> GRANT ALL PRIVILEGES ON database_name_here.* TO 'database_user_name_here'@'localhost'; mysql> FLUSH PRIVILEGES; mysql> exit;
- Install dependencies:
# automatically install python dependencies $ screen -SRD tamugd-parser $ python3 -m venv venv $ source venv/bin/activate $ pip install -r requirements.txt
- Generate prefs.json and update file:
# run in tamugd-parser/ $ python3 src/main.py $ nano prefs.json
By default, this tool only processes PDFs that are CURRENTLY available from the Registrar but it supports parsing PDFs from 2012 onwards (perhaps also older ones, but they have not been tested). It will automatically download PDFs as needed. If you have additional PDFs you can supply your own in the pdfs/
folder however you must use the -s or --start-year
flag.
Run python3 src/main.py --help
for more detailed information.
# Process all PDFs from the Registrar
$ python3 src/main.py
# Process PDFs from 2014 to present
$ python3 src/main.py --start-year 2014
# Process PDFs from 2014 to 2018
$ python3 src/main.py --start-year 2014 --end-year 2018
Once the script is running you can monitor its progress by using the following command:
# Building the database will take a while...
# Detach screen with CTRL+A then CTRL+D while running the main script
# Then run this to display a live feed from the newest logfile
$ cd logs && tail -f $(ls -t | head -1)