This repository contains the code for parsing approximately 1.8k HTML pages of UofT PEY co-op job postings (from September 2023 to May 2024) to a single sqlite3
database file.
See releases to access the latest sqlite3
database with PEY Co-op job postings.
parse_to_db.py
: main Python script used for parsing the HTML pages.requirements.txt
: required imports
To get started with this project, clone the repository and install the necessary Python dependencies.
git clone https://github.com/sadmanca/uoft-pey-coop-job-postings-analysis.git
cd uoft-pey-coop-job-postings-analysis
pip install -r requirements.txt
To parse the HTML pages and store the data in a database, run the parse_to_db.py
script.
python parse_to_db.py
The index.qmd
file contains additional information about the job postings. This includes details such as the job title, company, location, and job description. This information is used to enrich the data obtained from the HTML pages.
Contributions are welcome! Please read the contributing guidelines before making any changes.
This project is licensed under the MIT License. See the LICENSE file for more details.