Skip to content

alexandre-75/ETL_Extract-Transform-Load

Repository files navigation

ETL_Extract-Transform-Load

logo

INTRODUCTION

Scraping of books.toscrape.com with BeautifulSoup4 and Requests, export data to .csv files and download cover images.

Implementation of the ETL process:

  • Extract relevant and specific data from the source website;
  • Transform, filter and clean data;
  • Load data into searchable and retrievable files.

THE PROJECT

Tested on Windows 10, Python 3.10.6.

1. Creating a virtual environment

python<version> -m venv nom_env_virtuel

Activate the environment  `mon_env_virtuel\Scripts\activate` (Windows)

2. Installing packages

pip<version> install -r requirements.txt