This project includes an analysis of Wordle games published on Twitter, the creation of models that imitate players and metrics used to evaluate them. The bulk of the work is in the Analysis folder which contains the multiple selfcontained steps of the project eachone with a Jupyter Notebook with all the code and graphs used in each one. The anonymized dataset can be found here. The project uses the following structure:
- Notebook 0: An example dataset is used to get familiar with the data.
- Notebook 1: Various libraries are used to extract over seven million tweets containing Wordle games.
- Notebook 2: The data retrieved is cleaned and processed.
- Notebook 3: An analysis is performed on the data.
- Notebook 4: The data is transformed and labelled using clustering techniques.
- Notebook 5: A genetic algorithm is developed in order to imitate the groups found in notebook 4.
- Notebook 6: Various tests are performed on the models created in order to asses their performance.