This repository contains the necessary code and resources for participating in the Zindi NLP Challenge titled "To Vaccinate or Not to Vaccinate." The challenge revolves around developing a machine learning model that can effectively determine the sentiment (positive, neutral, negative) in Twitter posts related to vaccination topics. By analyzing public sentiment, the solution aims to aid public health organizations and policymakers in devising effective strategies for vaccine communication and promotion.
Jupyter Notebook | Published Article | Link To Working App on Hugging Face |
---|---|---|
Notebook with code and full analysis | Published Article | Link to Deployed App |
Add the text you want to analyze and click on the SUBMIT button.
Work has begun towards developing a COVID-19 vaccine, and monitoring public sentiment towards vaccinations is crucial. The challenge involves classifying Twitter posts as positive, neutral, or negative regarding vaccinations.
The dataset consists of labeled Twitter posts, each assigned a sentiment label (-1 for negative, 0 for neutral, 1 for positive). The data
folder contains the following files:
Train.csv
: Labeled tweets for model training.Test.csv
: Tweets for model testing.SampleSubmission.csv
: Example submission format.
- Data Preprocessing: Tokenization, lowercasing, removing special characters, etc.
- Model Selection: Choosing a pre-trained Hugging Face transformer model.
- Fine-Tuning: Training the model on the training data.
- Validation: Evaluating the model on the validation set.
- Gradio App: Creating a user-friendly interface for the model using Gradio.
- Model Deployment: Uploading the model and pipeline to the HuggingFace platform.
- Dockerization: Containerizing the Gradio app for cloud deployment.
pip install -r requirements.txt
Explore the app
folder for the Gradio app code and docker
folder for Docker-related files.
- Run the Jupyter notebooks in the
notebooks
folder to preprocess data, train the model, and evaluate its performance. - Navigate to the
app
folder and run the Gradio app: cd app python app.py
markdown Copy code
- For cloud deployment, follow the Dockerization instructions in the
docker
folder.
Contributions are welcome! Feel free to open issues and pull requests.
This project is licensed under the MIT License - see the LICENSE file for details. You can customize the repository structure, README content, and instructions based on your actual project progress and needs. Make sure to replace yourusername with your actual GitHub username in the repository clone link.
Remember to add your code files, notebooks, and any additional resources to the respective folders in the repository. If you have specific questions or need further assistance, feel free to ask!