This project has code on how to download tweets by a certain topic using Tweepy, simple code to label them manually and the classification algorithms code. NN coded separetely.
If Positive and Neutral classes are combined and the task turned into vaccine misinformation classification where negative tweets are misinformation, then you can achieve an accuracy of 82.5% with Multinomial Naive Bayes.
Code:
- Collect Tweets.ipynb
- Label Tweets.ipynb
- Prediction Model.ipynb
- Sentiment_Analysis_of_Tweets_using_NN.ipynb
Dataset:
- tweets.xlsx, around 340 of 999 tweets labelled with one-hot encoding.
- https://www.kaggle.com/muhammadmdurrani/vaccinetweets
Other:
- auth.py, this contains the keys you get from your twitter developer APIs
- composition.py, for text preprossessing
Clone this repo in a virtual environment folder. Download the dependencies. Hopefully it will work.
$ pip install -r requirements.txt
Muhammad Mubashirullah Durrani – mdurrani.cs@gmail.com