This repository contains various datasets focused on tweets with locations in Australia.
1. Tweets on Australian Election
- Time Span/Geopolitical Info: Covers tweets related to the Australian Election during COVID-19 period.
- Data Source/Credit: Dataset provided by Kaggle, available https://www.kaggle.com/code/ratan123/data-analysis-of-tweets-on-australian-election/notebook#Checking-the-head-of-the-dataframes
- Cleaned Status: The dataset is cleaned.
- License: Please refer to the Kaggle page for specific licensing details.
2. IEEE
- Time Span/Geopolitical Info: Coronavirus (COVID-19) Geo-Tagged Tweets Dataset
- Data Source/Credit: Dataset is provided by IEEE , available https://ieee-dataport.org/open-access/coronavirus-covid-19-geo-tagged-tweets-dataset~
- Cleaned Status: The dataset is not cleaned. Tweets need to be hydrated and cleaned.
- License: Please check the IEEE page for specific licensing information.
3. Australian Cities Tweets
- Time Span/Geopolitical Info: Focuses on tweets from various cities across Australia
- Data Source/Credit: Dataset provided by Kaggle, available https://www.kaggle.com/datasets/wjia26/australian-cities-tweets
- Cleaned Status: The dataset is cleaned.
- License: Please refer to the Kaggle page for specific licensing details.
4. Australian Cricket Tweets
- Time Span/Geopolitical Info: Contains tweets related to cricket
- Data Source/Credit: Dataset provided by Kaggle, available https://www.kaggle.com/datasets/gpreda/cricket-tweets
- Cleaned Status: The dataset is cleaned.
- License: Please refer to the Kaggle page for specific licensing details.
5. Tweets Using the Hashtag #australianvalues (22-27 April 2017)
- Time Span/Geopolitical Info: Covers tweets using the hashtag #australianvalues from 22-27 April 2017.
- Data Source/Credit: Dataset available on Figshare, accessible https://figshare.com/articles/dataset/Tweets_using_the_hashtag_australianvalues_22-27_April_2017/4982747?file=8387669
- Cleaned Status: The dataset is not cleaned. Tweets need to be hydrated and cleaned.
- License: Check the Figshare page for licensing information.
6. Lpheada: Labelled Public Health Dataset
- Time Span/Geopolitical Info: This dataset is focused on public health-related tweets.
- Data Source/Credit: Available on GitHub, provided by the Data Intelligence for Health Lab, accessible https://github.com/data-intelligence-for-health-lab/Lpheada-Labelled-Public-HEAlth-DAtaset
- Cleaned Status: The dataset is not cleaned. Tweets need to be hydrated and cleaned.
- License: Please refer to the GitHub repository for licensing details.
7. TBCOV: Two Billion Multilingual COVID-19 Tweets with Sentiment, Entity, Geo, and Gender Labels
- Time Span/Geopolitical Info: This dataset contains tweets related to the COVID-19 pandemic over a 14-month period from February 1st, 2020 till March 31st, 2021.
- Data Source/Credit: Available on https://crisisnlp.qcri.org/tbcov, https://github.com/CrisisComputing/TBCOV
- Cleaned Status: The dataset is not cleaned. Tweets need to be hydrated and cleaned.
- License: Please refer to the GitHub repository for licensing details.
8. (🌇Sunset) 🇺🇦 Ukraine Conflict Twitter Dataset
- Time Span/Geopolitical Info: Datasets of tweets about the ongoing Ukraine Russia Conflict.
- Data Source/Credit: Available on https://www.kaggle.com/datasets/bwandowando/ukraine-russian-crisis-twitter-dataset-1-2-m-rows
- Cleaned Status: The dataset is not cleaned.
- License: Please refer to the Kaggle page for licensing details.
9. Tweets on ChatGPT - #ChatGPT
- Time Span/Geopolitical Info: Tweets on #ChatGPT from 30th Nov 2022 to 29th Jan 2023.
- Data Source/Credit: Available on https://www.kaggle.com/datasets/manishabhatt22/tweets-onchatgpt-chatgpt
- Cleaned Status: The dataset is not cleaned.
- License: Please refer to the Kaggle page for licensing details.
To access these datasets:
- For Kaggle Datasets:
- Visit the provided Kaggle URLs.
- Log in to your Kaggle account (or create one if you don't have it).
- Click on the "Download" button on the dataset page.
- For IEEE DataPort Dataset:
- Go to the IEEE DataPort URL.
- Sign in or register to access the dataset.
- Follow the instructions to download the data.
- For Figshare Dataset:
- Access the dataset through the Figshare link.
- Download the dataset directly by clicking the file link.
- For GitHub Repository:
- Navigate to the GitHub repository using the provided link.
- Clone the repository using the git clone command or download the dataset files directly.
This repository is licensed under the MIT License. For more details on the licensing of individual datasets, please refer to the specific dataset source pages mentioned above.