Main author of the repository: Maris Sala Codes have been developed with CHCAA and Ida Nissen
Done as a project related to AU DATALAB under European Union's Next Generation Internet initiative
Code base has been developed at CHCAA as the newsFluxus model, more information here
This codebase includes codes developed to study potential trends in Reddit subreddits. The analysis uses newsFluxus and infodynamics to calculate novelty, resonance, and the Hurst exponent based on LDA topic modeling of subreddits related to human rights and technology.
.
├── dat/ # Output data
│ └── subreddits_incl_comments/
│ └── output/
│ ├── extra/ # Sampled subreddits
│ ├── fig/ # Figures: adaptline of top posts, beta timeseries, regline
│ └── mdl/ # Topic models and contents of topic modeling
│ └── testing_phase/ # Top tokens per subreddits
├── fig/ # Example figures of adaptline and regline
├── mdl/ # Irrelevant: Danish language models
├── res/ # Resources: stopwords
├── src/ # Source codes
│ ├── __pycache__
│ ├── archive/ # Original versions of main codes before they were modified for the project
│ ├── preparations/ # Preparing for topic modeling
│ │ ├── __init__.py
│ │ ├── betatimeseries.py
│ │ └── preptopicmodeling.py
│ ├── saffine/ # Detrending time series
│ │ ├── detrending_coeff.py
│ │ ├── detrending_method.py
│ │ └── multi_detrending.py
│ ├── tekisuto/ # Infodynamics source codes
│ │ ├── datasets/
│ │ ├── metrics/
│ │ ├── models/
│ │ ├── preprocessing/
│ │ ├── __init__.py
│ │ └── tekiutil.py
│ ├── visualsrc/ # Visualizing trends
│ ├── bow_mdl.py
│ ├── import_ndjson_files_incl_comments.py # Main code that loads subreddit posts and comments, joins together
│ ├── main_extractor_Ida_incl_comments_grundtvig.py # Main code
│ ├── main_extractor.py
│ ├── news_uncertainty.py
│ ├── signal_extraction.py
│ ├── spacy_parsing.py
│ └── topic_modeling_top_posts.py # Topic modeling of top trending posts per subreddit
├── .gitignore
├── LICENCE
└── README.md # Main information for this repository