Skip to content

This repository is about the group polarization detection model proposed in our paper.

License

Notifications You must be signed in to change notification settings

Levia-Mobius/HG-PD-Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HG-PD-Model

This repository is the official implementation of the group polarization detection model proposed in our paper "Heterogeneous Graph-based Polarization Detection (HG-PD): a Model Balancing Crude Processing with Rich Semantics".

What is HG-PD and Why?

Previous studies examining the detection and analysis of online group polarization have primarily concentrated on the single data type, such as self-reported measures, texts and graph structure. Typically these approaches depict group polarization as two factions with opposing viewpoints. However, such methodologies encounter limitations when applied to entertainment topics. To fill this research gap, this paper proposes a novel self-supervised model, termed HG-PD, for polarization detection on online social media.

Leveraging a heterogeneous graph, the model integrates multiple data types. Subsequently, Graph Neural Networks are then utilized to learn the user nodes' representations, guided by MCR2 loss function, for the downstream clustering task. Utilizing a real-world dataset, our model adeptly discerns nuanced differences among users with similar stances, transcending the traditional dichotomy. Furthermore, ablation experiments demonstrate that incorporating multifaceted information enriches the semantic depth of the graph, thereby furnishing meaningful interpretations that facilitate group polarization detection.

Requirements

  • Python version: 3.10.13
  • All packages versions are listed in package_version.txt

Usage Instruction

  1. You can refer to Data_description.txt for more information about .csv .xlsx .pt and .npy files in our codes.
  2. Code files Include 2 types:
    • Python scripts .py for collecting Sina Weibo's data (Sina_crawl) and some model-training related functions
    • Jupyter files .ipyn for (a) data processing; (b) all experiments in paper; and (c) visualization for HG-PD (i.e., exp3)
    • All Jupyter files are in 2 language versions, i.e., Chinese and English, for better understanding :D
  3. Python scripts (.py)
    • Sina_crawl: Used for crawling the data we need from Sina weibo (You can use it for crawling other Sina Weibo posts)
    • userInter: HomoG-based model framework for exp2
    • mcr_HGPD: HG-PD model framework for exp3
    • mcrLoss: MCR2 loss function
    • augment: Data augmentation
    • other_func: Used for constructing membership matrix \Pi
    • savePara: Used for saving loss .csv and model states .pt
  4. Jupyter files (.ipynb)
    • Data_processing: Include all data processing steps for 3 experiments
    • K-Prototype: Inmplementation of exp1 in our paper; Results are saved in Train_record/KPrototype
    • Ablation: Implementation of exp2 with related visualizations in our paper; Training results are saved in Train_record/Ablation and visualizations in Visualization
    • Model: Implementation of exp3 in our paper; Results are saved in Train_record/Model
    • Analysis_visualize: Visualizations of exp3 in our paper; Figures are saved in Visualization
    • Abnormal_compar: Model Comparisons Experiments in our paper; including 5 GAD models for GP detection and corresponding analysis & visualization (uploaded on 14/09/2024)
  5. About Train_record
    • We just put the best model state in Train_record folder rather than putting all .pt files (or otherwise that will be many many files...).
    • If you are interested in all files, I will put a Google Drive link here
  6. About Abnormal_result (uploaded on 14/09/2024)
    • Including (a) detection results from all 5 GAD models; (b) csv file involved in analysis of all 5 GAD models

Also feel free to post any issues via Github.

Contact

If you have any question on the code, feel free to contact ag.wrld.s@gmail.com or zili7472.uni.sydney.edu.au.