Skip to content

Designed and implemented a multi-modal hashtag generation pipeline that suggests new hashtags for an Instagram post with images & text. The approach used CNN for image classification and glove-embeddings for recommending semantically similar hashtags.

License

Notifications You must be signed in to change notification settings

VatsalSoni301/IRE-hashtag-generation

 
 

Repository files navigation

IRE-Hashtag-Generation

Introduction

  • The problem is to generate the hashtags for a given multi-modal post that could be a text or an image. The generated hashtags should be relevant to the existing hashtags that may present along with the post.

Prerequisites

  • python3
  • pandas
  • numpy
  • nltk
  • word2vec
  • glove

Dataset used

https://drive.google.com/drive/u/2/folders/1IakHwyTgaTKxZa0RqQ99TRuamhEi-8xp

Code

Training Text-Model

All different models have been kept in different files.

  • baseline.ipynb :- This file contains baseline model
  • generate_hashtags_from_text_all.ipynb :- This file contains global model trained on a complete corpus (including all 8 topics)
  • generate_hashtags_from_text_multiple.ipynb :- This file contains word2vec and glove model that is trained for each topic.
  • generate_hashtags_from_text.ipynb :- This file contains word2vec and glove model trained only on travel topic
  • generate_top_hashtags.ipynb :- This file contains code to generate topic wise top-100 hashtags from a corpus.
  • train_on_top.ipynb :- This file contains word2vec model merged with pre trained wikipedia glove embedding to predict hashtags.

Testing Text-Model

Use following files to test different models and measure accuracy.

  • test_text.ipynb :- This file contains code to evaluate word2vec model and measure the accuracy.
  • test_text_glove.ipynb :- This file contains code to evaluate glove model and measure the accuracy.
  • test_text_global.ipynb :- This file contains code to evaluate global word2vec model and measure the accuracy.

Links

  • Web page link
    • https://rebrand.ly/ire-hashtag-generation
  • YouTube video link
    • https://rebrand.ly/ire-hashtags-generation-video

About

Designed and implemented a multi-modal hashtag generation pipeline that suggests new hashtags for an Instagram post with images & text. The approach used CNN for image classification and glove-embeddings for recommending semantically similar hashtags.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 98.4%
  • Python 1.6%