Text Summariser

Generates summaries from texts - Wikipedia, Textbox and PDF. Uses NLTK for Python to enable tokenisation and core NLP features for Extractive Summarisation, and Hugging Face Transformers for Abstractive Summarisation, with Streamlit for front-end.

PDF Summariser

Uses Streamlit upload feature, and PDFPlumber to parse text in the PDF. Issues with academic papers which causes some text to become garbled. Works well on non-technical text.

Wikipedia Summariser

Uses BeautifulSoup to extract text from HTML before passing through the text summarisation engine.

Textbox Summariser

Basic textbox to allow for copy and paste entry of text for summarisation.

Extractive vs Abstractive

Extractive Summarisation as a technique focusses on determining key themes using frequency analysis of sentences in the corpus of text. Abstractive Summarisation uses Transformers to "understand" the key themes at a deeper level and write an entirely new summary, often with newly generated text which does not appear in the corpus itself.

Whilst Abstractive is more effective at generating summaries, due to the large nature of the model, it takes significantly longer to run than Extractive models.

Running Abstractive Summarisation

Due to size constraints, abstractive summarisation is too large for Heroku deployment (limited to 500MB on Heroku, total size is ~1.2GB). As such, to run the abstractive model, download the repo and run locally.

Installation Instructions

Install requirements - pip install -r requirements.txt
Run streamlit - streamlit run app.py

In the demo, you can test out extractive summarisation.

Live demo here: https://summary-generator.herokuapp.com/

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.DS_Store		.DS_Store
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
app.py		app.py
nltk.txt		nltk.txt
requirements.txt		requirements.txt
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text Summariser

PDF Summariser

Wikipedia Summariser

Textbox Summariser

Extractive vs Abstractive

Running Abstractive Summarisation

Installation Instructions

About

Releases

Packages

Contributors 2

Languages

License

harvinder-power/Text-Summariser

Folders and files

Latest commit

History

Repository files navigation

Text Summariser

PDF Summariser

Wikipedia Summariser

Textbox Summariser

Extractive vs Abstractive

Running Abstractive Summarisation

Installation Instructions

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages