Warning This project uses a version of the Instagram API Platform deprecated in early 2020 and is now read-only.
A simple Ruby application to scrape Instagram posts, built using a set of helpers and proxies to bypass Instagram API's rate limit.
The app doesn't need any API key and can be used with a CLI app or a Sinatra-based API.
You must have Ruby 2.6.2 in order to run this application. If you don't already, install it with rbenv install 2.6.2
or rvm install ruby-2.6.2
.
Then, install the required gems with Bundler:
bundle install
require_relative "lib/igscraper"
options = {
min_likes: 50,
start_date: Date.new(2017, 01, 01),
end_date: Date.new(2020, 01, 01),
keywords: ["coding", "lewagon"],
}
@igscraper = InstagramScraper.new(options)
# Scrape posts by username or hashtag
@igscraper.scrape("@gabrisquonk")
@igscraper.scrape("@lewagonlisbon")
@igscraper.scrape("#lewagon")
# Read posts
@igscraper.posts # =>
# [
# {
# :target=>"@gabrisquonk",
# :target_url=>"https://www.instagram.com/gabrisquonk",
# :shortcode=>"BgnpVPEHusZ",
# :url=>"https://www.instagram.com/p/BgnpVPEHusZ",
# :likes=>92,
# :comments=>1,
# :caption=> "Can we do it again, please? 🙏 #Batch122 #DemoDay Last Friday @lewagon 🎤 🙌 #coding #learning #erasmusforadults",
# :date=>"2018-03-22",
# :country_code=>"PT",
# :publisher=>"Le Wagon Lisbon",
# :publisher_username=>"lewagonlisbon",
# :publisher_url=>"https://www.instagram.com/lewagonlisbon"
# },
# {
# ...,
# }
# ]
# Remove a post by shortcode
@igscraper.remove_post("BgnpVPEHusZ")
# Update the options
new_options = {
min_likes: 0,
start_date: Date.new(2019, 01, 01),
end_date: Date.today,
keywords: [],
}
@igscraper.update(new_options)
# Scrape new posts using the updated options
@igscraper.scrape("@fedesquonk")
# Generate a CSV file containing the current posts
@igscraper.to_csv("Desktop/posts.csv") # path relative to $HOME
⚠️ Make sure to have./bin
in your$PATH
and be in theigscraper
folder
Run locally, specifying one or multiple comma-separated usernames/hashtags as target (without #
before an hashtag):
igscraper -T @gabrisquonk,lewagon -l 50 -k coding,lisbon -o Desktop/posts.csv
Print the usage message in your terminal:
igscraper --help
A
Procfile
has also been included for easy deployment! 🚀
Start a local server on port 5000:
foreman start
Returns a CSV file containing the posts matching the specified parameters or an eventual error in JSON format.
-
URL:
/
-
Method:
GET
-
URL Params:
users=[list]
*hashtags=[list]
*keywords=[list]
min_likes=[number]
start_date=[date]
end_date=[date]
output=[filename]
*at least one has to be specified
If you wish to contribute please create a new issue or fork the repository and open a new pull request.