web-scraping

Star

Here are 5,786 public repositories matching this topic...

scrapy / scrapy

Star

Scrapy, a fast high-level web crawling & scraping framework for Python.

python crawler framework scraping crawling web-scraping hacktoberfest web-scraping-python

Updated Nov 14, 2024
Python

The best and simplest free open source web page change detection, website watcher, restock monitor and notification service. Restock Monitor, change detection. Designed for simplicity - Simply monitor which websites had a text change for free. Free Open source web page change detection, Website defacement monitoring, Price change notification

notifications monitoring self-hosted web-scraping website-monitor url-monitor change-alert change-detection changedetection website-change-monitor website-change-tracker website-monitoring change-monitoring website-watcher website-change-detector restock-monitor website-change-detection website-change-notification back-in-stock website-defacement-monitoring

Updated Nov 10, 2024
Python

apify / crawlee

Star

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

nodejs javascript npm crawler scraper automation typescript web-crawler headless scraping crawling web-scraping web-crawling headless-chrome apify puppeteer playwright

Updated Nov 13, 2024
TypeScript

Evil0ctal / Douyin_TikTok_Download_API

Sponsor

Star

🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具，支持API调用，在线批量解析及下载。

Updated Sep 26, 2024
Python

lorien / awesome-web-scraping

Star

List of libraries, tools and APIs for web scraping and data processing.

crawler spider scraping crawling web-scraping captcha-recaptcha webscraping crawling-framework scraping-framework captcha-bypass scraping-tool crawling-tool scraping-python crawling-python

Updated Oct 27, 2024
Makefile

alirezamika / autoscraper

Sponsor

Star

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

python crawler machine-learning scraper automation ai scraping artificial-intelligence web-scraping scrape webscraping webautomation

Updated Oct 12, 2024
Python

go-rod / rod

Star

A Chrome DevTools Protocol driver for web automation and scraping.

testing go golang scraper automation web chrome-devtools headless devtools crawling web-scraping cdp chrome-headless rod chrome-devtools-protocol devtools-protocol gorod

Updated Aug 19, 2024
Go

seleniumbase / SeleniumBase

Star

📊 Blazing fast Python framework for web crawling, scraping, testing, and reporting. Supports pytest. Stealth options: UC Mode and CDP Mode. Multiple tools and integrations.

Updated Nov 14, 2024
Python

mherrmann / helium

Star

Lighter web automation with Python

python firefox chrome webdriver selenium python3 web-scraping helium web-automation selenium-python

Updated Aug 23, 2024
Python

apify / crawlee-python

Star

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.