Scrapy project
An open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.
Grow your team on GitHub
GitHub is home to over 50 million developers working together. Join them to grow your own development teams, manage permissions, and collaborate on projects.
Sign upRepositories
-
scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
-
itemloaders
Library to populate items using XPath and CSS with a convenient API
-
scrapyd-client
Command line client for Scrapyd server
-
itemadapter
Common interface for data container classes
-
protego
A pure-Python robots.txt parser with support for modern conventions.
-
-
parsel
Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors
-
scrapy-bench
A CLI for benchmarking Scrapy.
-
queuelib
Collection of persistent (disk-based) queues
-
-
scrapyd
A service daemon to run Scrapy spiders
-
quotesbot
This is a sample Scrapy project for educational purposes
-
scrapy-itemloader Archived
[Archived] Library to populate Scrapy items using XPath and CSS with a convenient API
-
scrapely
A pure-python HTML screen-scraping library
-
loginform
Fill HTML login forms automatically
-
url-chromium
url component from Chromium source code, forked from https://chromium.googlesource.com/chromium/src/url
-
base-chromium
base component forked from Chromium source https://chromium.googlesource.com/chromium/src/base/
-
dirbot
Scrapy project to scrape public web directories (educational) [DEPRECATED]
-
scrapy-bench-speedcenter
Forked from Parth-Vader/scrapy-bench-speedcenterCodespeed for scrapy-bench
-
pypydispatcher
A fork of http://pydispatcher.sourceforge.net/ with PyPy support
-
gsoc2014-integration-tests
GSoC2014 - Scrapy Integration tests project

