scrapy-kafka

Kafka-based components for Scrapy. There are 2 components:

A custom Spider that waits for URLs to crawl via a Kafka topic. When there are no more messages to read for the topic, the Spider just stays idle.
A custom ItemPipeline component that stores a JSON-ified Item back into another Kafka topic.

Please see the example directory for how to use this.

Contributors

Contributors to scrapy-kafka, listed alphabetically:

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
docs		docs
example		example
scrapy_kafka		scrapy_kafka
.gitignore		.gitignore
LICENSE		LICENSE
README.rst		README.rst
requirements.txt		requirements.txt
setup.py		setup.py