Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/ArchiveTeam/grab-site
The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
https://github.com/ArchiveTeam/grab-site
Document --no-warc-compression
Dri0m opened this issue over 6 years ago
Dri0m opened this issue over 6 years ago
Failed DNS resolutions are retried forever with --wpull-args=--retry-dns-error
ethus3h opened this issue over 6 years ago
ethus3h opened this issue over 6 years ago
scrape a page more than once
notslang opened this issue almost 7 years ago
notslang opened this issue almost 7 years ago
Added Centos7 Installation
raspher opened this pull request almost 7 years ago
raspher opened this pull request almost 7 years ago
Add support for Cloudflare DDoS protection screen
ivan opened this issue about 7 years ago
ivan opened this issue about 7 years ago
Nonsensical [Errno 8] Exec format error
ivan opened this issue about 7 years ago
ivan opened this issue about 7 years ago
ImportError: No module named 'dns.resolver'
ivan opened this issue about 8 years ago
ivan opened this issue about 8 years ago
Add Dockerfile to simplify installation
notslang opened this pull request over 8 years ago
notslang opened this pull request over 8 years ago
Allow starting crawls directly from the dashboard
brandongalbraith opened this issue over 8 years ago
brandongalbraith opened this issue over 8 years ago
Allow serving the dashboard with https://
rwoodpecker opened this issue almost 9 years ago
rwoodpecker opened this issue almost 9 years ago
Crawls sometimes hang forever
ivan opened this issue about 9 years ago
ivan opened this issue about 9 years ago
Enhancement idea: delay/concurrency by regex
ethus3h opened this issue about 9 years ago
ethus3h opened this issue about 9 years ago
Allow resuming a crawl
ivan opened this issue about 9 years ago
ivan opened this issue about 9 years ago