Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/ArchiveTeam/grab-site

The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
https://github.com/ArchiveTeam/grab-site

Can Grab-site be used in W7 with pip?

Snippet24816 opened this issue about 1 month ago
Getting 502 Bad Gateway Errors

syberphunk opened this issue 3 months ago
Fix FB-RE2 build error in setup.py

dannypage opened this pull request 6 months ago
Failed building wheel for fb-re2

mariospicross opened this issue 10 months ago
Dashboard

drzo opened this pull request 10 months ago
xFormers Support?

Astra060 opened this issue 11 months ago
Fallback to re if re2 can't be imported

rebane2001 opened this pull request 11 months ago
Fix --which-wpull-command not working correctly with certain paths

rebane2001 opened this pull request 11 months ago
fb-re2 dependency clang compile error on macOS Sonoma

xor-gate opened this issue about 1 year ago
Is it possible to crawl only a domain and its subdomains?

ghost opened this issue about 1 year ago
Support python 3.9-3.12

fruzitent opened this pull request about 1 year ago
Add instructions for when using nix profiles

tripleo1 opened this pull request about 1 year ago
Add instructions for when using nix profiles

tripleo1 opened this pull request about 1 year ago
Grab site is not actually compatible with python 3.8

cenodis opened this issue over 1 year ago
is it possible to output regular files instead of warc?

ftc2 opened this issue over 1 year ago
grab-site not displaying any content on Port 29000, but installed and running

DominicBilke opened this issue almost 2 years ago
Add upload option

upintheairsheep opened this issue about 2 years ago
Debian/Ubuntu install instructions fail on Raspbian

Billybangleballs opened this issue about 2 years ago
Add a --no-global-igset option

ivan opened this issue over 2 years ago
Record grab-site version in WARC headers

JustAnotherArchivist opened this pull request over 2 years ago
Log settings changes and ignores

JustAnotherArchivist opened this pull request over 2 years ago
No messege on Dashboard

CircleCrop opened this issue over 2 years ago
install error in macOS Catalina

LeeBinder opened this issue over 2 years ago
Update macOS install script to reflect Python 3.8.x (rather than 3.7)

LeeBinder opened this issue over 2 years ago
Make it work again in Python 3.10

iacore opened this pull request almost 3 years ago
Syntax Error on run

trentwiles opened this issue almost 3 years ago
Should we add an anti-porn igset?

TheTechRobo opened this issue almost 3 years ago
Dubious quickmod2 SMF forum ignore

TheTechRobo opened this issue almost 3 years ago
README: remove outdated "non-SMF forums"

TheTechRobo opened this pull request almost 3 years ago
Resuming a WARC after hard "No space left on device" error message?

Preservation-Quest opened this issue about 3 years ago
Update README.md

Preservation-Quest opened this pull request about 3 years ago
multiple --wpull-args

TheTechRobo opened this pull request about 3 years ago
How do you add custom hooks now?

TheTechRobo opened this issue about 3 years ago
Pause gracefully if OSError (No space left on device)

TheTechRobo opened this issue about 3 years ago
Add some Tumblr ignores to global igset

TheTechRobo opened this issue about 3 years ago
Add SimpleMachineForum ignores to `forums` igset

TheTechRobo opened this pull request about 3 years ago
On the dashboard, make the background colour ACTUALLY a background colour

TheTechRobo opened this issue about 3 years ago
Add SimpleMachineForums igsets

TheTechRobo opened this issue about 3 years ago
No module named 'autobahn'

vitacell opened this issue about 3 years ago
Backslash to Forward slash correction

acrois opened this issue over 3 years ago
Fix ludios_wpull to support SQLAlchemy 1.4

ivan opened this issue over 3 years ago
Error while starting a crawl in docker container

Z2Up1UwcaYOyZq opened this issue over 3 years ago
Full Docker support

acrois opened this pull request over 3 years ago
infinite recursion on offsite links?

TheTechRobo opened this issue over 3 years ago
Ignore errors and keep crawling

TowardMyth opened this issue over 3 years ago
Project Evolution

acrois opened this issue over 3 years ago
What does the ID do?

TheTechRobo opened this issue over 3 years ago
Document `--wpull-args=--no-warc-compression`

TheTechRobo opened this pull request over 3 years ago
Change settings mid-crawl

TheTechRobo opened this issue over 3 years ago
Grab-site gets only a single page

mathuryash5 opened this issue over 3 years ago
Cookies not staying

TheTechRobo opened this issue over 3 years ago
clearer error when URL is invalid

TheTechRobo opened this pull request over 3 years ago
Ignore local/lan-only hosts (and invalid domains).

jtagcat opened this issue over 3 years ago
--no-offsite-links doesn't work

tripleo1 opened this issue over 3 years ago
Dockerfile?

818S opened this issue over 3 years ago
Can't evaluate Select

TheTechRobo opened this issue over 3 years ago
Update setup.py

PythonCoderAS opened this pull request over 3 years ago
Ignore set: XenForo 1/2 and PostNuke forum engines

nekto-nekto opened this pull request almost 4 years ago
del

nekto-nekto opened this issue almost 4 years ago
Issue-175: First pass at creating a Dockerfile for Nix that actually runs

bknowles opened this pull request about 4 years ago
Add a Dockerfile for running grab-site in a Nix-based container

bknowles opened this issue about 4 years ago
Can't build lxml.etree (on macOS)

bknowles opened this issue about 4 years ago
[wpull] 'cython_function_or_method' object has no attribute 'lower'

tempname1024 opened this issue about 4 years ago
[BUG] Twitter pages potentially not downloading correctly

Coloradohusky opened this issue about 4 years ago
Bash script for automatic upload

raspher opened this pull request about 4 years ago
pull args for http-auth (e.g. --user --password) are ignored

mep85 opened this issue over 4 years ago
Regexp exclusion problem

manueldeprada opened this issue over 4 years ago
Change wpull args during a crawl

Coloradohusky opened this issue over 4 years ago
ImportError: cannot import name 'SSLCertificateError'

dragonxtek opened this issue over 4 years ago
Make WARC files searchable

Svekla opened this issue over 4 years ago
Pip build missing required package?

cfcs opened this issue about 5 years ago
cannot import name 'SSLCertificateError'

mkrzmr opened this issue about 5 years ago
More intelligent protocol selection

masterX244 opened this pull request about 5 years ago
--finished-warc-dir= not working for me

BradCoffield opened this issue over 5 years ago
What does the error status in URL queue mean?

Phasip opened this issue over 5 years ago
Possible to run in the cloud?

BradCoffield opened this issue over 5 years ago
DNS operation timed out

nihelmasell opened this issue over 5 years ago
Best way to grab this page?

sardaukar opened this issue over 5 years ago
Errors on initial URLs are retried forever

JustAnotherArchivist opened this issue over 5 years ago
Continuing or updating a grab

nihelmasell opened this issue over 5 years ago
Crawl eventually becomes nothing but "Disconnected from ws:// server:"...

BradCoffield opened this issue over 5 years ago
Add simplistic Dockerfile

Fusl opened this pull request almost 6 years ago
wpull crash when http_proxy is set

yi opened this issue almost 6 years ago
Reference git repo in install_requires

Fusl opened this pull request almost 6 years ago
Seeking new maintainer / project owner

ivan opened this issue almost 6 years ago
dashboard: Home/PgUp/PgDn/End keys usually fail in Firefox

ivan opened this issue about 6 years ago