Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/ArchiveTeam/wpull
Wget-compatible web downloader and crawler.
https://github.com/ArchiveTeam/wpull
Leading whitespace in sitemaps isn't stripped
JustAnotherArchivist opened this issue about 2 months ago
JustAnotherArchivist opened this issue about 2 months ago
Farsi in URL wrongly encoded to "Double UTF-8"
manu-cyber opened this issue 3 months ago
manu-cyber opened this issue 3 months ago
WordPress plugin All-In-One Event Calendar outlinks cause crashes with OSError: [Errno 24] Too many open files
JustAnotherArchivist opened this issue 4 months ago
JustAnotherArchivist opened this issue 4 months ago
DNS lookup attempts for IP hostnames
JustAnotherArchivist opened this issue 5 months ago
JustAnotherArchivist opened this issue 5 months ago
Last response on a `--max-redirect` limit is not written to WARC
JustAnotherArchivist opened this issue 5 months ago
JustAnotherArchivist opened this issue 5 months ago
Error Running wpull
amryounis opened this issue 8 months ago
amryounis opened this issue 8 months ago
update inline condition for `link` element
iorgun opened this pull request 9 months ago
iorgun opened this pull request 9 months ago
Corrupted gzipped sitemap causes OSError crash
JustAnotherArchivist opened this issue 10 months ago
JustAnotherArchivist opened this issue 10 months ago
Stop adding `data:` and `mailto:` URIs to the database
JustAnotherArchivist opened this issue 12 months ago
JustAnotherArchivist opened this issue 12 months ago
Abandoned?
Alseenrodelap opened this issue about 1 year ago
Alseenrodelap opened this issue about 1 year ago
Error in python 3.9
Dazmed707 opened this issue about 1 year ago
Dazmed707 opened this issue about 1 year ago
Fix broken URLInfo.query_map attribute
chosak opened this pull request over 1 year ago
chosak opened this pull request over 1 year ago
Remove Tornado dependency
JustAnotherArchivist opened this issue over 1 year ago
JustAnotherArchivist opened this issue over 1 year ago
url containing & character getting split
kashortiexda opened this issue over 1 year ago
kashortiexda opened this issue over 1 year ago
ValueError: IPv6 addresses are 16 bytes long
Sanqui opened this issue almost 2 years ago
Sanqui opened this issue almost 2 years ago
CI test
Fusl opened this pull request almost 2 years ago
Fusl opened this pull request almost 2 years ago
AttributeError: module 'collections' has no attribute 'Mapping'
allanlaal opened this issue about 2 years ago
allanlaal opened this issue about 2 years ago
Change references from EFNet to hackint in docs
TheTechRobo opened this pull request about 2 years ago
TheTechRobo opened this pull request about 2 years ago
FTP processor may start retrieval before logging the `Fetching` message
JustAnotherArchivist opened this issue about 2 years ago
JustAnotherArchivist opened this issue about 2 years ago
Always send the Host header first
Pokechu22 opened this pull request about 2 years ago
Pokechu22 opened this pull request about 2 years ago
Remove namedtuple; fix bugs
iacore opened this pull request almost 3 years ago
iacore opened this pull request almost 3 years ago
Fix missing --reject argument type
masdeseiscaracteres opened this pull request almost 3 years ago
masdeseiscaracteres opened this pull request almost 3 years ago
Equivalent but differently encoded URLs break no-parent recursion
JustAnotherArchivist opened this issue about 3 years ago
JustAnotherArchivist opened this issue about 3 years ago
Always send the `Host` header first
JustAnotherArchivist opened this issue over 3 years ago
JustAnotherArchivist opened this issue over 3 years ago
Migrate test suite to Drone
JustAnotherArchivist opened this issue over 3 years ago
JustAnotherArchivist opened this issue over 3 years ago
Support for server-side image maps
JustAnotherArchivist opened this issue over 3 years ago
JustAnotherArchivist opened this issue over 3 years ago
Handle `Refresh` header as a redirect
JustAnotherArchivist opened this issue almost 4 years ago
JustAnotherArchivist opened this issue almost 4 years ago
FTP login information in URLs is lost
JustAnotherArchivist opened this issue almost 4 years ago
JustAnotherArchivist opened this issue almost 4 years ago
SQLAlchemy 1.4 incompatibility: `sqlalchemy.orm.evaluator.UnevaluatableError: Cannot evaluate Select`
JustAnotherArchivist opened this issue almost 4 years ago
JustAnotherArchivist opened this issue almost 4 years ago
OSError when the cookie jar gets cleared
JustAnotherArchivist opened this issue almost 4 years ago
JustAnotherArchivist opened this issue almost 4 years ago
resolve_dns hook lacks information on IPv4/IPv6 preference and can't return more than one result
JustAnotherArchivist opened this issue almost 4 years ago
JustAnotherArchivist opened this issue almost 4 years ago
HTML in JavaScript leads to undecoded character references in URLs
JustAnotherArchivist opened this issue almost 4 years ago
JustAnotherArchivist opened this issue almost 4 years ago
NUL byte in <link> href confuses libxml2-lxml parser
JustAnotherArchivist opened this issue almost 4 years ago
JustAnotherArchivist opened this issue almost 4 years ago
LinkInfo/LinkContext's linked and inline fields for HTML-extracted URLs are not always bools
JustAnotherArchivist opened this issue almost 4 years ago
JustAnotherArchivist opened this issue almost 4 years ago
Writing output to stdout (--output-document -) crashes with a TypeError
JustAnotherArchivist opened this issue about 4 years ago
JustAnotherArchivist opened this issue about 4 years ago
Abort downloads when reaching the --monitor-disk limit
JustAnotherArchivist opened this issue over 4 years ago
JustAnotherArchivist opened this issue over 4 years ago
ValueError: Field missing colon.
JustAnotherArchivist opened this issue over 4 years ago
JustAnotherArchivist opened this issue over 4 years ago
Providing an empty string to `--post-data` makes wpull send GET instead of POST requests
jodizzle opened this issue over 4 years ago
jodizzle opened this issue over 4 years ago
'ValueError: write to closed file' when using --output-document with redirects
JustAnotherArchivist opened this issue over 4 years ago
JustAnotherArchivist opened this issue over 4 years ago
Website directory as folder with index.html file within
jaredxx1 opened this issue over 4 years ago
jaredxx1 opened this issue over 4 years ago
Dependency namedlist incompatible with Python 3.8
jayvdb opened this issue over 4 years ago
jayvdb opened this issue over 4 years ago
Multipart response whose connection gets killed is still written to WARC
JustAnotherArchivist opened this issue almost 5 years ago
JustAnotherArchivist opened this issue almost 5 years ago
[Errno 1] Operation not permitted
makew0rld opened this issue almost 5 years ago
makew0rld opened this issue almost 5 years ago
Store cookies in database instead of in memory
JustAnotherArchivist opened this issue almost 5 years ago
JustAnotherArchivist opened this issue almost 5 years ago
Unexpected recursion due to parsing HTML on an expected script
JustAnotherArchivist opened this issue almost 5 years ago
JustAnotherArchivist opened this issue almost 5 years ago
Fix spaces in URL query string
alex73 opened this pull request almost 5 years ago
alex73 opened this pull request almost 5 years ago
Spaces in URL parameters get encoded as + instead of %20
JustAnotherArchivist opened this issue almost 5 years ago
JustAnotherArchivist opened this issue almost 5 years ago
Fails to recurse on some sites when the homepage is a 404
JustAnotherArchivist opened this issue almost 5 years ago
JustAnotherArchivist opened this issue almost 5 years ago
Unable to resolve onion hostnames when using http proxy
codsane opened this issue about 5 years ago
codsane opened this issue about 5 years ago
Handle URL parsing errors
JustAnotherArchivist opened this issue about 5 years ago
JustAnotherArchivist opened this issue about 5 years ago
Replace buggy urllib.parse
JustAnotherArchivist opened this issue about 5 years ago
JustAnotherArchivist opened this issue about 5 years ago
Tildes in links in Shift-JIS pages are interpreted as %E2%80%BE when html-parser is set to libxml2-lxml
DoomTay opened this issue about 5 years ago
DoomTay opened this issue about 5 years ago
Update Travis-CI badge to use this repo's URI
machawk1 opened this pull request about 5 years ago
machawk1 opened this pull request about 5 years ago
Travis-CI badge in README is for a different repository
machawk1 opened this issue about 5 years ago
machawk1 opened this issue about 5 years ago
doc: Fix link formatting
Timmy opened this pull request about 5 years ago
Timmy opened this pull request about 5 years ago
Invalid Syntax Error when doing anything
Czechball opened this issue over 5 years ago
Czechball opened this issue over 5 years ago
IDNA crashes due to uncaught UnicodeErrors
JustAnotherArchivist opened this issue over 5 years ago
JustAnotherArchivist opened this issue over 5 years ago
pip installs wrong version of tornado and html5lib
kelson42 opened this issue over 5 years ago
kelson42 opened this issue over 5 years ago
The DNS operation timed out
nihelmasell opened this issue over 5 years ago
nihelmasell opened this issue over 5 years ago
Remove PhantomJS support
JustAnotherArchivist opened this issue over 5 years ago
JustAnotherArchivist opened this issue over 5 years ago
Should wpull deduplicate redirect targets?
JustAnotherArchivist opened this issue over 5 years ago
JustAnotherArchivist opened this issue over 5 years ago
Incorrect recursion after domain-changing redirects
JustAnotherArchivist opened this issue over 5 years ago
JustAnotherArchivist opened this issue over 5 years ago
wpull seems to ignore the <base> tag for images
JustAnotherArchivist opened this issue over 5 years ago
JustAnotherArchivist opened this issue over 5 years ago
Plugin deactivate method not called
JustAnotherArchivist opened this issue over 5 years ago
JustAnotherArchivist opened this issue over 5 years ago
URL table performance woes
JustAnotherArchivist opened this issue over 5 years ago
JustAnotherArchivist opened this issue over 5 years ago
python 3.7 compatibility
francisg-gc opened this pull request over 5 years ago
francisg-gc opened this pull request over 5 years ago
"Repeat redirects" (307 or 308) to a different host lead to an incorrect Host header
JustAnotherArchivist opened this issue over 5 years ago
JustAnotherArchivist opened this issue over 5 years ago
Incorrect WARC-Payload-Digest values when transfer encoding is present
JustAnotherArchivist opened this issue over 5 years ago
JustAnotherArchivist opened this issue over 5 years ago
First line of cookiejar must be non-empty and is ignored in parsing
JustAnotherArchivist opened this issue over 5 years ago
JustAnotherArchivist opened this issue over 5 years ago
Fails to recurse on www.stevenholcomb.com with lxml
JustAnotherArchivist opened this issue almost 6 years ago
JustAnotherArchivist opened this issue almost 6 years ago
Upgrade html5lib to 1.0.1
PromyLOPh opened this pull request almost 6 years ago
PromyLOPh opened this pull request almost 6 years ago
Remove get_exception_message
PromyLOPh opened this pull request almost 6 years ago
PromyLOPh opened this pull request almost 6 years ago
Error: Operation not permitted (SSL)
PromyLOPh opened this issue almost 6 years ago
PromyLOPh opened this issue almost 6 years ago
test_html_wrong_charset fails with lxml lxml==4.3.2
PromyLOPh opened this issue almost 6 years ago
PromyLOPh opened this issue almost 6 years ago
Remove support for PhantomJS
PromyLOPh opened this pull request almost 6 years ago
PromyLOPh opened this pull request almost 6 years ago
Connections not released to HostPool
PromyLOPh opened this issue almost 6 years ago
PromyLOPh opened this issue almost 6 years ago
dnspython 1.13.0+ breaks exception message extraction test
JustAnotherArchivist opened this issue almost 6 years ago
JustAnotherArchivist opened this issue almost 6 years ago
Travis: Test with Python 3.6, 3.7, as well as -dev
PromyLOPh opened this pull request almost 6 years ago
PromyLOPh opened this pull request almost 6 years ago
Move requirements.txt to setup.py
PromyLOPh opened this pull request almost 6 years ago
PromyLOPh opened this pull request almost 6 years ago
Update links after move to ArchiveTeam organization
PromyLOPh opened this pull request almost 6 years ago
PromyLOPh opened this pull request almost 6 years ago
Help me assemble a sample of wpull warcs
wumpus opened this issue almost 6 years ago
wumpus opened this issue almost 6 years ago
Outdated version on pypi
leahneukirchen opened this issue about 6 years ago
leahneukirchen opened this issue about 6 years ago
Non-deterministic logging order/different order in log file and meta WARC
JustAnotherArchivist opened this issue about 6 years ago
JustAnotherArchivist opened this issue about 6 years ago
AttributeError: 'CloseTimer' object has no attribute 'response'
JustAnotherArchivist opened this issue about 6 years ago
JustAnotherArchivist opened this issue about 6 years ago
Stuck HTTPS connections
JustAnotherArchivist opened this issue about 6 years ago
JustAnotherArchivist opened this issue about 6 years ago
Frequent "HTTP session did not complete" warnings with wpull 2.0.x
JustAnotherArchivist opened this issue about 6 years ago
JustAnotherArchivist opened this issue about 6 years ago
ValueError: Invalid IPv6 URL
JustAnotherArchivist opened this issue about 6 years ago
JustAnotherArchivist opened this issue about 6 years ago
Python 3.7 compatibility
tscs37 opened this issue about 6 years ago
tscs37 opened this issue about 6 years ago
Same-page anchor links are treated as the "base" of a directory
DoomTay opened this issue about 6 years ago
DoomTay opened this issue about 6 years ago
WIP: fix test suite
anarcat opened this pull request about 6 years ago
anarcat opened this pull request about 6 years ago
Fix infinite loop in wpull.url.query_to_map
tsudoko opened this pull request about 6 years ago
tsudoko opened this pull request about 6 years ago
DNS Module errors
Tsuser1 opened this issue about 6 years ago
Tsuser1 opened this issue about 6 years ago
Extract links from HTML5 media tags
tsudoko opened this pull request over 6 years ago
tsudoko opened this pull request over 6 years ago
URL prioritisation, split meta WARCs, and miscellaneous bug fixes
JustAnotherArchivist opened this pull request over 6 years ago
JustAnotherArchivist opened this pull request over 6 years ago
No Module named Html5lib.tokenizer
buffer1900 opened this issue almost 7 years ago
buffer1900 opened this issue almost 7 years ago
ImportError with Tornado 5.0
m4ntic0r opened this issue almost 7 years ago
m4ntic0r opened this issue almost 7 years ago
Minified CSS with "url()" results in grabbing of misparsed URLs
DoomTay opened this issue over 7 years ago
DoomTay opened this issue over 7 years ago
Make wpull work without a DNS server
inkuss opened this issue over 7 years ago
inkuss opened this issue over 7 years ago
Crash while attempting to write a failed FTP request to WARC
JustAnotherArchivist opened this issue over 7 years ago
JustAnotherArchivist opened this issue over 7 years ago
Crash while resolving a hostname
JustAnotherArchivist opened this issue over 7 years ago
JustAnotherArchivist opened this issue over 7 years ago