Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/ArchiveTeam/wpull

Wget-compatible web downloader and crawler.
https://github.com/ArchiveTeam/wpull

DNS lookup attempts for IP hostnames

JustAnotherArchivist opened this issue about 2 months ago
Last response on a `--max-redirect` limit is not written to WARC

JustAnotherArchivist opened this issue about 2 months ago
Error Running wpull

amryounis opened this issue 5 months ago
update inline condition for `link` element

iorgun opened this pull request 6 months ago
Corrupted gzipped sitemap causes OSError crash

JustAnotherArchivist opened this issue 7 months ago
Stop adding `data:` and `mailto:` URIs to the database

JustAnotherArchivist opened this issue 9 months ago
Abandoned?

Alseenrodelap opened this issue 10 months ago
Error in python 3.9

Dazmed707 opened this issue 11 months ago
Fix broken URLInfo.query_map attribute

chosak opened this pull request 12 months ago
Remove Tornado dependency

JustAnotherArchivist opened this issue about 1 year ago
url containing & character getting split

kashortiexda opened this issue over 1 year ago
ValueError: IPv6 addresses are 16 bytes long

Sanqui opened this issue over 1 year ago
CI test

Fusl opened this pull request over 1 year ago
AttributeError: module 'collections' has no attribute 'Mapping'

allanlaal opened this issue almost 2 years ago
Change references from EFNet to hackint in docs

TheTechRobo opened this pull request almost 2 years ago
FTP processor may start retrieval before logging the `Fetching` message

JustAnotherArchivist opened this issue almost 2 years ago
Always send the Host header first

Pokechu22 opened this pull request almost 2 years ago
Remove namedtuple; fix bugs

iacore opened this pull request over 2 years ago
Fix missing --reject argument type

masdeseiscaracteres opened this pull request over 2 years ago
Equivalent but differently encoded URLs break no-parent recursion

JustAnotherArchivist opened this issue almost 3 years ago
Always send the `Host` header first

JustAnotherArchivist opened this issue about 3 years ago
Migrate test suite to Drone

JustAnotherArchivist opened this issue over 3 years ago
Support for server-side image maps

JustAnotherArchivist opened this issue over 3 years ago
Handle `Refresh` header as a redirect

JustAnotherArchivist opened this issue over 3 years ago
FTP login information in URLs is lost

JustAnotherArchivist opened this issue over 3 years ago
OSError when the cookie jar gets cleared

JustAnotherArchivist opened this issue over 3 years ago
HTML in JavaScript leads to undecoded character references in URLs

JustAnotherArchivist opened this issue over 3 years ago
NUL byte in <link> href confuses libxml2-lxml parser

JustAnotherArchivist opened this issue over 3 years ago
Writing output to stdout (--output-document -) crashes with a TypeError

JustAnotherArchivist opened this issue almost 4 years ago
Abort downloads when reaching the --monitor-disk limit

JustAnotherArchivist opened this issue over 4 years ago
ValueError: Field missing colon.

JustAnotherArchivist opened this issue over 4 years ago
'ValueError: write to closed file' when using --output-document with redirects

JustAnotherArchivist opened this issue over 4 years ago
Website directory as folder with index.html file within

jaredxx1 opened this issue over 4 years ago
Dependency namedlist incompatible with Python 3.8

jayvdb opened this issue over 4 years ago
Multipart response whose connection gets killed is still written to WARC

JustAnotherArchivist opened this issue over 4 years ago
[Errno 1] Operation not permitted

makew0rld opened this issue over 4 years ago
Store cookies in database instead of in memory

JustAnotherArchivist opened this issue over 4 years ago
Unexpected recursion due to parsing HTML on an expected script

JustAnotherArchivist opened this issue over 4 years ago
Fix spaces in URL query string

alex73 opened this pull request over 4 years ago
Spaces in URL parameters get encoded as + instead of %20

JustAnotherArchivist opened this issue over 4 years ago
Fails to recurse on some sites when the homepage is a 404

JustAnotherArchivist opened this issue over 4 years ago
Unable to resolve onion hostnames when using http proxy

codsane opened this issue almost 5 years ago
Handle URL parsing errors

JustAnotherArchivist opened this issue almost 5 years ago
Replace buggy urllib.parse

JustAnotherArchivist opened this issue almost 5 years ago
Update Travis-CI badge to use this repo's URI

machawk1 opened this pull request almost 5 years ago
Travis-CI badge in README is for a different repository

machawk1 opened this issue almost 5 years ago
doc: Fix link formatting

Timmy opened this pull request almost 5 years ago
Invalid Syntax Error when doing anything

Czechball opened this issue about 5 years ago
IDNA crashes due to uncaught UnicodeErrors

JustAnotherArchivist opened this issue about 5 years ago
pip installs wrong version of tornado and html5lib

kelson42 opened this issue about 5 years ago
The DNS operation timed out

nihelmasell opened this issue about 5 years ago
Remove PhantomJS support

JustAnotherArchivist opened this issue about 5 years ago
Should wpull deduplicate redirect targets?

JustAnotherArchivist opened this issue over 5 years ago
Incorrect recursion after domain-changing redirects

JustAnotherArchivist opened this issue over 5 years ago
wpull seems to ignore the <base> tag for images

JustAnotherArchivist opened this issue over 5 years ago
Plugin deactivate method not called

JustAnotherArchivist opened this issue over 5 years ago
URL table performance woes

JustAnotherArchivist opened this issue over 5 years ago
python 3.7 compatibility

francisg-gc opened this pull request over 5 years ago
"Repeat redirects" (307 or 308) to a different host lead to an incorrect Host header

JustAnotherArchivist opened this issue over 5 years ago
Incorrect WARC-Payload-Digest values when transfer encoding is present

JustAnotherArchivist opened this issue over 5 years ago
First line of cookiejar must be non-empty and is ignored in parsing

JustAnotherArchivist opened this issue over 5 years ago
Fails to recurse on www.stevenholcomb.com with lxml

JustAnotherArchivist opened this issue over 5 years ago
Upgrade html5lib to 1.0.1

PromyLOPh opened this pull request over 5 years ago
Remove get_exception_message

PromyLOPh opened this pull request over 5 years ago
Error: Operation not permitted (SSL)

PromyLOPh opened this issue over 5 years ago
test_html_wrong_charset fails with lxml lxml==4.3.2

PromyLOPh opened this issue over 5 years ago
Remove support for PhantomJS

PromyLOPh opened this pull request over 5 years ago
Connections not released to HostPool

PromyLOPh opened this issue over 5 years ago
dnspython 1.13.0+ breaks exception message extraction test

JustAnotherArchivist opened this issue over 5 years ago
Travis: Test with Python 3.6, 3.7, as well as -dev

PromyLOPh opened this pull request over 5 years ago
Move requirements.txt to setup.py

PromyLOPh opened this pull request over 5 years ago
Update links after move to ArchiveTeam organization

PromyLOPh opened this pull request over 5 years ago
Help me assemble a sample of wpull warcs

wumpus opened this issue over 5 years ago
Outdated version on pypi

leahneukirchen opened this issue over 5 years ago
Non-deterministic logging order/different order in log file and meta WARC

JustAnotherArchivist opened this issue almost 6 years ago
AttributeError: 'CloseTimer' object has no attribute 'response'

JustAnotherArchivist opened this issue almost 6 years ago
Stuck HTTPS connections

JustAnotherArchivist opened this issue almost 6 years ago
Frequent "HTTP session did not complete" warnings with wpull 2.0.x

JustAnotherArchivist opened this issue almost 6 years ago
ValueError: Invalid IPv6 URL

JustAnotherArchivist opened this issue almost 6 years ago
Python 3.7 compatibility

tscs37 opened this issue almost 6 years ago
Same-page anchor links are treated as the "base" of a directory

DoomTay opened this issue almost 6 years ago
WIP: fix test suite

anarcat opened this pull request almost 6 years ago
Fix infinite loop in wpull.url.query_to_map

tsudoko opened this pull request almost 6 years ago
DNS Module errors

Tsuser1 opened this issue almost 6 years ago
Extract links from HTML5 media tags

tsudoko opened this pull request about 6 years ago
URL prioritisation, split meta WARCs, and miscellaneous bug fixes

JustAnotherArchivist opened this pull request about 6 years ago
No Module named Html5lib.tokenizer

buffer1900 opened this issue over 6 years ago
ImportError with Tornado 5.0

m4ntic0r opened this issue over 6 years ago
Minified CSS with "url()" results in grabbing of misparsed URLs

DoomTay opened this issue about 7 years ago
Make wpull work without a DNS server

inkuss opened this issue about 7 years ago
Crash while attempting to write a failed FTP request to WARC

JustAnotherArchivist opened this issue over 7 years ago
Crash while resolving a hostname

JustAnotherArchivist opened this issue over 7 years ago
Consider switching to html5-parser

Sanqui opened this issue over 7 years ago
Update requirements.txt link

prayashm opened this pull request over 7 years ago