Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/ArchiveTeam/wpull
Wget-compatible web downloader and crawler.
https://github.com/ArchiveTeam/wpull
Fixes WARCRecoder.
279a1cf7a3f84322dad4afeebf3bdf2d1a9d865b authored almost 11 years ago by Christopher Foo <[email protected]>
279a1cf7a3f84322dad4afeebf3bdf2d1a9d865b authored almost 11 years ago by Christopher Foo <[email protected]>
Implements --max-redirect.
a61d24542f1d0a3c0272723e1afef9171b43c021 authored almost 11 years ago by Christopher Foo <[email protected]>
a61d24542f1d0a3c0272723e1afef9171b43c021 authored almost 11 years ago by Christopher Foo <[email protected]>
Removes saving both http and content files.
Saving the http file is sort of redundant and may use up valuable disk
space.
Refactor WARCRecorder to save files in --warc-tempfile.
cc1f6343cbc485a0de1e8c51da665e181b165668 authored almost 11 years ago by Christopher Foo <[email protected]>
cc1f6343cbc485a0de1e8c51da665e181b165668 authored almost 11 years ago by Christopher Foo <[email protected]>
Sets default for --tries to 20
9d7c4eb6b55113dcd92b4bcc780b3ddf6ce398f8 authored almost 11 years ago by Christopher Foo <[email protected]>
9d7c4eb6b55113dcd92b4bcc780b3ddf6ce398f8 authored almost 11 years ago by Christopher Foo <[email protected]>
Implements --retry-connrefused, --retry-dns-error.
d449d643dcd992db2bcd38254d609268c01484ad authored almost 11 years ago by Christopher Foo <[email protected]>
d449d643dcd992db2bcd38254d609268c01484ad authored almost 11 years ago by Christopher Foo <[email protected]>
Refactor some Engine methods.
ba01c86f573f69d21a515af4fac56803b4ab3caf authored almost 11 years ago by Christopher Foo <[email protected]>
ba01c86f573f69d21a515af4fac56803b4ab3caf authored almost 11 years ago by Christopher Foo <[email protected]>
Moves Recorder concern out of Engine.
183117f32c70dfdbb88ccf8ba6d8f4b70fbca2a0 authored almost 11 years ago by Christopher Foo <[email protected]>
183117f32c70dfdbb88ccf8ba6d8f4b70fbca2a0 authored almost 11 years ago by Christopher Foo <[email protected]>
Moves AppArgumentParser into options module.
175e9d5f38911b2a99b0919da2bfe74a06a33480 authored almost 11 years ago by Christopher Foo <[email protected]>
175e9d5f38911b2a99b0919da2bfe74a06a33480 authored almost 11 years ago by Christopher Foo <[email protected]>
Adds test for DNS resolver failure.
ca869bc05e6f080b1a961259b8feaadd93fd1803 authored almost 11 years ago by Christopher Foo <[email protected]>
ca869bc05e6f080b1a961259b8feaadd93fd1803 authored almost 11 years ago by Christopher Foo <[email protected]>
Suppresses debugging tracebacks. Adds more fetching logs.
cb0c917e74824e98302aff2a5e860a8bfad5fdf6 authored almost 11 years ago by Christopher Foo <[email protected]>
cb0c917e74824e98302aff2a5e860a8bfad5fdf6 authored almost 11 years ago by Christopher Foo <[email protected]>
Add explicit DNS resolution lookup failure error.
c1b9a05bddc6a71d5ccbb867744f9f8e591ae436 authored almost 11 years ago by Christopher Foo <[email protected]>
c1b9a05bddc6a71d5ccbb867744f9f8e591ae436 authored almost 11 years ago by Christopher Foo <[email protected]>
Implements explicit ConnectionRefused error.
9740ccfb334c2d9e3cd2308bc37a9f919708e4ee authored almost 11 years ago by Christopher Foo <[email protected]>
9740ccfb334c2d9e3cd2308bc37a9f919708e4ee authored almost 11 years ago by Christopher Foo <[email protected]>
Implements --referer, --user-agent.
8e6dfe919c3efdb6c201349b4135f214f5116c04 authored almost 11 years ago by Christopher Foo <[email protected]>
8e6dfe919c3efdb6c201349b4135f214f5116c04 authored almost 11 years ago by Christopher Foo <[email protected]>
Sets default --level to 5.
e1e45a1d9016b354a0d5ec57a78f6e3552856224 authored almost 11 years ago by Christopher Foo <[email protected]>
e1e45a1d9016b354a0d5ec57a78f6e3552856224 authored almost 11 years ago by Christopher Foo <[email protected]>
Replaces builder functions with Builder. Hook --no-parent, --span-hosts
f3acf6ecee09db297cc5a998d732054805d1510e authored almost 11 years ago by Christopher Foo <[email protected]>
f3acf6ecee09db297cc5a998d732054805d1510e authored almost 11 years ago by Christopher Foo <[email protected]>
Adds SpanHostsFilter.
ef43abca0bf063479f0f07a2eaaa727e1687ec5f authored almost 11 years ago by Christopher Foo <[email protected]>
ef43abca0bf063479f0f07a2eaaa727e1687ec5f authored almost 11 years ago by Christopher Foo <[email protected]>
Add progress recorder.
201f1c27b3d4e611a9646cad6ca13e2d79701e90 authored almost 11 years ago by Christopher Foo <[email protected]>
201f1c27b3d4e611a9646cad6ca13e2d79701e90 authored almost 11 years ago by Christopher Foo <[email protected]>
Allow document scraper to return None.
9c37d04e70eb4286c6f8a990f01e3ba41ef5783c authored almost 11 years ago by Christopher Foo <[email protected]>
9c37d04e70eb4286c6f8a990f01e3ba41ef5783c authored almost 11 years ago by Christopher Foo <[email protected]>
recorder.py: Remove abstractmethod decorators
The methods arn't actually required.
b04e4e8898ee6c2d2932fc1926f05ec1de9e526b authored almost 11 years ago by Christopher Foo <[email protected]>
Adds Statistics.increment()
88ab38819d8c6cb0ea573e806cd1ae3e32643b1e authored almost 11 years ago by Christopher Foo <[email protected]>
88ab38819d8c6cb0ea573e806cd1ae3e32643b1e authored almost 11 years ago by Christopher Foo <[email protected]>
Implements html tag following options.
8217e7d41becd86b3c56269dd0ca2e8198e66918 authored almost 11 years ago by Christopher Foo <[email protected]>
8217e7d41becd86b3c56269dd0ca2e8198e66918 authored almost 11 years ago by Christopher Foo <[email protected]>
Adds user agent to requests.
4a1a9cd71f35da5ae9d38d24a03a3ab653e4d897 authored almost 11 years ago by Christopher Foo <[email protected]>
4a1a9cd71f35da5ae9d38d24a03a3ab653e4d897 authored almost 11 years ago by Christopher Foo <[email protected]>
Sets the warcinfo id when writing warc.
9918a629bf2d7f8e0154aae4077848ee3f9ce73d authored almost 11 years ago by Christopher Foo <[email protected]>
9918a629bf2d7f8e0154aae4077848ee3f9ce73d authored almost 11 years ago by Christopher Foo <[email protected]>
Comments unsupported options for now.
daea16d9747ebe6313cad627f0e4544709bae656 authored almost 11 years ago by Christopher Foo <[email protected]>
daea16d9747ebe6313cad627f0e4544709bae656 authored almost 11 years ago by Christopher Foo <[email protected]>
Implements setting file timestamps from server.
cd86c2274063df916cfe5069f315f3940f650ba4 authored almost 11 years ago by Christopher Foo <[email protected]>
cd86c2274063df916cfe5069f315f3940f650ba4 authored almost 11 years ago by Christopher Foo <[email protected]>
Adds test for NameValueRecord str.
707c2301bab12ccb8c4e601867b2acb30a79caff authored almost 11 years ago by Christopher Foo <[email protected]>
707c2301bab12ccb8c4e601867b2acb30a79caff authored almost 11 years ago by Christopher Foo <[email protected]>
http.py: Improves handling of chunked transfer errors.
fb71f155245bcbc6c43944bd25024c089fb75e0b authored almost 11 years ago by Christopher Foo <[email protected]>
fb71f155245bcbc6c43944bd25024c089fb75e0b authored almost 11 years ago by Christopher Foo <[email protected]>
app.py: Hook up timeout options.
91ff169223759ef1543a44b7ef9e60c5e0ddced7 authored almost 11 years ago by Christopher Foo <[email protected]>
91ff169223759ef1543a44b7ef9e60c5e0ddced7 authored almost 11 years ago by Christopher Foo <[email protected]>
Finish implementation of connection timeouts.
f5fd6db60bc5b513bad81a55488b65cd9565fffb authored about 11 years ago by Christopher Foo <[email protected]>
f5fd6db60bc5b513bad81a55488b65cd9565fffb authored about 11 years ago by Christopher Foo <[email protected]>
Refactor timeout logic into iostream.
503923476f36dea205e467f4bf966600f31b05d3 authored about 11 years ago by Christopher Foo <[email protected]>
503923476f36dea205e467f4bf966600f31b05d3 authored about 11 years ago by Christopher Foo <[email protected]>
Add iostream connect timeout.
2590997d9ea77f275fa285cb096f260d01af62a5 authored about 11 years ago by Christopher Foo <[email protected]>
2590997d9ea77f275fa285cb096f260d01af62a5 authored about 11 years ago by Christopher Foo <[email protected]>
wait_future(): allow seconds to be None for cleaner calls
b23ebf289466804989742d61fe5a5663d0f6fb6d authored about 11 years ago by Christopher Foo <[email protected]>
b23ebf289466804989742d61fe5a5663d0f6fb6d authored about 11 years ago by Christopher Foo <[email protected]>
Add timeout to Resolver
124c6c1ff84249c0ecf63fb8356cd373a4aca878 authored about 11 years ago by Christopher Foo <[email protected]>
124c6c1ff84249c0ecf63fb8356cd373a4aca878 authored about 11 years ago by Christopher Foo <[email protected]>
Add wait_future()
c526cbb4f020c952ea528770c9061215097be177 authored about 11 years ago by Christopher Foo <[email protected]>
c526cbb4f020c952ea528770c9061215097be177 authored about 11 years ago by Christopher Foo <[email protected]>
Add OrderedDefaultDict and tests.
ea623cf6623b93cb93e9197e410cf8f10fd4bc32 authored about 11 years ago by Christopher Foo <[email protected]>
ea623cf6623b93cb93e9197e410cf8f10fd4bc32 authored about 11 years ago by Christopher Foo <[email protected]>
Fix member access errors in WARCRecorder.
798c4d3c958f76c66c17e97ac14db9584796e2b7 authored about 11 years ago by Christopher Foo <[email protected]>
798c4d3c958f76c66c17e97ac14db9584796e2b7 authored about 11 years ago by Christopher Foo <[email protected]>
Add WIP WARC Recorder.
7c61681a76a3c52300af46f25096ee19f449f07f authored about 11 years ago by Christopher Foo <[email protected]>
7c61681a76a3c52300af46f25096ee19f449f07f authored about 11 years ago by Christopher Foo <[email protected]>
Fixes setup.py to include all source files.
77225de02215d4460e09a72fc3c55b10bf767f38 authored about 11 years ago by Christopher Foo <[email protected]>
77225de02215d4460e09a72fc3c55b10bf767f38 authored about 11 years ago by Christopher Foo <[email protected]>
Support getting body file size.
89915380863c1e732c22b69083710534ad90428f authored about 11 years ago by Christopher Foo <[email protected]>
89915380863c1e732c22b69083710534ad90428f authored about 11 years ago by Christopher Foo <[email protected]>
Implement logging options.
9cf042feabdc282ae683e8202f4b088d4f71505c authored about 11 years ago by Christopher Foo <[email protected]>
9cf042feabdc282ae683e8202f4b088d4f71505c authored about 11 years ago by Christopher Foo <[email protected]>
Add stub converter module.
e7fc0b907469da9187c419ad3f325a0a754043a5 authored about 11 years ago by Christopher Foo <[email protected]>
e7fc0b907469da9187c419ad3f325a0a754043a5 authored about 11 years ago by Christopher Foo <[email protected]>
Implements IPv4/6 preference option.
ff17bf778bb1efa3da65a5920c89e4e23e4a7b6e authored about 11 years ago by Christopher Foo <[email protected]>
ff17bf778bb1efa3da65a5920c89e4e23e4a7b6e authored about 11 years ago by Christopher Foo <[email protected]>
Don't use non-default LIMIT clause with UPDATE feature.
b0ee3a8b5a1a8a989a57e56335255eae1b83ca93 authored about 11 years ago by Christopher Foo <[email protected]>
b0ee3a8b5a1a8a989a57e56335255eae1b83ca93 authored about 11 years ago by Christopher Foo <[email protected]>
Fix requirements.txt file.
1ba2110838a9fc6944b76d45e3679d784bb710ea authored about 11 years ago by Christopher Foo <[email protected]>
1ba2110838a9fc6944b76d45e3679d784bb710ea authored about 11 years ago by Christopher Foo <[email protected]>
Adds travis file.
08296b9267cdb6d70e2a4b4801eae111f6253e0b authored about 11 years ago by Christopher Foo <[email protected]>
08296b9267cdb6d70e2a4b4801eae111f6253e0b authored about 11 years ago by Christopher Foo <[email protected]>
Initial import.
4bed134a56c3ef0c6dde7890520ca1412a40f207 authored about 11 years ago by Christopher Foo <[email protected]>
4bed134a56c3ef0c6dde7890520ca1412a40f207 authored about 11 years ago by Christopher Foo <[email protected]>