Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/ArchiveTeam/wpull

Wget-compatible web downloader and crawler.
https://github.com/ArchiveTeam/wpull

testing.ftp: Add get_url(). Fix unsafe-thread writer.close()

99d0d076963198ded919c7bfb00ab5cc89d25344 authored over 10 years ago by Christopher Foo <[email protected]>
WIP Add FTP client

ef1e0bf015541d8943cdc8df6da19a2bcf730325 authored over 10 years ago by Christopher Foo <[email protected]>
fixup! Abstract client and client session.

b291bdb14ec1c5f34430d756de024be28e2d1ce5 authored over 10 years ago by Christopher Foo <[email protected]>
connecion.ConnectionPool.session: Update docstring.

[ci skip]

0bd2577ab43a1c9d4898d35182a2f4fc4d54bb50 authored over 10 years ago by Christopher Foo <[email protected]>
Abstract lines from http.Request to URLPropertyMixin

7e97f7bfb25fd1147fef2f5158b6c2a410cc74a2 authored over 10 years ago by Christopher Foo <[email protected]>
Abstract client and client session.

771c50ed08d31b8588f07004cb825839bd41591e authored over 10 years ago by Christopher Foo <[email protected]>
add WIP ftp.command.Commander

1ee745151221661a4f02842f7c28308f8c059fd7 authored over 10 years ago by Christopher Foo <[email protected]>
ftp: Add util.reply_code_tuple() Reply.code_tuple()

575909615d87085d770321c5b53aedf336923155 authored over 10 years ago by Christopher Foo <[email protected]>
http.stream: Update docstring

984eeccbcbb2133e4d58dab3d0c46b656632ca34 authored over 10 years ago by Christopher Foo <[email protected]>
Add WIP FTP stream support classes

RE: chfoo/wpull#23

dfbacf0dd11d17d7fb68cff41183c1faa60e16b0 authored over 10 years ago by Christopher Foo <[email protected]>
proxy: Don't log ConnectionError exceptions as errors

re: chfoo/wpull#177

bb3d01b3505e9ca390c102034d0217701f223739 authored over 10 years ago by Christopher Foo <[email protected]>
http.client: Close session connection on exception.

Re: chfoo/wpull#177

d6b7e85fe0eb15e0d312542290a40dd6942345c0 authored over 10 years ago by Christopher Foo <[email protected]>
connection.py: Use states and require reset() to avoid accidental reuse.

f96e874d625a4b2a9cc693fd23447ac11c3b938e authored over 10 years ago by Christopher Foo <[email protected]>
util.close_on_error: Catch all Exception and close except on StopIteration

8985e5b90fe21f83149d91d574c4bc5282573ca8 authored over 10 years ago by Christopher Foo <[email protected]>
string.normalize_codec_name: Handle case where name contains \x00.

Bug found during fuzz testing.

9dde43efbda7be20afc289caba237389cf2cfa05 authored over 10 years ago by Christopher Foo <[email protected]>
connection: Close idle connections if limit reached.

Re chfoo/wpull#167

a8c5fc883a138e12dbdcb6889df03a9b4f2ff76c authored over 10 years ago by Christopher Foo <[email protected]>
fuzz_fusil: Bump threads limit

[ci skip]

90f047cc9ccb16fb067da96ed457e8d74199e8a9 authored over 10 years ago by Christopher Foo <[email protected]>
Drop beautifulsoup4 requirement

Closes chfoo/wpull#170

2721ab4deb939a4343dda48dab4967bfab79ff28 authored over 10 years ago by Christopher Foo <[email protected]>
string: Use our own bundled dammit module

re chfoo/wpull#170

ad0daf8033810e52f35bb6825d4faa7310cbece2 authored over 10 years ago by Christopher Foo <[email protected]>
Fix up thirdparty.dammit for Py3 and add copyright notice

re chfoo/wpull#170

10edb591679a2875477375de470fba67edd58da0 authored over 10 years ago by Christopher Foo <[email protected]>
Bundle bs4.dammit revno 342 into thirdparty

Re: chfoo/wpull#170

d1dbd76b6136abef5ed2d54e2148f4a21467192d authored over 10 years ago by Christopher Foo <[email protected]>
Always use Latin-1 for HTTP headers instead a mix of UTF-8 & Latin-1.

Closes chfoo/wpull#174

605bdece0083bcad01afe80ad63a81cb789104b3 authored over 10 years ago by Christopher Foo <[email protected]>
fuzz_fusil_2: Increase timeout. Use objgraph and lsof debugs.

[ci skip]

d61140627e6f63bf8abee3604d6a4a84cb18ea78 authored over 10 years ago by Christopher Foo <[email protected]>
fuzz_fusil_2: Bump max threads to 50.

[ci skip]

380fdeba4bdb80bd06afe66005406c01ede13414 authored over 10 years ago by Christopher Foo <[email protected]>
processor.web: Always assume UTF-8 IRIs regardless of source encoding.

Closes chfoo/wpull#172

993cabad5b8b6a7fb9a4c7f1dac2157b9d73a609 authored over 10 years ago by Christopher Foo <[email protected]>
changelog: Add entry about fixed encoding detection w/ truncated encoding

b563076e97be3e14dbdfe180e1fb036876558dcf authored over 10 years ago by Christopher Foo <[email protected]>
string.detect_encoding: Don't select ASCII.

2533e85286b6da05ac5f8499a7ee1de0dc97ad69 authored over 10 years ago by Christopher Foo <[email protected]>
string.try_decoding: Support truncated byte strings.

Closes chfoo/wpull#171

f8d326b6d303531b9ce2f25137127fe94c76c8b4 authored over 10 years ago by Christopher Foo <[email protected]>
fuzz_fusil: Update runner ignores to match fuzz_fusil_2

[ci skip]

5e091cd27ba121e57f98f0473a5816454a823db6 authored over 10 years ago by Christopher Foo <[email protected]>
http.chunked: Comment out debug logs for performance.

cf3fa4184b03d44fa9830371d05bf10c75ca1c36 authored over 10 years ago by Christopher Foo <[email protected]>
http.chunked: Fix unused string format args

867e9ffd92b786974cccf0720b8f4d7ff0e65ce8 authored over 10 years ago by Christopher Foo <[email protected]>
decompression.DeflateDecompressor: Check for None before flush

Closes chfoo/wpull#157

cf07e9a78898614d44113491267b620e2e3a5e8b authored over 10 years ago by Christopher Foo <[email protected]>
changelog: Add queued_url/dequeued_url entry

6607c0b4f6087f2222380ea96d6786f275767467 authored over 10 years ago by Christopher Foo <[email protected]>
Merge branch 'temp1' into develop

Conflicts:
wpull/engine.py
wpull/version.py

f97635d5a444f35da5c053b1fffa248e5c2015ce authored over 10 years ago by Christopher Foo <[email protected]>
Merge branch 'mback2k-topic/queue_hooks' into temp1

42f3e6c4bdcd8c8a88e8c85fe86489bca250f7bb authored over 10 years ago by Christopher Foo <[email protected]>
fixup! fuzz_fusil_2: Update runner to match new conditions.

5f5ce18e76ded8e04efa5150ea75037494922f11 authored over 10 years ago by Christopher Foo <[email protected]>
fuzz_fusil_2: Update runner to match new conditions.

996acefa965bcb3d9311ac07ddfd3fc3d719bc21 authored over 10 years ago by Christopher Foo <[email protected]>
http.stream: Close the connection on errors to avoid bad state by reuse.

ac358659c4ecc54551cede5c6316bc3cefd251ac authored over 10 years ago by Christopher Foo <[email protected]>
http.request: Add status line parsing content to err msg.

ac9dc52f8deed09a4ead0ae05827ca74ec024c3a authored over 10 years ago by Christopher Foo <[email protected]>
connection.py: Explicitly close the connection on errors.

9424b23741d07cb4ed320ce1470f91fcda8142bf authored over 10 years ago by Christopher Foo <[email protected]>
http.stream: Handle header line too long

fb68042fd272b1d67432224b895d8f46e5b8dbb8 authored over 10 years ago by Christopher Foo <[email protected]>
http.chunked: Handle line too long

b6f04896ca0d1671ab439e0e361d913dba738582 authored over 10 years ago by Christopher Foo <[email protected]>
fuzz_fusil_2: Ignore "ERROR Fetching "

85e71f662bc8a7988e37c1a52e1335177c0a6bbb authored over 10 years ago by Christopher Foo <[email protected]>
Add Fusil fuzz test runner to interact with huhhttp.

0bcefed5f78d76820a9c8afde29056057a019bfc authored over 10 years ago by Christopher Foo <[email protected]>
Added final queue counter assertion to test scripts

9bab4558e9d2a56f19700c0410f68ee341ac695d authored over 10 years ago by Marc Hoersken <[email protected]>
Call queued_url-hook from HookEnvironment._add_hooked_url

This makes sure it includes scraped or custom URLs.

8b81db0d2f36665fea403716dcdf16fc64fad947 authored over 10 years ago by Marc Hoersken <[email protected]>
Ignore unused callback result

74a35a8d49bf3fa5f460d587bb8897dd1e57706f authored over 10 years ago by Marc Hoersken <[email protected]>
fixup! Make use of new_temp_file hint parameter.

2a595381d38e67eb01a61c49857706ad1f26c72a authored over 10 years ago by Christopher Foo <[email protected]>
engine.py: Add FILE_LEAK_DEBUG env var for debugging.

79cdf2417eb6667d2354fdaf91dd4685319be0b5 authored over 10 years ago by Christopher Foo <[email protected]>
Make use of new_temp_file hint parameter.

cce7b9a035b4eb1b2caf8b8e850f5b555e088d3f authored over 10 years ago by Christopher Foo <[email protected]>
body.new_temp_file: Add hint parameter.

47d88aab67601c7a82b666ea0bf00cc0a4c88a84 authored over 10 years ago by Christopher Foo <[email protected]>
recorder_test.py: Add test for DemuxRecorder.

0f6b504b6959a100a4b1c38311ec9ab4da3d9f2d authored over 10 years ago by Christopher Foo <[email protected]>
Merge branch 'topic/asyncio' into develop

69d34de29a8c6e082c99f7bce05a766363b9fe8d authored over 10 years ago by Christopher Foo <[email protected]>
http.stream_test: Test all cases under SSL too.

4a10ec0f311ef9aa21d19145fa8c86308f7f159d authored over 10 years ago by Christopher Foo <[email protected]>
Fixed test hooks in py_hook_script2.py and lua_hook_script2.lua

f6366e90038f47d4f1deb2bb816d824219e6c4b5 authored over 10 years ago by Marc Hoersken <[email protected]>
Added queue hook tests to lua_hook_script2.lua

6ebfaece309d9f36994ee9fdfca643e0ac58cd27 authored over 10 years ago by Marc Hoersken <[email protected]>
Moved hook tests from boring_script.py into py_hook_script2.py

08ac5a42a5d0f8895564a15b8efdd7d7f5bff4c2 authored over 10 years ago by Marc Hoersken <[email protected]>
fixup! connection.py: Use periodic connection closer for timeouts

Closes chfoo/wpull#166

725f5052033246c4ebece88f420fa96dbd621016 authored over 10 years ago by Christopher Foo <[email protected]>
engine.py: Add OBJGRAPH_DEBUG env var for debugging feature.

70752009c6ac6d8c47ec00e52d20cd9636755590 authored over 10 years ago by Christopher Foo <[email protected]>
connection.py: Use periodic connection closer for timeouts

Re: chfoo/wpull#166

ae6927eab7dccbb197ea837351e3831ea66a83d4 authored over 10 years ago by Christopher Foo <[email protected]>
Allow debug console with port 0. Keep prev command in textbox.

8681cdc1bae2aa6f31f49c00a7d242093b7b11dc authored over 10 years ago by Christopher Foo <[email protected]>
__main__: Use Trollius event loop policy as default. Fix debug console hang.

f09b0f0b90a4dd06f51d5f43ad83c86600703ed3 authored over 10 years ago by Christopher Foo <[email protected]>
processor.web: Fix UnboundLocalError on PhantomJS fails

Closes chfoo/wpull#137

6fed488a1db734d3ac96af6603c790845ec0ca5a authored over 10 years ago by Christopher Foo <[email protected]>
engine.py: Fix use of semaphore.

re: chfoo/wpull#168

3f4077fe6c578765ae5e8eec019f0ed7cddb398c authored over 10 years ago by Christopher Foo <[email protected]>
Removed debug output from default hooks

560d79821e5f4f9b256e306896529b260d8ac0fd authored over 10 years ago by Marc Hoersken <[email protected]>
WIP: Only call queued_url-hook for newly added URLs

fbd49560d568d65d734754e8462d0b9fb4a21d7c authored over 10 years ago by Marc Hoersken <[email protected]>
WIP: Extend URLItem.add_*_url_infos to return newly added URLs

8dcc78f4603d1f3c6beb2696131abf0fecbbab07 authored over 10 years ago by Marc Hoersken <[email protected]>
WIP: Extend BaseSQLURLTable.add to return newly added URLS

86033fa09a51572ccd2cc53727430fd8d23b63d7 authored over 10 years ago by Marc Hoersken <[email protected]>
engine.py: Rewrite producer-consumer to not use .wait() and Task.cancel().

Closes chfoo/wpull#168

28493a2ca4ba07688a62e36f5d88855c60338f7c authored over 10 years ago by Christopher Foo <[email protected]>
WIP: Added prototype of queue hooks

a73092dd531dbcba342f5587a28ae9100e391196 authored over 10 years ago by Marc Hoersken <[email protected]>
http.client: Don't leak client sessions out of recorder sessions.

Re: chfoo/wpull#167

ece736424eac6676c8f04e6a52fab9b79c363c04 authored over 10 years ago by Christopher Foo <[email protected]>
processor.web,recorder: Explicit temp file clean up on exit context.

Re: chfoo/wpull#167

61e92526b322f3eef48b290382f9d98940a96eaa authored over 10 years ago by Christopher Foo <[email protected]>
http.request: Account for empty reasons and the 0 status code.

Closes chfoo/wpull#165

78bff2f742a0c281bf9a21dafa95236347184823 authored over 10 years ago by Christopher Foo <[email protected]>
builder,proxy.py: Set rewrite=True.

Closes chfoo/wpull#161

21af2cbd8e77011ae2a6d5074edd6d5bf81a6e88 authored over 10 years ago by Christopher Foo <[email protected]>
http.web: Fix stuck in loop on 307/308 redirects.

f6b722d9de84c9190027c9aa0e1a936170cf2e8d authored over 10 years ago by Christopher Foo <[email protected]>
collections.OrderedDefaultDict: Fix recursion error during copy.

Don't copy and paste from the internet without testing them throughly.

Closes chfoo/wpull#160

ae604f96703daf53a36f21a7668bff96592404f9 authored over 10 years ago by Christopher Foo <[email protected]>
connection.py: Close ourselves on timeout to avoid FD leak.

6caf517c7f4de049dfc692e4b757c2a37d1f8932 authored over 10 years ago by Christopher Foo <[email protected]>
phantomjs: Don't use wait_for to avoid losing items from queue.

5d52a144ec6a00e66b9409ec1524d4f87eeda4bb authored over 10 years ago by Christopher Foo <[email protected]>
http.stream: Don't drain on final write to avoid "Connection lost".

284265b5953f5a4922f4ad39580e6ae9923f8192 authored over 10 years ago by Christopher Foo <[email protected]>
setup.py: Add missing wpull.backport to packages.

Closes chfoo/wpull#159

db1d5f5ac483839f8d9ede485ace65c4f9d16f77 authored over 10 years ago by Christopher Foo <[email protected]>
connection.py: Properly catch all network errors.

8bc4448ade56d92054ef653e72dc0074fb920958 authored over 10 years ago by Christopher Foo <[email protected]>
badapp.py: Reduce /big size to avoid timeouts on travis ci.

bc6ed0f7016bfe30a14a7a1f91f6526c98d4881f authored over 10 years ago by Christopher Foo <[email protected]>
connection.py: Catch trollius.ConnectionRefusedError for Py 3.2

09370f29071fea20f1e5d2c449f9f1d5d9edd65b authored over 10 years ago by Christopher Foo <[email protected]>
stream_test.py: Increase timeout for travis ci.

b938c8cecf39adfc3ef0baced85b92ba00b293e8 authored over 10 years ago by Christopher Foo <[email protected]>
engine.py: Fix set_concurrent poison pill logic.

d255199506d69816fcd39f3eb950bdc106e510ae authored over 10 years ago by Christopher Foo <[email protected]>
http.chunked: Check for negative chunk size.

232b8e3e274e20ced92530995ed0d8a7c99e5cee authored over 10 years ago by Christopher Foo <[email protected]>
dns,hook: Change resolve_dns to mean override host to match wget+lua.

c2e6c2af3024ed3e7acd4457006fd88f245dc4c0 authored over 10 years ago by Christopher Foo <[email protected]>
connection: Handle case where errno is None

4b2a18397f4b7facdaa3f06edfcfb31e50f4bbcd authored over 10 years ago by Christopher Foo <[email protected]>
http.stream_test: Use explicit 127.0.0.1 as host.

0d49b650988bd87b2cf18b44d6d4fa0691f91609 authored over 10 years ago by Christopher Foo <[email protected]>
Drop Python 2 support.

45262a580f7654c2f27bb7ee68a24a7ae3a59d9a authored over 10 years ago by Christopher Foo <[email protected]>
body.Body.size: Check for seekable attribute first.

4b1efbf94c694ba5385351616704d10a09163951 authored over 10 years ago by Christopher Foo <[email protected]>
Change return syntax for Py <3.4 support.

96070f4a2497e914289dcf575ed471b142496017 authored over 10 years ago by Christopher Foo <[email protected]>
Bump version to 0.1000a1

daf2dc2dd66ab195f2e11e309526c72868948942 authored over 10 years ago by Christopher Foo <[email protected]>
travis.yml: Remove topic/asyncio from blacklist.

98666d6f781aac57379095c10543a7246a0491dc authored over 10 years ago by Christopher Foo <[email protected]>
Add inline doc and update docs about trollius internal rewrite.

94fd0328b5d2692e5884d26a2dd77d0ba02f8072 authored over 10 years ago by Christopher Foo <[email protected]>
Update requirements.txt and setup.py

274d6962af0952a7f1bb950093fdf4b9166c48eb authored over 10 years ago by Christopher Foo <[email protected]>
Delete unused conversation module.

33d9fbab61adfdb0342e24f4b5183467a12e5f08 authored over 10 years ago by Christopher Foo <[email protected]>
phantomjs: Add some debug messages.

06b1a6db564952f361c9f794266ac5d67f5e94d3 authored over 10 years ago by Christopher Foo <[email protected]>
phantomjs: Support long RPC messages.

0c2cb3bdb282ddbce8c9f6cee8e44b668b507271 authored over 10 years ago by Christopher Foo <[email protected]>
engine.py: Refactor into better producer-consumer pattern.

6be167b84c077bb2f9d82aa2a72b82fee6931b71 authored over 10 years ago by Christopher Foo <[email protected]>