Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/ArchiveTeam/ludios_wpull

wpull fork with fixes and faster parsing using html5-parser; used by grab-site; should go away when wpull is similarly improved
https://github.com/ArchiveTeam/ludios_wpull

Abstract lines from http.Request to URLPropertyMixin

7e97f7bfb25fd1147fef2f5158b6c2a410cc74a2 authored over 10 years ago
Abstract client and client session.

771c50ed08d31b8588f07004cb825839bd41591e authored over 10 years ago
add WIP ftp.command.Commander

1ee745151221661a4f02842f7c28308f8c059fd7 authored over 10 years ago
ftp: Add util.reply_code_tuple() Reply.code_tuple()

575909615d87085d770321c5b53aedf336923155 authored over 10 years ago
http.stream: Update docstring

984eeccbcbb2133e4d58dab3d0c46b656632ca34 authored over 10 years ago
Add WIP FTP stream support classes

RE: chfoo/wpull#23

dfbacf0dd11d17d7fb68cff41183c1faa60e16b0 authored over 10 years ago
proxy: Don't log ConnectionError exceptions as errors

re: chfoo/wpull#177

bb3d01b3505e9ca390c102034d0217701f223739 authored over 10 years ago
http.client: Close session connection on exception.

Re: chfoo/wpull#177

d6b7e85fe0eb15e0d312542290a40dd6942345c0 authored over 10 years ago
connection.py: Use states and require reset() to avoid accidental reuse.

f96e874d625a4b2a9cc693fd23447ac11c3b938e authored over 10 years ago
util.close_on_error: Catch all Exception and close except on StopIteration

8985e5b90fe21f83149d91d574c4bc5282573ca8 authored over 10 years ago
string.normalize_codec_name: Handle case where name contains \x00.

Bug found during fuzz testing.

9dde43efbda7be20afc289caba237389cf2cfa05 authored over 10 years ago
connection: Close idle connections if limit reached.

Re chfoo/wpull#167

a8c5fc883a138e12dbdcb6889df03a9b4f2ff76c authored over 10 years ago
fuzz_fusil: Bump threads limit

[ci skip]

90f047cc9ccb16fb067da96ed457e8d74199e8a9 authored over 10 years ago
Drop beautifulsoup4 requirement

Closes chfoo/wpull#170

2721ab4deb939a4343dda48dab4967bfab79ff28 authored over 10 years ago
string: Use our own bundled dammit module

re chfoo/wpull#170

ad0daf8033810e52f35bb6825d4faa7310cbece2 authored over 10 years ago
Fix up thirdparty.dammit for Py3 and add copyright notice

re chfoo/wpull#170

10edb591679a2875477375de470fba67edd58da0 authored over 10 years ago
Bundle bs4.dammit revno 342 into thirdparty

Re: chfoo/wpull#170

d1dbd76b6136abef5ed2d54e2148f4a21467192d authored over 10 years ago
Always use Latin-1 for HTTP headers instead a mix of UTF-8 & Latin-1.

Closes chfoo/wpull#174

605bdece0083bcad01afe80ad63a81cb789104b3 authored over 10 years ago
fuzz_fusil_2: Increase timeout. Use objgraph and lsof debugs.

[ci skip]

d61140627e6f63bf8abee3604d6a4a84cb18ea78 authored over 10 years ago
fuzz_fusil_2: Bump max threads to 50.

[ci skip]

380fdeba4bdb80bd06afe66005406c01ede13414 authored over 10 years ago
processor.web: Always assume UTF-8 IRIs regardless of source encoding.

Closes chfoo/wpull#172

993cabad5b8b6a7fb9a4c7f1dac2157b9d73a609 authored over 10 years ago
changelog: Add entry about fixed encoding detection w/ truncated encoding

b563076e97be3e14dbdfe180e1fb036876558dcf authored over 10 years ago
string.detect_encoding: Don't select ASCII.

2533e85286b6da05ac5f8499a7ee1de0dc97ad69 authored over 10 years ago
string.try_decoding: Support truncated byte strings.

Closes chfoo/wpull#171

f8d326b6d303531b9ce2f25137127fe94c76c8b4 authored over 10 years ago
fuzz_fusil: Update runner ignores to match fuzz_fusil_2

[ci skip]

5e091cd27ba121e57f98f0473a5816454a823db6 authored over 10 years ago
http.chunked: Comment out debug logs for performance.

cf3fa4184b03d44fa9830371d05bf10c75ca1c36 authored over 10 years ago
http.chunked: Fix unused string format args

867e9ffd92b786974cccf0720b8f4d7ff0e65ce8 authored over 10 years ago
decompression.DeflateDecompressor: Check for None before flush

Closes chfoo/wpull#157

cf07e9a78898614d44113491267b620e2e3a5e8b authored over 10 years ago
changelog: Add queued_url/dequeued_url entry

6607c0b4f6087f2222380ea96d6786f275767467 authored over 10 years ago
Merge branch 'temp1' into develop

Conflicts:
wpull/engine.py
wpull/version.py

f97635d5a444f35da5c053b1fffa248e5c2015ce authored over 10 years ago
Merge branch 'mback2k-topic/queue_hooks' into temp1

42f3e6c4bdcd8c8a88e8c85fe86489bca250f7bb authored over 10 years ago
fixup! fuzz_fusil_2: Update runner to match new conditions.

5f5ce18e76ded8e04efa5150ea75037494922f11 authored over 10 years ago
fuzz_fusil_2: Update runner to match new conditions.

996acefa965bcb3d9311ac07ddfd3fc3d719bc21 authored over 10 years ago
http.stream: Close the connection on errors to avoid bad state by reuse.

ac358659c4ecc54551cede5c6316bc3cefd251ac authored over 10 years ago
http.request: Add status line parsing content to err msg.

ac9dc52f8deed09a4ead0ae05827ca74ec024c3a authored over 10 years ago
connection.py: Explicitly close the connection on errors.

9424b23741d07cb4ed320ce1470f91fcda8142bf authored over 10 years ago
http.stream: Handle header line too long

fb68042fd272b1d67432224b895d8f46e5b8dbb8 authored over 10 years ago
http.chunked: Handle line too long

b6f04896ca0d1671ab439e0e361d913dba738582 authored over 10 years ago
fuzz_fusil_2: Ignore "ERROR Fetching "

85e71f662bc8a7988e37c1a52e1335177c0a6bbb authored over 10 years ago
Add Fusil fuzz test runner to interact with huhhttp.

0bcefed5f78d76820a9c8afde29056057a019bfc authored over 10 years ago
Added final queue counter assertion to test scripts

9bab4558e9d2a56f19700c0410f68ee341ac695d authored over 10 years ago
Call queued_url-hook from HookEnvironment._add_hooked_url

This makes sure it includes scraped or custom URLs.

8b81db0d2f36665fea403716dcdf16fc64fad947 authored over 10 years ago
Ignore unused callback result

74a35a8d49bf3fa5f460d587bb8897dd1e57706f authored over 10 years ago
fixup! Make use of new_temp_file hint parameter.

2a595381d38e67eb01a61c49857706ad1f26c72a authored over 10 years ago
engine.py: Add FILE_LEAK_DEBUG env var for debugging.

79cdf2417eb6667d2354fdaf91dd4685319be0b5 authored over 10 years ago
Make use of new_temp_file hint parameter.

cce7b9a035b4eb1b2caf8b8e850f5b555e088d3f authored over 10 years ago
body.new_temp_file: Add hint parameter.

47d88aab67601c7a82b666ea0bf00cc0a4c88a84 authored over 10 years ago
recorder_test.py: Add test for DemuxRecorder.

0f6b504b6959a100a4b1c38311ec9ab4da3d9f2d authored over 10 years ago
Merge branch 'topic/asyncio' into develop

69d34de29a8c6e082c99f7bce05a766363b9fe8d authored over 10 years ago
http.stream_test: Test all cases under SSL too.

4a10ec0f311ef9aa21d19145fa8c86308f7f159d authored over 10 years ago
Fixed test hooks in py_hook_script2.py and lua_hook_script2.lua

f6366e90038f47d4f1deb2bb816d824219e6c4b5 authored over 10 years ago
Added queue hook tests to lua_hook_script2.lua

6ebfaece309d9f36994ee9fdfca643e0ac58cd27 authored over 10 years ago
Moved hook tests from boring_script.py into py_hook_script2.py

08ac5a42a5d0f8895564a15b8efdd7d7f5bff4c2 authored over 10 years ago
fixup! connection.py: Use periodic connection closer for timeouts

Closes chfoo/wpull#166

725f5052033246c4ebece88f420fa96dbd621016 authored over 10 years ago
engine.py: Add OBJGRAPH_DEBUG env var for debugging feature.

70752009c6ac6d8c47ec00e52d20cd9636755590 authored over 10 years ago
connection.py: Use periodic connection closer for timeouts

Re: chfoo/wpull#166

ae6927eab7dccbb197ea837351e3831ea66a83d4 authored over 10 years ago
Allow debug console with port 0. Keep prev command in textbox.

8681cdc1bae2aa6f31f49c00a7d242093b7b11dc authored over 10 years ago
__main__: Use Trollius event loop policy as default. Fix debug console hang.

f09b0f0b90a4dd06f51d5f43ad83c86600703ed3 authored over 10 years ago
processor.web: Fix UnboundLocalError on PhantomJS fails

Closes chfoo/wpull#137

6fed488a1db734d3ac96af6603c790845ec0ca5a authored over 10 years ago
engine.py: Fix use of semaphore.

re: chfoo/wpull#168

3f4077fe6c578765ae5e8eec019f0ed7cddb398c authored over 10 years ago
Removed debug output from default hooks

560d79821e5f4f9b256e306896529b260d8ac0fd authored over 10 years ago
WIP: Only call queued_url-hook for newly added URLs

fbd49560d568d65d734754e8462d0b9fb4a21d7c authored over 10 years ago
WIP: Extend URLItem.add_*_url_infos to return newly added URLs

8dcc78f4603d1f3c6beb2696131abf0fecbbab07 authored over 10 years ago
WIP: Extend BaseSQLURLTable.add to return newly added URLS

86033fa09a51572ccd2cc53727430fd8d23b63d7 authored over 10 years ago
engine.py: Rewrite producer-consumer to not use .wait() and Task.cancel().

Closes chfoo/wpull#168

28493a2ca4ba07688a62e36f5d88855c60338f7c authored over 10 years ago
WIP: Added prototype of queue hooks

a73092dd531dbcba342f5587a28ae9100e391196 authored over 10 years ago
http.client: Don't leak client sessions out of recorder sessions.

Re: chfoo/wpull#167

ece736424eac6676c8f04e6a52fab9b79c363c04 authored over 10 years ago
processor.web,recorder: Explicit temp file clean up on exit context.

Re: chfoo/wpull#167

61e92526b322f3eef48b290382f9d98940a96eaa authored over 10 years ago
http.request: Account for empty reasons and the 0 status code.

Closes chfoo/wpull#165

78bff2f742a0c281bf9a21dafa95236347184823 authored over 10 years ago
builder,proxy.py: Set rewrite=True.

Closes chfoo/wpull#161

21af2cbd8e77011ae2a6d5074edd6d5bf81a6e88 authored over 10 years ago
http.web: Fix stuck in loop on 307/308 redirects.

f6b722d9de84c9190027c9aa0e1a936170cf2e8d authored over 10 years ago
collections.OrderedDefaultDict: Fix recursion error during copy.

Don't copy and paste from the internet without testing them throughly.

Closes chfoo/wpull#160

ae604f96703daf53a36f21a7668bff96592404f9 authored over 10 years ago
connection.py: Close ourselves on timeout to avoid FD leak.

6caf517c7f4de049dfc692e4b757c2a37d1f8932 authored over 10 years ago
phantomjs: Don't use wait_for to avoid losing items from queue.

5d52a144ec6a00e66b9409ec1524d4f87eeda4bb authored over 10 years ago
http.stream: Don't drain on final write to avoid "Connection lost".

284265b5953f5a4922f4ad39580e6ae9923f8192 authored over 10 years ago
setup.py: Add missing wpull.backport to packages.

Closes chfoo/wpull#159

db1d5f5ac483839f8d9ede485ace65c4f9d16f77 authored over 10 years ago
connection.py: Properly catch all network errors.

8bc4448ade56d92054ef653e72dc0074fb920958 authored over 10 years ago
badapp.py: Reduce /big size to avoid timeouts on travis ci.

bc6ed0f7016bfe30a14a7a1f91f6526c98d4881f authored over 10 years ago
connection.py: Catch trollius.ConnectionRefusedError for Py 3.2

09370f29071fea20f1e5d2c449f9f1d5d9edd65b authored over 10 years ago
stream_test.py: Increase timeout for travis ci.

b938c8cecf39adfc3ef0baced85b92ba00b293e8 authored over 10 years ago
engine.py: Fix set_concurrent poison pill logic.

d255199506d69816fcd39f3eb950bdc106e510ae authored over 10 years ago
http.chunked: Check for negative chunk size.

232b8e3e274e20ced92530995ed0d8a7c99e5cee authored over 10 years ago
dns,hook: Change resolve_dns to mean override host to match wget+lua.

c2e6c2af3024ed3e7acd4457006fd88f245dc4c0 authored over 10 years ago
connection: Handle case where errno is None

4b2a18397f4b7facdaa3f06edfcfb31e50f4bbcd authored over 10 years ago
http.stream_test: Use explicit 127.0.0.1 as host.

0d49b650988bd87b2cf18b44d6d4fa0691f91609 authored over 10 years ago
Drop Python 2 support.

45262a580f7654c2f27bb7ee68a24a7ae3a59d9a authored over 10 years ago
body.Body.size: Check for seekable attribute first.

4b1efbf94c694ba5385351616704d10a09163951 authored over 10 years ago
Change return syntax for Py <3.4 support.

96070f4a2497e914289dcf575ed471b142496017 authored over 10 years ago
Bump version to 0.1000a1

daf2dc2dd66ab195f2e11e309526c72868948942 authored over 10 years ago
travis.yml: Remove topic/asyncio from blacklist.

98666d6f781aac57379095c10543a7246a0491dc authored over 10 years ago
Add inline doc and update docs about trollius internal rewrite.

94fd0328b5d2692e5884d26a2dd77d0ba02f8072 authored over 10 years ago
Update requirements.txt and setup.py

274d6962af0952a7f1bb950093fdf4b9166c48eb authored over 10 years ago
Delete unused conversation module.

33d9fbab61adfdb0342e24f4b5183467a12e5f08 authored over 10 years ago
phantomjs: Add some debug messages.

06b1a6db564952f361c9f794266ac5d67f5e94d3 authored over 10 years ago
phantomjs: Support long RPC messages.

0c2cb3bdb282ddbce8c9f6cee8e44b668b507271 authored over 10 years ago
engine.py: Refactor into better producer-consumer pattern.

6be167b84c077bb2f9d82aa2a72b82fee6931b71 authored over 10 years ago
stream_test.py: Log less on repeated calls.

b7f7dd956b429796b752487b032bad97cd0e3add authored over 10 years ago
app_test.TestAppHTTPS: Fix wrong class.

3308a7df0c3bf179c4a427a44f558bf8e6c0ee46 authored over 10 years ago
http.web: Check for None before adding cookies.

0d9eb421303e1912e1911f58d295e01bd975789f authored over 10 years ago
app_test.py: Fix TestAppHTTPS with asyncio support.

c870960533dd4d95d8a09cf3a997e9a481901606 authored over 10 years ago