Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/ArchiveTeam/ludios_wpull
wpull fork with fixes and faster parsing using html5-parser; used by grab-site; should go away when wpull is similarly improved
https://github.com/ArchiveTeam/ludios_wpull
[ci skip]
977a29c048a0e55b21dabea5dfb101bb84e57fc9 authored over 10 years agoa9b35babf42caef9b23d76ac47497f40c288c27d authored over 10 years ago
bc5f704f8252c4d25ed8d0a48f1ea681d20ed55c authored over 10 years ago
Conflicts:
doc/changelog.rst
wpull/builder.py
Conflicts:
doc/changelog.rst
[ci skip]
add60b66677305656bed6f90a6ba37e57c7cac74 authored over 10 years ago[ci skip]
7ccab71fc43e073ea20ae615f80564b6a5e4f57c authored over 10 years ago95fe12162683e37740582e9b1afc5f07c07b28e7 authored over 10 years ago
38d90880c45e05c2a1013463e35cbca6858b027f authored over 10 years ago
Renames to --warc-move for consistency. Implements moving all WARCs and
the CDX file to given di...
ae2d7e75a17f96a0ec983b4126823993b5040d3f authored over 10 years ago
8601b4b79e2786e8688a4ccb193aaa6426112928 authored over 10 years ago
4e36ee365bb82408b49032c00a6d7eafcd6a7524 authored over 10 years ago
eaf827289f6274ff10c59f5982bef99a034bc997 authored over 10 years ago
63e5847f9c5a0e66265a3a4474d29017573acf27 authored over 10 years ago
fba68f90381ddfc1cce81c1beab42a65824a710f authored over 10 years ago
fe3cd52bfd4fcf2fc6cedfd502117aed1bcea796 authored over 10 years ago
597e56490b318f84fafc6192e54777401cd1ff97 authored over 10 years ago
d4ab47243eafd896f6f431ca4d4bbe5e9f4cd7f9 authored over 10 years ago
321bc3f85874279b7fc97e65b427f717f32b4cf5 authored over 10 years ago
Use of AF_INET | AF_INET6 may fail for things such as localhost.
[ci skip]
6255c39d84d3a56915221741aa3b1e286beb88c6 authored over 10 years ago7011afa0d15a963025b0e7eb4cd98fbcf7a0b307 authored over 10 years ago
WIP adds explicit callback hooks into classes instead of subclassing.
Allows attaching callbacks...
d46503c2140ad13869d2ce884d5d1a371271069f authored over 10 years ago
095349ef02f32d25ed37ec6711e46ea31be8486f authored over 10 years ago
920f20adba87d7367d650c4057e6dcfc37c56410 authored over 10 years ago
f23ba4796427b9a2cbb9970a71e59380a8238cc6 authored over 10 years ago
631d8c4122bd1e0fee054e6b6521fc0a83328c47 authored over 10 years ago
241a7ed4b0293e9221f3f6679747e58942ba0e1f authored over 10 years ago
d36f4fc24f1364f25795e4feab83d78c4fde1f14 authored over 10 years ago
SIGINT/SIGTERM signal handlers moved into
Application.setup_signal_handlers()
b0b78d4f722c9fe15561f100abed8e599dc817e0 authored over 10 years ago
d2db7d354e621ed2fb08d6357b82c9fd9c02bee3 authored over 10 years ago
Merge branch 'develop'
Conflicts:
wpull/version.py
22c7a58774755dcf2c14e54f9c23f2063327f148 authored over 10 years ago
re: chfoo/wpull#132
966bf57ed184c7bddb0728078768231d6bdddcc4 authored over 10 years agoNo longer supports IPv6 URLs with extra/misplaced brackets.
e9282a5c90256cd4cdb630e4b3a5b5218a466da0 authored over 10 years ago7a6c4d39fdf49f6a85ca1376c66080f6a847b0e6 authored over 10 years ago
0956137ad2d0d33b0693976419622294a35d157e authored over 10 years ago
[ci skip]
d4897b7b75f82dc1ace0ea01f4005955e9960d6f authored over 10 years ago[ci skip]
4ecfaa3e3ca98ca236462446e58ffa693ff0b76f authored over 10 years agoCloses chfoo/wpull#129
408593d22fb3857e436b69591268977d4640ddda authored almost 11 years agoCloses chfoo/wpull#127
2596548d1599fdbab3e8ca26c73dd888eb9f58bd authored almost 11 years ago548366eca340c2d97fa964dadf8d67589f1cf6a9 authored almost 11 years ago
Removes --no-strong-robots option.
urlfilter.DemuxURLFilter: Adds result a mapping of names to v...
Closes chfoo/wpull#128
ec8c9be7e8c50f72801167e4a955aa7a3623e335 authored almost 11 years agoCloses chfoo/wpull#48
9aee023a2b097cfedfe6bc7294ca199ba3dd52b1 authored almost 11 years agoThis reverts commit 5e32a2f737f7a2109b7d3663c9f26f9375379990.
Re: chfoo/wpull#125
7f80481be587a95fcef5f7392bc0da0b15eebfab authored almost 11 years ago5e32a2f737f7a2109b7d3663c9f26f9375379990 authored almost 11 years ago
b1ac4f6ea2094b26111b1ce3b61226f0ecc87462 authored almost 11 years ago
7009383e28d045df1925e7ccce25e3d4b356ecc8 authored almost 11 years ago
[ci skip]
93041a1a64ac965ea77134672574073f0e774d0e authored almost 11 years agoinstead of munging the hostname.
Munging the hostname may have unseen effects such as cookies l...
dc3d2e99eba9cde1b0294845f14cb062b727fda6 authored almost 11 years ago7967dfe0a7165405b19007ccfd69121f23340bf8 authored almost 11 years ago
Closes chfoo/wpull#122
66b1cbd5031b860ca3677f7eb45381aa00ea6df9 authored almost 11 years agoMerge branch 'develop'
Conflicts:
wpull/version.py
26d87fe5d5aa322cff1613854aa22ac01e73dfcc authored almost 11 years ago
cef58153a5a55104dd9d3e4d5a186c36fa49130a authored almost 11 years ago
49e3f0a8074ac25b520e048904fc450c2aac110d authored almost 11 years ago
5b79985ccc345e12c8b9bba02c282059ec47cc60 authored almost 11 years ago
Removes brackets from hostname as part of normalization.
Closes chfoo/wpull#121
201259db72b2d0934a81394a29c68005476c7fca authored almost 11 years ago956fc28fb6b2f035f7ab468a71cde8094e403d9b authored almost 11 years ago
37ead5babdf1649825098f54144e379b9cb4917f authored almost 11 years ago
Closes chfoo/wpull#120
fa31346e15230e54eebcffc75cbaa0a897a3c538 authored almost 11 years ago5a21c7521f3960283acd8758aca6b208c6d35099 authored almost 11 years ago
c4027a5a61214b2fcbf9e1050d4aef7ff521820f authored almost 11 years ago
48cab4cd99757989e37cf0aa8b38fdf7de2e35a9 authored almost 11 years ago
a0f5e0496f223f670a47f8e7b7a587d5ba1b35a5 authored almost 11 years ago
d57a3fa3c8c7e305b8e17f5be63e7d7721e17c99 authored almost 11 years ago
c498478f589d9cf7c664275ef4e4c7267c4620c8 authored almost 11 years ago
121942c38c410eb9797b251e507bd07f9019af4c authored almost 11 years ago
f2f20e263bef2aa9e19ead88823ad3e28294fbbc authored almost 11 years ago
5b7f66348f65982417f9e006ed33168b94e59948 authored almost 11 years ago
4f2febba5fbcf8344b05aa7dbdc515f7936d8a37 authored almost 11 years ago
53afeb5378ea5d42c1e3caffcc7390583390a4e5 authored almost 11 years ago
0ca91eb3c0f29f09a87a799baccf2ef1788321fb authored almost 11 years ago
The XHTML parser does not handle soup well.
97dbe491adc325d1c77a6a288e2800e430029265 authored almost 11 years agoprocessor: Don't catch UnicodeError.
fc80999bdefebb3962af294db7d060e19b31dd9b authored almost 11 years agoCloses chfoo/wpull#118.
7f7ff184181ce5f6c747c8062cf341907c404996 authored almost 11 years ago0c965bcee06fb76e20134b0ac0b589f417e87eaf authored almost 11 years ago
Conflicts:
wpull/version.py
d5b4dbb423c12b52898dfa9d35ea4447464afcde authored almost 11 years ago
7db216a5a004628b939c3c7d2803ef8d181b4d5d authored almost 11 years ago
4cf5e63505c672bc4a16e33d2cc2debf4e6495ed authored almost 11 years ago
c25363546cd9d2393122bc40e0c6a4f9c6728ff5 authored almost 11 years ago
Adds JavaScriptReader, JavaScriptScraper.
Re: chfoo/wpull#74.
930e49c0155e76e3e27ffb1b4dd1d852157e793e authored almost 11 years ago2d786ad2b7bc4453a3f28f33e359e1e2b266eb81 authored almost 11 years ago
ef91868b62ff3aa463511709a89266bbffd54f6b authored almost 11 years ago
2d0c165eda23b08ba7a99b2f0bc9ad388802574b authored almost 11 years ago
Moves to_bytes, to_str, normalized_codec_name, detect_encoding,
try_decoding, format_size, print...
f7c86a6dd99d32a917c4b9d21d122a6490ba70db authored almost 11 years ago
75d71754ad8f4c0a32dc25b9d70be07eead8a684 authored almost 11 years ago
474678738604f5664d2b7d7932f208e36a0bbfa6 authored almost 11 years ago
Closes chfoo/wpull#114.
ce1495d60bfbf6d08ea7cdb80d2d1c40099ed559 authored almost 11 years agodfdd194a9b9df0a428aac8f8168b6295a460cce5 authored almost 11 years ago
Adds util.GzipDecompressor().
Closes chfoo/wpull#115.
8e14bf27b2fd41f5a98b8db1d65a3eb49bfc6d0d authored almost 11 years ago
4316736ef33cf0ba97c92288a437774b1a6bf3df authored almost 11 years ago
Merge commit 'd07172c0be2f910d9d95fddb487a4917aab73c70'
Conflicts:
wpull/version.py