Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/ArchiveTeam/wpull

Wget-compatible web downloader and crawler.
https://github.com/ArchiveTeam/wpull

Implement --follow-ftp option.

70492a75e1c1a5080c312fa46a29bf83d35850cf authored about 10 years ago by Christopher Foo <[email protected]>
urlfilter.FollowFTPFilter: Handle case where referrer is None

7580b858304445df6917f91b0b46c5aa88821369 authored about 10 years ago by Christopher Foo <[email protected]>
recorder.warc: Append newline to log as needed.

df6b09864cb5f83a0de40667a372cab440f51b52 authored about 10 years ago by Christopher Foo <[email protected]>
ftp.client: Implement recorder on data stream read

206cbfcf82220acc376375c2772832f9b320e0a8 authored about 10 years ago by Christopher Foo <[email protected]>
recorder.warc: WIP FTP WARC recording.

[ci skip]

d1e5247d70f4acfa40c7a9f95b263a54ad1e585a authored about 10 years ago by Christopher Foo <[email protected]>
Add driver.resource stub.

9bd2f593e511378ed75900364cde4627405ceaf5 authored about 10 years ago by Christopher Foo <[email protected]>
Move phantomjs concerns from processor.web to coprocessor.phantomjs

712265f831cc3ea2200c2a63d412adf11d772930 authored about 10 years ago by Christopher Foo <[email protected]>
Move phantomjs into driver package.

dcd50fca7564f6ddcdb5c3a8a3daaec709bd8776 authored about 10 years ago by Christopher Foo <[email protected]>
travis: Blacklist topic/rewrite_phantomjs

030341326e3d221b5ea2ff059722febb54f21303 authored about 10 years ago by Christopher Foo <[email protected]>
pragma: no cover on fail() on unit tests

0f5cae28d474336673f61d44e4db0076f4a44bc5 authored about 10 years ago by Christopher Foo <[email protected]>
url: Forbid spaces in hostname.

Closes chfoo/wpull#158

3650c7e14edcee54ff96adf53a4e5b74eac1d0f4 authored about 10 years ago by Christopher Foo <[email protected]>
cookie: Limit cookie length per domain.

Closes chfoo/wpull#200

38a41521f6dcd658165313a83abe95490a6dc169 authored about 10 years ago by Christopher Foo <[email protected]>
Escape strings from untrusted sources.

Closes chfoo/wpull#173

ccea266390f758225618992c6f7bf939825eee13 authored about 10 years ago by Christopher Foo <[email protected]>
string: Add printable_str()

7b9035f0c79d5887d1851b337b58f79d87bb5eba authored about 10 years ago by Christopher Foo <[email protected]>
Merge commit '51e0c1441fe2f2c7d058f2f098474a55b8e12b1f' into develop

Bring in changes web and rule changes

20dec3f28ee7411197baa6e00293f50d6bad303e authored about 10 years ago by Christopher Foo <[email protected]>
recorder.warc_test: WIP Stub in FTP recorder test

[ci skip]

8ddd6dbf364f66c9f8115dadf5489dc55e6ae950 authored about 10 years ago by Christopher Foo <[email protected]>
WIP Implement FTP progress recorder.

[ci skip]

51e0c1441fe2f2c7d058f2f098474a55b8e12b1f authored about 10 years ago by Christopher Foo <[email protected]>
recorder.printing: Support print FTP responses

[ci skip]

ce92b00f823d048f694e9c3a0884408d7e3df20d authored about 10 years ago by Christopher Foo <[email protected]>
recorder: Add begin_control/end_control callbacks.

[ci skip]

41708b03a780113b24e7f1a51547aff554acbafc authored about 10 years ago by Christopher Foo <[email protected]>
fixup! processor: Integrate web/rule robots handling.

Fix the return syntax for Py 3.2 compatiblity.

acd60dc7441c6bf00acaa3f73320cf17ef8b7277 authored about 10 years ago by Christopher Foo <[email protected]>
processor: Integrate web/rule robots handling.

f8f27ca9a4834fa5f51f55550026c6b5c3dcdc84 authored about 10 years ago by Christopher Foo <[email protected]>
Merge branch 'develop' into issue/23_ftp_2

Conflicts:
wpull/processor/web.py

d2e3dc3f5fb76eadaecffc8333c4b9c2243a087d authored about 10 years ago by Christopher Foo <[email protected]>
Bump version 0.1001.2

4da31a33e50650de2d5a6fe708c7fc7d7bb1d638 authored about 10 years ago by Christopher Foo <[email protected]>
Merge branch 'develop'

2dffdef4394845473d990452f4cb2472adf59904 authored about 10 years ago by Christopher Foo <[email protected]>
changelog: Update latest to 0.1001.2

[ci skip]

2abfe5355992092edcacc5063d6300a6115aa9b7 authored about 10 years ago by Christopher Foo <[email protected]>
gitignore: Ignore Intellij and .orig files

37a0dda5a6a6b02176d8b800a0c1318afc9c365e authored about 10 years ago by Christopher Foo <[email protected]>
processor.web: Catch errors during fetching robots.txt

Closes chfoo/wpull#185

fe5c64611b98652483081fbea1a87948a30b5c3b authored about 10 years ago by Christopher Foo <[email protected]>
url.urljoin: Handle case of non-absolute base URLs.

Closes chfoo/wpull#196

d742a51bb471d62a700026537b4521fcf6bf77c8 authored about 10 years ago by Christopher Foo <[email protected]>
changelog: Update entry with prev commit ValueError fix

cec28189082db013c2510521e381a7c0c843bd3d authored about 10 years ago by Christopher Foo <[email protected]>
http.web: Catch ValueError getting next link location.

Closes chfoo/wpull#197

d47000e017f152462002fbad74055e4d777a1297 authored about 10 years ago by Christopher Foo <[email protected]>
contributing.md: Include info about reporting bugs

[ci skip]

2f43d2a0f2f63a2cd3551db43fc17f9deb101fd2 authored about 10 years ago by Christopher Foo <[email protected]>
ftp.client: WIP add control stream recorder support

c82becf28901e66743871bfc2b4f6aea23307d5e authored about 10 years ago by Christopher Foo <[email protected]>
ftp.command: Split up get_file to setup_data_stream.

8d3fca83a97cd8d6a886b0bc5397e6e34999a99c authored about 10 years ago by Christopher Foo <[email protected]>
*.request: Add DictableMixin, Serializable, ProtocolResponseMixin

1166fc5013d19131cd058af3d6aaf21be8fc9af7 authored about 10 years ago by Christopher Foo <[email protected]>
doc/terse_options.rst: Update to match description and manhole option.

2da86844d93b9459f3d700fbbfa833dab8eee196 authored over 10 years ago by Christopher Foo <[email protected]>
setup.py: Don't suffix versions to cx_freeze exe name.

Closes chfoo/wpull#195

a70dd781b8cdbeb1d03b1dd190429767f96bdc56 authored over 10 years ago by Christopher Foo <[email protected]>
Bump version 0.1001.1

f2f5f61e214e997c1de93efff1549b6544e89c79 authored over 10 years ago by Christopher Foo <[email protected]>
Merge branch 'develop'

Conflicts:
wpull/version.py

c27248810d7b91c30a8668afa477f1bb2bc74346 authored over 10 years ago by Christopher Foo <[email protected]>
changelog: Update latest to 0.1001.1

[ci skip]

ed69d124785df3a1ad6634fd441807e547346855 authored over 10 years ago by Christopher Foo <[email protected]>
Fix docstring syntax errors.

4b026ef8b2ec649fb5da3857f686cf2ad8df7f9c authored over 10 years ago by Christopher Foo <[email protected]>
app: Add missing return

61436a5596ab24c190a848ece8546effbab5cd3c authored over 10 years ago by Christopher Foo <[email protected]>
writer_test: Use new-style trollius yield From.

d84435ed42257180a276ad6989afb620d2b1a05e authored over 10 years ago by Christopher Foo <[email protected]>
app_test,writer_test: Use Builder unit_test=True

430a4b883837f342f26f0ea011adb015b9292f57 authored over 10 years ago by Christopher Foo <[email protected]>
buildeR: Add unit_test param to output to stdout instead of syserr

977233723fba637e1915093e4d129b4586eecfee authored over 10 years ago by Christopher Foo <[email protected]>
recorder.warc: Use StreamHandler log to avoid double open for Windows

c9f468bd181902ec21ac15737dd03e525ba0a381 authored over 10 years ago by Christopher Foo <[email protected]>
Implement WIP FTP file writing.

bf8b1b9edd8eab193c122b31017ac89f0413175c authored over 10 years ago by Christopher Foo <[email protected]>
ftp.client: Add some callbacks needed by file writer.

aaeded1a40c3e51c3f4dfa229a65c9b6ea1831bf authored over 10 years ago by Christopher Foo <[email protected]>
errors,stats: Add Statistics.increment_error

0440a8cd7fad9ce0f87209dc25895412a2c49bde authored over 10 years ago by Christopher Foo <[email protected]>
Merge branch 'develop' into issue/23_ftp_2

Need the test_dammit fix to fix the travis ci build

dba1cae0c6ac5b2e3ca072bce06c601d212f89c3 authored over 10 years ago by Christopher Foo <[email protected]>
thirdparty:test_dammit: Fix failure caused by chardet 2.3.0

75bbfe5dcf2ec68adcdc2c74017041cf282cadb0 authored over 10 years ago by Christopher Foo <[email protected]>
database.sqltable: Fix update to use O(1) instead of full table scan.

Sqlalchemy generated a UPDATE WHERE EXISTS (SELECT 1 FROM WHERE) instead
of using a JOIN or a UP...

7543fe0b98ea520c11314b25ce674e3f52df8305 authored over 10 years ago by Christopher Foo <[email protected]>
builder,options: Add WIP FTP processing.

664420db3b52a07e6d14d7e677ef872dbfad220a authored over 10 years ago by Christopher Foo <[email protected]>
processor.ftp: Add WIP implementation

c6e62cbb8c826e3e0133da13b2bda7cc95ef5dc2 authored over 10 years ago by Christopher Foo <[email protected]>
ftp: Add Response.reply attribute

67066c7fb431b1308c3aa4d9ba14d1b994471675 authored over 10 years ago by Christopher Foo <[email protected]>
processor.web: Clean up imports. Add back discard document logic.

6a5538ed892a37360f11ea938efed0c2cdcf96d4 authored over 10 years ago by Christopher Foo <[email protected]>
Merge branch 'issue/193_cx_freeze_improve' into develop

82668ff3c0e3604c3f0f0cef0d32924eef261fa9 authored over 10 years ago by Christopher Foo <[email protected]>
doc: Add link to cx freeze downloads.

ea48c3184af5f658d6a55c46f359c6c0f44c91dd authored over 10 years ago by Christopher Foo <[email protected]>
Support cx freeze under Windows.

Closes chfoo/wpull#193

7e68e528fe90514d410655f72c34bdbfdb49c5d8 authored over 10 years ago by Christopher Foo <[email protected]>
Add makefile for cx_freezing on Debian/Ubuntu

re: chfoo/wpull#193

6fad6cf330e67286fd8bd5b1190bba50926de346 authored over 10 years ago by Christopher Foo <[email protected]>
options,__init__: Match prog description from readme.

ae8fde2fd52ca068f71fe36ee20c9ae8454fc5f0 authored over 10 years ago by Christopher Foo <[email protected]>
processor.rule: Fix wrong connection refused error type

23b24807592722a85532c5ab86fcb22c5a89180e authored over 10 years ago by Christopher Foo <[email protected]>
travis.yml: Install into and run tests in ./thematrix/

9ddd62f3a6ec4b58349a4ca2e983c7f74359ab5b authored over 10 years ago by Christopher Foo <[email protected]>
setup.py: Add missing recorder package and test files

905288083954f7ae80913ccafde6cecdbe972705 authored over 10 years ago by Christopher Foo <[email protected]>
processor.rule: Fix up statistics counter on handle_document()

7cdc6edda5f0f291412cf51a5d55a614f4e2dfaf authored over 10 years ago by Christopher Foo <[email protected]>
Merge branch 'develop' into issue/23_ftp_2

Conflicts:
wpull/testing/py_hook_script2.py

5df1dd20d7c0d3b0346a186e48ad7d97449ce9ab authored over 10 years ago by Christopher Foo <[email protected]>
URLTableHookWrapper: Account requeued error URLs. Fixes negative counter.

f39ed246b0070b3c93329ef38a7e2d288ce263cf authored over 10 years ago by Christopher Foo <[email protected]>
Abstract response handling from WebProcessorSession to ResultRule

e2279a8d7f403b2e76b8c9cd485f6465c4201cca authored over 10 years ago by Christopher Foo <[email protected]>
body: Add directory and hint params to Body.__init__()

7c7146535b74495088c1d3ee56ce2b09503387bc authored over 10 years ago by Christopher Foo <[email protected]>
fixup! builder: Stub in FTP functions.

062cfc7a87f9ce8fc3ae19421f62ec84888f2c12 authored over 10 years ago by Christopher Foo <[email protected]>
builder: Stub in FTP functions.

42e7f7e7875204c620222bfc56dc0e1086e559e6 authored over 10 years ago by Christopher Foo <[email protected]>
urlfilter: Rename HTTPFilter→SchemeFilter, add FollowFTPFilter

572d19f418e950df409e353ed0fe008042de7e95 authored over 10 years ago by Christopher Foo <[email protected]>
phantomjs_test: Handle case where mock crash has no reply.

206d63c26d9ae94cdd05744f4d047dcae9c5a5ae authored over 10 years ago by Christopher Foo <[email protected]>
proxy: Support add and extract cookies for PhantomJS.

Closes chfoo/wpull#95

1c7377973205f68adddc8cd8d7af1be0fd0971f5 authored over 10 years ago by Christopher Foo <[email protected]>
requirements: Require trollius>=1.0.2

3a7164e5ba2d3acfef6649f34be3e14bc6fae0be authored over 10 years ago by Christopher Foo <[email protected]>
Add phantomjs.PhantomJSClient.close().

b632c414cb15a3d29533bdac13373bd31977a6d1 authored over 10 years ago by Christopher Foo <[email protected]>
phantomjs: Handle cases of PhantomJS crash during remote check-in/out

Closes chfoo/wpull#194

bc31e5045cafa511f77f9a8386a13161b9723ce4 authored over 10 years ago by Christopher Foo <[email protected]>
Add WIP FTP/Delegate Processor stubs.

38b86d2bcae14c0f50fa2c66e89ff5a9f5a43e46 authored over 10 years ago by Christopher Foo <[email protected]>
Add ResultRule. Move retry connrefused/dnserror params.

e2d42fa9068c8f96ff10ff2dbd15808d7ee56fbb authored over 10 years ago by Christopher Foo <[email protected]>
app: Don't print crash message on HookStop.

53dae8fb6904ec3c8500686f92a518361bc093bb authored over 10 years ago by Christopher Foo <[email protected]>
Move converter concern out of WebProcessor

fb9a3063ca9514ab89ff5eb506e1868e1f3ccee4 authored over 10 years ago by Christopher Foo <[email protected]>
url: Include brackets in IPv6 addresses on hostname_with_port()

b05aa7cd904ee4f84f3525446cde30cf2bbca7b7 authored over 10 years ago by Christopher Foo <[email protected]>
Update docs for recorder package.

a8379123b2d878c04d10e98583d696704e72069d authored over 10 years ago by Christopher Foo <[email protected]>
Fix recorder package import references

a5f540ab4116d1e2846689df4f9116d2f27f15a5 authored over 10 years ago by Christopher Foo <[email protected]>
Split recorder module into package.

3195f7fcace8bb76317362d5abfd4467c74f25a6 authored over 10 years ago by Christopher Foo <[email protected]>
Bump version to 0.1002a1

[ci skip]

f145fd56938e04217decf0f84f4dc146fe2ace4d authored over 10 years ago by Christopher Foo <[email protected]>
Bump version to 0.1001

9c681627095c29b34847cfeab393dbc424f6d4cc authored over 10 years ago by Christopher Foo <[email protected]>
Merge branch 'develop'

Conflicts:
wpull/version.py

03c899346fb6e8f31fc71b3c47729a20d9eefeb8 authored over 10 years ago by Christopher Foo <[email protected]>
changelog: Update latest to 0.1001.

[ci skip]

c16189b119e7ec4b4b93f516d7e653544fcc734d authored over 10 years ago by Christopher Foo <[email protected]>
connection: Rewrite check_out() to fix race condition causing hang.

clean() may have cleared out all connections in the host pool but the
original caller is still w...

6eeb4744c44a814d2a380412dd2e92689277ae1c authored over 10 years ago by Christopher Foo <[email protected]>
travis.yml: Don't run coverage under PyPy which causes timeouts.

e9e5eeff3f898765e33aa19a39f82d62256f4d43 authored over 10 years ago by Christopher Foo <[email protected]>
connection: Handle SSLErrors with "unknown ca".

551d099f4fb73f61bf4604b5c4b8f6de3ce5eb2b authored over 10 years ago by Christopher Foo <[email protected]>
app_test: Handle Travis CI slowness on test scripting hooks

212f3e81809b798fa1e31edd5f00a05de06748a9 authored over 10 years ago by Christopher Foo <[email protected]>
travis.yml: Add coveralls config. readme: Add coveralls badge.

2c7061abd79db312f6f2a8ac55bbce719463e8aa authored over 10 years ago by Christopher Foo <[email protected]>
Add unit test for observer module (coverage)

d5ab94da7a5007800b654f940ee817cddaa1b95d authored over 10 years ago by Christopher Foo <[email protected]>
document.util: Use function instead of duplicate code (coverage)

f2f9a6d9e0b3837f85cf95da3f3cb2e478142b07 authored over 10 years ago by Christopher Foo <[email protected]>
Add unit test decompression.gzip_uncompress() for coverage.

e552705726fcc1bc01ce3e3e3ed0f994c654224b authored over 10 years ago by Christopher Foo <[email protected]>
string.format_size: Fix format error when sizes are terabytes.

Increase coverage.

06b85edd063073dc873a1230c70031e2ec4e9777 authored over 10 years ago by Christopher Foo <[email protected]>
scraper.javascript: Fix nonsensical link scraping code

Regression introduced in 342acf5368d0b2fc484b4cf3e5e44fb390b9ab4a and
made worse by 342acf5368d0...

cc4255ef5d79eb9fc8c9f580d1593d3ca9657a59 authored over 10 years ago by Christopher Foo <[email protected]>
scraper.util: Use warning string consistent with url module.

[ci skip]

905e0e57a848bca2f137acf18598c8153ae8de94 authored over 10 years ago by Christopher Foo <[email protected]>
Add --debug-manhole option for installing Manhole socket

f0f30bfef40cb0df1d2ce6125aaf817a142b461a authored over 10 years ago by Christopher Foo <[email protected]>