Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/ArchiveTeam/ArchiveBot

ArchiveBot, an IRC bot for archiving websites
https://github.com/ArchiveTeam/ArchiveBot

ignore irrelevant languages and .pl spam sites

efeae454e2688a0e2f740a8f18c441055274c6ac authored almost 10 years ago by Sanky Sanqui <[email protected]>
db: ic.cz: ignore prev/next links on web boards

8b03d1c2663c8a98c05c7e809fffbbbda5a934ad authored almost 10 years ago by David Yip <[email protected]>
db: ic.cz: ignore web poll thing

13df06674443619ac34a3e72569e880779372652 authored almost 10 years ago by David Yip <[email protected]>
db: ic.cz: ignore all site statistics.

Normally I'd be interested, but we just don't have enough time for
these.

3366153e847ee74e230a3884b51908cf7bf50bb9 authored almost 10 years ago by David Yip <[email protected]>
db: ic.cz: ignore targetx&targety= pairs that come from clicking maps

2c6ca1d281169842d216f3215bc7bad4e80873bb authored almost 10 years ago by David Yip <[email protected]>
db: ic.cz: ignore more reply/UI-state-change actions.

aba14afe0f0edfd08e37b7e97c3bbafd1131aebf authored almost 10 years ago by David Yip <[email protected]>
db: ic.cz: ignore negative indices for image galleries.

These don't yield anything useful.

ba242eef2ae3fdc4b4747e7052ef1639ef8072d4 authored almost 10 years ago by David Yip <[email protected]>
db: Also ignore album sort on coppermine thumbnail pages.

df1c290a40fe22bb868d0905444ddec6137d9b16 authored almost 10 years ago by David Yip <[email protected]>
db: ic.cz: even more calendars.

28d7cf1c8dca13fe2df47d21921c4e9f834195aa authored almost 10 years ago by David Yip <[email protected]>
Copy in latest dupespotter

7caf12bec527de49ca3af0fae72878b9400cdcf1 authored almost 10 years ago by Ivan Kozik <[email protected]>
db: ic.cz: Ignore sorts on shops, write-product-review pages

7608d3ba31d0e17a6be43562f9338a574e8c2077 authored almost 10 years ago by David Yip <[email protected]>
db: How many ways can _you_ write "calendar"?

a32c4fd28c36b4f8eae0a87be6c6984a77a3826c authored almost 10 years ago by David Yip <[email protected]>
db: Remove incorrect mode= string from ic.cz Phorum ignores.

6ef64fc83ced13cf76990193b4c539c982f1d1c7 authored almost 10 years ago by David Yip <[email protected]>
db: Ignore sort-order-in-query-string thing on Phorum boards

bcc34b9d7610ba626b320dee03050e924c6f42db authored almost 10 years ago by David Yip <[email protected]>
db: ignore more infinite-calendar-things on ic.cz.

e36e82961c14b76bfe6ae09ec07af2233c45dfd7 authored almost 10 years ago by David Yip <[email protected]>
travis.yml: Use quiet opts and travis_retry instead of for loop.

ac5e3a95eef6f6458d185a0773806502af42c1bf authored almost 10 years ago by Christopher Foo <[email protected]>
doc: Change Tumblr example to example.com to avoid naive copy & paste.

That ignore pattern doesn't always work
(4aec43731d1b4962a54794ea301927c31c24959e) so use generi...

2e9e0d585395ec223cea29e168edaf32aa35e9a9 authored almost 10 years ago by Christopher Foo <[email protected]>
travis.yml: Retry pip install

d601039e5ef92633005822ec7d336906ea8b2bf0 authored almost 10 years ago by Christopher Foo <[email protected]>
pipeline: Add stackoverflow link doc.

[ci skip]

e7af963f921a7797b9d4749ad9943d7538294a75 authored almost 10 years ago by Christopher Foo <[email protected]>
pipeline: Bump version.

a97fd361e7d84df36c2d1a5764f422f83c929417 authored almost 10 years ago by Christopher Foo <[email protected]>
pipeline: Flush after dupes print.

Closes ArchiveTeam/ArchiveBot#153

1bb22f181d6dc7bcfb7b2fa4c61fc9cf9c9fbdb1 authored almost 10 years ago by Christopher Foo <[email protected]>
pipeline: Fail early when sparse file not supported

cdb3a71ff82298f2249974384a5020c0283c7a8c authored almost 10 years ago by Christopher Foo <[email protected]>
ignore_patterns.singletumblr: Allow a.tumblr.com

Allow things like https://a.tumblr.com/tumblr_njvn2jIkir1unm52po1.mp3
served from http://dmcasaf...

4aec43731d1b4962a54794ea301927c31c24959e authored almost 10 years ago by Christopher Foo <[email protected]>
db: More ic.cz patterns.

In particular:

- harizzzma.com and nahraj.net no longer resolve, so don't waste time
trying
-...

ee36f6b3f6b596ad7e1e7551b428d521a8a22142 authored almost 10 years ago by David Yip <[email protected]>
Bump pipeline version. #137.

d47bf871d3eb499f29758f67da869ff2ce72f5cc authored almost 10 years ago by David Yip <[email protected]>
doc: Update other instances of --pipeline docs. #137.

3cac02f1d8edfe550ee387f3384a59d14c8ea82d authored almost 10 years ago by David Yip <[email protected]>
bot: Match verbiage for --pipeline with docs. #137.

887966cb6cf0d4aa2d1e53bae13a7f5c9333ade4 authored almost 10 years ago by David Yip <[email protected]>
Use pipeline_nick to control which custom queues we look at. #137.

a8f6c9f526b5f982d314f747da62b371ceada4a0 authored almost 10 years ago by David Yip <[email protected]>
Document new behavior of --pipeline option. #137.

8373500b6aff70a7d79553b0e114ebcb7b243308 authored almost 10 years ago by David Yip <[email protected]>
Revert "pipeline: Use session-timeout w/ wpull==0.1007. Bump version."

This reverts commit f454fd79eee78d35b6aacc5d72fe38581bca17bb.

Wpull stability problems.

4c91ae284e5514480219f9b20b2da3742212fd22 authored almost 10 years ago by Christopher Foo <[email protected]>
db: ic.cz ignore set - further refinements.

In particular:

- ignore more guestbook links
- remove viewtopic.php.*start= from set, because a...

a6b82c5ab5a4f672eff5768e44cab671a4ac7e92 authored almost 10 years ago by David Yip <[email protected]>
db: ic.cz: Also ignore &start=\d+ on forums.

This appears to be a pagination thing that we don't need.

4386320432e05745abad24e5cb54c670a52ba9b5 authored almost 10 years ago by David Yip <[email protected]>
db: More troublesome infinite-calendar loops on ic.cz.

6bd8eb3713ec97c614ff3b9317530c4472ec2b8e authored almost 10 years ago by David Yip <[email protected]>
db: An ignore set for unwanted URLs on ic.cz.

This could be broken up later, but this is much more convenient for now.

9f5a30adadbb8cec1676331349a917c633371ec3 authored almost 10 years ago by David Yip <[email protected]>
pipeline: Use session-timeout w/ wpull==0.1007. Bump version.

--session-timeout 21600 will kill a stream download longer than 6 hours.

f454fd79eee78d35b6aacc5d72fe38581bca17bb authored almost 10 years ago by Christopher Foo <[email protected]>
pipeline: Bump version.

e22473b10001a2b19f946ada8abec353eb6a6a37 authored almost 10 years ago by Christopher Foo <[email protected]>
pipeline: Add preflight module and check wpull args

18ebab788d88a35253abb5b8fac395d3aa775fdc authored almost 10 years ago by Christopher Foo <[email protected]>
pipeline: seesaw.wpull: Fail early w/ KeyError instead of passing None

ab043867d50ac96ef80f95f4736e118b1cebd731 authored almost 10 years ago by Christopher Foo <[email protected]>
db: coppermine: also ignore last-commented-by order.

fda5f583dc00cf0e4543dc02972f4d7e338bac5c authored almost 10 years ago by David Yip <[email protected]>
Merge remote-tracking branch 'origin/master'

cc90a7778548cda44faee9e9e244d407ac676b9a authored almost 10 years ago by David Yip <[email protected]>
db: Restrict Coppermine album selector to displayimage.php.

cb0f115d408fbdcac357d9602b91777a37346b5d authored almost 10 years ago by David Yip <[email protected]>
db: Also ignore Coppermine's lastupby pseudo-album.

774b931ad73a497b78d4f69518ee0b48618e464b authored almost 10 years ago by David Yip <[email protected]>
db: Also ignore addfav.php for Coppermine.

a95123a3116ba6e66db2315af75b051a99ca9b76 authored almost 10 years ago by David Yip <[email protected]>
db: Add an ignore set for Coppermine Photo Gallery.

ic.cz has TONS of these things.

3e62d5ca1d3f2dc5f7b9f05c24004f878c8378ae authored almost 10 years ago by David Yip <[email protected]>
Copy in latest dupespotter

76ab86c9c180e042ae877f076ee2ca2fe24e6cc8 authored almost 10 years ago by Ivan Kozik <[email protected]>
Merge remote-tracking branch 'origin/reddit-over18-4'

707b014cbe94b7f772b4fd9647a2da7caae71c09 authored almost 10 years ago by David Yip <[email protected]>
Ignore more twitter share links

f9e2d8ee41aac885eb60320f224da3bcec533e86 authored almost 10 years ago by Ivan Kozik <[email protected]>
Merge pull request #152 from ArchiveTeam/topic/viewer_json_api

viewer: Add JSON interface to search endpoint

2f644962e7fe41c4f33827b8cdb647827dc110b8 authored almost 10 years ago by yipdw <[email protected]>
viewer: Add JSON interface to searcher.

Example usage:

curl -s 'http://HOST/api/v1/search.json?q=tumblr' | jq '[.results[].domain]'

0a0258b62d69365eeaf8152a2792a51f3bd4257b authored almost 10 years ago by David Yip <[email protected]>
Bump pipeline version.

d79ed98792ce6dcafed7767b4b85120d62edd24d authored almost 10 years ago by David Yip <[email protected]>
pipeline: Use wpull 0.1006.1.

Version 0.1006.1 fixes chfoo/wpull#236, contains dupespotter plugin
support, and will force me t...

f447f5578006c38361fb5234e40471e0a545b793 authored almost 10 years ago by David Yip <[email protected]>
fixup! viewer: Handle percent-encode filenames with spaces

[ci skip]

15042610208704422c2617fe3446777bce650ac9 authored almost 10 years ago by Christopher Foo <[email protected]>
viewer: Return 404 when domain/job is not found

[ci skip]

60a66e7de4310b8e1c409f11df739fcb15a87d94 authored almost 10 years ago by Christopher Foo <[email protected]>
viewer: Handle percent-encode filenames with spaces

[ci skip]

7a1dbf62b4ce44c106f9286c324e6a32452acda6 authored almost 10 years ago by Christopher Foo <[email protected]>
viewer: Lowercase nicks on leaderboard query

[ci skip]

411d46da37b1ebc231040f5d0b676ccbc1b4d322 authored almost 10 years ago by Christopher Foo <[email protected]>
bot: Limit DELAY_SPEC to 7 digits max.

Prevent accidentally sabotage a job that won't run for a week for
example.

ae692d809015881f7473487b6b52061009377f0f authored almost 10 years ago by Christopher Foo <[email protected]>
bot: Change SET_CONCURRENCY to only match 2 digits of workers.

Closes ArchiveTeam/ArchiveBot#144

c91df5d8a654ec9ebf8c6bfafacefe774c903789 authored almost 10 years ago by Christopher Foo <[email protected]>
pipeline: Bump version. Require chfoo/wpull@277ea4bf as a hotfix.

84e0c1cf9c0a9559fa1370f3570a0837f9a7641f authored almost 10 years ago by Christopher Foo <[email protected]>
pipeline: Check for body is None before accessing content_size

Closes ArchiveTeam/ArchiveBot#151

065b0cc2549224f72a16cd3611fffb2050962c74 authored almost 10 years ago by Christopher Foo <[email protected]>
Work around https://github.com/chfoo/wpull/issues/234

1dd3df225ae4fc3a0e8c478ac4bdfbe2e64acf93 authored almost 10 years ago by Ivan Kozik <[email protected]>
Merge pull request #149 from ArchiveTeam/topic/wpull_dupespotter_plugin

pipeline: Wpull with dupespotter plugin, escaped fragment, strip session ID

d0b38707c2b628c0ad81af9ea7edeb18cfe82c87 authored almost 10 years ago by Ivan Kozik <[email protected]>
pipeline: Bump version 20150201.01

0816de1076b2ea643f85fccd08079ab709d95099 authored almost 10 years ago by Christopher Foo <[email protected]>
pipeline: Add --strip-session-id & --escaped-fragment

d54eb7804b57cd59893770a2ca575815f7aaf89b authored almost 10 years ago by Christopher Foo <[email protected]>
pipeline: Require seesaw>=0.7,<0.9

48211ef55bc3b1d670f6dafea41b2fd57b8b980d authored almost 10 years ago by Christopher Foo <[email protected]>
pipeline: Update requirements for wpull dupespotter plugin

6c04cd2a7d84fea44d0f40f04c031827e3a2d793 authored almost 10 years ago by Christopher Foo <[email protected]>
pipeline: Copy dupespotter code from wpull hax branch

Copyied from ludios/wpull@fd9d7ad97eca345984175cf71a4e8c38c54bb400

1dca49aaeb6803eb71cb346457cae0b6b909f92a authored almost 10 years ago by Christopher Foo <[email protected]>
Don't crash when another uploader grabs the file

319436267d78f46d9ae5e57ebdac02af08a4a9da authored almost 10 years ago by Ivan Kozik <[email protected]>
doc: Describe trailing slash, merge double status heading

Describe trailing slash and parents. Format options as code. Fix the
double "status" heading. Fi...

faaba51dd1c1a21702dad5515b3f572989e2d77e authored almost 10 years ago by Christopher Foo <[email protected]>
Document !whereis.

a3adc0a682fab06ba34abd971c405a9888cf4012 authored almost 10 years ago by David Yip <[email protected]>
Implement !whereis.

26b6e7b881042f9fb3f3add51de34e40b90efcfd authored almost 10 years ago by David Yip <[email protected]>
Merge pull request #141 from ArchiveTeam/topic/dashboard_3.0_rebase

Add dashboard 3.0 as a beta

5986635570c5cfdf888990a64170d336adac9a15 authored almost 10 years ago by yipdw <[email protected]>
Ignore non-Icecast mp3 streaming sites

fd8004a3dfcf586040fbc8bee9efe34ba8cf16d3 authored almost 10 years ago by Ivan Kozik <[email protected]>
viewer: Format cost leaderboard numbers

[ci skip]

976a5d098312f99b1aa39594689abbaf96ca9b32 authored almost 10 years ago by Christopher Foo <[email protected]>
fixup! viewer: Add simple cost leaderboard

Add missing template

9e8a5dbbd8edb1ccd89aea9f84df2b77849d5d8f authored almost 10 years ago by Christopher Foo <[email protected]>
viewer: Add simple cost leaderboard

Re: ArchiveTeam/ArchiveBot#108

e96bab3d0b692c21dda30228657dd175b0d82435 authored almost 10 years ago by Christopher Foo <[email protected]>
viewer: Show more details in search. Strip paths from search query.

1ac67c1428d10cc1f6d4ec80b040072aab3232a1 authored almost 10 years ago by Christopher Foo <[email protected]>
pipeline: Add CheckIP task. Bump version.

Closes ArchiveTeam/ArchiveBot#135

79f11f496fcb829e8c2d25ca829e83db3fdc77cf authored almost 10 years ago by Christopher Foo <[email protected]>
bot: Point histories link to the viewer link

afe63e8230456637cd4995ee69f29b8c5f7eb176 authored almost 10 years ago by Christopher Foo <[email protected]>
dashboard: Add "try 3.0 beta" link

c2d46d8b76c14cb553128ee5a68a97a7d0d7fe46 authored almost 10 years ago by Christopher Foo <[email protected]>
dashboard: Implement speed calculation

050c62779a898af969c405bc4b88cd92e3bc9315 authored almost 10 years ago by Christopher Foo <[email protected]>
dashboard: Remove get scrollTop call. Fix up log trim logic.

bc3b85f99b680b230b24dc805e601d889cc9c6d3 authored almost 10 years ago by Christopher Foo <[email protected]>
dashboard: Point JS path to server one

e28f37a2c7e675e9efb15a3a286256424e43822e authored almost 10 years ago by Christopher Foo <[email protected]>
dashboard: Implement reconnect, full detail toggle. Use one-time binding.

458d7020757ee089722c2980d6bde6e5d6d7bb53 authored almost 10 years ago by Christopher Foo <[email protected]>
dashboard: Use less DOM elements.

a6b8add16d0217bae8ea6ab78fa9dbe69ef70d01 authored almost 10 years ago by Christopher Foo <[email protected]>
dashboard: Add antiscroll logic, clear button, button to view one job.

8d39d591ed71f2597102ee6a53a27dfb232e0ad0 authored almost 10 years ago by Christopher Foo <[email protected]>
dashboard: Fix queue calculation. Add progress table. Manual log DOM usage.

f55fb031d2b1c4812b5b7fa1fad767171d92052c authored almost 10 years ago by Christopher Foo <[email protected]>
dashboard: Add WIP 3.0 version

What's new? The controller is replaced with AngularJS and the JavaScript
has been rewritten and ...

18b6c83496d2eea025d2389248f6eedac18c1439 authored almost 10 years ago by Christopher Foo <[email protected]>
Ignore more dokuwiki nonsense

35c47694163623a38f52742872acb83a2a160d89 authored about 10 years ago by Ivan Kozik <[email protected]>
Use wpull hax-26

f7140275ed078313386eda3dff1930c91d7b144c authored about 10 years ago by Ivan Kozik <[email protected]>
Fix erroneous rendering of border after filtering

1f5ec68bd41f9271aa15895b2cdcb687291d3689 authored about 10 years ago by Ivan Kozik <[email protected]>
Make log window animation a little less glitchy

5d0867304bb0188e6a988df7438d4d43ec30d9b9 authored about 10 years ago by Ivan Kozik <[email protected]>
Animate showing/hiding of log windows

29619c1d5f4f23beb6cecfb71332ffc150b8d6ad authored about 10 years ago by Ivan Kozik <[email protected]>
Show taller log window when there is only one visible

a9e230a88a92737f2d413c1e3d14a21edd1a107d authored about 10 years ago by Ivan Kozik <[email protected]>
Use wider dashboard margins on big screens

4d6de3225620f72390428f133ba96265d37b7966 authored about 10 years ago by Ivan Kozik <[email protected]>
Allow running multiple uploader processes on the same FINISHED_WARCS_DIR

d0fca000708fd2d4c26137ac44220bc1e77c5ce5 authored about 10 years ago by Ivan Kozik <[email protected]>
Merge branch 'master' into reddit-over18-4

7aa6fd55a25f1fc037ee4ed38a2d4a1af3b34529 authored about 10 years ago by Ivan Kozik <[email protected]>
Convert the dashboard filter box from substring search to regular expression.

This fixes a bug where clicking on stats information for a job would show
multiple jobs instead ...

990b70d2d62b6528e1762b91e8baf6fbc0066807 authored about 10 years ago by Ivan Kozik <[email protected]>
Ignore some junk wordpress URLs

22a4506b6c854d8e0a7d889576269862eaba6653 authored about 10 years ago by Ivan Kozik <[email protected]>
Ignore another share link

64dc17af3432a8f313df2cd8593c22929f1fa6b2 authored about 10 years ago by Ivan Kozik <[email protected]>
Use non-fancy string.replace for ignore placeholders. #138.

d128fdf091766d755a6afaae673da274e8f856b1 authored about 10 years ago by David Yip <[email protected]>