Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/ArchiveTeam/ArchiveBot

ArchiveBot, an IRC bot for archiving websites
https://github.com/ArchiveTeam/ArchiveBot

Get the pipeline version automatically from git

This requires at least git 2.6.0 due to --date=format:x

aecfb01d0b5d2b39fd3ba3188d631996a3cb4fea authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #407 from JustAnotherArchivist/force-utc

Require that the time zone is set to UTC on pipelines

d258af0a76ddbce2e2f531926262159f4ed772b4 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Require that the time zone is set to UTC on pipelines

fc7d9de79ebf5faab95c1827b43db47f8f97651f authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #406 from JustAnotherArchivist/pipeline-notifications

Notify IRC channel on pipeline changes

5cd1e38a11c62a4d65c16e2c340cf67e777273ba authored over 5 years ago by JustAnotherArchivist <[email protected]>
Notify IRC channel on pipeline changes

d26dedaa5dc50d20efb734fdc5ab2d427a367e82 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #404 from JustAnotherArchivist/register-pipeline-immediately

Register pipeline immediately on startup

3a2c8d266eed661a710d903162687563cf86d315 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Handle the data directory not existing

The data directory is created by seesaw when it starts running the Pipeline (or specifically, th...

b39fe701b98a98dde8e7aca475ec2409de9c465a authored over 5 years ago by JustAnotherArchivist <[email protected]>
Register pipeline immediately on startup instead of after having started all GetItemFromQueue tasks

1585458ad778036b32203422fbd76f628fa2b9cb authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #403 from JustAnotherArchivist/log-shipper-crash

Fix log shipper crash

48092a88c2a4b9251d7cd954dd767ed93eaad491 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Catch all Redis errors in the log shipper

Previously, only connection errors were caught. Buf if the disk on the control node was full, fo...

01a4e9620d290086a7d1f43e9e86978e9b4d121f authored over 5 years ago by JustAnotherArchivist <[email protected]>
Limit the log queue size and drop entries if they cannot be shipped fast enough

63810f21e60335736d32e23f9aa4e425c1563ebd authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #383 from JustAnotherArchivist/cogs-without-broadcast

cogs: Run the Broadcaster only if necessary, i.e. if tweeting is enabled

bd2abc6db7e50ffbd0cad4f3ac16fe4ffd40dba0 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #382 from Ghostofapacket/patch-5

Block uploading to an rsync URL that doesn't end in a slash

c7c53ee09f4840af3634e625f1209b61565202f0 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #386 from JustAnotherArchivist/ignore-sinkhole-2

Ignore a CBL sinkhole domain

2161a6e6533bc2124fa5740098aee6ed718fef3c authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #402 from JustAnotherArchivist/check-webservers

Check for local webservers

86cb4257a7a7673ef5bbc13985489c0f4dff9bf3 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Check for local webservers

ad0f1616e2a3699f0c8c2eb858cf185969000f41 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #401 from JustAnotherArchivist/check-nxdomain

Verify that inexistent domains do not resolve

a7ac103fbf1badead45768ad1a5736b2ef7075a9 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #396 from JustAnotherArchivist/keep-failed-logs

Compress the log file if wpull didn't finish cleanly (and didn't write a meta WARC)

38f81b5795d37a2930b7cbdfabc48a72ddc6cc92 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Verify that inexistent domains do not resolve

9e5e39ed3d33396a199de3e3ddbfcede77c0e285 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Compress the log file if wpull didn't finish cleanly (and didn't write a meta WARC)

63f60fa243b54948a73a77eb070bbfe7d0a760b1 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #399 from JustAnotherArchivist/update-tests-201908

Update tests (August 2019)

686b8951d373d4e5fb3f2b46e63377f9af4ba068 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Remove quiet flags on pip installs

6f7579d644c7410b76b8ad24e78dbc0490a66e40 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Explicitly run /usr/bin/$PYTHON instead of $PYTHON because pyenv interferes with the $PATH, breaking Ruby: https://github.com/pyenv/pyenv/issues/789 and https://github.com/travis-ci/travis-ci/issues/8486

26eeec1e5d2cb076fdc16d7515e00f455174bc9b authored over 5 years ago by JustAnotherArchivist <[email protected]>
Replace outdated deadsnakes PPA that hasn't been updated since 2017

7a2a7693414e5314852f2625ea75ebd9abdb6029 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Fix broken ignore pattern in meetupeverywhere

5c858b080dc3a2448d05cb4323b0989b4519cf65 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Use PYTHON env variable instead of hardcoded python3.4 executable for pipeline test runner

35af69bcde3e5aae25bba78ac67f1666b456cfc9 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Run tests on Python 3.5 and 3.6

308a5fde2b2b57aaf2f1a4cecef2577353697c14 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #390 from JustAnotherArchivist/phantomjs-removal

Remove PhantomJS support

14c4a8423fc1df54c9558e42990923894ccc64d1 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Remove PhantomJS support

* PhantomJS is no longer maintained. The last stable release was in early 2016, little developme...

abcb77fac8ecd849b0aae9d7b78e632505430b33 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #388 from JustAnotherArchivist/rsync

Check that rsync exists in pipeline and add it to the installation command

7aad15ae7817a66cd098b7d82423a225b6fa9e17 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Check that rsync exists in pipeline and add it to the installation command

323e1a959f5a43f36f3e40275d74c1200f9a9820 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Ignore a CBL sinkhole domain

www<dot>cgzxb<dot>com is associated with some malware, and accessing it causes a listing on the ...

5482b4f585414c379b9b8173f04aa431025c3e97 authored over 5 years ago by JustAnotherArchivist <[email protected]>
cogs: Run the Broadcaster only if necessary, i.e. if tweeting is enabled

The Broadcaster uses a ridiculous amount of CPU, and if tweeting's disabled, that's just a waste...

9ce5206d33d6af16a9b60e026d5574290305a55b authored over 5 years ago by JustAnotherArchivist <[email protected]>
Update loader.py

https://github.com/ArchiveTeam/ArchiveBot/issues/305

Fixes issue 305.

d9df1e5b1af8ee9c2bc2c7cc9a68f261d0c66400 authored over 5 years ago by Matt Iggo <[email protected]>
Merge pull request #381 from JustAnotherArchivist/ignore-more-share-links

Ignore more WhatsApp, Facebook, LinkedIn, and LINE share links

7d7fd51aedd07b9aef6a2a1f0e35b2fae61bc42c authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #380 from JustAnotherArchivist/ignore-blogs-wpfc-minified

Ignore WordPress Fastest Cache's minified JS recursion

717cae6969f048006e889c83798cfe42fab22fa1 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Ignore more WhatsApp, Facebook, LinkedIn, and LINE share links

fbc7ab25b7d6cb54b1a1ff71189b1fd357550258 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Ignore WordPress Fastest Cache's minified JS recursion

bf157e5738d35786094526ca4c71373c87d6d169 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #377 from JustAnotherArchivist/dashboard-include-js-dependencies

Self-host the dashboard's JS dependencies and update them

6d8f2996d90646e8b7bbd47997c051553c80c76b authored over 5 years ago by JustAnotherArchivist <[email protected]>
Self-host the JS dependencies and update them

jQuery from 2.1.1 to 3.4.1, DataTables from 1.10.2 to 1.10.19.

ef37770938b3adda75b3110ce80f5a682e75334f authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #376 from JustAnotherArchivist/useragent-update

Update user agents to current versions

ce1115cf9cc3a4f6412967fc421118d5b1c3366d authored over 5 years ago by JustAnotherArchivist <[email protected]>
Update user agents to current versions

Mostly based on https://techblog.willshouse.com/2012/01/03/most-common-user-agents/

Couldn't fi...

9e94db32fb9232c5c293780e1a6baa8541a6bf40 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #375 from JustAnotherArchivist/ignore-blogs-revpagination-custdomain

Ignore reverse pagination on custom Blogspot domains as well

3ee8f321f41c1f8d35a623406abcbe96e8f7aaa9 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Ignore reverse pagination on custom Blogspot domains as well

2327e205ca09c0c917f3a1d4e9811af415b049ed authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #374 from JustAnotherArchivist/ignore-mediawiki-languages

Add separate igsets for MediaWiki locales and a script to generate them from MW message files

b2917de5a35d0564336860634f951be530733f00 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #373 from JustAnotherArchivist/ignore-github

Add igsets for GitHub

cb73a624eaa6bb81aad5c43f46b0abde0a924005 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #372 from JustAnotherArchivist/ignore-mastodon

Basic Mastodon igset

040aadd86f0cbc20fd036d4af3f5eb92e1f2ce13 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #371 from JustAnotherArchivist/ignore-blogs-update

A few blogs igset improvements

c08e4de42390a28f344fcd3beab23371485dc4ec authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #370 from JustAnotherArchivist/ignore-squarespace-js-hell

Ignore Squarespace JS hell

6a5cb653ef4d18fb147c66356f466c1ec1b0426d authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #369 from JustAnotherArchivist/ignore-mediawiki-various

Various improvements for MediaWiki ignores

c842c1f7f298783959ee57b97b28f746281ac054 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Add macOS Chrome UA (#366)

8737d8a6ca8cd2739a36a4a8769755ea0c753ef5 authored over 5 years ago by Flashfire42 <[email protected]>
Merge pull request #365 from Flashfire42/patch-1

Update Safari UA

bc1d9b96f6add654a590bf28862ba0150386e46b authored over 5 years ago by JustAnotherArchivist <[email protected]>
Add separate igsets for MediaWiki locales and a script to generate them from MW message files

1514d162e035c4cb1d304b41ced32a9f971638f8 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Add igsets for GitHub

"github" is the general igset that should be used with GitHub grabs. "nogithubcode" ignores the ...

5235b191e1b525805992f51f2f2a72b1e6b9629a authored over 5 years ago by JustAnotherArchivist <[email protected]>
Basic Mastodon igset

edc80952495f154c987f83eeafd6a80710968775 authored over 5 years ago by JustAnotherArchivist <[email protected]>
A few blogs igset improvements

- Fix CSI ignore on Blogspot, and remove search and label ignore since those pages are actually ...

3b90d44286e79aadcbc0c24670b0308a35b8e1d0 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Ignore Squarespace JS hell

bbd5807ec3569ac55386e39a60c96fc88cdb8ef1 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Various improvements for MediaWiki ignores

- I've seen a couple jobs recurse through the entire log of a user with limit=1. Nothing good ca...

900586fe05912a2c0bfb86b3b7760ecdb3bdb49c authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #367 from JustAnotherArchivist/new-dashboard-websocket

New dashboard WebSocket server

467cd28b02857520007b23217db6f3a0ea1f034b authored over 5 years ago by JustAnotherArchivist <[email protected]>
New dashboard WebSocket server

The old server was unable to keep up with the messages (#333), ate all the CPU and RAM it could ...

286970b33c89782703a795c30b690256ca2ffd6c authored over 5 years ago by JustAnotherArchivist <[email protected]>
Update safari.json

Updated Safari UA for Mac OSX with the string for High Sierra

79eb529f827a2b11b08cdeccda14898222293d3a authored over 5 years ago by Flashfire42 <[email protected]>
Add Edge UA (#364)

5013966112419ac6c88eabee6a030cc07be80032 authored over 5 years ago by Flashfire42 <[email protected]>
Merge pull request #363 from JustAnotherArchivist/dashboard-pending

Add pending job list to the dashboard and disable !pending when the queue is long

336cc412bc6ca4d5b7ec982e88f04935e7562fb3 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Disable !pending if the queue is too long

9b4ee44257d0feaf7228a439b4614eeb997f818f authored over 5 years ago by JustAnotherArchivist <[email protected]>
Change format to match that of the !pending output

176cb710fc99c5b88ab072668fef01c6daa61d21 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Add page listing pending jobs to dashboard

522bcc6594c2a84ae8cb32556c1e9b787a894678 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #357 from JustAnotherArchivist/bot-own-explain

Allow unauthorised users to add an explanation to their own jobs (fixes #223)

7f8ef3202c6ed976dba2cc0929d27fc9f328358f authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #356 from JustAnotherArchivist/bot-concurrency-alias

Add alias --concurrent for !a/!ao command and !concurrent for !concurrency

eeb9e2e109280a7843ea364ebcb0f324631b8f8b authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #355 from JustAnotherArchivist/bot-status-pending-ao

Add pending-ao counter to !status reply

8c045f0547d38105ed67321b25ce2c67f70ca8b7 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #360 from JustAnotherArchivist/error-handling

Add some error handling

a50778c51453ff12f9690d37b22ae1b63512bee6 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Add some error handling

6421c07ec88c739e8a2dea3faf72b2b3388cab91 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #354 from PromyLOPh/monitor

Add WPULL_MONITOR_DISK/MEMORY env vars

2b8ef1f6cc89d34ea2c70e2cfab4b4df9fba219b authored over 5 years ago by JustAnotherArchivist <[email protected]>
Bump pipeline version

008495b45c38a629bde316ab28cffb67f34148de authored over 5 years ago by JustAnotherArchivist <[email protected]>
Add !concurrent as an alias for !concurrency

77b0b86def8171fcdb18ca4ca77b1e2af8bea4e8 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Allow unauthorised users to add an explanation to their own jobs (fixes #223)

90622b50b8b1d3df1cd6e8f9f17fd9223b56f4d4 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Add alias --concurrent for !a/!ao command

wpull's option is called --concurrent, so it makes sense to support that on ArchiveBot as well.

35af01a175d5fd6f4b7d5085a237db68114d95a8 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Add pending-ao counter to !status reply

b0977ce5b573824899624dc9dcf6dfc034468068 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Add WPULL_MONITOR_DISK/MEMORY env vars

The defaults might not fit every pipeline and are a bit low when it
comes to large file download...

3a037042eb56c3b10b319f2378537686defac36c authored almost 6 years ago by Lars-Dominik Braun <[email protected]>
Merge pull request #350 from hook321/patch-2

Updated grab-site repo URL in README

47b0f4371a95c169c19381125743ec99199f9583 authored almost 6 years ago by JustAnotherArchivist <[email protected]>
Updated grab-site repo URL in README

8d0427f65af4b1eb0ecc5e31d311789f58bc13d1 authored almost 6 years ago by hook <[email protected]>
Merge pull request #337 from JustAnotherArchivist/pending-ops

Allow only opped users to add jobs when there is a queue of 5 or more jobs

0a82925930358ef3be23b30f94fadee410231efa authored about 6 years ago by JustAnotherArchivist <[email protected]>
Merge pull request #330 from anarcat/patch-1

document the archive --large flag

0cbc710b815eb8c9f430833cfb28faa6cbcbc4e1 authored about 6 years ago by JustAnotherArchivist <[email protected]>
Allow only opped users to add jobs when there is a queue of 5 or more jobs

dd5bae2f26761accf8fc316edf8ae9f74525e683 authored about 6 years ago by JustAnotherArchivist <[email protected]>
document the archive --large flag

It's in the source but not in the docs.

4143ca59eda0ed8eeb50744a2d0589ad6136e149 authored over 6 years ago by anarcat <[email protected]>
Merge pull request #327 from JustAnotherArchivist/disable-ytdl

Disable youtube-dl until #291 is fixed

9f194a4b6df5f8e90b10af2f1755350d81efca2c authored over 6 years ago by David Yip <[email protected]>
Disable youtube-dl until #291 is fixed

196fd335fb22e1734483d901035ab66d803465dd authored over 6 years ago by JustAnotherArchivist <[email protected]>
pipeline: Bump pipeline version

This reflects the ignoracle speedups.

958583bb0c380568640cabd96eba00591af9d5c3 authored over 6 years ago by David Yip <[email protected]>
Merge pull request #326 from anarcat/terse

make job status less chatty

90bf00bcd7665c1278551bb8fa3bc89b193b6aa5 authored over 6 years ago by David Yip <[email protected]>
use MiB to be consistent with dashboard...

... and the rest of the universe

a05779825d86ac2294a7609379655d1d695fd062 authored over 6 years ago by Antoine Beaupré <[email protected]>
remove dashboard reference in job status

The reference is noisy for nothing - it's already in the channel topic
and it's not used in othe...

78abe5b242b20a4a2f46296c6ffaac835e695c58 authored over 6 years ago by Antoine Beaupré <[email protected]>
make job status less chatty

The job status is currently two verbose and spams the channel with
four lines of log.

So instea...

69b525ff21f3ca32d1f1f18701fdfb8408742ec7 authored over 6 years ago by Antoine Beaupré <[email protected]>
Merge pull request #325 from ivan/dashboard-cherry-picks

Import grab-site's dashboard fixes

8df159a3048926cb77181d36e1b6d9fbd6cbf0b1 authored over 6 years ago by David Yip <[email protected]>
Merge pull request #262 from JustAnotherArchivist/ignoracle-regex-cache

Cache the ignoracle patterns while primary_url and primary_netloc stay constant

fc99354db4010bcf649ccd63116ba643975ab234 authored over 6 years ago by David Yip <[email protected]>
dashboard: use content="no-referrer" instead of the obsolete content="never"

132d56a68ef827c3ec7821054833b36559b1aa7a authored over 6 years ago by Ivan Kozik <[email protected]>
dashboard: keep table aligned when a crawl has > 9 connections

773904767c49fbd8ccf581d9f17b0d7fdb4023a2 authored over 6 years ago by Ivan Kozik <[email protected]>
dashboard: adjust code formatting

5c29a4860a4ebe68199424320e8980ab0880a61f authored over 6 years ago by Ivan Kozik <[email protected]>
dashboard: make the background a little less saturated

171221982d4044d3e57fc630a760eeb65baa7e8f authored over 6 years ago by Ivan Kozik <[email protected]>
dashboard: add a subtle box-shadow to the log windows

b2c2bc5a59fbc8c9ee65399cccf371bb3f561729 authored over 6 years ago by Ivan Kozik <[email protected]>
dashboard: adjust color to make it more obvious that stats line is a click target

fdb29ab7ba5e11e95c545f44bb89b91a3f55b416 authored over 6 years ago by Ivan Kozik <[email protected]>
dashboard: for Chrome 63+, use the faster `overscroll-behavior: contain` instead of attaching an onwheel event.

c6ae77ee696e921e82a9510b34bc0ff342dbb99c authored over 6 years ago by Ivan Kozik <[email protected]>