Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/ArchiveTeam/ArchiveBot
ArchiveBot, an IRC bot for archiving websites
https://github.com/ArchiveTeam/ArchiveBot
9b6fe920cd0910febc4f03f77b7d8b6e71e8d481 authored about 4 years ago by JustAnotherArchivist <[email protected]>
Fix syntax warnings in uploader
4d68ae48113720b45051706fb381fa700c6e92d2 authored about 4 years ago by JustAnotherArchivist <[email protected]>79e487ff16bce765d18d260c47f50d9ba90f4d42 authored about 4 years ago by JustAnotherArchivist <[email protected]>
Add a status endpoint to the dashboard
f00443eacfb345a5e6c2e8bb09009cb0f4a23315 authored about 4 years ago by JustAnotherArchivist <[email protected]>6f29a27a9243ebd87d18c5147b2625e42af21685 authored about 4 years ago by JustAnotherArchivist <[email protected]>
Support ipv6, fixes issue #315. For IPv4 forcing, create a pipeline that does not have an IPv6 ...
c4a2a5656ec391843fea881d07ee3af9fce2d330 authored about 4 years ago by Falcon Darkstar Momot <[email protected]>196dc7a8ff2c56c4cdbf545826f644650343a416 authored about 4 years ago by Falcon Momot <[email protected]>
option to allow it
409722ad0296a47bc0a162964daac07e3131135f authored about 4 years ago by Falcon Momot <[email protected]>Add AnyClip to badvideos
b6c6aeaaf15d2033f6c67d548437eb044e4d2397 authored about 4 years ago by JustAnotherArchivist <[email protected]>16dcbd460e6d619ee72fe58c663b6c06f15c61cc authored about 4 years ago by JustAnotherArchivist <[email protected]>
Add (temporary) badvideos igset
164f990406b87ace1e4343f4aac365cb5dd8ed54 authored about 4 years ago by JustAnotherArchivist <[email protected]>
This igset is only intended as a temporary workaround until #443 is implemented properly.
Does n...
Adjust IRC rate limits
751450dc94bf49e70dff86845da710cd1605b032 authored about 4 years ago by JustAnotherArchivist <[email protected]>Cinch's standard limits of 10 queue size and 0.5 per second are a bit odd. 1 per second is gener...
6486005c41b48e00f39a3475acca80e20db862a8 authored about 4 years ago by JustAnotherArchivist <[email protected]>Ignore Google login
602296d62d1127d73cd85a078000949596d5248d authored about 4 years ago by JustAnotherArchivist <[email protected]>b24b0df22ed122ae7e319e04c9bbb8d25767b2f3 authored about 4 years ago by JustAnotherArchivist <[email protected]>
Ignore Wikipedia's Special:CentralAuth correctly
367bd0bb60758443d6d0af578ff35fd4c1a24f9a authored about 4 years ago by JustAnotherArchivist <[email protected]>Ignore more unnecessary GitHub URLs
38a55821cb5f50eb0c9ddaf519a2a933906c7656 authored about 4 years ago by JustAnotherArchivist <[email protected]>Ignore Iterate/iteratehq's emoji mess
e208aae5b5b1393411be5a0ff961e8de504feb47 authored about 4 years ago by JustAnotherArchivist <[email protected]>Ignore IA URLs more thoroughly
2a0cf961ef87a8f1c0a614d2f1bebeddb9a654bc authored about 4 years ago by JustAnotherArchivist <[email protected]>Ignore another CBL sinkhole domain
9f0a0800c22eb70ed8fd5e8895ed48e0b8319367 authored about 4 years ago by JustAnotherArchivist <[email protected]>Ignore another 'clap' URL and the admin panel on FC2 blogs
57f5334740009d81e7191ca3ab6c07548ac9234d authored about 4 years ago by JustAnotherArchivist <[email protected]>Ignore wpfc-minified cache globally
effd77706030d93f2d5c40ff1b4cfa1bc6c37622 authored about 4 years ago by JustAnotherArchivist <[email protected]>Follow-up for #431
10d4582fe16ca45f357acb0291485285127daef7 authored about 4 years ago by JustAnotherArchivist <[email protected]>
- All searches on 'Actions'; there's heavy rate-limiting, and they're hardly useful anyway.
- us...
35d80fb70b7cdb906dcec97fd248564ee2a7ae21 authored about 4 years ago by JustAnotherArchivist <[email protected]>
d4b5eaba26c58c2dfff4f5d179eb2d2dfc3c78df authored about 4 years ago by JustAnotherArchivist <[email protected]>
Links directly to specific servers (ia######.us.archive.org) should not be used but are present ...
8b532a9568bfca99a6e45532a4d3ecfabac3ad96 authored about 4 years ago by JustAnotherArchivist <[email protected]>12745dcafa08c1768a3d5240a14d814f02f6785b authored about 4 years ago by JustAnotherArchivist <[email protected]>
Blog outlinks are common throughout the web, not only on jobs for an actual blog.
7d09c2eec98c48afac63d37e66dc17b9cf607c8e authored about 4 years ago by JustAnotherArchivist <[email protected]>Enable IRC SSL verification
6901c018631be5c88137923a8ded4a777b194659 authored over 4 years ago by JustAnotherArchivist <[email protected]>aeb1ddcacc7d1233328b06dc7abd11a923c8c7eb authored over 4 years ago by JustAnotherArchivist <[email protected]>
Add SASL support
c71a3b0087f999e9d09a23ba24fe7507b3d2bd90 authored over 4 years ago by JustAnotherArchivist <[email protected]>e4a730a76162dc8588f429ebb95351df0ca8fcbf authored over 4 years ago by JustAnotherArchivist <[email protected]>
Handle alternative FC2 blog domain, comment edit links, and claps
0ee607bace08a57b7b7c7116c273629a2b625996 authored over 4 years ago by JustAnotherArchivist <[email protected]>abc55c239a7e07314507fefc1b7219a6798387d4 authored over 4 years ago by JustAnotherArchivist <[email protected]>
Handle FC2 blogs with numberless domains
584cfc6b0611e5cba546f7929556ce5ec8058231 authored over 4 years ago by JustAnotherArchivist <[email protected]>0a5b780162d2d1461cfbf1ea2ef3e1ec4de759a8 authored over 4 years ago by JustAnotherArchivist <[email protected]>
Add ignore sets for FC2 blogs and wikis
58199942a4c79bbb2b0edeb87cfeee92675bf0cc authored over 4 years ago by JustAnotherArchivist <[email protected]>1bf9b0319e6782d6a478a4b9efbfa43b70e4a93b authored over 4 years ago by JustAnotherArchivist <[email protected]>
Ignore Littlstar video files
7bcef706590555279560c9111401b814ea931c29 authored over 4 years ago by JustAnotherArchivist <[email protected]>Some of these files are gigantic, especially the ones in the S3 bucket (4K VR etc., the largest ...
fe58cef13a1d8f745b6666059bdd8f5bc0a0d33e authored over 4 years ago by JustAnotherArchivist <[email protected]>Ignore another version of Tistory login URLs
965488bdc03ce7ece785989d933af991a7dcc2dc authored over 4 years ago by JustAnotherArchivist <[email protected]>84aa3aa3a5f689e9b144fc02212b19edbd0c6393 authored over 4 years ago by JustAnotherArchivist <[email protected]>
Ignore various unnecessary or broken URLs on GitHub
3585ed999010665a7b367e37fd6f325f30a23983 authored over 4 years ago by JustAnotherArchivist <[email protected]>Sync other MediaWiki locales to current mediawiki igset and improve scripts for doing so
0c43c97027d800a2e546d19bef4ed387a5789d29 authored over 4 years ago by JustAnotherArchivist <[email protected]>Fix webmachine-sprockets dependency
736e7477a3029cd07eb3df364d020e9b58f10aab authored over 4 years ago by JustAnotherArchivist <[email protected]>https://github.com/lgierth/webmachine-sprockets was deleted or made private sometime between 202...
1264c409c15bcd8f11704034e60a3c6dedeb73f0 authored over 4 years ago by JustAnotherArchivist <[email protected]>- First three additions are 404s (data-url, possible available with special headers or something...
4ca6f4a6b48540c1d72de0a86461a8deeaef9f71 authored over 4 years ago by JustAnotherArchivist <[email protected]>ee93927d03196fbc501bfcd3cf14a0c994ab4ae2 authored over 4 years ago by JustAnotherArchivist <[email protected]>
Exempt Tumblr's video hosting domains
d3db653e85f42ae3b7d4bcf0986b18e4e1893eb4 authored over 4 years ago by JustAnotherArchivist <[email protected]>93174123ffd2a0519b0340994a146e8fe7e3ee74 authored over 4 years ago by JustAnotherArchivist <[email protected]>
Support custom domains on Tistory
d04441de2a6140bd171565523fe524554726c59b authored over 4 years ago by JustAnotherArchivist <[email protected]>E.g. https://www.bycpp.net/ == https://fastaping.tistory.com/
8eac8359ba1ad4302be9c3e20765c90d2791e60d authored over 4 years ago by JustAnotherArchivist <[email protected]>Ignore infinite calendar pagination on Tistory
733056fe65278d097e43f52408120455e1845f48 authored over 4 years ago by JustAnotherArchivist <[email protected]>49495ca3a3a7ba795ccf607554d29feeddb26389 authored over 4 years ago by JustAnotherArchivist <[email protected]>
Add Tistory igset
111f397e18fbe4595a73648f9552e02d93d815cb authored over 4 years ago by JustAnotherArchivist <[email protected]>3fd05d597b51a87b065cfbbabd102775523590bd authored over 4 years ago by JustAnotherArchivist <[email protected]>
Fix other pending counter
91b5531cb8ef95b84c432db15d34da48ad2df376 authored over 4 years ago by JustAnotherArchivist <[email protected]>ffc324d500a353aa690df0837596e75d47e1bacc authored over 4 years ago by JustAnotherArchivist <[email protected]>
Report total number of jobs pending in other queues on !status
f8d903f344b99b1d3a509d03bb3f24f15e52da1e authored over 4 years ago by JustAnotherArchivist <[email protected]>218cd6d1644a5767432ca147aebfc5bcf3ecf203 authored over 4 years ago by JustAnotherArchivist <[email protected]>
Extend and update Dreamwidth/LiveJournal igset
2e71c9a14556cc0f2a826def74c46f13a5194a9e authored over 4 years ago by JustAnotherArchivist <[email protected]>Updated most of the existing ignores to be more precise, and added a large number of ignores app...
4ff6499b93e52849eb3dd97ebeb45c8c32f55e08 authored over 4 years ago by JustAnotherArchivist <[email protected]>Ignore pages on Gamepedia wikis that include things from the whole network
255e02631136d0b27942ae22ff766e6521d5ea36 authored almost 5 years ago by JustAnotherArchivist <[email protected]>* Special:WikiPoints/global lists all users that have ever made an edit on any wiki in the netwo...
838783ff0fe7ede05fd32f7ae7b602f970b79d89 authored almost 5 years ago by JustAnotherArchivist <[email protected]>Add compact ignore list view which collapses based on the existing igsets
76a8b7ff4ca717abe4d732fd082a41e6292108ae authored almost 5 years ago by JustAnotherArchivist <[email protected]>Based on zino's implementation in Pike: https://gist.githubusercontent.com/PeterBortas/f0799a495...
f124086e709f4cd3f7b9ac0965949411477f45ea authored almost 5 years ago by JustAnotherArchivist <[email protected]>Add more ignores for Reddit user pages
16b3f7ca93fb153cd92d46718a23a077f14a820d authored almost 5 years ago by JustAnotherArchivist <[email protected]>Ignore excessive "Actions" filters on GitHub
66435d3f4eea6e7b1983db975ffcf6d83077ade4 authored almost 5 years ago by JustAnotherArchivist <[email protected]>a89717c8e946536e846418caee89521fc0f2cb5e authored almost 5 years ago by JustAnotherArchivist <[email protected]>
27c94a2cd8ac4be7fe4e12d1561df39670f668f8 authored almost 5 years ago by JustAnotherArchivist <[email protected]>
Include CentralAuth to mediawiki ignore rule
842415b66a552235335314359ef17eadd150fa2c authored almost 5 years ago by JustAnotherArchivist <[email protected]>
On wikimedia wikis, CentralAuth is the pandora's box responsible for
crazy amount of other WMF-w...
Support weaker SSL/TLS connections for a broader compatibility with outdated web servers
38f77fffd7d52dd0b1a09c2c29bbe4fd55669211 authored about 5 years ago by JustAnotherArchivist <[email protected]>wpull's `--no-strong-crypto` allows for SSLv2 and SSLv3 connections (if the OpenSSL library is b...
e6b4c2a1081ca3886ccfb4767fba7dafabdb1cb4 authored about 5 years ago by JustAnotherArchivist <[email protected]>Ignore "BlogArchive" widget on Blogger
7691ba8c191721d32198f1e6adf65e704fcec4af authored over 5 years ago by JustAnotherArchivist <[email protected]>Adjust Squarespace JS hell ignore
c39186335613c858242ea6cb000fc9fa2a9122e3 authored over 5 years ago by JustAnotherArchivist <[email protected]>This still lets it grab the entire tree on the homepage. The problem is when it starts doing tha...
eb8d17c6f0e5a2fa072845616ecd41da55a8b781 authored over 5 years ago by JustAnotherArchivist <[email protected]>2bf5b15fa8e049f2f45bd190f238e834a53f7bc6 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Add basic Instagram igset
70c4326e94b56f4d4e3cbf1fb1142f17155ee745 authored over 5 years ago by JustAnotherArchivist <[email protected]>?hl= is the language switcher. Captions started appearing in URLs in early August; legitimate ca...
cb008bb8c07eefff0fbadbefef6e5434bc77a6bf authored over 5 years ago by JustAnotherArchivist <[email protected]>Prevent pipeline spam on bot startup
9d2521e295a22a8fabffc2474f3b4c2de5626e70 authored over 5 years ago by JustAnotherArchivist <[email protected]>5c67c78f65d7d37dffa6b4e01d23b0ad63a1473e authored over 5 years ago by JustAnotherArchivist <[email protected]>
Bump the default disk size limit from 500 MiB to 5 GiB
07afd368805bf053641bd8fd6cb597330fb93345 authored over 5 years ago by JustAnotherArchivist <[email protected]>89d435f7f4b176243e505c7b584aa38c28f7ee17 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Handle log files in the uploader
88920f48f42f77e37f38fc5b7deae37b4b431854 authored over 5 years ago by JustAnotherArchivist <[email protected]>c6bd5792af0223dc3368c392f90b4721fb2748d3 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Remove all uploads from within the pipeline
3716a58131f9980ca3070ecbf9124491eecc9383 authored over 5 years ago by JustAnotherArchivist <[email protected]>e179725059e7752f6d97a5e72f4556256e78fa12 authored over 5 years ago by JustAnotherArchivist <[email protected]>
ee224a0ac29530c4cb205a6fae592a1c348265c9 authored over 5 years ago by JustAnotherArchivist <[email protected]>
3a6dc0cf57ed19250778f36ba452643d970cb5c6 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Fix connection to WebSocket when accessing the dashboard over a non-standard port
fe454dba78684c3e1076752de6b3c884d0b0ad8e authored over 5 years ago by JustAnotherArchivist <[email protected]>Check that an absolute URL with a host was provided
bfee50517a30023d2dfb1f6ac23f57e87212d47b authored over 5 years ago by JustAnotherArchivist <[email protected]>Fixes #394
32b3354f5c145246ae33674365a6caf3fdaccebf authored over 5 years ago by JustAnotherArchivist <[email protected]>f87a6390d585ee5d7fe3271803b208d78c1b6318 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Verify that the version number hasn't changed since the pipeline was started
58b8b22007842baccb92c33917b912a8cb8c069f authored over 5 years ago by JustAnotherArchivist <[email protected]>Since the pipeline might loop inside GetItemFromQueue for a while, this is more reliable. It che...
2be4a29c4f162b991664ad570f14d4a1646b188a authored over 5 years ago by JustAnotherArchivist <[email protected]>9102db8a5e19efb95676e4b86c9bf38a1d2085f4 authored over 5 years ago by JustAnotherArchivist <[email protected]>
Get the pipeline version automatically from git
e68fed742101e0efbf44b918555f59d80e54fb61 authored over 5 years ago by JustAnotherArchivist <[email protected]>