Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/ArchiveTeam/ArchiveBot

ArchiveBot, an IRC bot for archiving websites
https://github.com/ArchiveTeam/ArchiveBot

Ignore another share link

6ea4324d73689ccc335214bcd09242b4ab5f54cc authored about 10 years ago by Ivan Kozik <[email protected]>
Work around https://github.com/ArchiveTeam/ArchiveBot/issues/138#issuecomment-68352100

c01ece45d000f87cd239523ce207889e6b89bd5d authored about 10 years ago by Ivan Kozik <[email protected]>
Work around https://github.com/ArchiveTeam/ArchiveBot/issues/138#issuecomment-68352100

14246c5b1cedd611901cf9df8bc802877d7691ba authored about 10 years ago by Ivan Kozik <[email protected]>
Work around https://github.com/ArchiveTeam/ArchiveBot/issues/138#issuecomment-68352100

d53e708d1d62722a75b4bf16a564b64c2eaba7cd authored about 10 years ago by Ivan Kozik <[email protected]>
Merge branch 'master' into reddit-over18-4

4b35a23f35eb74f5e47bdd99a1a529387f488961 authored about 10 years ago by Ivan Kozik <[email protected]>
Use wpull hax-24

71eccceb2a2d42b9ee5a178a77a667eaa62ecf1f authored about 10 years ago by Ivan Kozik <[email protected]>
Merge branch 'master' into reddit-over18-4

c559358b7d29ab0375f54e98f3bf5ecbaa4df10f authored about 10 years ago by Ivan Kozik <[email protected]>
Update user agents

8304eae328a1b840882d4e56a83baf3ea1772f49 authored about 10 years ago by Ivan Kozik <[email protected]>
bot: Don't recognize non-integral delays.

It is fun to tell a job to delay for 42*pi seconds or whatever, but the
pipeline's settings list...

f5a9cca16d9694b65fa84b79da665a654b06aae5 authored about 10 years ago by David Yip <[email protected]>
plumbing: Introduce status-report command.

This generates a list of objects that directly correlate job ident with
pipeline information. T...

e2df4d14e9090e456e55a82d34b27c6790f6ffab authored about 10 years ago by David Yip <[email protected]>
db: Use correct delimiter for {primary_netloc} in singletumblr. #104.

0cbad2ded1282171b1327d6fc82be3ff00f8c4cd authored about 10 years ago by David Yip <[email protected]>
doc: Actually preformat example URL list in ignore docs.

[ci skip]

7e54545469e990e2bf5222c8091321b5a811c220 authored about 10 years ago by David Yip <[email protected]>
doc: More consistent network location for ignore example.

[ci skip]

da47638de835112a29f4258adccf050ebad22d5a authored about 10 years ago by David Yip <[email protected]>
doc: Present example file list as preformatted text.

[ci skip]

44c22140f6558e5f2adf22b3f8a80add770b1915 authored about 10 years ago by David Yip <[email protected]>
Bump pipeline version. Closes #104.

1abc26a5db9ed323d0f81fe11135627dfab1d22e authored about 10 years ago by David Yip <[email protected]>
pipeline: Top URL and level is in record_info, not url_info. #104.

f2356becbfc77c23f1d949944c09407b3a30c141 authored about 10 years ago by David Yip <[email protected]>
Document {primary_url} and {primary_netloc}. #104.

a10bcf6a367b7d6181b7b9d18e74f0e185cdb5d0 authored about 10 years ago by David Yip <[email protected]>
Switch back to {primary_netloc} and {primary_url}. #104.

6744c21079c13d2f1615d77df91a79620d072bde authored about 10 years ago by David Yip <[email protected]>
db: Remove trailing space in singletumblr ignore set. #104.

75a0bad39ea2419aab4aea649b5a3ce1b3e45a3b authored about 10 years ago by David Yip <[email protected]>
db: Add an ignore set to restrict !a *.tumblr.com to the target. #104.

(This is the sort of thing that #104 is useful for.)

9d86e8b696838f44e4ba0b4e257534394aa4e862 authored about 10 years ago by David Yip <[email protected]>
pipeline: Use Ignoracle in wpull hooks. #104.

b8bf55ac19c2c70a8cf41402e3e7e8d1efea299e authored about 10 years ago by David Yip <[email protected]>
pipeline: Regex-escape URL and network location. #104.

9c325acd752ddc0544ff595f4df46c7ca757ce89 authored about 10 years ago by David Yip <[email protected]>
pipeline: Introduce Ignoracle, a home for ignore logic. #104.

Ignoracle implements the placeholders discussed in #104 and gives us a
way to add correctness an...

417fe45c0bfac008d648c3f212b07a59eda5afc3 authored about 10 years ago by David Yip <[email protected]>
pipeline: Switch to templates for placeholders. #104.

string.format() substitutes all occurrences of {token} with a token in
the formatting map. Unfo...

eb87647f13c7594039637af0e42c549dde0ff97b authored about 10 years ago by David Yip <[email protected]>
pipeline: primary_host -> primary_netloc. #104.

Python's urllib.parse.urlparse returns the network location (e.g. auth,
host, and port), not jus...

5cbc8bd6dd049bcaed8a66c03c7f3408a8da2fe3 authored about 10 years ago by David Yip <[email protected]>
pipeline: Add a helper to feed url_info to Ignoracles. #104.

b0e48e497c10334a975254e1e199846a1cb84bb1 authored about 10 years ago by David Yip <[email protected]>
viewer: Handle case where spaces are in the filename

02c3d78e0a2e84c7c7f50efb15ee4c1b4c3fc0d5 authored about 10 years ago by Christopher Foo <[email protected]>
viewer: Handle case where JSON is empty/corrupt file.

4b78d2a0b7cc48b0ae0020d478e765215c2b7c76 authored about 10 years ago by Christopher Foo <[email protected]>
viewer: Prevent populate() running multiple times.

c937144fc0051ebc28a08f9140f3cf72b93af338 authored about 10 years ago by Christopher Foo <[email protected]>
viewer: Fetch the JSON files.

Closes ArchiveTeam/ArchiveBot#128

3b29b755026d5a31eb3d0e1513af1fdb07bb0091 authored about 10 years ago by Christopher Foo <[email protected]>
viewer: Add audit for absolutly no WARC files.

Re: ArchiveTeam/ArchiveBot#129

e062fb987efbd8e9009f18ae1d669aaa40e1361a authored about 10 years ago by Christopher Foo <[email protected]>
viewer: Switch over to SQLAlchemy

33e5d2e91c01aa72d43d75f688a4f8427f551bb3 authored about 10 years ago by Christopher Foo <[email protected]>
plumbing: Remove unused ia-list utility.

The viewer code does this job much better.

e936db0f7efdf13a88a3d29f333f4679560a2982 authored about 10 years ago by David Yip <[email protected]>
viewer: Add basic CSS.

713304488bd266521aee5f5ff5a6693e313dff1a authored about 10 years ago by Christopher Foo <[email protected]>
viewer: Fix no search results message.

8fddbe972c766004c6c198c994c3e37710dd7e8c authored about 10 years ago by Christopher Foo <[email protected]>
Revert "bot: Make !a operate in no-offsite-links by default."

This reverts commit be4450870472dde7ed966a279d18067f83b7b299.

Disabling offsite links by defaul...

3503c21df9ff4cf6bc7ddccc2d0c1f0bc98db403 authored about 10 years ago by David Yip <[email protected]>
bot: Make !a operate in no-offsite-links by default.

A major job bloating factor is offsite links and their associated page
requisites. This extra f...

be4450870472dde7ed966a279d18067f83b7b299 authored about 10 years ago by David Yip <[email protected]>
Fix typo in /js/chartbeat.js

1e1566842c175505726b448af6a9b6a72d9bdff3 authored about 10 years ago by Ivan Kozik <[email protected]>
test: Add firehose to test harness; update dashboard runner.

c01237c306ae5ea7793d79b6e70bdb9ebeea68e8 authored about 10 years ago by David Yip <[email protected]>
doc: Add ZeroMQ 4.0.5 to backend requirements.

I'm not actually sure of the best way to get ZeroMQ, so I've left it
out. Ubuntu Utopic seems t...

3de55e2f15fd68fe130236b49686dbe3e43e5267 authored about 10 years ago by David Yip <[email protected]>
plumbing: Turn log-firehose into a pubsub server.

There are at least two sinks for log messages:

1. The dashboard.
2. Longer-term storage of Arch...

808d9830dac4bd291edd32fb68d0f9e18ab4bfda authored about 10 years ago by David Yip <[email protected]>
Merge branch 'master' into reddit-over18-4

ea0b73d52a8d22d4ed5c40143bc082ef16e567ef authored about 10 years ago by Ivan Kozik <[email protected]>
Fix UnboundLocalError when no http_info

099041b3dca7cbf9b88c15f5931acb6639cad480 authored about 10 years ago by Ivan Kozik <[email protected]>
Merge branch 'master' into reddit-over18-4

e2dcb47e41e182c50883bef026027bea68192f9d authored about 10 years ago by Ivan Kozik <[email protected]>
Use wpull hax-22 and bump pipeline version

28555b69a4a49698f2a422950b3b0a77116efde5 authored about 10 years ago by Ivan Kozik <[email protected]>
Get response code for FTP instead of crashing on KeyError

59ace00a6df2b64ddade64a5ceb880fdbb55d7fb authored about 10 years ago by Ivan Kozik <[email protected]>
Merge branch 'master' into reddit-over18-4

6eea04eb051e58c7f4bd1382ceea4f76016c4445 authored about 10 years ago by Ivan Kozik <[email protected]>
Fix Icecast hook for ftp:// sites

[...]
File "/home/archivebot/.local/lib/python3.4/site-packages/wpull/hook.py", line 433, in _...

1e82223829e2f908ec6f24d660e82c35fe4f215b authored about 10 years ago by Ivan Kozik <[email protected]>
Ignore Special:Log/

4b45cda1b9d781ce9d03e8be62d1e03980bc2ab6 authored about 10 years ago by Ivan Kozik <[email protected]>
Add missing test/run_cogs.sh

474ec26c990ffb3dc33516b1d5c687b6c8eb8f05 authored about 10 years ago by Christopher Foo <[email protected]>
integration_runner: Run cogs too. Check for warc and json files.

c067b4667846cb7c6884d63105ed136b90631f09 authored about 10 years ago by Christopher Foo <[email protected]>
.travis.yml: Enable public(ish) shaming.

26a3ddf641a2ac8d0ae9220a1ec70afcb4600c5d authored about 10 years ago by David Yip <[email protected]>
Remove pointless features.

cd7818f3ad91eee26674864ff9b4b7d34aab02c5 authored about 10 years ago by David Yip <[email protected]>
.travis.yml: Enable rsync

b4b8030c9061e43186ed2c443359007c4e27dbc5 authored about 10 years ago by Christopher Foo <[email protected]>
.travis.yml: Strip _rev from design docs

5ec99036ec24b49e7517a5482bd95c56f8e06d2a authored about 10 years ago by Christopher Foo <[email protected]>
test/integration_runner: Check web proc return code too

63d9e66542d56c96c861105b6e53df23ca7640c6 authored about 10 years ago by Christopher Foo <[email protected]>
.travis.yml: Add couchdb curls

182a84e6df2ace9eb4b55953e7c0a6af48391b93 authored about 10 years ago by Christopher Foo <[email protected]>
.travis.yml,test: Add in rsync and basic IRC client

0a0e146bc8a0e235434004c8ac7c8a9fb3274871 authored about 10 years ago by Christopher Foo <[email protected]>
test/integration_runner: Check if processes are alive

76a7ba9b87574b0b3066ee904162fb35dc0b1c5a authored about 10 years ago by Christopher Foo <[email protected]>
test/run_pipeline: Override python3 to python3.4

2db5c6fae3cb49fae7094d73dbefff766cffca1c authored about 10 years ago by Christopher Foo <[email protected]>
integration_runner: setpgrp for each subprocess

755590d5372beab60843e328d39c4f625d3b7a13 authored about 10 years ago by Christopher Foo <[email protected]>
Add WIP integration test scripts.

c7b28fa92a2a4dfeccf201c5ac19c3d06eb42d31 authored about 10 years ago by Christopher Foo <[email protected]>
pipeline wpullargs_test: Add missing finished_warcs_dir arg

c2b1bdd7505f095db2d83dbfc085740d71d813d5 authored about 10 years ago by Christopher Foo <[email protected]>
.travis.yml: Add back in overridden ruby commands

b581db74647b26d96e4fe984f2bbe161aa367543 authored about 10 years ago by Christopher Foo <[email protected]>
.travis.yml: Install python3.4-dev

0bdf746ad02f0ffaf800f4d47e0ee5b1fc96d47b authored about 10 years ago by Christopher Foo <[email protected]>
.travis.yml: Use --yes to add-apt-repository

a59212e43ce925dfb203c7bb873c343d4b0ed043 authored about 10 years ago by Christopher Foo <[email protected]>
.travis.yml: Add commands for Python test

f344afb107c17b215d7eac638180ae1f22cf1a73 authored about 10 years ago by Christopher Foo <[email protected]>
Ignore another streaming site

3322b0bf479094587e00e0b23a298cf2ee339adf authored about 10 years ago by Ivan Kozik <[email protected]>
Ignore more of streamtheworld.com

Sample URL:
http://7579.live.streamtheworld.com/977_90?type=.flv

d477ec54ed41fc4b078c8ff3ea13af8ea16c20ae authored about 10 years ago by Ivan Kozik <[email protected]>
Send over18 cookie for reddit

15ae3ca6a6831f2b1ae366a58d5620474f5b3d2c authored about 10 years ago by Ivan Kozik <[email protected]>
Fix KeyError: 'suppress_ignore_reports'

[...]
File "/home/archivebot/.local/lib/python3.4/site-packages/wpull/hook.py", line 433, in _...

162e5cbb256913f4f8ad9c61174b2fa288741418 authored about 10 years ago by Ivan Kozik <[email protected]>
pipeline: Bump version 20141213.02

19ae74b8a9fc120c2233f8dcd1b906eb05b20a78 authored about 10 years ago by Christopher Foo <[email protected]>
Merge branch 'next'

8bfb29d42b1f49777c77f493aa0f1b4ccdc6f03a authored about 10 years ago by Christopher Foo <[email protected]>
pipeline/wpull_hooks: Use logger & print. Add maybe_log_ignore()

fc6ae36fe8c9832dfa66a7ac79db27978f18071b authored about 10 years ago by Christopher Foo <[email protected]>
pipeline/wpull_hooks: Log ignores on icy stuff

ca67d7ff48c20d13a05c0e250659c4154e949bcc authored about 10 years ago by Christopher Foo <[email protected]>
Use wpull hax-21

fa99d4e77a7ec2ea8c3f4e1b4fbf76b992334301 authored about 10 years ago by Ivan Kozik <[email protected]>
Merge branch '12As-master' into next

8ccc883899677201238be90847bc7477a171539d authored about 10 years ago by Christopher Foo <[email protected]>
Merge branch 'master' of https://github.com/12As/ArchiveBot into 12As-master

bb98b9018903cd9305aaba0d1ae949c3ccdd9d2b authored about 10 years ago by Christopher Foo <[email protected]>
pipeline extensions: Accept bytes to tee_to_control

Closes ArchiveTeam/ArchiveBot#134

46831e50c7b0d2c485c65de0b2de4ecd4f581fee authored about 10 years ago by Christopher Foo <[email protected]>
pipeline/wpull: Remove references to Python 2

830be869884ae9b4fe67c9ff572c086f1fbac3cc authored about 10 years ago by Christopher Foo <[email protected]>
pipeline: Temporarily lock to seesaw 0.7.

When we run under seesaw 0.8.2, jobs quickly fail like this:

Starting GetItemFromQueue for It...

429a0699612c0da51230f3eb7671a5d3f5a40546 authored about 10 years ago by David Yip <[email protected]>
Ignore imageshack.com/lost

fe386a733cece507d69c00d2d06e2bd006e5d52e authored about 10 years ago by Ivan Kozik <[email protected]>
plumbing: Also set timestamp of last update in job.

This should simplify job reports.

c804936edca2869c8373aec198dca9c833bdbe62 authored about 10 years ago by David Yip <[email protected]>
Add additional block, remove case sensitivity

I added an additional icecast block and removed case sensitivity from header fields (but not val...

f1be926993d4d6c56edc78105ba776bfe76ca2c9 authored about 10 years ago by 12As <[email protected]>
Revert "dashboard: Apply dark style to archivebot.com."

This reverts commit a65415a11b0292c64030a228f4a40a5299c698b5.

Stylish doesn't seem to like the ...

84975805c6991e934ffe40dace9a20ca7e13558c authored about 10 years ago by David Yip <[email protected]>
dashboard: Try different monospace fonts for log window.

I like the way Essential Pragmata Pro and Inconsolata look on my screen.
Pragmata Pro would prob...

aff3c9ff8d60c4a7942e1356936226f133b72a33 authored about 10 years ago by David Yip <[email protected]>
dashboard: Ensure dark style applies to headers.

cea412b418d5dc5a93e7fc01dcfe1b02eb2d9bb0 authored about 10 years ago by David Yip <[email protected]>
dashboard: Apply dark style to archivebot.com.

dashboard.at.ninjawedding.org is an internal name.

a65415a11b0292c64030a228f4a40a5299c698b5 authored about 10 years ago by David Yip <[email protected]>
dashboard: Add a dark bluish theme.

For use with Stylish and similar extensions.

cc438273f60eadb91f9e25c808d4be175bb37da9 authored about 10 years ago by David Yip <[email protected]>
Change HTTP version to string & style changes

bf2a11c9abaa4aa5b13f1ae0a4c14e9d1dc84c1d authored about 10 years ago by 12As <[email protected]>
Don't attempt to download from Icecast streams

Update wpull_hooks.py to add an ignore for icecast servers using handle_pre_response hook

9762369408891c9db502a20865448b9fe96390c3 authored about 10 years ago by 12As <[email protected]>
bot: Adapt some specs for changes in RSpec 3.

83c823e4069b08106857c31c4412b1263393e51d authored about 10 years ago by David Yip <[email protected]>
bot: Strip excess spaces on !a, !ao, !a <, !ao <, !ig, !unig.

16245f85dfcbee2e102c8d9a9a28b7752a30258e authored about 10 years ago by David Yip <[email protected]>
Dashboard: also supervise stdin -> broadcast process.

Previously, we didn't run the code that fed stdin to the broadcaster
with supervision. It was t...

d8d379c000e7dec7d13546844e78b1a9abad3abc authored about 10 years ago by David Yip <[email protected]>
plumbing: Use job key to distinguish trimmed log messages.

Previously, we used the log key, which isn't quite as useful for sending
logs to meaningful repo...

995d6c9fb44eff865aa669f5874db76d5ca10d72 authored about 10 years ago by David Yip <[email protected]>
Update to Celluloid ~> 0.16.0.

This commit also uses a prerelease build of analysand, because
analysand's current release is lo...

ed062cc5da3a6a03ed13b0ad05548db1ee6315f6 authored about 10 years ago by David Yip <[email protected]>
Ignore sets: fix JSON errors.

724518feaafbb909a6b66f52667f97d3a77c181e authored about 10 years ago by David Yip <[email protected]>
bot: Reintroduce Brain#op?, which is used by !a < FILE.

b15c444f6ee95afb81eb3a0c017490e427d3a829 authored about 10 years ago by David Yip <[email protected]>
Revert "bot: Experiment with finer !con permissions."

This reverts commit 8d4630541e9246594f0e0e874cd653096b97fdfb.

Not necessary after all.

15ab7b39ad98ddf0168839f39c9018dd54a2ef72 authored about 10 years ago by David Yip <[email protected]>
User agents: fix JSON errors.

ff9bd4fb38f1d6cdae7882fe6ed19836844fa5a7 authored about 10 years ago by David Yip <[email protected]>