Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/ArchiveTeam/patch-grab

Grabbing patch.com sites
https://github.com/ArchiveTeam/patch-grab

Bump version.

ad1e4722031ad44c7b980ef2bc4eeaace8ae445c authored over 11 years ago by David Yip <[email protected]>
Send version when requesting an item.

b1aa3d88099d368f3e520b6ae6beeaad8f06488c authored over 11 years ago by David Yip <[email protected]>
Bump version.

79eea514d71fe0530ce3bc87835a73cd13974003 authored over 11 years ago by David Yip <[email protected]>
url is a table, not a string.

b947f46ff1ed422e4e9ceae82f6b0f59a8601734 authored over 11 years ago by David Yip <[email protected]>
On HTTP 420, wait one hour and retry the failed URL.

This should help reduce the number of open claims when we get 420'd.

d209654bb20afb98fbf5ac88fed94ba46177465d authored over 11 years ago by David Yip <[email protected]>
Bump version.

85b4e0207d9129f3589567bf8978860972c6f20b authored over 11 years ago by David Yip <[email protected]>
Update project deadline.

I'm not sure when the deadline is, but dragging this out into October
seems excessive.

83015c72f18a8131d4db36b8a89390e591c99341 authored over 11 years ago by David Yip <[email protected]>
Depth-first search on a patch.com URL graph.

(Mostly to pick up stuff that the existing crawler isn't picking up.)

b9162d93d011a695fe2627210887d6c734718b94 authored over 11 years ago by David Yip <[email protected]>
PhantomJS retrieval tools; some documentation. #5.

9ecc5abee15b28d83dfa11093b245e535131d266 authored over 11 years ago by David Yip <[email protected]>
A crazy hacky paginator. #5.

c73760a96485edb24c670fd29d12484f2d746431 authored over 11 years ago by David Yip <[email protected]>
Ignore PhantomJS downloads. #5.

b8bea5a468de5203223c548bc0bde2c2053637f1 authored over 11 years ago by David Yip <[email protected]>
Use a public repository URL for WarcProxy. #5.

3e92b88a5b59c5fca7ab67b20b04e7b2ff4a7ebb authored over 11 years ago by David Yip <[email protected]>
Submodule WarcProxy to deal with more annoying captures. #5.

d53d128952d6b4ca0bbaa858a666828e339fec95 authored over 11 years ago by David Yip <[email protected]>
Bump script version.

a7d533c8d3a63a900492a8f7701cda2efafa1db4 authored over 11 years ago by David Yip <[email protected]>
Add curl to installation instructions. #3.

(Boy, I sure hope it's called curl on all platforms.)

b4a5838426b22f6eb1fc96c8212e94faf4dc5bc9 authored over 11 years ago by David Yip <[email protected]>
Require wget-lua and curl. #3, #4.

027c77a63d179ab75c6e5e18b4338fcf857e828b authored over 11 years ago by David Yip <[email protected]>
Bump version.

80daedf8ebec371f2d731cb997466601bd6c698d authored over 11 years ago by David Yip <[email protected]>
Make filetype check case-insensitive.

Some file implementations (i.e. OpenIndiana's) return "html" in
lowercase.

c96b8c104d88ed729c520e424180638972c6617d authored over 11 years ago by David Yip <[email protected]>
Bump version.

e8e3e568ec1a6901150b105e54967ea37f9de310 authored over 11 years ago by David Yip <[email protected]>
Remove lies.

cbfc1c3580cd88a5b2dd5240c596bd03217b34aa authored over 11 years ago by David Yip <[email protected]>
We're good to go outside the AT Warrior.

e327e16df81be3797ff6f6704cd8e29115bc9661 authored over 11 years ago by David Yip <[email protected]>
Remove cookie jar.

d833b85a7d02a0251b00388e6ae782b1a601216c authored over 11 years ago by David Yip <[email protected]>
Hashing isn't necessary.

Work items are generally small enough that the extra string content
doesn't matter.

ccb4b627d5b5b71b15639c55b5c7bc778d699cda authored over 11 years ago by David Yip <[email protected]>
Old Python versions need an external HTML parser.

We choose lxml.

This might make things not work on the Warrior; going to find that one
out.

19edbf243351d290102ddebd38e7358e4704025d authored over 11 years ago by David Yip <[email protected]>
The Ruby-based spider isn't needed anymore.

That said, if you want it, take a look at the spider branch.

929cb3b18e166f41169d24760c160c767075587f authored over 11 years ago by David Yip <[email protected]>
Fix indentation.

97bd0f7c2af736825b0449219707098c26bbd4f4 authored over 11 years ago by David Yip <[email protected]>
Play with fire.

e41a35da146fc0d35bbda376557803d2be31a7d0 authored over 11 years ago by David Yip <[email protected]>
Upload and stuff.

7505d433e5a4a3e913403e70bc778f2c07af64bb authored over 11 years ago by David Yip <[email protected]>
Remove Requests.

See previous commit for justification.

1a6b9dc4b37245d9f70aea969c69a4adf49ae58c authored over 11 years ago by David Yip <[email protected]>
Switch to Tornado's synchronous HTTP client.

Nothing against Requests, but fewer dependencies is better.

67ff3720bfcc5cd32871aff5217282bf50f9f919 authored over 11 years ago by David Yip <[email protected]>
Be less verbose.

26ccc7be6047d6781949074acc7cff26165b3e83 authored over 11 years ago by David Yip <[email protected]>
Keep track of URLs we've already downloaded.

Yes, wget does this, but seems to only do so for each URL in a URL list.
Because our URLs are gu...

02196814bdd38a30cf56c0450d1aa02c226d835a authored over 11 years ago by David Yip <[email protected]>
Make temporary variables local.

4d909a31cade05e493ac9c6bf84d82f367120cf3 authored over 11 years ago by David Yip <[email protected]>
Only wait after fetching *.patch.com URLs.

c8f9a4c4d159c3b4403ffedbfa89811b36e90beb authored over 11 years ago by David Yip <[email protected]>
Fix typo.

93ea4c3e08880106eb399517e6c64b904165e8f6 authored over 11 years ago by David Yip <[email protected]>
Omit file(1) output.

It's not really necessary: the scraper log gives us all the information
we need.

e6e3a18e09922c74f390bc56ee29ccf39212e51a authored over 11 years ago by David Yip <[email protected]>
Walk Patch's asset servers, too.

a77b81bc2e6103438c313fdc1e79393d721ee1a8 authored over 11 years ago by David Yip <[email protected]>
Add Requests 1.2.3.

(The updated pipeline script needs this.)

d84299617c7149ac27a42e698592c1f9d69e6738 authored over 11 years ago by David Yip <[email protected]>
Bump version.

f3841c9bd41abeec0531284b2b0097f4f902bd6f authored over 11 years ago by David Yip <[email protected]>
Expand items into URLs.

f8398fe19fa591ef33332a542ae97b8f778fe33c authored over 11 years ago by David Yip <[email protected]>
Log in to the aggregator.

Yes, I'm well aware that the credentials are in cleartext in this repo.

A stupid attacker (or s...

c721c84ce7ac7d7ed68f258f357236820def34a3 authored over 11 years ago by David Yip <[email protected]>
Scrape and catalog additional links when downloading.

49d9a65bca0739243bc9183053c882880e807991 authored over 11 years ago by David Yip <[email protected]>
A script to extract patch.com URLs from wget-retrieved HTML.

984f8f4c6497b80322936bd3db636ccc842b9086 authored over 11 years ago by David Yip <[email protected]>
BeautifulSoup 4.

1ca7e7e5314d5a65ad085d3ead5612709c90d4c6 authored over 11 years ago by David Yip <[email protected]>
Per-hostname evaluation queues.

b3c3dd3372523e39cbc643b1854479e98d9d43e5 authored over 11 years ago by David Yip <[email protected]>
Handle all HTTP response codes.

fdde43f7f76bfc4d6e1fc30697bf572cead28cef authored over 11 years ago by David Yip <[email protected]>
Holy disorganized hacks, Batman.

2fe270072df2f94ecf5bbde8fdc54bb05a773450 authored over 11 years ago by David Yip <[email protected]>
A program to spider Patch.com sites for links.

Not perfect, but it's a way to start.

094074e9bffee6e2990474a563f0e850b772d0aa authored over 11 years ago by David Yip <[email protected]>
I didn't copy and paste this. No sir.

f011dbcb65de6061e2d9b3d0a21ac73c8506298b authored over 11 years ago by David Yip <[email protected]>
Fix tracker URL.

091bccedf07436b0de5201b086f9f862e97b8f12 authored over 11 years ago by David Yip <[email protected]>
First cut at a project intro.

e75fe3e8fba269ec32eb18ed5ba5bdd403e2d220 authored over 11 years ago by David Yip <[email protected]>
Ignore shutdown sentinel.

afa0814c97dde40ba7dc8de2c4a19ad4a6711be8 authored over 11 years ago by David Yip <[email protected]>
Ignore wget-lua build products.

916097effc6dc61fe0f4fc769402bca5f2d77015 authored over 11 years ago by David Yip <[email protected]>
Just in case you don't already have it: get-wget-lua.sh.

3dc218878375033894098c51fbcad51959d3863f authored over 11 years ago by David Yip <[email protected]>
Hack out an archiving pipeline. No upload yet.

f361cad8a6ddd60f95ec7e4541f3e174f484628c authored over 11 years ago by David Yip <[email protected]>
Ignore temporary data storage.

946558277223476b5b8235639aa63c41cce51786 authored over 11 years ago by David Yip <[email protected]>
Remove example code.

faec6125ac824c66b66946df67aea021fa21bcc9 authored over 11 years ago by David Yip <[email protected]>
Version 0.15.

72262c8bb7a0a19c9610b3085e4241ad891dcd79 authored over 11 years ago by Alard <[email protected]>
Fix broadcasting of empty messages.

7aeaffa2977e6e11393151a2af078301b9e06a20 authored over 11 years ago by Alard <[email protected]>
test_executable accepts lists of version numbers.

2110b643c7178ee8a3b55d88f2487188ed1e7a88 authored over 11 years ago by Alard <[email protected]>
Version 0.14.

4384099aed888b665fe92e03f06fd6ab43ecd4b9 authored almost 12 years ago by Alard <[email protected]>
Add favicon.

4e15d79d04852f234cedf2f9cc7ccd34075c223c authored almost 12 years ago by Alard <[email protected]>
Remember collapse status when showing new items.

b813b135f47ca5fca532bf30d6838a03d22fe510 authored almost 12 years ago by Alard <[email protected]>
Try to ignore duplicate connections.

On reconnect Socket.IO sometimes creates duplicate connections, which
leads to duplicate events....

ef88516ae4e467486a8f4c6215deb64036f96753 authored almost 12 years ago by Alard <[email protected]>
Reload web interface if warrior restarts.

You can't 'reconnect' to a new run-pipeline or run-warrior instance.

55ad6875a7d5b6c24bdfbeb0322f26489da10fb6 authored almost 12 years ago by Alard <[email protected]>
Add an explanation to the rate limiting message.

1a7b54e81ef15939d6a4d339b329a5fcb5a7bf3b authored almost 12 years ago by Alard <[email protected]>
Add an immediate shutdown button.

22e059c4ebff7e79a045e9ee93655c8b5545cedd authored almost 12 years ago by Alard <[email protected]>
Run each item on a clean stack.

04505b6414f9df361d8aecac5e5c5a423fa86fe0 authored almost 12 years ago by Alard <[email protected]>
run-warrior and run-pipeline print a message.

6248f2c2f17078919475f5f3f2a0a61481406f1e authored almost 12 years ago by Alard <[email protected]>
Fix previous commit.

0f4eb1244154fa203b0bc31d29bcf6c7b7c40d96 authored almost 12 years ago by Alard <[email protected]>
Improved task exception handling.

If there is an exception in one of the tasks for an item,
the system should now fail the item an...

9a33784ec3a04e5e01816ad97b4268dfacbbc058 authored almost 12 years ago by Alard <[email protected]>
Check --projects-dir and --data-dir.

6cbe6856becd2ab8a7ad55aea93b21ad01a7cb90 authored almost 12 years ago by Alard <[email protected]>
Add a stdin connection to Wget.

d539549e39e78c7fb9c5fba3a1eea403173ceaa3 authored almost 12 years ago by Alard <[email protected]>
Report warrior VM build to the tracker.

e5cf47910a9de34556eb22a89aa6756d0f09b307 authored almost 12 years ago by Alard <[email protected]>
Accept extra task attributes in GetItemFromTracker.

1617ec425df3ace2bb9d372e8272d382adab7163 authored almost 12 years ago by Alard <[email protected]>
Add HTTP Basic authentication.

6c1caebf8fd78ecae6824ad6604578be44190f40 authored almost 12 years ago by Alard <[email protected]>
Separate advanced and basic settings.

92aaf6f0792cca05e8be13c2bb2129d1a04112b5 authored almost 12 years ago by Alard <[email protected]>
Add --address option to run-warrior.

e23676eca23ee91fef658123214abdbc0d98cec5 authored almost 12 years ago by Alard <[email protected]>
Merge branch 'master' into development

Conflicts:
seesaw/public/script.js

acb9d54a69380be1d27ee21275f72727c7ed6433 authored almost 12 years ago by Alard <[email protected]>
Merge pull request #18 from db48x/minimizable

merged

bc733a1dbd38d7635fc2c0696c785beb66702925 authored almost 12 years ago by Daniel Brooks <[email protected]>
Merge branch 'master' into development

ba52f5c73cfa7ae1c21cb2c785d5b476bc270ffd authored almost 12 years ago by Alard <[email protected]>
Reboot the warrior after 7 days. (Version 0.0.13.)

d1f884c6c46eb401a81260d1be56a3e920e6847e authored almost 12 years ago by Alard <[email protected]>
show rsync progress in the main log as well as the brief log

rsync puts the \r at the end of every line, rather than the beginning,
so processCarriageReturns...

86c8040ed08c055631e1438313ab2a2a7d06400c authored almost 12 years ago by Daniel Brooks <[email protected]>
items default to uncollapsed, with a checkbox to collapse them all

The user can still collapse and uncollapse them individually, and the
state of the checkbox is r...

27bde945302adb36810d1c2fc5913f0e6536e77d authored almost 12 years ago by Daniel Brooks <[email protected]>
Use keep-data option in run-pipeline.

925b9da2efe06901decaa5862aa341fbf2e6d26c authored almost 12 years ago by Alard <[email protected]>
don't trim first word of the brief log message

Decided to use the trimLeft() method rather than a regex in order to avoid
this sort of mistake ...

cfe2ec29ad9dfc241679a737955bfeba3301043c authored almost 12 years ago by Daniel Brooks <[email protected]>
Add debugging option --keep-data. (Closes #4.)

73baf9bc5640d3fd01578f7c47942cef3cdf51b1 authored almost 12 years ago by Alard <[email protected]>
Remove capital from help message.

92fd03a30596b68c4b9d42a02ee6dfdc50fa7afe authored almost 12 years ago by Alard <[email protected]>
Fix bug in task count.

5521d4f3403e4abad78ccfd64a84af8d42c36206 authored almost 12 years ago by Alard <[email protected]>
correct number of tasks on update

3867a684cda2d424b97a944408a3fa850d51e89f authored almost 12 years ago by Daniel Brooks <[email protected]>
Merge remote-tracking branch 'db48x/minimizable' into minimizable

093c7b64e15e43ff9dab0e4023c370452015cc93 authored almost 12 years ago by Alard <[email protected]>
factor out the code for updating the brief status/log

Also fixes a minor bug that causes the brief status and log to be lost whever
we recieve the ite...

2255deae3095130eaf259587900881aaaee4a659 authored almost 12 years ago by Daniel Brooks <[email protected]>
use some unicode triangles for the twisties instead of letters, and tweak the spacing so that the text doesn't move when you click them

95d9ed3434f00df80bb1354e3172f0381627d9dc authored almost 12 years ago by Daniel Brooks <[email protected]>
items start out minimized, with abbreviated status information

Everything fits on one line this way. Includes a twisty so that individual
items can be opened o...

9044e8c78427156bc4a4eb8061cecd30169690a2 authored almost 12 years ago by Daniel Brooks <[email protected]>
Forgotten to print executable path.

a39f4d241f01f1297fb668eb9c687cbcc0ac1558 authored almost 12 years ago by Alard <[email protected]>
Python 2.6 does not support subprocess.check_output.

b248fd4e42042cd7ea5148dcc79b44bff03179ed authored almost 12 years ago by Alard <[email protected]>
Version 0.0.12.

a1aa527d51efa14b491767a6e58360eb7c08e9a9 authored almost 12 years ago by Alard <[email protected]>
Add find_executable utility function.

87dfc25cff51218271c284facb19c712a1112c03 authored almost 12 years ago by Alard <[email protected]>
Add UploadWithTracker.

7d5debd5bfeef1027aadba99c9b027d46244da51 authored almost 12 years ago by Alard <[email protected]>
Add CurlUpload.

7303f0bc3d3751524ab0ce15b27085b2518b26ac authored almost 12 years ago by Alard <[email protected]>