Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/ArchiveTeam/blingee-grab

Saving all images and content from Blingee.
https://github.com/ArchiveTeam/blingee-grab

Ignore some new urls, bump version.

74d6f8a1db83336b2d2259b0612a1c1029ca3695 authored about 9 years ago by Gary Herreman <[email protected]>
Don't accept exit code 4; get canonical url from url shortener; bump version

Exit code 4 also means network(?) failure of some sort.

c4b750c09356dbef479e9f84d2fc3bb11b08aab8 authored about 9 years ago by Gary Herreman <[email protected]>
Clean up httploop_result and profile scraper; retry on connrefused; bump version

07ddfcf114e339b963ca999598bfda94f049f95e authored about 9 years ago by Gary Herreman <[email protected]>
Make sleep time on site errors longer; check that read was successful; bump version.

f8d42b359f98f8531533f19bb8691c9e9fe2122b authored about 9 years ago by Gary Herreman <[email protected]>
Bug fix: Make sure tries = 0 at start of scripts; bump version.

bdf04276f8addeaa0f520de400e1291b0779bc31 authored about 9 years ago by Gary Herreman <[email protected]>
Check error code in parse_html before continuing; bump version.

Lua 5.1 has no way of checking error codes with popen, so we have
to use echo $? and split the o...

bc6416f7bb74156a7ff9d2a31f650803cbb870fc authored about 9 years ago by Gary Herreman <[email protected]>
(Possible) Bug fix: Handle errors in parse_html; bump version number

It's might be possible for someone to run lots of instances/threads,
and hit the file descriptor...

ac8a9af93fc1b9536a10a336f612290d619e2015 authored about 9 years ago by Gary Herreman <[email protected]>
Add 1blingee option; bump version number.

844c660fcc3df50dc0ed3005464d4421e852e4a7 authored over 9 years ago by Gary Herreman <[email protected]>
Bump version number.

a3b1338ba5fc0a3255fd52efe10782f2f38ef686 authored over 9 years ago by Gary Herreman <[email protected]>
Make parse_html function a bit more robust.

1096e3360fefe84df375f7e5ed3d3571192bfac3 authored over 9 years ago by Gary Herreman <[email protected]>
Match to end of string in match_url; bump version number.

aba43e8414a041b9834d3531d3132f953cb65b95 authored over 9 years ago by Gary Herreman <[email protected]>
Add some more ignore patterns; bump version number.

c5e2119ffd515fd5c1a6fd5c5001b564666e295a authored over 9 years ago by Gary Herreman <[email protected]>
Modify pipeline.py and bump version number.

- Use a random user-agent for downloading. Thanks, Obama.
- Use gzip for profile ID scraping.

0c638630452a34f714712ccbe01cf261e6b5784c authored over 9 years ago by Gary Herreman <[email protected]>
Make sure to close after popen; bump version number.

5e0c640f3d17c8e69fe8aa7dff52ce9486f389b8 authored over 9 years ago by Gary Herreman <[email protected]>
Bump version number

30c85751df30f75f15be91b956ecba826a16dca0 authored over 9 years ago by Gary Herreman <[email protected]>
Groups: Fix variable and url check; add another url to ignorelist

42544f93424f6f7d64b21b3fa2ea5a89cd1b051e authored over 9 years ago by Gary Herreman <[email protected]>
Use correct day in version number.

67d942deb9266257c82503c57168d980e9f14ba9 authored over 9 years ago by Gary Herreman <[email protected]>
Bump version number.

6aeeb79479abf382b5936aea7b995c60bf0bd36f authored over 9 years ago by Gary Herreman <[email protected]>
Make sure that we don't look in images for links; update README.md

790d732db06abc167572858c5ee2786a2891a7f7 authored over 9 years ago by Gary Herreman <[email protected]>
Bump version number.

de3e257baf30328650c38d0e01eb8c60bd7acd8c authored over 9 years ago by Gary Herreman <[email protected]>
Use pip search instead of pip show.

Pip on the warrior doesn't have a show function.

07c29dc0d02b32fce4203f2d9cb21e17907f7ecf authored over 9 years ago by Gary Herreman <[email protected]>
Bump version number.

d71c01bf1ad610a0fed101f6c9b350e6bf8aac6a authored over 9 years ago by Gary Herreman <[email protected]>
parse_html: Check if html exists first.

17c24e9e518aa157d66746968831fa669c19b538 authored over 9 years ago by Gary Herreman <[email protected]>
Bump version number

2cb42c6204f3689aab2e0dc98e7a68a6986bf3e1 authored over 9 years ago by Gary Herreman <[email protected]>
Update README and python2 -> python

fedc7100c3316232018a17ee136b1d615b089415 authored over 9 years ago by Gary Herreman <[email protected]>
Remove old dependencies.

001009a0b8a96884fb0b99518266d1a3fbb384c8 authored over 9 years ago by Gary Herreman <[email protected]>
Add README, update tracker, and use pip for python-requests.

8f8dfab538322b542058796eb0dbdbc92a7a656a authored over 9 years ago by Gary Herreman <[email protected]>
Accept exit code 4 when finished.

31f894bd31feb7808908bf1d38519c920a5022ea authored over 9 years ago by Gary Herreman <[email protected]>
Get canonical url as well.

It's not big and is linked to in many places.

8267ee2a44f6335b98de8d9a0cf7fab85bb76683 authored over 9 years ago by Gary Herreman <[email protected]>
Fix html parsing, get url shortener.

- htmlparser doesn't work with lua5.1 and setting up lua5.2 with
wget-lua is far too painful. ...

7f391f5077a19536faad743e2832a27ed72864b4 authored over 9 years ago by Gary Herreman <[email protected]>
Make sure it's luarocks-5.2

2af09267aa7e3674b8130be9e3daace2a82123c1 authored over 9 years ago by Gary Herreman <[email protected]>
Modify warrior-install.sh and get-wget-lua.sh for Lua 5.2

59e1c4dd99465e6e1317411436a3000fdd7460f0 authored over 9 years ago by Gary Herreman <[email protected]>
Set executable bit on warrior-install.sh

e3feb14f352c26d2d44a1fd974e695d9d938d40b authored over 9 years ago by Gary Herreman <[email protected]>
blingee.lua: ABORT after tries. check302.py

92dab53d46e26fc902e8be427145f79c41f78ebd authored over 9 years ago by Arkiver2 <[email protected]>
check302.py

dc6da70f4db9f5ec51749ccfca927f81942e34bc authored over 9 years ago by Arkiver2 <[email protected]>
Fix get-wget-lua.sh

d2d8f7dad879c979af25b210c8cc39e7ec2cdbb9 authored over 9 years ago by Gary Herreman <[email protected]>
Merge pull request #1 from garyrh/profiles

Profiles!

3f5e76d75154a8f222a56261621cc03c65575260 authored over 9 years ago by Gary Herreman <[email protected]>
Update warrior-install.sh

3662943239c3931a3f384a9a45c11f270b405367 authored over 9 years ago by Gary Herreman <[email protected]>
Grab profiles in batches.

46219fd602197a48ee3bd33d4a00b486946c939b authored over 9 years ago by Gary Herreman <[email protected]>
Grab profiles by ID instead of username.

See the ArchiveTeam wiki entry for Blingee for info.

844df0d63fbc16c6b6e9857db0494e885d4d458a authored over 9 years ago by Gary Herreman <[email protected]>
Add support for grabbing profiles; more ignores.

9603491a26fcad18d23ef0def7f56cd3031f3a91 authored over 9 years ago by Gary Herreman <[email protected]>
Oops.

07a6385e0ead0367c73e2da04b75a6209d9ed9ea authored over 9 years ago by Gary Herreman <[email protected]>
Make downloading/matching more liberal; organize ignore lists

The script is more careful now so that any new resources will be grabbed.

029a888df86e254b5dd0d8da057bd2cc15022099 authored over 9 years ago by Gary Herreman <[email protected]>
Check for status_code 0 and only ignore static urls w/o a timestamp.

5075c4bcf194dc70f3edb9b33416a7db44fd3b8b authored over 9 years ago by Gary Herreman <[email protected]>
Grab blingees in batches; fix page number matching bug.

30f972a2a476a5983754e1c4cf5ded5c6f2a0b60 authored over 9 years ago by Gary Herreman <[email protected]>
Add support for badges and fix some ignore patterns.

- Ignore add_post urls.

- Only ignore page=1 on group pages.

- Ignore group pages that don't c...

0ad40fcaf03d778bb8b622e228e28e716d91be28 authored over 9 years ago by Gary Herreman <[email protected]>
Add support for grabbing challenges.

16da28a93a05127591c24935bb43e50b5b09f2f3 authored over 9 years ago by Gary Herreman <[email protected]>
Add support for grabbing competitions.

5dc78aeb97b454a97f707e853fdab220417f255b authored over 9 years ago by Gary Herreman <[email protected]>
Grab the extra group frontpage link.

It's tiny and seems to be linked to a lot on the website.

41b2ac438405e4674ddc18326d98b6215ff52779 authored over 9 years ago by Gary Herreman <[email protected]>
Add support for grabbing groups; bug fixes and cleanup.

ae755b0e630c84851413d33b6caec31bf2bff5c7 authored over 9 years ago by Gary Herreman <[email protected]>
Add support for grabbing stamps and ignore another static url.

9c3f707a44aa08a473f1f39bfe6a0edd8c2ce3b0 authored over 9 years ago by Gary Herreman <[email protected]>
Add support for grabbing blingee comments.

7ae3455f9d7bd176ef416cdb0d4bbf19c3386106 authored over 9 years ago by Gary Herreman <[email protected]>
Remove non-working deleted/private page detection; cleanup

The majority of blingees seem to be non-deleted/private anyway.

8a460a56c2a5318c14701dcb8e62024778041e53 authored over 9 years ago by Gary Herreman <[email protected]>
Initial commit

0bb44196ae38e8f7aee806401165f8d48193143e authored over 9 years ago by Gary Herreman <[email protected]>