Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/ArchiveTeam/NewsGrabber

Grabbing all news.
https://github.com/ArchiveTeam/NewsGrabber

Added missing services from the Alexa Top 40 News Sites

(http://web.archive.org/web/20160403002351/http://www.alexa.com/topsites/category;0/Top/News and...

33fa363241eceaddc51d962911d722b4c4a68bd1 authored over 8 years ago by BnA-Robin <[email protected]>
Merge pull request #70 from BnA-Robin/master

Add vice network & national geographic services

f4cab0384e1e848f1bef71a4f1a28812f671762a authored over 8 years ago by Arkiver2 <[email protected]>
Increase refresh for i-d.vice.com service

259fe8b32746cbf726799b6064cac71f3b43f24e authored over 8 years ago by BnA-Robin <[email protected]>
Added national geographic services

89de53c6ee4ba8d4a9cafdfd51adbfd6adfaf1fb authored over 8 years ago by BnA-Robin <[email protected]>
Added all vice services

224cbbba868ec9d8192c39c035208c61f1b1b287 authored over 8 years ago by BnA-Robin <[email protected]>
Fix bloomberg.com URL order.

3901aa22512cca0cf6874aaeb008728ed47dc0e1 authored over 8 years ago by Arkiver2 <[email protected]>
9 Andorra newssites and YouTube channels.

18b4d49b49bb4a9f466e0b61a4b3a1f69e73b3da authored over 8 years ago by Arkiver2 <[email protected]>
Create web__lanacion_com_ar.py

42530d8685b047e070006b8fc67a9348481f9380 authored over 8 years ago by HarryC145 <[email protected]>
Add sustg.com.

8cea13d71f414916ae48f725913584771249e401 authored over 8 years ago by Arkiver2 <[email protected]>
Update web__bloomberg_co_jp.py

ea16196a65d4a6b90b7038d83a2e843154e34e4a authored over 8 years ago by HarryC145 <[email protected]>
Create web__bloomberg_co_jp.py

7c215dbebc4b9f2546768061c101190196cdf14c authored over 8 years ago by HarryC145 <[email protected]>
Create web__bloomberg_com.py

a7f345a47e0374ad60562b59c89e03f351fa199b authored over 8 years ago by HarryC145 <[email protected]>
Fix name, take volksblatt.li out.

011d36b4d61488d7f5fbac9effeee03424bb8b80 authored over 8 years ago by Arkiver2 <[email protected]>
Merge pull request #69 from BnA-Robin/ThinkTanks

Think tanks: United Nations and countries starting with A

09eed5c7e74db63ec826d6153115740efd44e2df authored over 8 years ago by HarryC145 <[email protected]>
Update web__percapita_org_au.py

Typo

ed09aeb61ee766e1e2692e9d0c7aae4e96af89b6 authored over 8 years ago by BnA-Robin <[email protected]>
Think Tanks services: Azerbaijan services

dd820e0c4a9fb97c487d43f65f274c112c0b9c51 authored over 8 years ago by BnA-Robin <[email protected]>
Think Tanks services: Australia services

858d333a7e71b828ad09fd41c7a49df99aef5b0c authored over 8 years ago by BnA-Robin <[email protected]>
Think Tanks services: Argentina services

030a2524ef833ea78ea12b30a602428883abb76b authored over 8 years ago by BnA-Robin <[email protected]>
Think Tanks services: Albania services

469211dd711c5dd647b45acc6e840a2ed7aae8d8 authored over 8 years ago by BnA-Robin <[email protected]>
Think Tanks services: United Nations services

c47247dd89d88c8dc056d24152af3e29744d1465 authored over 8 years ago by BnA-Robin <[email protected]>
Merge pull request #67 from BnA-Robin/master

Fix for services: missing http:// prefix in web__video_bbc_com service

2e27e0921d7aef82b05b60ce7c475b72cfb589be authored over 8 years ago by HarryC145 <[email protected]>
Fix for services: missing http:// prefix in web__video_bbc_com service

d1aac197c65efe3bdee3eeb1ae7e7e9a4c9ced0a authored over 8 years ago by BnA-Robin <[email protected]>
main.py: Sleep some time after check_refresh.

Make sure standard regexes are not added twice to the regexes lists.
Fix HTML service file is wr...

f6db45fd7dcd81feba27ecdbe7dfe2c8f8537b88 authored over 8 years ago by Arkiver2 <[email protected]>
Merge pull request #65 from BnA-Robin/RSS_Refactor

Refactored services

08a591567ecc6025594b4edb6c17247f4a011a5b authored over 8 years ago by Arkiver2 <[email protected]>
Refactored services: move RSS to the back of the list, add user overviews for services that only have RSS feeds listed and fixed formatting of the services files

85f93934590c4d996200d72ba6795dc695cf4411 authored over 8 years ago by BnA-Robin <[email protected]>
worker_script.py: disable external links grabbing.

6657bc5d8a5bf69f2a15ff25226a51983034f815 authored over 8 years ago by Arkiver2 <[email protected]>
worker_script.py: fix in grabbing external links. Use useragent "ArchiveTeam; Googlebot/2.1" to prevent some cookie messages from appearing.

f244e1f350f61ee6958c6a5871b342143ccee705 authored over 8 years ago by Arkiver2 <[email protected]>
worker_script.py: grab external links from articles if the article is not a video article.

b3d47fc54bb67617a073d5480bc6a5ae8acc3890 authored over 8 years ago by Arkiver2 <[email protected]>
worker.py: refresh script every 300 seconds.

906fc3a4317ef10fd97ce98d81e90615f0cc25ee authored over 8 years ago by Arkiver2 <[email protected]>
worker.py: check if length of downloaded worker_script.py is not 0.

82843f3e0e6465fd475862c825f316dfce0705cd authored over 8 years ago by Arkiver2 <[email protected]>
worker.py: Updated to download and run worker_script.py every 5 minutes.

9eaab2cb1bfbe2776004e0dcfdbe9af0182ac5d2 authored over 8 years ago by Arkiver2 <[email protected]>
Merge branch 'master' of https://github.com/ArchiveTeam/NewsGrabber

0ec138a556555b243d66b8c13abd7b00a4cb8bf4 authored over 8 years ago by Arkiver2 <[email protected]>
worker_script.py to be downloaded and run by worker.py.

ae3713194125d60e8b39959ac0496fa00b99435e authored over 8 years ago by Arkiver2 <[email protected]>
Merge pull request #64 from BnA-Robin/Newsbuddy_Main

Grab source url as well for improved replay

55c07df402d6274859e0387ca3f576553700ae99 authored over 8 years ago by Arkiver2 <[email protected]>
Update web__bbc_com.py

43ec9b1a689c8730da1b13b087903033cac7628c authored over 8 years ago by HarryC145 <[email protected]>
# Add url itself if there are any urls found for this url and it hasn't been added to the grablistnormal array yet

# This will improve replay of the website because the original overviews will be captured once fo...

73fadca3267ba2f20eaf7e840a3be856a33ec540 authored over 8 years ago by BnA-Robin <[email protected]>
Merge pull request #63 from BnA-Robin/master

New services

b53eaa03789422f5a60919a55a3cb081a4407745 authored over 8 years ago by HarryC145 <[email protected]>
Added special Dutch news website for children

7d43d5b9aa475788a3d38a01272589c23a98bcc7 authored over 8 years ago by BnA-Robin <[email protected]>
Added elsevier.nl

06f2d623e084516c98dfb64285f78788db951b05 authored over 8 years ago by BnA-Robin <[email protected]>
Conventions: start list of urls on the first row

17a3de756c1a0df3576630afb81a0575451da853 authored over 8 years ago by BnA-Robin <[email protected]>
Added some more services that got referenced in recent news and/or have background articles

d75e8b4f9e6182e5d2cbc1ef7ae2711ec798f4cd authored over 8 years ago by BnA-Robin <[email protected]>
Added russian ria.ru

9f6d7d0c59919aacfef896de92948992f51e77c3 authored over 8 years ago by BnA-Robin <[email protected]>
Add 25 Russian newssites and 1 YouTube channel.

f2ba746c9adac0df73cc7e2c455502094f1e4a8d authored over 8 years ago by Arkiver2 <[email protected]>
add mackungfu.org

116a1d22a392ba56cfca1829a3de839671c89593 authored over 8 years ago by PressStartandSelect <[email protected]>
Tries on seed URLs to 10.

Bump version to 20160321.01.

7ed655915870f269fc7063fb6e931dc6ea2ffd83 authored over 8 years ago by Arkiver2 <[email protected]>
Standard regexes shouldn't match the domain.

Bump version to 20160320.02.

51dd41a49281a9f58d46195239abe905c83a14d1 authored over 8 years ago by Arkiver2 <[email protected]>
Refresh services on command !rs or !refresh-services.

Write files not to temp first.
Bump version to 20160320.01.

84f6e66f1bc5b6a66076109072b1fc4580e104d2 authored over 8 years ago by Arkiver2 <[email protected]>
Add 9 Russian newssites

317d2f8991d3d839cd369fba6b770322bbf4b8df authored over 8 years ago by Arkiver2 <[email protected]>
Update web__reuters_com.py

b80703941bb5869338343123cc3a84478bbe7195 authored over 8 years ago by HarryC145 <[email protected]>
Add 17 Russian newssites and youtube channels.

28b0170650581f1e872ee7ea01ba86b0c47ef506 authored over 8 years ago by Arkiver2 <[email protected]>
update politifact

5b67597d58d622b5934614e21bc6b05193fdeb74 authored over 8 years ago by PressStartandSelect <[email protected]>
Create web__video_bbc_com.py

6594b620e926ef09d6d0bb41783f07a0f9ac48e2 authored over 8 years ago by HarryC145 <[email protected]>
Fix regex for recognizing IRC commands. Bump version to 20160314.01.

2246b958d087b7190d7c729991673df3a933174b authored over 8 years ago by Arkiver2 <[email protected]>
Merge pull request #62 from BnA-Robin/master

Added qq.com (ranked 8th in global Alexa)

f6ef5cde29f45f91c426408f41dd695184dab4e0 authored over 8 years ago by HarryC145 <[email protected]>
Fix alignment and filename for qq.com

ca2ddb8c12ad144a87e942843ec998f41d56f038 authored over 8 years ago by BnA-Robin <[email protected]>
Fix for formatting of qq.com

6e88745f7d168f43199d8952dc23d8a75d4e811b authored over 8 years ago by BnA-Robin <[email protected]>
Added qq.com (ranked 8th in global Alexa)

8f924ad4b3ea70e3ed2106533799fc5f9ea7ded6 authored over 8 years ago by BnA-Robin <[email protected]>
Add 12 UK newssites.

f40ee35f21ae5c8276ac25bb62bd5a7c4e242e3d authored over 8 years ago by Arkiver2 <[email protected]>
Add looopings.nl

323553aecca75ab5c3d34e5736ed5501a6a88559 authored over 8 years ago by Arkiver2 <[email protected]>
main.py: Fix writing of Internet Archive item numbering to disk.

Fix trying to remove file that's already removed by rsync.
Bump version to 20160311.01.

6a600278d0873ad16a98f602e2871b5679d22000 authored over 8 years ago by Arkiver2 <[email protected]>
worker.py: Use --no-o and --no-g with rsync (http://serverfault.com/questions/364709/how-to-keep-rsync-from-chowning-transfered-files).

458b76df15fc6979904c21594d059b5f6392d7c4 authored over 8 years ago by Arkiver2 <[email protected]>
main.py: Set max number of concurrent uploads with !con-uploads or !concurrent-uploads.

Before and upload, check with Internet Archive if your accesskey is not
'over_limit' or 'rationi...

a9e3906d3f8d216176a74ef1a98ba4878b9aca3f authored over 8 years ago by Arkiver2 <[email protected]>
Split discovery and grabbing.

Main.py is the storage server. Worker.py is the grabber server. The
storage server discovers new...

2ac7957edffb6deb0f8291a4045fb038b1a80640 authored over 8 years ago by Arkiver2 <[email protected]>
add 8 news sites and fix 9 news sites

23c8eeef5cd6af487788e1a136196fd4e672ffec authored over 8 years ago by PressStartandSelect <[email protected]>
fix filename

002bb07c705a1cd48fbc228c9e7fe0f9dec82795 authored over 8 years ago by PressStartandSelect <[email protected]>
add 21 african news sites

bc755a433621d0e3ae2ca7e1e651c1361f85dc13 authored over 8 years ago by PressStartandSelect <[email protected]>
add nunatsiaqonline.ca

b9762e546f00bf633b71993d2c0d68ee5e4e0b96 authored over 8 years ago by PressStartandSelect <[email protected]>
add more postmedia sites

6cabb2125da6bda7b283bbf8f3c08f1311dbcf45 authored over 8 years ago by PressStartandSelect <[email protected]>
add metro news, politifact, politico

1025918d5b2f9a60c6334389ec6e6df54f8e1335 authored over 8 years ago by PressStartandSelect <[email protected]>
add 10 sites

44a896fde058464182490e6570822c1ad560b6cc authored over 8 years ago by PressStartandSelect <[email protected]>
Fix issue #60, #59, investigate on #21. Bump version to 20160219.01.

85486b18ecd027312225cbcf73e7e70e312f7c50 authored over 8 years ago by Arkiver2 <[email protected]>
Merge pull request #58 from ingenioustechie/master

Added dnaindia.com

7a6e14b611a271831d4258fe36ebd03e874e9385 authored over 8 years ago by HarryC145 <[email protected]>
Added Wikidata

4f203b5ddcf9d43ca6a44dc6025517a657a5e16e authored over 8 years ago by Mukesh <[email protected]>
did Indentation

a1cfc6bd64b7a4f997e60e165529a20dc872fd8a authored over 8 years ago by Mukesh <[email protected]>
Changed refresh time

4b3056592af66152d86a72893f4ae72f586b1df2 authored over 8 years ago by Mukesh <[email protected]>
Create web__liveleak_com.py

5f466cd1a106d72a6add8ae75aed2ab0a3ab7602 authored over 8 years ago by HarryC145 <[email protected]>
http? to https?

a295f4108f85fec140c137b0775aacd3ad867e53 authored over 8 years ago by Arkiver2 <[email protected]>
Update web__udn_com.py

8b31336902e1ad8ed841034fdf26775470c755aa authored over 8 years ago by HarryC145 <[email protected]>
Fix reconnect problem. Bump version to 20160216.01.

7a45c053ca41e22e02b710602b9836b4edb697d2 authored over 8 years ago by Arkiver2 <[email protected]>
Added dnaindia.com

ad7ebce625b2568aca4115724d79b3e9050f556b authored over 8 years ago by Mukesh <[email protected]>
Merge pull request #55 from ersi/taiwan-top-news

Adding 5 Taiwan/ROC newspapers

008a845f0fb9bf1e33942cb7d490c71905e8a0b1 authored over 8 years ago by HarryC145 <[email protected]>
Changing refresh to 6 from 7 for taiwan news

87b15f73cd53721d6bded0cd5f18d2805db027fd authored over 8 years ago by Erik Simmesgård <[email protected]>
Create web__naenara_com_kp.py

ff7f62d43d5972e5ba6b34a06ac1647eec56f068 authored over 8 years ago by HarryC145 <[email protected]>
Fix issues #45, #54, #53, #57, #48. Bump version to 20160214.01.

478bd919165738d14dcbf53d9037bd2ded9b1a15 authored over 8 years ago by Arkiver2 <[email protected]>
Fix upload problem. Update version to 20160123.01.

bea83436cabb12a4079a9979386f3772b6407e5c authored over 8 years ago by Arkiver2 <[email protected]>
Adding 5 Taiwan/ROC newspapers:

* appledaily.com.tw
* chinatimes.com
* ltn.com.tw
* taipeitimes.com
* udn.com

f5976dbf31ce615b9bbdafa0cf2baaefdcf4cf27 authored over 8 years ago by Erik Simmesgård <[email protected]>
Update web__euronews_com.py

d1e4d11a30466158fb3242cb9c79b972f89c64dc authored over 8 years ago by HarryC145 <[email protected]>
Update web__euronews_com.py

73289725ada9aff28278a4312f9d31cb435ac745 authored over 8 years ago by HarryC145 <[email protected]>
Create web__euronews_com.py

3ce5689ebb9db90d61c10498fd804a7821825deb authored over 8 years ago by HarryC145 <[email protected]>
Create web__nytimes_com.py

0d53f0af258a007d24c8969b322f193d77c11617 authored over 8 years ago by HarryC145 <[email protected]>
Create web__ap_org.py

63f5d6423f908da29c3d841dcc3254a29c656cf0 authored over 8 years ago by HarryC145 <[email protected]>
Rename web_thanhnien_vn.py to web__thanhnien_vn.py

fdb38a14907ee0413ce24a6c50aab6e4bda0970c authored over 8 years ago by HarryC145 <[email protected]>
Create web_thanhnien_vn.py

5abd721f943890108c73021a35af98b79353cd85 authored over 8 years ago by HarryC145 <[email protected]>
added http://tuoitrenews.vn/ - vietnam

fa15d0995083b5d3829a216ef80ac02f21c7987a authored over 8 years ago by HarryC145 <[email protected]>
Merge pull request #36 from espes/master

add CCTV中文国际 youtube channel

2421bf2d481bce7a4fa91f49611621f39e3d5811 authored over 8 years ago by HarryC145 <[email protected]>
Merge pull request #37 from JesseWeinstein/wikidata_links

5 wikidata links (including two newly created items)

9ae2fd5f70e261d05869b657878b210962d2f7cb authored over 8 years ago by HarryC145 <[email protected]>
5 wikidata links (including two newly created items)

85d8bf2f85e74c713c9df6811a34f2978e623384 authored over 8 years ago by Jesse Weinstein <[email protected]>
add cctvch youtube channel

1c8668996e495855b6afcd6c0f1f94268a027493 authored over 8 years ago by espes <[email protected]>
Merge pull request #34 from espes/cankaoxiaoxi

add cankaoxiaoxi.com

2bc3cd5fad22488b1769af17a771e2be74651ec5 authored over 8 years ago by HarryC145 <[email protected]>
Merge pull request #33 from espes/master

add news24.com

fed68a71b8745d32c6bde0d2d68aa193c1c9e65f authored over 8 years ago by HarryC145 <[email protected]>