Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/ArchiveTeam/blogger-grab
Archiving Blogger/Blogspot.
https://github.com/ArchiveTeam/blogger-grab
Version 20231216.01. Remove date check for checking and aborting blogs that had recent updates.
b721023201c15ee6348ab9b1de5a3db7295a79e3 authored about 1 year ago by arkiver <[email protected]>
b721023201c15ee6348ab9b1de5a3db7295a79e3 authored about 1 year ago by arkiver <[email protected]>
Version 20231129.04. Support dynamicviews blogs.
d738bcfba7a5bdda578ae0d6e8710c5573716d8c authored about 1 year ago by arkiver <[email protected]>
d738bcfba7a5bdda578ae0d6e8710c5573716d8c authored about 1 year ago by arkiver <[email protected]>
Version 20231129.03. Do not abort on adult blogs anymore.
fc39f94d93e12d17a7a25c5b0effb37ea6d8d9da authored about 1 year ago by arkiver <[email protected]>
fc39f94d93e12d17a7a25c5b0effb37ea6d8d9da authored about 1 year ago by arkiver <[email protected]>
Version 20231129.02. Detect loop in path and ignore URLs at loop depth 4.
7bc7d6f98afdb911241e30888b6654903acfcb74 authored about 1 year ago by arkiver <[email protected]>
7bc7d6f98afdb911241e30888b6654903acfcb74 authored about 1 year ago by arkiver <[email protected]>
Version 20231129.01. Add resolv.conf file.
358db83b9397632a54fe376c4610dfa82ce42230 authored about 1 year ago by arkiver <[email protected]>
358db83b9397632a54fe376c4610dfa82ce42230 authored about 1 year ago by arkiver <[email protected]>
Version 20231127.02. Add clock and invalid domain tests.
14f2eddfdc2b6d78dbd3f5c282a6f607f0029426 authored about 1 year ago by arkiver <[email protected]>
14f2eddfdc2b6d78dbd3f5c282a6f607f0029426 authored about 1 year ago by arkiver <[email protected]>
Version 20231127.01. Do not get /search/label/ URLs containing ?.
97f217f9964a81dcd814dd7e7178785827649b76 authored about 1 year ago by arkiver <[email protected]>
97f217f9964a81dcd814dd7e7178785827649b76 authored about 1 year ago by arkiver <[email protected]>
Version 20231126.03. Disable processing of Wget-AT extracted URL due to problems filtering out sidebars and footers.
0eb9827cf68c33086210640cb4ac50d609942a64 authored about 1 year ago by arkiver <[email protected]>
0eb9827cf68c33086210640cb4ac50d609942a64 authored about 1 year ago by arkiver <[email protected]>
Version 20231126.02. Enable default Wget-AT URL extraction.
4fc9549335390a730d634407c46d6a091312c7b7 authored about 1 year ago by arkiver <[email protected]>
4fc9549335390a730d634407c46d6a091312c7b7 authored about 1 year ago by arkiver <[email protected]>
Version 20231126.01. Relax pattern to extract URLs from href=.
2c554b7dd02e8d8a61496d6b8922cd8d1df9b4fb authored about 1 year ago by arkiver <[email protected]>
2c554b7dd02e8d8a61496d6b8922cd8d1df9b4fb authored about 1 year ago by arkiver <[email protected]>
Version 20231125.01. Do not accept /search URL with too many slashes.
87b7053c36eb55bdf4891ef0b4e644a3cf7984ac authored about 1 year ago by arkiver <[email protected]>
87b7053c36eb55bdf4891ef0b4e644a3cf7984ac authored about 1 year ago by arkiver <[email protected]>
Version 20231124.02. Ignore searchsearchsearch label URL.
e22fa1768e9f4b5e81948810a7b4215678aea3f1 authored about 1 year ago by arkiver <[email protected]>
e22fa1768e9f4b5e81948810a7b4215678aea3f1 authored about 1 year ago by arkiver <[email protected]>
Version 20231124.01. Ignore ?en-xx URLs. Handle /search/?q= URLs.
7edfd2aefaa78f8ef4547c778e801e39ca723e18 authored about 1 year ago by arkiver <[email protected]>
7edfd2aefaa78f8ef4547c778e801e39ca723e18 authored about 1 year ago by arkiver <[email protected]>
Version 20231123.04. Ignore blogger.com/feeds/*/blogs URLs.
ae954e013a98666f397bff7aaa64faaa4ecd465e authored about 1 year ago by arkiver <[email protected]>
ae954e013a98666f397bff7aaa64faaa4ecd465e authored about 1 year ago by arkiver <[email protected]>
Version 20231123.03. Only retry on problematic blog:* item URL.
9ca9b8a2b40d716a9e5d7b2f8cfec2b17f66a37b authored about 1 year ago by arkiver <[email protected]>
9ca9b8a2b40d716a9e5d7b2f8cfec2b17f66a37b authored about 1 year ago by arkiver <[email protected]>
Version 20231123.02. Timeout of 4 seconds. Remove --no-http-keep-alive option.
3b14a3fde0293dad6d8a1529607006a5cc607a55 authored about 1 year ago by arkiver <[email protected]>
3b14a3fde0293dad6d8a1529607006a5cc607a55 authored about 1 year ago by arkiver <[email protected]>
Version 20231123.01. Do not do retries on url:* item.
c06375d9859cde09ddcdfb1c8178421a12753750 authored about 1 year ago by arkiver <[email protected]>
c06375d9859cde09ddcdfb1c8178421a12753750 authored about 1 year ago by arkiver <[email protected]>
Version 20231121.19. Fix support for non-com TLDs for blogspot and blogger when extracting blog from URL.
1f1735a9f0efe37fde8219b06ab2701321e7a101 authored about 1 year ago by arkiver <[email protected]>
1f1735a9f0efe37fde8219b06ab2701321e7a101 authored about 1 year ago by arkiver <[email protected]>
Version 20231121.18. Support all TLDs. Prevent odd /search/label/ loop. Take out .htmlfeeds/posts URLs.
58d22bd566415973b2de0456037eaa83685c5c3e authored about 1 year ago by arkiver <[email protected]>
58d22bd566415973b2de0456037eaa83685c5c3e authored about 1 year ago by arkiver <[email protected]>
Version 20231121.17. Ignore share-post.g? URL.
66832a5851dfe148eebc0fe958eabcb50809abd7 authored about 1 year ago by arkiver <[email protected]>
66832a5851dfe148eebc0fe958eabcb50809abd7 authored about 1 year ago by arkiver <[email protected]>
Version 20231121.16. Do not print message on finding beginning and end of footer/sidebar/menu.
734922b3a80de2929c68946658b1f3c1eb7db986 authored about 1 year ago by arkiver <[email protected]>
734922b3a80de2929c68946658b1f3c1eb7db986 authored about 1 year ago by arkiver <[email protected]>
Version 20231121.15. Ignore search?q= URLs without updates-max parameter. Ignore ?m%3D1?m%3D1 URLs.
ca7be6b96bdf5427eca04b8ed031bdd9d3c544c1 authored about 1 year ago by arkiver <[email protected]>
ca7be6b96bdf5427eca04b8ed031bdd9d3c544c1 authored about 1 year ago by arkiver <[email protected]>
Version 20231121.14. Ignore URLs with more than 5 slashes or more.
95455f66f64aebbafce506732fec788c41bfe6e8 authored about 1 year ago by arkiver <[email protected]>
95455f66f64aebbafce506732fec788c41bfe6e8 authored about 1 year ago by arkiver <[email protected]>
Version 20231121.13. Support discovering blog, article, and page items using old *.blogger.com domain.
0e154edf1064e8b082d0a0754e07a428a6b1d91a authored about 1 year ago by arkiver <[email protected]>
0e154edf1064e8b082d0a0754e07a428a6b1d91a authored about 1 year ago by arkiver <[email protected]>
Version 20231121.12. Remove debug output.
300f089a327b5c074f61a5ea90268fd3f4a47aa9 authored about 1 year ago by arkiver <[email protected]>
300f089a327b5c074f61a5ea90268fd3f4a47aa9 authored about 1 year ago by arkiver <[email protected]>
Version 20231121.11. Fix handling https:///.
108aa9b0ea3e9e2a6911c7f24d2dcd623fff621f authored about 1 year ago by arkiver <[email protected]>
108aa9b0ea3e9e2a6911c7f24d2dcd623fff621f authored about 1 year ago by arkiver <[email protected]>
Version 20231121.10. Do not extract items from menu, sidebar, and footer for article, page, and search items.
4cbaf15783c56b6a83f464f672c854def76e03a7 authored about 1 year ago by arkiver <[email protected]>
4cbaf15783c56b6a83f464f672c854def76e03a7 authored about 1 year ago by arkiver <[email protected]>
Version 20231121.09. Ignore ?m=0 URLs.
f1e23fef7b9236d0de160ea9d6736015676826a4 authored about 1 year ago by arkiver <[email protected]>
f1e23fef7b9236d0de160ea9d6736015676826a4 authored about 1 year ago by arkiver <[email protected]>
Version 20231121.08. Get followed.g list of followers.
60bbaba2c5be79ba1f13fd24ac6ff5f8a2cc1c81 authored about 1 year ago by arkiver <[email protected]>
60bbaba2c5be79ba1f13fd24ac6ff5f8a2cc1c81 authored about 1 year ago by arkiver <[email protected]>
Version 20231121.07. Do not print every discovered item. Multi item size 1000 (limit on tracker side).
4cbf738a9ce8e7f34447c588cea18f6a0e038960 authored about 1 year ago by arkiver <[email protected]>
4cbf738a9ce8e7f34447c588cea18f6a0e038960 authored about 1 year ago by arkiver <[email protected]>
Version 20231121.06. Support profile items.
584c4578c6fabc0b28cad15495d5f021b6a5f50a authored about 1 year ago by arkiver <[email protected]>
584c4578c6fabc0b28cad15495d5f021b6a5f50a authored about 1 year ago by arkiver <[email protected]>
Version 20231121.05. Write outlinks to a special stash project.
61460697e7f8bef30a1325685f356f5caad0ca57 authored about 1 year ago by arkiver <[email protected]>
61460697e7f8bef30a1325685f356f5caad0ca57 authored about 1 year ago by arkiver <[email protected]>
Version 20231121.04. Take out sitemap.xml check in robots.txt.
2047dd5ebd3a294efad5764ecec2086263654c1a authored about 1 year ago by arkiver <[email protected]>
2047dd5ebd3a294efad5764ecec2086263654c1a authored about 1 year ago by arkiver <[email protected]>
Version 20231121.03. Keep decoding items before matching for finding aborted items.
9ee77725d8dbc986450583cb911f9071bd580413 authored about 1 year ago by arkiver <[email protected]>
9ee77725d8dbc986450583cb911f9071bd580413 authored about 1 year ago by arkiver <[email protected]>
Version 20231121.02. Move item name check.
8f88ad54dd7b639ec4c6f7ed0f695ff6a3e1546e authored about 1 year ago by arkiver <[email protected]>
8f88ad54dd7b639ec4c6f7ed0f695ff6a3e1546e authored about 1 year ago by arkiver <[email protected]>
Version 20231121.01. Fix local where needed.
2c768d1c819058bfbbd1f1711512bed80fdea4b3 authored about 1 year ago by arkiver <[email protected]>
2c768d1c819058bfbbd1f1711512bed80fdea4b3 authored about 1 year ago by arkiver <[email protected]>
Version 20231120.02. Do 2 retries on a problematic URL. Ignore ?widgetType=BlogArchive URLs.
0193896f4df3cebe99d9e3699a16e0e1f845545b authored about 1 year ago by arkiver <[email protected]>
0193896f4df3cebe99d9e3699a16e0e1f845545b authored about 1 year ago by arkiver <[email protected]>
Version 20231120.01. Ignore ?showComment= URLs.
a8e86a8ed26fccc3d2e6aa597e834a145f0d26be authored about 1 year ago by arkiver <[email protected]>
a8e86a8ed26fccc3d2e6aa597e834a145f0d26be authored about 1 year ago by arkiver <[email protected]>
Version 20231119.09. Increase backfeed size to 500 items.
57081be4557d578fa7a11d6bf5ed600c86acbcf3 authored about 1 year ago by arkiver <[email protected]>
57081be4557d578fa7a11d6bf5ed600c86acbcf3 authored about 1 year ago by arkiver <[email protected]>
Version 20231119.08. Skip URLs with space characters.
5549daf465e9977407db5e7ae69a4b846a8c6d62 authored about 1 year ago by arkiver <[email protected]>
5549daf465e9977407db5e7ae69a4b846a8c6d62 authored about 1 year ago by arkiver <[email protected]>
Version 20231119.07. Multi item size 200 to limit a tracker side. Relax sitemap check.
89476a7d22ff3db747bad498a605dbc88ae89c0f authored about 1 year ago by arkiver <[email protected]>
89476a7d22ff3db747bad498a605dbc88ae89c0f authored about 1 year ago by arkiver <[email protected]>
Version 20231119.06. Various updates.
17a130b7a17279177a9b4ba3d8ea0b62b3617d89 authored about 1 year ago by arkiver <[email protected]>
17a130b7a17279177a9b4ba3d8ea0b62b3617d89 authored about 1 year ago by arkiver <[email protected]>
Version 20231119.05. Get update-max pages again.
3d78441dcb12136502dbdd902bacee30bd39a2ea authored about 1 year ago by arkiver <[email protected]>
3d78441dcb12136502dbdd902bacee30bd39a2ea authored about 1 year ago by arkiver <[email protected]>
Version 20231119.04. Multi item size 10. Correctly use cjson. Accept status code 400. Ignore main pagination due to size. Better ignores.
672e4a6cc2b2f006f710e7688f1defa97ed44724 authored about 1 year ago by arkiver <[email protected]>
672e4a6cc2b2f006f710e7688f1defa97ed44724 authored about 1 year ago by arkiver <[email protected]>
Version 20231119.03. Fix project name.
359e24b509475c601081b65445d278272a18dc89 authored about 1 year ago by arkiver <[email protected]>
359e24b509475c601081b65445d278272a18dc89 authored about 1 year ago by arkiver <[email protected]>
Version 20231119.02. Move checks for skipping a blog up. Skip on adult overlay.
de5683e117c4d0e10c7d80c604cc013a176a053d authored about 1 year ago by arkiver <[email protected]>
de5683e117c4d0e10c7d80c604cc013a176a053d authored about 1 year ago by arkiver <[email protected]>
Version 20231119.01. Initial.
e473f3c84b0b51f481c377a192ed03e3b5bb8d10 authored about 1 year ago by arkiver <[email protected]>
e473f3c84b0b51f481c377a192ed03e3b5bb8d10 authored about 1 year ago by arkiver <[email protected]>