Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/ArchiveTeam/urls-grab
Archiving URLs (outlinks) from a variety of sources.
https://github.com/ArchiveTeam/urls-grab
Version 20240301.01. Move filters from tracker to repo.
fd7672639ce5564d674c343a4398bf72ebacf8f3 authored 10 months ago by arkiver <[email protected]>
fd7672639ce5564d674c343a4398bf72ebacf8f3 authored 10 months ago by arkiver <[email protected]>
Version 20240116.01. Skip on URL with crumb parameter when parent URL has crumb parameter as well.
f511f3854c76257d11ee1215fd73833540b4f5df authored 12 months ago by arkiver <[email protected]>
f511f3854c76257d11ee1215fd73833540b4f5df authored 12 months ago by arkiver <[email protected]>
Version 20240110.01. Introduce filtering of URLs when parent and new URL match the same pattern. Improve filtering out of '+xxx+' in URLs. Add several patterns to prevent some loops.
2068e64612d3b6d545d549f8048afcf3b9a70c9f authored 12 months ago by arkiver <[email protected]>
2068e64612d3b6d545d549f8048afcf3b9a70c9f authored 12 months ago by arkiver <[email protected]>
Version 20231230.01. Prevent loop due to sitemapa.xml URLs pointing to a different domain.
756e327d5f79464a236757bd322a40bf818fa63e authored about 1 year ago by arkiver <[email protected]>
756e327d5f79464a236757bd322a40bf818fa63e authored about 1 year ago by arkiver <[email protected]>
Version 20231229.01. Ignore certain font URLs discovered from other font URLs.
4e092207f9c0a83d17fb769622a184462f8dd9ee authored about 1 year ago by arkiver <[email protected]>
4e092207f9c0a83d17fb769622a184462f8dd9ee authored about 1 year ago by arkiver <[email protected]>
Version 20231212.01. Exit URL with ver URL parameter.
a730f5492d690b5f8d30fb00be9f394273f7aad3 authored about 1 year ago by arkiver <[email protected]>
a730f5492d690b5f8d30fb00be9f394273f7aad3 authored about 1 year ago by arkiver <[email protected]>
Version 20231207.01. Exit the URL on finding the xoid URL parameter.
1bfe2ffb6e5e52c8dfe23ca555e5dcc7d5d88fb6 authored about 1 year ago by arkiver <[email protected]>
1bfe2ffb6e5e52c8dfe23ca555e5dcc7d5d88fb6 authored about 1 year ago by arkiver <[email protected]>
Version 20231112.04. Stop queuing URLs discovered on special interest pages.
394d1a56af634518ee05e7fbedde578c636e3b33 authored about 1 year ago by arkiver <[email protected]>
394d1a56af634518ee05e7fbedde578c636e3b33 authored about 1 year ago by arkiver <[email protected]>
Version 20231112.03. Stop queuing to the Zippyshare project.
f6b1d712035d4d431cbecd28807c0c9b7aed50bf authored about 1 year ago by arkiver <[email protected]>
f6b1d712035d4d431cbecd28807c0c9b7aed50bf authored about 1 year ago by arkiver <[email protected]>
Version 20231112.02. Discover and queue Imgur items.
5a6d47040d20dd1aa181f5299815b81479dc3640 authored about 1 year ago by arkiver <[email protected]>
5a6d47040d20dd1aa181f5299815b81479dc3640 authored about 1 year ago by arkiver <[email protected]>
Version 20231112.01. Also support trailing / for 301 to other domain to handle spam.
dcdee944eec699a45abea4b47837bd361af81e02 authored about 1 year ago by arkiver <[email protected]>
dcdee944eec699a45abea4b47837bd361af81e02 authored about 1 year ago by arkiver <[email protected]>
Version 20231108.06. Do not download 301 redirected to URL in same session when not queued back.
24ec555db94a95afe261280978176a0b3b319fe6 authored about 1 year ago by arkiver <[email protected]>
24ec555db94a95afe261280978176a0b3b319fe6 authored about 1 year ago by arkiver <[email protected]>
Version 20231108.05. Queue again from ads.txt and app-ads.txt. Do not queue URL to which is 301 redirected with if it a front page without trailing /.
c7c2718124f502d3ac0dc0c0a16f65c7345779a7 authored about 1 year ago by arkiver <[email protected]>
c7c2718124f502d3ac0dc0c0a16f65c7345779a7 authored about 1 year ago by arkiver <[email protected]>
Version 20231108.04. Stop queuing from ads.txt and app-ads.txt. Multi item size to 100, to limit at tracker side.
0d53e097f74dcf9db76c9f5540ba2b6b2a74c28b authored about 1 year ago by arkiver <[email protected]>
0d53e097f74dcf9db76c9f5540ba2b6b2a74c28b authored about 1 year ago by arkiver <[email protected]>
Version 20231108.03. Append / to URL when normalising for aborted URLs check if not enough / in URL.
51da137bece442f1ddecd3e1881602225c6bc91b authored about 1 year ago by arkiver <[email protected]>
51da137bece442f1ddecd3e1881602225c6bc91b authored about 1 year ago by arkiver <[email protected]>
Version 20231108.02. Exit on S?SID and _?s URL paramaters.
10695bfba2c44fa216171132931d7be37fe84b03 authored about 1 year ago by arkiver <[email protected]>
10695bfba2c44fa216171132931d7be37fe84b03 authored about 1 year ago by arkiver <[email protected]>
Version 20231108.01. Take out new /template/news/{xzx,b1/} spam changes.
b1d6ce647b5e1d9afa8bc69ce396cde5eb1232a3 authored about 1 year ago by arkiver <[email protected]>
b1d6ce647b5e1d9afa8bc69ce396cde5eb1232a3 authored about 1 year ago by arkiver <[email protected]>
Version 20231107.02. Initial commented out code for using pandoc to convert file to PDF for further processing for URLs extraction.
7050f9ecb4023a0d8a7a11abec8d8b5e2d76a445 authored about 1 year ago by arkiver <[email protected]>
7050f9ecb4023a0d8a7a11abec8d8b5e2d76a445 authored about 1 year ago by arkiver <[email protected]>
Version 20231107.01. Rewrite ˜ to ~ in PDF extracted URLs. Handle port in extracted URLs without protocol from PDF.
4abe1c23611b9b4f9fe5de5398ead04fc4832e37 authored about 1 year ago by arkiver <[email protected]>
4abe1c23611b9b4f9fe5de5398ead04fc4832e37 authored about 1 year ago by arkiver <[email protected]>
Version 20231102.01. Check for minimum version of Wget-AT instead of specific version.
d99c6ba6958e974361f39f416d600b70ce49f2eb authored about 1 year ago by arkiver <[email protected]>
d99c6ba6958e974361f39f416d600b70ce49f2eb authored about 1 year ago by arkiver <[email protected]>
Version 20231031.01. Take out new /template/ loops.
0c36bf8ee7c61a163e5d238f90ef8b89d1a3b3ee authored about 1 year ago by arkiver <[email protected]>
0c36bf8ee7c61a163e5d238f90ef8b89d1a3b3ee authored about 1 year ago by arkiver <[email protected]>
Version 20231024.03. Extract every candidate URL from set of strings to join.
b4d2520b4c7e863adc15458412606db58c119d62 authored about 1 year ago by arkiver <[email protected]>
b4d2520b4c7e863adc15458412606db58c119d62 authored about 1 year ago by arkiver <[email protected]>
Version 20231024.02. Filter bad extracted URL.
624906dd015c17fb78e6b31e8e49023e5464f8b4 authored about 1 year ago by arkiver <[email protected]>
624906dd015c17fb78e6b31e8e49023e5464f8b4 authored about 1 year ago by arkiver <[email protected]>
Version 20231024.01. Remove some too wide filter patterns.
b549441f1ca1fe17ed5dc41b569443b707a7fff9 authored about 1 year ago by arkiver <[email protected]>
b549441f1ca1fe17ed5dc41b569443b707a7fff9 authored about 1 year ago by arkiver <[email protected]>
Version 20231020.02. Fix filter pattern to allow for - in URL.
9c7488c70f9780a0635ef6acb60b09cd8e3db2bb authored about 1 year ago by arkiver <[email protected]>
9c7488c70f9780a0635ef6acb60b09cd8e3db2bb authored about 1 year ago by arkiver <[email protected]>
Version 20231020.01. Add more ads URLs to one-time patterns list.
82d36071cdb3de185ea7c72c6dcbf171c53e78d4 authored about 1 year ago by arkiver <[email protected]>
82d36071cdb3de185ea7c72c6dcbf171c53e78d4 authored about 1 year ago by arkiver <[email protected]>
Version 20231019.02. Queue back all URLs found on special interest pages.
8b395ccab59bd3b5c9c3551d32539f4631c73430 authored about 1 year ago by arkiver <[email protected]>
8b395ccab59bd3b5c9c3551d32539f4631c73430 authored about 1 year ago by arkiver <[email protected]>
Version 20231019.01. Support extracting URLs from PDF with obfuscated '.' as ' dot ' or ' (dot) ' or ' [dot] '. Handle extra white spaces after newline in PDF URL extraction.
f3e6c3caaf889eb4b460f2809f4b51778c0efd86 authored about 1 year ago by arkiver <[email protected]>
f3e6c3caaf889eb4b460f2809f4b51778c0efd86 authored about 1 year ago by arkiver <[email protected]>
Version 20231017.01. Exit URL on _event_transid parameter.
32d22d21432810b1993561d1990701c87c8b19eb authored about 1 year ago by arkiver <[email protected]>
32d22d21432810b1993561d1990701c87c8b19eb authored about 1 year ago by arkiver <[email protected]>
Version 20231016.02.
c2fdb4c2e3f987251e479e5cf8fb521caf4d38ee authored about 1 year ago by arkiver <[email protected]>
c2fdb4c2e3f987251e479e5cf8fb521caf4d38ee authored about 1 year ago by arkiver <[email protected]>
Filter out /read/ loop.
d982c15f7832cd3b00ade9a51dd06170df243c3b authored about 1 year ago by arkiver <[email protected]>
d982c15f7832cd3b00ade9a51dd06170df243c3b authored about 1 year ago by arkiver <[email protected]>
Prevent /pics/K888 loop.
1b86e04abc8ff2dcf8032e0657d2b0e04d75cbdc authored about 1 year ago by arkiver <[email protected]>
1b86e04abc8ff2dcf8032e0657d2b0e04d75cbdc authored about 1 year ago by arkiver <[email protected]>
Handle newshtml loop.
c7c5a8e2cf1867ab894d8ce48f496ba995f03b56 authored about 1 year ago by arkiver <[email protected]>
c7c5a8e2cf1867ab894d8ce48f496ba995f03b56 authored about 1 year ago by arkiver <[email protected]>
Version 20231016.01. Ignore upluds yamaxun loop.
f4c4e39f7954014b779f5586541a1399912ae1bf authored about 1 year ago by arkiver <[email protected]>
f4c4e39f7954014b779f5586541a1399912ae1bf authored about 1 year ago by arkiver <[email protected]>
Version 20231015.02. Ignore loops.
68977d2e3131cbda44392f9eaf8ba358b6d03d8b authored about 1 year ago by arkiver <[email protected]>
68977d2e3131cbda44392f9eaf8ba358b6d03d8b authored about 1 year ago by arkiver <[email protected]>
Version 20231015.01. googlesyndication.com and googletagmanager.com URLs are one-time URLs.
5bb9f7ccd9e3c14911659227e533740caf0ac624 authored about 1 year ago by arkiver <[email protected]>
5bb9f7ccd9e3c14911659227e533740caf0ac624 authored about 1 year ago by arkiver <[email protected]>
Version 20231010.03. Actually stop opening user-agents.txt file as well.
3ac6baafa6d35ac9aed7b2f2ba0c22195e288abd authored about 1 year ago by arkiver <[email protected]>
3ac6baafa6d35ac9aed7b2f2ba0c22195e288abd authored about 1 year ago by arkiver <[email protected]>
Version 20231010.02. Load user agents list only once.
e3e5e74136e469ff07437f61a837a5f90a0e9cd0 authored about 1 year ago by arkiver <[email protected]>
e3e5e74136e469ff07437f61a837a5f90a0e9cd0 authored about 1 year ago by arkiver <[email protected]>
Version 20231010.01. Disable check on http://on.quad9.net/.
1ebafa4f63cfcd9a922a2eccc5c3feb8e97395dc authored about 1 year ago by arkiver <[email protected]>
1ebafa4f63cfcd9a922a2eccc5c3feb8e97395dc authored about 1 year ago by arkiver <[email protected]>
Version 20230811.01.
7a5d6cd66f641dec3fff0ed9d93c6c19f3262544 authored over 1 year ago by arkiver <[email protected]>
7a5d6cd66f641dec3fff0ed9d93c6c19f3262544 authored over 1 year ago by arkiver <[email protected]>
Revert "Version 20230810.01. Enable getting special interest URLs."
This reverts commit a61bf4cb0674c3e3410e69ebe8c14a40e09580e3.
28aa4aae5c6ac376f88225f2db70851e7a8a9618 authored over 1 year ago by arkiver <[email protected]>
Version 20230810.01. Enable getting special interest URLs.
a61bf4cb0674c3e3410e69ebe8c14a40e09580e3 authored over 1 year ago by arkiver <[email protected]>
a61bf4cb0674c3e3410e69ebe8c14a40e09580e3 authored over 1 year ago by arkiver <[email protected]>
Version 20230809.01. Get rid of --rotate-dns option.
0f8d2e87667bb7e7c1eda71b00636d4b03cf3a44 authored over 1 year ago by arkiver <[email protected]>
0f8d2e87667bb7e7c1eda71b00636d4b03cf3a44 authored over 1 year ago by arkiver <[email protected]>
Version 20230807.03. More one time URLs.
913501289112fe48ee9889c6c617c1188c714667 authored over 1 year ago by arkiver <[email protected]>
913501289112fe48ee9889c6c617c1188c714667 authored over 1 year ago by arkiver <[email protected]>
Version 20230807.02. Update user-agents.
745ab956747a5d55067f69c86e031ffb51bcaebe authored over 1 year ago by arkiver <[email protected]>
745ab956747a5d55067f69c86e031ffb51bcaebe authored over 1 year ago by arkiver <[email protected]>
Version 20230807.01. Treat doubleclick.net as one time URL. Do not queue back all URLs found on pages of interest.
20b98a866b62d35fd7ca27c10dd2c077f3857a58 authored over 1 year ago by arkiver <[email protected]>
20b98a866b62d35fd7ca27c10dd2c077f3857a58 authored over 1 year ago by arkiver <[email protected]>
Merge pull request #14 from imerr/master-1
Extra docker container params
364ff907c74bec1c1dc5e4aad7fa6f28b64e1306 authored over 1 year ago by arkiver <[email protected]>
Extra docker container params
watchtower: `--include-restarting` also update if the container is in a crash loop due to a ba...
7484928673406533146a9a990d3751764e6b41f4 authored over 1 year ago by Robin Rolf <[email protected]>
Version 20230803.01. Queue URLs matching certain patterns to 'onetime' shard instead of main filter.
a3776c6916c6750b19ca67cbeb7461ca1cbac309 authored over 1 year ago by arkiver <[email protected]>
a3776c6916c6750b19ca67cbeb7461ca1cbac309 authored over 1 year ago by arkiver <[email protected]>
Version 20230801.01. Queue URLs extracted from new sitemaps to separate tracker.
26c6ab229a7242ddab030ba3c2c969d0ded9dfe2 authored over 1 year ago by arkiver <[email protected]>
26c6ab229a7242ddab030ba3c2c969d0ded9dfe2 authored over 1 year ago by arkiver <[email protected]>
Version 20230727.01. Use GNU Wget 1.21.3-at.20230623.01. Use Wget-AT option --reject-reserved-subnets. Remove old Wget files. Update README to latest.
14a7eab17ba9b7829f7418782e729f12a0f0149f authored over 1 year ago by arkiver <[email protected]>
14a7eab17ba9b7829f7418782e729f12a0f0149f authored over 1 year ago by arkiver <[email protected]>
Version 20230725.02. Enable queuing URLs from special interest URLs.
047a74b99276328953523d31d0b6efbae11e6299 authored over 1 year ago by arkiver <[email protected]>
047a74b99276328953523d31d0b6efbae11e6299 authored over 1 year ago by arkiver <[email protected]>
Version 20230725.01. Reformat checks in pipeline.
c77610711a4878da35f31030484808e5f61347af authored over 1 year ago by arkiver <[email protected]>
c77610711a4878da35f31030484808e5f61347af authored over 1 year ago by arkiver <[email protected]>
Version 20230722.03.
ff80c50c254e96955179edaed8521af5d6ca952a authored over 1 year ago by arkiver <[email protected]>
ff80c50c254e96955179edaed8521af5d6ca952a authored over 1 year ago by arkiver <[email protected]>
Revert "Version 20230722.02. Queue pages from special interest pages again."
This reverts commit 2d78071caa26421348929e5526277eb0246baabf.
85fd95763f744b669689b56df213234d71616160 authored over 1 year ago by arkiver <[email protected]>
Version 20230722.02. Queue pages from special interest pages again.
2d78071caa26421348929e5526277eb0246baabf authored over 1 year ago by arkiver <[email protected]>
2d78071caa26421348929e5526277eb0246baabf authored over 1 year ago by arkiver <[email protected]>
Version 20230722.01. Extract more news sitemaps.
5d7bd3f1ebc5bfa72adf48d355572a082da13887 authored over 1 year ago by arkiver <[email protected]>
5d7bd3f1ebc5bfa72adf48d355572a082da13887 authored over 1 year ago by arkiver <[email protected]>
Version 20230721.01. Fix pattern match problem.
735b4fde33595843a078681913bd788ff12176d4 authored over 1 year ago by arkiver <[email protected]>
735b4fde33595843a078681913bd788ff12176d4 authored over 1 year ago by arkiver <[email protected]>
Version 20230719.03. Improve spam check.
3c6d283ba5a06aafa447c45d220cf446430fc27b authored over 1 year ago by arkiver <[email protected]>
3c6d283ba5a06aafa447c45d220cf446430fc27b authored over 1 year ago by arkiver <[email protected]>
Version 20230719.02. Queue default pages only for https URL.
280dbb1d95d39442db1c32e89b606ac9e446f116 authored over 1 year ago by arkiver <[email protected]>
280dbb1d95d39442db1c32e89b606ac9e446f116 authored over 1 year ago by arkiver <[email protected]>
Version 20230719.01. Only queue special URLs if main page is 200.
859f85668310d2f15b92503cf6bd66ea03dfe779 authored over 1 year ago by arkiver <[email protected]>
859f85668310d2f15b92503cf6bd66ea03dfe779 authored over 1 year ago by arkiver <[email protected]>
Version 20230718.02. Improve spam loop checks.
74dd2276d440f60c05edb08d405f4707e9ae9bcd authored over 1 year ago by arkiver <[email protected]>
74dd2276d440f60c05edb08d405f4707e9ae9bcd authored over 1 year ago by arkiver <[email protected]>
Version 20230718.01. Improve spam loop checks.
a76de59a4b711f741844a3585f3285252252aeea authored over 1 year ago by arkiver <[email protected]>
a76de59a4b711f741844a3585f3285252252aeea authored over 1 year ago by arkiver <[email protected]>
Version 20230716.01. Change spam loop check.
a5689410b86fe3814157ffdfbccd108c6fcf7eac authored over 1 year ago by arkiver <[email protected]>
a5689410b86fe3814157ffdfbccd108c6fcf7eac authored over 1 year ago by arkiver <[email protected]>
Version 20230711.02. Do not check 'res' in http_stat when deciding to write to WARC.
f77b4c4a4af1354e94f90b7a57cd2953e7acec67 authored over 1 year ago by arkiver <[email protected]>
f77b4c4a4af1354e94f90b7a57cd2953e7acec67 authored over 1 year ago by arkiver <[email protected]>
Version 20230711.01. Queue news sitemaps to separate project.
83c2b3d7312f096e98166c9f73aead5b7583cbb9 authored over 1 year ago by arkiver <[email protected]>
83c2b3d7312f096e98166c9f73aead5b7583cbb9 authored over 1 year ago by arkiver <[email protected]>
Version 20230708.03. Check for correctly extracted URLs from robots.txt.
067de6151f9a8bca3e188b7157e44aafb3969008 authored over 1 year ago by arkiver <[email protected]>
067de6151f9a8bca3e188b7157e44aafb3969008 authored over 1 year ago by arkiver <[email protected]>
Version 20230708.02. New attempt at prevent spam loop.
5a9f3981d7130a6817bac0c909b2b2bafebbf633 authored over 1 year ago by arkiver <[email protected]>
5a9f3981d7130a6817bac0c909b2b2bafebbf633 authored over 1 year ago by arkiver <[email protected]>
Version 20230708.01. Relax checks on spam domain URLs.
ca89daa9a67e47f7bd08326414af9717ea6cbcec authored over 1 year ago by arkiver <[email protected]>
ca89daa9a67e47f7bd08326414af9717ea6cbcec authored over 1 year ago by arkiver <[email protected]>
Version 20230707.03. Attempt to fix spam loop detection.
70685f77585462be87b5c2ae1c2431ab26746976 authored over 1 year ago by arkiver <[email protected]>
70685f77585462be87b5c2ae1c2431ab26746976 authored over 1 year ago by arkiver <[email protected]>
Version 20230707.02. Attempt to block out several spam loops.
7c67e83f626f7e18306f452c189a86af3d138e2c authored over 1 year ago by arkiver <[email protected]>
7c67e83f626f7e18306f452c189a86af3d138e2c authored over 1 year ago by arkiver <[email protected]>
Version 20230707.01. Exit URLs with various parameters. Exit URLs with path starting with various default paths.
37b31234ec574567fb12046c1fb310ea7e7da3f2 authored over 1 year ago by arkiver <[email protected]>
37b31234ec574567fb12046c1fb310ea7e7da3f2 authored over 1 year ago by arkiver <[email protected]>
Version 20230706.02. Move around exit URL check.
840dc650011b8c5f6320d0bf726b01c0b6124f16 authored over 1 year ago by arkiver <[email protected]>
840dc650011b8c5f6320d0bf726b01c0b6124f16 authored over 1 year ago by arkiver <[email protected]>
Version 20230706.01. Exit URL on state_uuid parameter.
890852df27c28428393c22b26fa7c124c613b1da authored over 1 year ago by arkiver <[email protected]>
890852df27c28428393c22b26fa7c124c613b1da authored over 1 year ago by arkiver <[email protected]>
Version 20230627.01. Randomize order of DNS servers for --dns-servers options.
2141d7593e31feb485747347823e3f828c311cde authored over 1 year ago by arkiver <[email protected]>
2141d7593e31feb485747347823e3f828c311cde authored over 1 year ago by arkiver <[email protected]>
Version 20230626.02. Write percent encoded URLs to aborted URLs files.
efb4de13dad08486c2da2080b15eb56bb06d9de4 authored over 1 year ago by arkiver <[email protected]>
efb4de13dad08486c2da2080b15eb56bb06d9de4 authored over 1 year ago by arkiver <[email protected]>
Version 20230626.01. Do not queue all URLs found on special interest pages.
9a7b50188e1adbc863c0d9ee84dae44823e1a13a authored over 1 year ago by arkiver <[email protected]>
9a7b50188e1adbc863c0d9ee84dae44823e1a13a authored over 1 year ago by arkiver <[email protected]>
Version 20230625.03. Fix filters.
909200b97553cc6a08755b2bf12af65082f1c955 authored over 1 year ago by arkiver <[email protected]>
909200b97553cc6a08755b2bf12af65082f1c955 authored over 1 year ago by arkiver <[email protected]>
Version 20230625.02. Move filters from server to repo.
bdcc9ce1f28072557bf0e86d966eba00362b7e4a authored over 1 year ago by arkiver <[email protected]>
bdcc9ce1f28072557bf0e86d966eba00362b7e4a authored over 1 year ago by arkiver <[email protected]>
Version 20230625.01. Exit on several parameters in URL.
352cc1ecb8e5377efe38f2c79373f25c377974cc authored over 1 year ago by arkiver <[email protected]>
352cc1ecb8e5377efe38f2c79373f25c377974cc authored over 1 year ago by arkiver <[email protected]>
Version 20230624.01. Get rid of another news/html spam loop.
abc6f6f437c330e25873d2a77b67f968514cf674 authored over 1 year ago by arkiver <[email protected]>
abc6f6f437c330e25873d2a77b67f968514cf674 authored over 1 year ago by arkiver <[email protected]>
Version 20230616.05. Use custom resolv.conf with 'search .'.
957b3bcb8747085085450ba096bee0f36ce1f195 authored over 1 year ago by arkiver <[email protected]>
957b3bcb8747085085450ba096bee0f36ce1f195 authored over 1 year ago by arkiver <[email protected]>
Version 20230616.04. Move check to test.
b30e76440d50d5d4a348ebbcc59e23cd558c303d authored over 1 year ago by arkiver <[email protected]>
b30e76440d50d5d4a348ebbcc59e23cd558c303d authored over 1 year ago by arkiver <[email protected]>
Version 20230616.03. Relax max clock offset to 180 seconds.
dacd2e1c9b99ea2b31c291e92aade3214bdd69b3 authored over 1 year ago by arkiver <[email protected]>
dacd2e1c9b99ea2b31c291e92aade3214bdd69b3 authored over 1 year ago by arkiver <[email protected]>
Version 20230616.02. Run checks every 30 multi items.
f77165553638f0d1698aa3242c84e014622f3206 authored over 1 year ago by arkiver <[email protected]>
f77165553638f0d1698aa3242c84e014622f3206 authored over 1 year ago by arkiver <[email protected]>
Version 20230616.01. Better debug output on connection checks.
ce31538a46d3a9d7922f597c3240dd9f4b809edd authored over 1 year ago by arkiver <[email protected]>
ce31538a46d3a9d7922f597c3240dd9f4b809edd authored over 1 year ago by arkiver <[email protected]>
Version 20230615.05. Bring domain checks more in line in ArchiveBot checks.
844f7b852a64e55ffce8da7c16b5fe56b72084a6 authored over 1 year ago by arkiver <[email protected]>
844f7b852a64e55ffce8da7c16b5fe56b72084a6 authored over 1 year ago by arkiver <[email protected]>
Version 20230615.04. Replace example.business check by thissubdomaindoesnotexist.arpa.li.
0d8f03bc0d77787a8fe8b6dcd50f639727300594 authored over 1 year ago by arkiver <[email protected]>
0d8f03bc0d77787a8fe8b6dcd50f639727300594 authored over 1 year ago by arkiver <[email protected]>
Version 20230615.03. Fix syntax error.
44a0ca5bcf8492e453f772c64074790561c29923 authored over 1 year ago by arkiver <[email protected]>
44a0ca5bcf8492e453f772c64074790561c29923 authored over 1 year ago by arkiver <[email protected]>
Version 20230615.02.
3eaacadbcf688f3273f002f1f7bb30b231cc36a0 authored over 1 year ago by arkiver <[email protected]>
3eaacadbcf688f3273f002f1f7bb30b231cc36a0 authored over 1 year ago by arkiver <[email protected]>
Checks on DNS, connection and time.
b73b5c5d8b1520465307b8842dedc979e61b3d77 authored over 1 year ago by arkiver <[email protected]>
b73b5c5d8b1520465307b8842dedc979e61b3d77 authored over 1 year ago by arkiver <[email protected]>
Version 20230615.01. Enable queuing URLs found on special interest pages.
545b1027d6fe371e63fd0253912feb37f268411b authored over 1 year ago by arkiver <[email protected]>
545b1027d6fe371e63fd0253912feb37f268411b authored over 1 year ago by arkiver <[email protected]>
Version 20230605.05. Ensure URLs is always normalized when finding main URLs in Lua.
7acb158cc9d3474f30e65fa04e58ae4e7234b0db authored over 1 year ago by arkiver <[email protected]>
7acb158cc9d3474f30e65fa04e58ae4e7234b0db authored over 1 year ago by arkiver <[email protected]>
Version 20230605.04. Do not check for non existand host name. Do not check for http to https redirect.
a72957964a8c5b0e7bcfafa218f1f5a255c77687 authored over 1 year ago by arkiver <[email protected]>
a72957964a8c5b0e7bcfafa218f1f5a255c77687 authored over 1 year ago by arkiver <[email protected]>
Version 20230605.03.
e7d1ca1ad4506f8b8d199a8b905bae9d46b00478 authored over 1 year ago by arkiver <[email protected]>
e7d1ca1ad4506f8b8d199a8b905bae9d46b00478 authored over 1 year ago by arkiver <[email protected]>
Version GNU Wget 1.21.3-at.20230605.01. Use GNU Wget 1.21.3-at.20230605.01. Use --host-lookups, --hosts-file, and --resolvconf-file options.
e2b084a30061d08d8df296b96e980eb30a0df704 authored over 1 year ago by arkiver <[email protected]>
e2b084a30061d08d8df296b96e980eb30a0df704 authored over 1 year ago by arkiver <[email protected]>
Version 20230605.02. Ensure failed URLs are properly registered as having been aborted.
14c376dcd2d673212faefe9c1d3fe2db7991ee41 authored over 1 year ago by arkiver <[email protected]>
14c376dcd2d673212faefe9c1d3fe2db7991ee41 authored over 1 year ago by arkiver <[email protected]>
Version 20230605.01. Stop queuing URLs from special interest pages.
acb48a92ffc5d78118985587394d564336cd172d authored over 1 year ago by arkiver <[email protected]>
acb48a92ffc5d78118985587394d564336cd172d authored over 1 year ago by arkiver <[email protected]>
Version 20230604.09. Simple check on http redirecting and odd resolving.
7a88dbd8805d9c4afa398c1a46055a148bb84c46 authored over 1 year ago by arkiver <[email protected]>
7a88dbd8805d9c4afa398c1a46055a148bb84c46 authored over 1 year ago by arkiver <[email protected]>
Version 20230604.08. Do not attempt to extract URLs from URL with stamp timestamp parameter.
d023faee61fc98d32c94def08ba0e03c275dc8a0 authored over 1 year ago by arkiver <[email protected]>
d023faee61fc98d32c94def08ba0e03c275dc8a0 authored over 1 year ago by arkiver <[email protected]>