Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/ArchiveTeam/pagespersoorange-grab

Archiving Pages Perso Orange
https://github.com/ArchiveTeam/pagespersoorange-grab

Version 20231003.01. Handle problematic items.

bc158f9d77b085d680d079410b12f0541752f9e9 authored about 1 year ago by arkiver <[email protected]>
Version 20231002.01. Abort item is URL is seen more than 5 times.

49ca43f1fa635c88abc1bb6b7a756f9b39a9a863 authored about 1 year ago by arkiver <[email protected]>
Version 20231001.03. Increase multi item size to 100 to limit on tracker side. 6 second sleep time for 404 redirect, else 1 second.

1b5a037d38d36b62eff5860daa4f58c6df4f6958 authored about 1 year ago by arkiver <[email protected]>
Version 20231001.02. Only strip marie, ecole, assoc or pagespro-orange.

c779112eb897f3c2874d31a88ac9363e86ff9083 authored about 1 year ago by arkiver <[email protected]>
Version 20231001.01. Strip .ecole, .assoc, .marie from 'site'.

c34768ce568217ec7c3e772ad6f2d2565ec8dcde authored about 1 year ago by arkiver <[email protected]>
Version 20230929.02. Prevent subdomain loop.

903a2cb10f8c85baa391fe683bf5eb7cda220989 authored about 1 year ago by arkiver <[email protected]>
Version 20230929.01. Skip several domains from queuing to.

1e60275fc0606a8ef0978b91951c4ccd39ebaaa0 authored about 1 year ago by arkiver <[email protected]>
Version 20230928.04. Improvements.

48fc3b422ca345fb3fb14c855304efc0e96bfd69 authored about 1 year ago by arkiver <[email protected]>
Version 20230928.03. Fix resetting some variables.

f76fc64a5dd90f06bbb5e2800ea27a2970b414e5 authored about 1 year ago by arkiver <[email protected]>
Version 20230928.02. Also queue *.orange and monsite.wanadoo.

aaeb1a7b35f02c3d56d944651cc4dd7c655f9553 authored about 1 year ago by arkiver <[email protected]>
Version 20230928.01. Do not queue some domains if another is found. Handle woopic.com. Use different sleep times. Discover outlinks.

814852c35f4be3c099667788e74607f0c1b96d83 authored about 1 year ago by arkiver <[email protected]>
initial

20d2dde19c71d3167ac482dab9ac749e5f3f82b6 authored about 1 year ago by arkiver <[email protected]>