Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/mwmbl/crawler-extension

A browser extension that can be installed by volunteers to participate in mwmbl distributed crawling.
https://github.com/mwmbl/crawler-extension

Sleep when no urls in batch

daoudclarke opened this pull request 6 months ago
Add Mirror to Codeberg workflow

diagonalo opened this pull request about 1 year ago
Bug fixes for curation

daoudclarke opened this pull request about 1 year ago
Query extra search engine

daoudclarke opened this pull request about 1 year ago
[Feature] Crawl page on demand

EchedeyLR opened this issue over 1 year ago
[Feature] Adjustable requests limit per website

EchedeyLR opened this issue over 1 year ago
Tag released versions on Git

omasanori opened this issue about 2 years ago
Replace the generic extension icon with official branding one on AMO

omasanori opened this issue about 2 years ago
Remove some logging

daoudclarke opened this pull request about 2 years ago
Improve crawler prioritisation

daoudclarke opened this issue about 2 years ago
Specify the Accept-Language field to fetch requests

omasanori opened this pull request about 2 years ago
add websites to the crawler

nobaraos12 opened this issue about 2 years ago
Add stat ticker for pages uploaded

nobaraos12 opened this issue about 2 years ago
add support for legacy versions of firefox

nobaraos12 opened this issue about 2 years ago
Specify the desired language in our requests

daoudclarke opened this issue over 2 years ago
Internationalization

adjagu opened this pull request over 2 years ago
Send all links

daoudclarke opened this pull request over 2 years ago
Update README.md

adjagu opened this pull request over 2 years ago
Build Failed

adjagu opened this issue over 2 years ago
Respect <meta name="robots" content="noindex">

daoudclarke opened this issue over 2 years ago
Add an option to pause crawling

fawaf opened this issue over 2 years ago
Doesn't crawl at all

g00g1 opened this issue over 2 years ago
Detect status correctly

daoudclarke opened this pull request over 2 years ago
Add timeout

daoudclarke opened this pull request over 2 years ago
Bump version to 0.4

daoudclarke opened this pull request over 2 years ago
Prevent loading big pages

daoudclarke opened this pull request almost 3 years ago
Don't try and crawl really large pages

daoudclarke opened this issue almost 3 years ago
Crawl one page at a time

daoudclarke opened this issue almost 3 years ago
Added popup to log crawler URLs

ColinEspinas opened this pull request almost 3 years ago
Don't send cookies

daoudclarke opened this pull request almost 3 years ago
Don't try and crawl if we're not online

daoudclarke opened this pull request almost 3 years ago
Remember visited links so we don't visit them multiple times

daoudclarke opened this pull request almost 3 years ago
Check the number of unique domains to prevent falling into loops

daoudclarke opened this pull request almost 3 years ago
Crawl more root domains

daoudclarke opened this pull request about 3 years ago
Adapt for firefox

daoudclarke opened this pull request about 3 years ago
Handle exceptions

daoudclarke opened this pull request about 3 years ago
Browser Compatibility

ColinEspinas opened this issue about 3 years ago
Implement crawl

daoudclarke opened this pull request about 3 years ago
Justext

daoudclarke opened this pull request about 3 years ago
Changed dev script to build with watch mode

ColinEspinas opened this pull request about 3 years ago
Add automated workflows to build and release

ColinEspinas opened this issue about 3 years ago
Respect robots.txt

daoudclarke opened this pull request about 3 years ago
Retrieve pages

daoudclarke opened this pull request about 3 years ago
Run a crawl iteration once every second

daoudclarke opened this pull request about 3 years ago
Add dev mode for panel and options

ColinEspinas opened this issue about 3 years ago