Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/mwmbl/crawler-extension
A browser extension that can be installed by volunteers to participate in mwmbl distributed crawling.
https://github.com/mwmbl/crawler-extension
Sleep when no urls in batch
daoudclarke opened this pull request 6 months ago
daoudclarke opened this pull request 6 months ago
Adding a new URL to a query on mwmbl.org with "Search Google" enabled redirects back to the homepage
anijatsu opened this issue 11 months ago
anijatsu opened this issue 11 months ago
Add Mirror to Codeberg workflow
diagonalo opened this pull request about 1 year ago
diagonalo opened this pull request about 1 year ago
Bug fixes for curation
daoudclarke opened this pull request about 1 year ago
daoudclarke opened this pull request about 1 year ago
Query extra search engine
daoudclarke opened this pull request about 1 year ago
daoudclarke opened this pull request about 1 year ago
[Feature] Crawl page on demand
EchedeyLR opened this issue over 1 year ago
EchedeyLR opened this issue over 1 year ago
[Feature] Adjustable requests limit per website
EchedeyLR opened this issue over 1 year ago
EchedeyLR opened this issue over 1 year ago
Tag released versions on Git
omasanori opened this issue about 2 years ago
omasanori opened this issue about 2 years ago
Replace the generic extension icon with official branding one on AMO
omasanori opened this issue about 2 years ago
omasanori opened this issue about 2 years ago
Remove some logging
daoudclarke opened this pull request about 2 years ago
daoudclarke opened this pull request about 2 years ago
Improve crawler prioritisation
daoudclarke opened this issue about 2 years ago
daoudclarke opened this issue about 2 years ago
Specify the Accept-Language field to fetch requests
omasanori opened this pull request about 2 years ago
omasanori opened this pull request about 2 years ago
add websites to the crawler
nobaraos12 opened this issue about 2 years ago
nobaraos12 opened this issue about 2 years ago
Add stat ticker for pages uploaded
nobaraos12 opened this issue about 2 years ago
nobaraos12 opened this issue about 2 years ago
add support for legacy versions of firefox
nobaraos12 opened this issue about 2 years ago
nobaraos12 opened this issue about 2 years ago
Specify the desired language in our requests
daoudclarke opened this issue over 2 years ago
daoudclarke opened this issue over 2 years ago
Internationalization
adjagu opened this pull request over 2 years ago
adjagu opened this pull request over 2 years ago
Send all links
daoudclarke opened this pull request over 2 years ago
daoudclarke opened this pull request over 2 years ago
Update README.md
adjagu opened this pull request over 2 years ago
adjagu opened this pull request over 2 years ago
Build Failed
adjagu opened this issue over 2 years ago
adjagu opened this issue over 2 years ago
Respect <meta name="robots" content="noindex">
daoudclarke opened this issue over 2 years ago
daoudclarke opened this issue over 2 years ago
Add an option to pause crawling
fawaf opened this issue over 2 years ago
fawaf opened this issue over 2 years ago
Doesn't crawl at all
g00g1 opened this issue over 2 years ago
g00g1 opened this issue over 2 years ago
Detect status correctly
daoudclarke opened this pull request over 2 years ago
daoudclarke opened this pull request over 2 years ago
Add timeout
daoudclarke opened this pull request over 2 years ago
daoudclarke opened this pull request over 2 years ago
Bump version to 0.4
daoudclarke opened this pull request over 2 years ago
daoudclarke opened this pull request over 2 years ago
Prevent loading big pages
daoudclarke opened this pull request almost 3 years ago
daoudclarke opened this pull request almost 3 years ago
Don't try and crawl really large pages
daoudclarke opened this issue almost 3 years ago
daoudclarke opened this issue almost 3 years ago
Crawl one page at a time
daoudclarke opened this issue almost 3 years ago
daoudclarke opened this issue almost 3 years ago
Added popup to log crawler URLs
ColinEspinas opened this pull request almost 3 years ago
ColinEspinas opened this pull request almost 3 years ago
Don't send cookies
daoudclarke opened this pull request almost 3 years ago
daoudclarke opened this pull request almost 3 years ago
Don't try and crawl if we're not online
daoudclarke opened this pull request almost 3 years ago
daoudclarke opened this pull request almost 3 years ago
Remember visited links so we don't visit them multiple times
daoudclarke opened this pull request almost 3 years ago
daoudclarke opened this pull request almost 3 years ago
Check the number of unique domains to prevent falling into loops
daoudclarke opened this pull request almost 3 years ago
daoudclarke opened this pull request almost 3 years ago
Crawl more root domains
daoudclarke opened this pull request about 3 years ago
daoudclarke opened this pull request about 3 years ago
Adapt for firefox
daoudclarke opened this pull request about 3 years ago
daoudclarke opened this pull request about 3 years ago
Handle exceptions
daoudclarke opened this pull request about 3 years ago
daoudclarke opened this pull request about 3 years ago
Browser Compatibility
ColinEspinas opened this issue about 3 years ago
ColinEspinas opened this issue about 3 years ago
Implement crawl
daoudclarke opened this pull request about 3 years ago
daoudclarke opened this pull request about 3 years ago
Justext
daoudclarke opened this pull request about 3 years ago
daoudclarke opened this pull request about 3 years ago
Changed dev script to build with watch mode
ColinEspinas opened this pull request about 3 years ago
ColinEspinas opened this pull request about 3 years ago
Add automated workflows to build and release
ColinEspinas opened this issue about 3 years ago
ColinEspinas opened this issue about 3 years ago
Respect robots.txt
daoudclarke opened this pull request about 3 years ago
daoudclarke opened this pull request about 3 years ago
Retrieve pages
daoudclarke opened this pull request about 3 years ago
daoudclarke opened this pull request about 3 years ago
Run a crawl iteration once every second
daoudclarke opened this pull request about 3 years ago
daoudclarke opened this pull request about 3 years ago
Add dev mode for panel and options
ColinEspinas opened this issue about 3 years ago
ColinEspinas opened this issue about 3 years ago