Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/ooni/pipeline

OONI data processing pipeline
https://github.com/ooni/pipeline

Merge pull request #15 from TheTorProject/feature/domain_intelligence

Feature/domain intelligence

b92ff0e4eb622a0f02826dbe8522c2cd585b7318 authored almost 9 years ago
Skip normalisation when the data format version is 0.2.0

4a2d4d00b9342cbb495d3408d797bcd61a5066b3 authored almost 9 years ago
Add support for detecting blockpages in India

f8fa8a9b4fa9c0d463ad8d2f65dc9fe1a333f171 authored almost 9 years ago
Merge pull request #14 from TheTorProject/feature/fix-materialised-view-update

Move update views into separate tasks

ae67141cd50f47043c350b1e2a1ab3de149c0c9e authored almost 9 years ago
Set source_updated to 1970-01-01 when it's not set

19dc1b257f9895c7df1ab6e2b7281d4a78aeaf46 authored almost 9 years ago
Add documentation for domain_intelligence

* Add BeautifulSoup4 as a dependency

ae791da8516b5e36fdd3abc0980345c3c4970bd0 authored almost 9 years ago
Implement tasks for loading ASN Information into DB

* Remove unused tasks
* Use logging for writing log files

b577d08c6270c2e71d7c7ddfbcacaeb25d53c5b1 authored almost 9 years ago
Place the update views in a wrapper task

355ac1780f1f05eefb9ea3bf5b5c0148904e888c authored almost 9 years ago
Move update views into separate tasks

a4adc0dbfbcdb5d7c7a031e9138f97d120e36a6b authored almost 9 years ago
Merge pull request #12 from TheTorProject/fix/test_names

Add mappings for censorship circumvention tool tests

e77103b79d1a3a14b7909d03b0329837867196c8 authored almost 9 years ago
Add mappings for censorship circumvention tool tests

1ea5fd209073c529827b5b5970fe3963474fe5c9 authored almost 9 years ago
Implement tasks to add CitizenLab URLs to the database.

Provide alexa ranking, but disable google results (they block us after
very little time)

c1bc248d9863ac61b261d35989d8b9ee0808a9e9 authored almost 9 years ago
Start outlining task for gathering domain intelligence information

* https://github.com/TheTorProject/ooni-explorer/issues/32

99c2caedacf992147ffd0553deded2609bb657d5 authored almost 9 years ago
WIP on domain intelligence

5f85c4b4cbf43fcfa13827e3c7b476c88e78f5eb authored almost 9 years ago
Merge pull request #11 from TheTorProject/feature/refactor-luigi

Fix terrible bug in body normalisation

c6381160da69b40069e5d319e8fcc5449d3db49b authored almost 9 years ago
Fix terrible bug in body normalisation

ee455668f02de9a292464e2e9b4839905e7c02d7 authored almost 9 years ago
Merge pull request #10 from TheTorProject/feature/refactor-luigi

Feature/refactor luigi

66713bdd9552e9826420e303eb7cd2d664c18176 authored almost 9 years ago
Add metrics_table to where filters

7879c1d51e15ff3c9133848350de8ca20a804714 authored almost 9 years ago
Update_id is not callable

c0c5c5129fdc6987a2c079e9bfec6536e050dc43 authored almost 9 years ago
Define the update_id

6d9fe5d8afc19c7cde59b6ca99914017793925d3 authored almost 9 years ago
Compress should depend on Sanitise not Normalise only

b98e43e158c48bcde2f20e69e3deb7ecac5915a1 authored almost 9 years ago
Merge pull request #9 from TheTorProject/feature/refactor-luigi

Feature/refactor luigi

98ce526b9775473c2e643ea56ae6b005b3eb1b99 authored almost 9 years ago
Query object only exists in master branch of luigi

b3fd6a6a23e35d3a3ac68f0af7ea4877b2c08eb8 authored almost 9 years ago
Remove trailing s

79d92894b88d2bf1721a7ad80240dad0aa599702 authored almost 9 years ago
Add task to compress reports

20d72b74afb7b525533b1b3d3fd272cef98dff3c authored almost 9 years ago
Add tasks related to creating and updating the materialised views

2bac7a3e8b9c90387664a2abff23a646d219e7bd authored almost 9 years ago
Update requirements.txt

* Delete templates

846e9c6837f1b8d9ea9df7bcb421e5f26938ba12 authored almost 9 years ago
Update README with the most up to date version of the data pipeline.

* Delete unused code.

* Add logging configuration file.

af648a8317816ecbd2e549654ff9d7518069e6c9 authored almost 9 years ago
Delete more dead code

743681310f75e62486a604b0ba40d9037b8e243e authored almost 9 years ago
Delete all dead code

b95178b6aa668d0c19da81695097fbef27d07cc7 authored almost 9 years ago
Fix sanitisation of input key

af54a0a21c26caf41e54917e7ea46d661927e214 authored about 9 years ago
Also hash the fingerprints of bridges not in bridge_db

96bb024477181766156b68462987b7b0935dd28e authored about 9 years ago
Set the report_id when it is not set.

35886543d7ee2f838989859bd791020b5f66cfd5 authored about 9 years ago
Normalise issuer and subject

a90a28debed5c2f594c3e8a07be5255030087eff authored about 9 years ago
Make getting the serial n. attribute lazy

13269664aa1347c97f7c02bcd25873617d8ac446 authored about 9 years ago
Encode serial number with hex

b6971761f01e32e6413f0d22e6f1aaf121b0e00a authored about 9 years ago
Encode serial number as hex

6c0c836c54f247a9d494b8b2a15f855149275dac authored about 9 years ago
Normalise double nested headers in legacy reports

193827a85a76cdecc62957b1d0a63ca45f241c4d authored about 9 years ago
Extra subargs to normalise the options of legacy tests

2210c544a86029ed5e3e53d209de4514781a9eca authored about 9 years ago
Make the handling of captive portal and tls_handshake tests more robust

2a418d64e232b7a0115a9538943ee22519c2d5d0 authored about 9 years ago
Sanitise tls_handshake test

c7bc01ee2830b50f88a11340bf6071bf8a55b6e8 authored about 9 years ago
Add tor_http_requests_test to httpt tests

58c27da009ba4840279da56887f6b8ec90c99df2 authored about 9 years ago
Add support for fixing tor_http_requests test

e3c2ee411593adefbaf4e3a748e9b4fa76073ff9 authored about 9 years ago
Encode unicode data as UTF-8

1a100678fea5d6ce68ec84b2a951b6c0f1369253 authored about 9 years ago
tabularize

7b32f5c5f8b32c689f63d815efc50b95a599d324 authored about 9 years ago
if regex not found, empty string

f66d120dfab8108c48c786095776662aa32e6d10 authored about 9 years ago
log when trying fallback yaml loader

4dd67bb8710b42521e166ade0f18eabcd3a1049d authored about 9 years ago
squash another runtime exception

3e55b7212ed7c723372d6ec03e33305b0605b3c2 authored about 9 years ago
generator excretes NoneTypes...

5b3539004e356c2e5f7c3d23dfabe73579ee7e32 authored about 9 years ago
r2's yaml hack

1503ef7fa90b3b0a2765058e3056d99551e36754 authored about 9 years ago
revert 2de23459c

63e4fcb67766b4e664c76a84993d949fd70cb3b0 authored about 9 years ago
Normalise the sets in the captive portal test

3cfdeb86d0eb5e914c446a9545b79d4ad9bd46bb authored about 9 years ago
catch exceptions while scanning yaml

2de23459c5a1e2ec1a8eac5a401917f137e5e977 authored about 9 years ago
Fix normalisation of dns_consistency tests

7aaa07d0c630d7b2b52effbebf27de7966cb8436 authored about 9 years ago
Remove start_time key

beb64d6db5fb0d4b46b4bd22d13e266b5022a520 authored about 9 years ago
Set exit_ip and exit_name to null by default

236d218f17366bf06923cdbfe0c79dc3859ac31e authored about 9 years ago
Add input_hashes to the base schema

d6d7fbb760e83fb0826be2af8871955b0079eaf3 authored about 9 years ago
Add probe_city and backend_version to the base schema

1d5f0f05107290af24f2c76ef195e6c4bcfa9540 authored about 9 years ago
Perform better normalisation of MX and SOA records

* Do not fail when the tampering key is not present

deb44e709e0cc427c4d974c80b00ba5a2f074aea authored about 9 years ago
re-arrange tor_{exit,name} finding, exception stuff

68ef01d1cecb5124f8e20d7e75edbea2f28079dd authored about 9 years ago
http test options

c036d03ef10dffc15826628c21b9b5bd5b341d35 authored about 9 years ago
daily_workflow tweaks

27ac98c959a6f7d08abaa4ddcfc93a356bb87e2c authored about 9 years ago
multi_protocol_traceroute normalisation tweaks

aa44b6012dcdb42116af345855496a18fde55af0 authored about 9 years ago
raise exception when encountering unknown extension

151d6067f65f12335e6d65ac70d3ed4a5176b3a2 authored about 9 years ago
dns_consistency normalisation tweaks

2f1766922d60a7dd993e4316fea7fdeab71f31fb authored about 9 years ago
fix regex in dns_consistency normalization

445d51f3fee326f2ff24cbf6a0b97fea9985f9bd authored about 9 years ago
Do normalisation of the Headers as well

fbf247209a9d6200ae934413c225e0fad789e944 authored about 9 years ago
Don't swallow exceptions but re-raise them

087b1d2d25b9886f4959c0defe6aa9d82328f858 authored about 9 years ago
Fix bug in setting is_tor variable

b21c43535158b9bc1fda04b62b3be24531488086 authored about 9 years ago
Headers_diff can also be none

4a12fcb304c7aeb73eef06287eeb7fbafd0ea30a authored about 9 years ago
Convert set to list and fix tor entry

77d7b17aad7c804d819024fa8f19b32435994be5 authored about 9 years ago
Remember to write newlines

7af3f0ad92f50aaa070488d070473292896f354a authored about 9 years ago
Only cast to unicode when needed

6af36f4bf0973ec07cba039213a8b798bb2f56fc authored about 9 years ago
Integrate @TylerJFisher DNS normalisation code into the daily workflow

72aa0e00f8ac5bfb9f92cb7db7f94c978b35d011 authored about 9 years ago
Improvements to body sanitisation function

* Fix bug in bridge_reachability sanitisation

79e67dcf650a1f6e6ce8033a623cde0696a89486 authored about 9 years ago
Add example client.cfg file

* Progress in implementing luigi based daily workflow

63aea9c31261ee6e8a3515cf6241f612e69e46b9 authored about 9 years ago
Delete all dead code

78052bd210f378be745951ba2151779a2317bfaa authored about 9 years ago
Implement the daily workflow all in one luigi workflow

32037c8b76f87d29d16bff7b33e812bf0eb9ed9f authored about 9 years ago
Don't base64 encode the url

f5bfa10151bdfd6142cc7937c0fa29a3d52061a1 authored about 9 years ago
Also print out the failed entry

6758b1581c7d123bf9f26b7514896cd9aff8cb33 authored about 9 years ago
Skip over invalid reports

d90f263846216f7a38a5ba2e853d127de9d223fe authored about 9 years ago
Skip over entries that are not dictionaries

07e51da3a86375110ec7e65a9566c4eb6050da03 authored about 9 years ago
Implement performance boost by using CLoader

40c484bb7ed445d0151df9d108602b82ddaa2fec authored about 9 years ago
Fix bug in sanitise of scapy tests

c09abb8c56540b4d6afb8c1c3274c0a17e7fed90 authored about 9 years ago
Use the pgcrypto function since Postgres 9.4 is a requirement anyways (JSONB)

1cd35bc226bac318482702060466d2e5201f91b2 authored about 9 years ago
Move the unicode decode catch down

9f484117a17dbd15ee0f1e24e81060ef1e2f5e68 authored about 9 years ago
Implement base64 encoding of body

* Add bucket date to DB

1f2bbdb9638164ab9305045ee5467236b348177c authored about 9 years ago
Only strip null characters from response bodies.

* Properly remove the body of the response when failed to insert

b04bc6815ba299221267627c437631cfca6b90da authored about 9 years ago
Also create indexes for test_start_time and refresh the new materialised views

af70f30d7b35f4a7c50a969f245e29b4256f2473 authored about 9 years ago
Move the creation of indexes out of the Streams class

a5e0ef2fa7d912f794de0095092f2a90b8b94506 authored about 9 years ago
Implement hack to avoid decoding/encoding errors in URLs

6e5997c128e2bb50d35cec47e50fc9bc4b62f6a1 authored about 9 years ago
Fix typo

abeac9e37574cbca05eeb5ca5944d607c1f868b3 authored about 9 years ago
Add an empty response when it's missing

a462dcffb8ee5207a83617c5c9919ded898d6995 authored about 9 years ago
Fixing of body and headers

be80ba44fc13dea16cb1d3d73df5e88c737b131a authored about 9 years ago
Fix tor detection logic

59c00efba943ac4e015e3587bd101035f4fd0b09 authored about 9 years ago
Fix request key nesting

* Specify default values for bins_to_sanitised_streams task

da13382a7bf23f32c2fc3241e71f6d5ce01c9df3 authored about 9 years ago
Fix logic of detection if the request is done over Tor or not

1fa21a548c4baa035bc2c53dd75c91b4e0b19086 authored about 9 years ago
Headers should be inside of a dictionary instead of a list

78110213f8ebc795fdb3f6bd162a9ebff6cf905a authored about 9 years ago
Un-ravel the headers to not be a list within a list

96f2d106d62ed436539b4a2e8785c0aebd27afb1 authored about 9 years ago
Fix some bugs in the stream to DB logic

* Fix bugs in the sanitisation procedure

e866cb356a1b8abd320a2f85082dc3a17087d9cd authored about 9 years ago