Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/ooni/pipeline

OONI data processing pipeline
https://github.com/ooni/pipeline

shovel:0.0.12 - skip LZ4 frames while skipping duplicates

4b6762871cc2e4aa2126cf9987b31ecd10b448e5 authored over 7 years ago
shovel:0.0.11 - skip cross-bucket duplicates

830ab169dd20f8b0725893b069bad96c21cc2477 authored over 7 years ago
shovel:0.0.10 - capture one more postgres exception during COPY

6e8462fdeb1543c958b5e245f4d99b3d61a25e1a authored over 7 years ago
shovel:0.0.9 - fix missing encoding header for check_sanitised.py

bc2936db3a7f61da95cab1190bd04ada60fe5185 authored over 7 years ago
shovel:0.0.8 - bad data rows are not fatal anymore during COPY, HTTP headers follow unicode "taste" of postgres

0602fe11640947b3892997c78139792cd2a38da3 authored over 7 years ago
centrifugation: pass HTTP headers through pg_strquote to strip zero bytes

ab977d1956bddfc826c28ae0c0971a9ffed018d3 authored over 7 years ago
check_sanitised: fix sanitised_check pipeline task broken since 45d12b1a, thanks pylint

fd66ecec9e80cde43fd5f280b7aa16ae1c391d0b authored over 7 years ago
centrifugation: make bad rows during COPY non-fatal

abac7a37b8618b26c82228c8915f548ccea2e02d authored over 7 years ago
centrifugation: use ' instead of " where possible

c53699f14443f22ea12154013a42b7a4e02f6db4 authored over 7 years ago
Add indexes for TheTorProject/ooni-measurements#22

64aa029151daa74b0aedbc8dd8a807a735b26e0d authored over 7 years ago
fix typo

fec53deb5f6b14a5dc6ca02875ab3034e07cb87a authored over 7 years ago
shovel:0.0.7 - extract known fingerprints, fix HTTP code, DNS TTL

3706643ce29be6c9d18dfdbe52d17fb57154530f authored over 7 years ago
centrifugation: add `msm_no` indexes to support partial updates, fix NOT NULLs

8088298607ad5150171615958f8b5c56733bbf64 authored over 7 years ago
Merge branch 'master' into pipeline-16.10

Merging to get new blockpage signatures into the branch.

72d2e2aef14a258f12aff8222f31003e0773a555 authored over 7 years ago
Merge pull request #58 from TheTorProject/fingerprint/gf

Add French Guiana block page fingerprint

8c1a372126a350677f98ae1e3ad09e6ea2f22146 authored over 7 years ago
Merge branch 'master' into fingerprint/gf

74739f0dd3207ddd6684f06637b7fe3b1b6fee0c authored over 7 years ago
Merge pull request #67 from TheTorProject/fingerprint/ro

Add blockpage body fingerprint for Romania

9df2b1001e12c181a2a89bebd80c8f3c45c4807c authored over 7 years ago
Merge branch 'master' into fingerprint/ro

0950d24b5f4d9e47b1d4bf05b721ea46759adf58 authored over 7 years ago
Merge pull request #68 from TheTorProject/fingerprint/my

Add blockpage body fingerprint for Malaysia

cc886aef29ec028898c3f4e16a25d34ddca22e33 authored over 7 years ago
Merge pull request #69 from TheTorProject/fingerprint/ae

Add blockpage header fingerprint for United Arab Emirates

43db476b23afee6cce7352fa19bc4afec851834d authored over 7 years ago
shovel:0.0.6 - fix centrifugation postgres COPY wbuf

4a8471e45d132b3c7fa170198314f86dc69362d1 authored over 7 years ago
shovel:0.0.5 - refactor centrifugation to extract more features and be more robust against bad data

45d12b1a524874658f7a0a41cb135eb2c6a72571 authored over 7 years ago
centrifugation: add unique constraint to `input` table & add comments

85ffd20091102a74fc5b3a886c8ecfd841c956e8 authored over 7 years ago
Add blockpage header fingerprint for United Arab Emirates

9a72b16c5c3f9049ca56a99c8b3c3ce2a1d75f2f authored over 7 years ago
Add blockpage body fingerprint for Malaysia

48a10f7b9a383ae203ee30c0b9bb7ff0d7e5854e authored over 7 years ago
Add blockpage body fingerprint for Romania

fdc056adaf231fb94e1b4cdcc1f5679841499aca authored over 7 years ago
Add French Guiana block page fingerprint

0ed504f71f283a5b18a29f7120ba4f546244877d authored almost 8 years ago
shovel:0.0.4 - log okayish diff between sanitised and autoclaved for future reference

dcd5b9e37031fddd6d85a134fd3ac78b32c92041 authored almost 8 years ago
shovel:0.0.3 - checked cleanup of sanitised and reports-raw

881dd7c9c5177ffb4a7178e84fab54e748b6fe04 authored almost 8 years ago
Mark openobservatory/shovel:0.0.2

fd8cc29ebc5644b1c04b82ceda9f1bb4ee970f27 authored almost 8 years ago
airflow-dags and Dockerfile for airflow, airflow-worker migrated to TheTorProject/ooni-sysadmin#105

4728cb2c84957a3dcc2ebdceadeb753223b88af0 authored almost 8 years ago
centrifugation: batching does NOT reduce number of Queue.get lines in pyflame

ae3d4557aa98955dc6af907da433f3c9d95728e4 authored almost 8 years ago
pipeline-16.10 doc: some highlights

82ad892e00b18d42f9d4e79ccba69276b4d59496 authored almost 8 years ago
Pin depesz/Versioning that is used to bootstrap schema versioning

2a71fdaccb28b9730c5fbdf31c46af6767e5c4db authored almost 8 years ago
originas2pg: tool to load snapshot of originas.bz2 to postgres

The table does not store whole BGP stream of migrations of IP subnets
between ASes across time, ...

a702f728d45a03aeb374badf0522dc1d077fc98b authored almost 8 years ago
centrifugation: extract more metadata from autoclaved reports

- TCP failures
- A records for DNS
- verdicts for HTTP requests

9e37fa81f632e57baea746913cabc3f779e6e731 authored almost 8 years ago
canning: comment on ARG_MAX

8176dc3df71819ac115a9662169df78857a6780a authored almost 8 years ago
centrifugation: load indexing metadata to postgres

fe345a6383f80f175084bdb221dff2cd013290c0 authored almost 8 years ago
pipeline-16.10 doc: some gotchas

2a3da7c0ed4f3f09cc990bfb176f6a8c93626fca authored almost 8 years ago
centrifugation: converting http bodies to simhash (text dump for a while)

80e4602aaacb3406b9f46465521b681c792d8c41 authored almost 8 years ago
autoclaving: fix bridge_db loading to autoclave `tcp_connect` & `bridge_reachability`

a484ee834414228130bba2a72d31b5116c12fcd5 authored almost 8 years ago
autoclaving: slicing files into independent reports

4ddc40b6ab5eafc5759eeaf4f23dcf05bdbf6e65 authored almost 8 years ago
Support Reykjavík(sic!), great source of privacy geeks and TZ=UTC datetimes

8490691e8eb3e3fdda5fa17e48b0dcd6480da741 authored almost 8 years ago
Script to cleanup private/reports-raw (checking against S3 backup) after canning

9b488b513242732dbfe1c5fe2a36973e613f90a4 authored almost 8 years ago
shovel: strip python eggs

ce95c4f6198fb5252ff52f1db30cb609085eceb7 authored almost 8 years ago
Canning, pt.2

3222d571095efd0472119d896dac639215eb7919 authored almost 8 years ago
Some mu-benchmarks to justify questionable lines of code

cc10f7ef6de3f92ccc7280961963ee440af207c1 authored almost 8 years ago
Merge pull request #52 from TheTorProject/rebuild-notes

Add pipeline rebuild notes, updated client.cfg, add requirement

c6faeeaa8fede9fb2c075d40fe5002998797f348 authored almost 8 years ago
Add pipeline rebuild notes, updated client.cfg, add requirement

bd9adf3b3a3828bc55c018f5e47be549635e964f authored almost 8 years ago
Merge pull request #43 from TheTorProject/update-concurrently

Refresh materialized views concurrently

0a62bd4f2a17dc470a3974ec8b500877f63f684d authored almost 8 years ago
Merge pull request #51 from TheTorProject/fix/id-generation

Always set the id in the pipeline

51fdd878a7b3ad18dc54e43a5fe807820ae6899f authored almost 8 years ago
Always set the id in the pipeline

6e3bd1df1275f84f2d5048d0cba9890f50ad9610 authored almost 8 years ago
airflow: step #1 (raw -> canned)

Compression ratio ~21% & some hard-coded stuff.

064e9a053977f7ef2aafb4cba11f302de386692b authored almost 8 years ago
Merge pull request #44 from VEinteligente/temp-pull-req

Add psycopg2 to requirements.txt #1

e57c5c2d137ef00b6772e167d49a5f93dbc7fd26 authored about 8 years ago
Update pipeline-16.10 doc

ae941acf37da075ed587d5741fd5664dfdf1de43 authored about 8 years ago
Add psycopg2 to requirements.txt #1

cc5b8116dab4bfb6c48d506c77590290a4e5a49f authored over 8 years ago
Refresh materialized views concurrently

Prior to this materialized view updates lead to the database tables
related to these views will ...

32ae1b346b6bc485418a9b1857bffdb427df2e82 authored over 8 years ago
Merge pull request #42 from TheTorProject/optimize/indexes

Create a multicolumn index on test_start_time and probe_cc

5ec56d9d6960c906833842100d32be6667e12750 authored over 8 years ago
Merge pull request #39 from TheTorProject/fingerprint/ru

Add Russia blockpage header fingerprint

76135154f3979b58949f5201bb21fc509426541f authored over 8 years ago
Create a multicolumn index on test_start_time and probe_cc

This leads to a significant performance boost to queries on countries
with few measurements (que...

f926cc873823639037ec4d3be8fc8f6a490f44c3 authored over 8 years ago
Merge pull request #41 from TheTorProject/feature/pipeline-16.10-ideas

Add some comments on next generation of ooni-pipeline-ng

391c62fe7b933e75b9af0b50fe2897fb1899f324 authored over 8 years ago
Add some comments on next generation of ooni-pipeline-ng

See also TheTorProject/ooni-pipeline#32

6797749c20f6a77f60b6c8edf7aab61c277a761e authored over 8 years ago
Merge pull request #38 from TheTorProject/feature/pipeline-architecture

Update ooni-pipeline-architecture

cb384b2ff1e38d92f9884713c8fc440546d6828a authored over 8 years ago
Add Russia blockpage header fingerprint

48dd0f5a18722955ffb8f1bc9fae29face3993b1 authored over 8 years ago
Add architecture document explaining the pipeline architecture from a high level

f6b9b163ef633405ea3c760e9bdcf677bd2ca337 authored over 8 years ago
Update ooni-pipeline-architecture chart

21c2923c12fd1db79dfec000fbf2d05130bafc9f authored over 8 years ago
Merge pull request #37 from TheTorProject/new-korea-fp

Add new blockpage for korea

421414d29b4e862e23657b269083eb92b0d5aa01 authored over 8 years ago
Add new blockpage for korea

1333c5c2b2c503aa732342a71f4c970a9f1f90c8 authored over 8 years ago
Merge pull request #24 from TheTorProject/feature/blockpage-fingerprints

Add fingerprints for Italy, Cyprus, Denmark, Portugal, Norway, UK and France

f603499a46ec2afae1b917d677f04857a19abedc authored over 8 years ago
Add French block page fingerprint

3ae8f7b3d548c3665646d56d43f4ad3afc3a8b7f authored over 8 years ago
Merge branch 'fix/ignore-asn'

* fix/ignore-asn:
Add support for ignoring ASN

7ba9ce8376db353d6325f8035a1533ecf4410f19 authored over 8 years ago
Fix parsing of time stamps

5a9e8bbb043e05ee8a5bc91174956bbf00b15b79 authored over 8 years ago
Add more lax regexp for turkish blockpage detection

3506bc474ef7dc77290384c3cf2f618a1cc15fba authored over 8 years ago
Merge pull request #30 from TheTorProject/feature/web-connectivity

Add support for processing web_connectivity measurements

f909e72576226c3471bffc5c104766337f60a38c authored over 8 years ago
Add support for processing web_connectivity measurements

this is very ghetto...

505f7cf2f970bb96c20c71bba039ab4c42633030 authored over 8 years ago
Add UK block page fingerprint

9eefdc20a7e339328b37d8e3eb073dfa421a8709 authored over 8 years ago
Add support for ignoring ASN

3ad2bb4551070fcfceed60d4b8ea06a93ed99d9c authored over 8 years ago
Add fingerprint for Norway

065ed445847250d3a552d1809846e5391badbefc authored over 8 years ago
Add fingerprints for Italy, Cyprus, Denmark and Portugal

8fa0e0b93298db269c6fb9008d4aa783107d4122 authored over 8 years ago
Merge pull request #23 from TheTorProject/fingerprint/be

Fingerprint/be

fdf659c7f294f2f2e9f60cd9a5bcb25e64fd02b1 authored over 8 years ago
Add new WrapperTask to update the fingeprints for the materialised views

* Force all RunQuery tasks to update every time they are run by setting
task_id to contain a...

85e9ff0bc6fbd7853d67b05b3ccf189a7258d122 authored over 8 years ago
Add fingerprint for Belgium

ee2001ff8e0402c41612efcf97fa58ea9fe22420 authored over 8 years ago
[hotfix] handle new and old formats for test_start_time and measurement_start_time

e16aa7c9ee35d36dbec1ed218ab16b36efdaf7cb authored over 8 years ago
Merge pull request #22 from TheTorProject/fingerprint/qa-kr

Fingerprint/qa kr

cf3f765086016ed8b6f9e92c2e8dab566f226343 authored over 8 years ago
Merge pull request #19 from TheTorProject/feature/df-0.2.0-yaml

Handle also the case where a report is submitted using YAML, but is o…

d5224a645da4ca0450248af05c626f4403a4ecce authored over 8 years ago
Add fingerprint for location redirect in Korea

5093ef32ce2f57c3fae7d7da7f2eda787bfe4549 authored over 8 years ago
Add blockpage fingerprint for QA

26b7368f1bfe69991edc903683e9f36d83edefc0 authored over 8 years ago
Handle also the case where a report is submitted using YAML, but is of the new data format.

2f0f561e578c5a1d1a5c41e592e1b2ee29e6b3f1 authored almost 9 years ago
[hotfix] remove addition of sent and received keys

f9b3360161988c1eb55d71d1e50cd934021a4936 authored almost 9 years ago
Merge pull request #18 from TheTorProject/feature/ignore_cc

Add support for ignoring certain country codes when processing reports

dfad674bfb9e3009b5d7e460e5bf4073baa6778e authored almost 9 years ago
Fix bug in setting test_start_time

a669c538a8407e9a4585a5483a94bbca72e5ac73 authored almost 9 years ago
Add support for ignoring certain country codes when processing reports

4b4316941d4be94c24bb65d83cc4719e6bbea29f authored almost 9 years ago
Fix nesting of measurement_start_time

4674b0ecc946b33600608fd57a13ea501bf1678d authored almost 9 years ago
Cast probe_city to string

a08539cd5a9a53ca398b9a1c99637c9449415de5 authored almost 9 years ago
Deterministically generate the report_id

28746fe0e09ff11d0e53313457ac43e41e3d371f authored almost 9 years ago
Failback to using test_start_time

34fdeacf297c17c8111a94d537af4856e360cdac authored almost 9 years ago
Add measurement_start_time to represent the old value of test_start_time

4a67c63cc4afbad2b485dca9cbb67683369659d1 authored almost 9 years ago
Add heuristic for detecting blockpage in sudan

160f65d2acd382be95f2e1a96df0db1c73d127c6 authored almost 9 years ago
Merge pull request #17 from TheTorProject/feature/df-0.2

Skip normalisation when the data format version is 0.2.0

1ec4bb7aa7e4b798910646589a5c34b1c30e90c1 authored almost 9 years ago
Merge pull request #16 from TheTorProject/feature/india-blockpage

Add support for detecting blockpages in India

696f40a3ee76906c3b33c5d17e9b5e6bbdeb2cbb authored almost 9 years ago