Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/ooni/2022-04-websteps-illustrated

websteps: winter 2022 edition
https://github.com/ooni/2022-04-websteps-illustrated

feat(websteps): introduce a steps cache

This cache is only valid within a set of steps. For now, we're using
it mainly to ensure we don'...

d02ab70cfa50d4fe64d591861ac311408cd33415 authored almost 3 years ago by Simone Basso <[email protected]>
refactor: improve URLAddress algorithms

1. introduce a Clone operation and use it rather than rolling
out a manual copy of the structure...

9e22321873741f2e730246c5f4a39e93039d244a authored almost 3 years ago by Simone Basso <[email protected]>
feat: use non-connected socket for UDP resolver

This diff modifies the UDP resolver inside netxlite to use a
non-connected socket. We are doing ...

c2f7ccab0ec971d5c084ea4c571b76f7530b28ee authored almost 3 years ago by Simone Basso <[email protected]>
fix: recognize dns_servfail_error

This error occurred for example when querying kazemjalali.com
in websteps measurements run from ...

4269e82fbda40a7c35c1ebdc212d12f4c5053bd9 authored almost 3 years ago by Simone Basso <[email protected]>
fix(dbsteps): print the number of loaded measurements

86a89a3ef93e84cd708e8b2156428ecdb380868c authored almost 3 years ago by Simone Basso <[email protected]>
chore: commit vscode settings

7c062f1e246e7299ee0fd697b57ca23679e07fe1 authored almost 3 years ago by Simone Basso <[email protected]>
fix(netxlite): HTTPSSvc: better no_answer checks

I've seen some measurements returning some IP addresses for HTTPSSvc
queries but not returning a...

57a023bcf4ebb1dd9dbdac83c18dc53a165011f8 authored almost 3 years ago by Simone Basso <[email protected]>
fix(dbsteps): improve decoding of HTTP round trips

668e1acd5e03cb6400efbd660a9fa51962066574 authored almost 3 years ago by Simone Basso <[email protected]>
feat(dbsteps): decode HTTP round trips

We can possibly do better in this department in the future by
implementing a full decode of any ...

0455b6534fbfe5248d3dba665fab7bcfbc36d1b4 authored almost 3 years ago by Simone Basso <[email protected]>
fix(measurex): use singular for archival request

There is just one, therefore using the plural here is grossly
misleading and source of big confu...

5a77b1a5cb5c6e145389a7be8de9344afe311056 authored almost 3 years ago by Simone Basso <[email protected]>
feat(dbsteps): allows sampling tag output

This is becoming increasingly necessary given that we're now
basically dealing sometimes with hu...

ce00c241582fb105297d4e41f4050380d61fa983 authored almost 3 years ago by Simone Basso <[email protected]>
feat(dbsteps): decode DNS round trips

b648c95212918f540f5e659cb99f52e7200bb0e2 authored almost 3 years ago by Simone Basso <[email protected]>
feat(dbsteps): split `c` in `ci` and `cu`

0306e60458ffa76e4a7431a7d1f6987ec20b862a authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): correctly handler dns_no_answer

de7046284192e2440dacd379beb6728cb1881359 authored almost 3 years ago by Simone Basso <[email protected]>
fix(dnsping): use different socket for each ping

We've experimentally determined that, in some networks, after you issue
a blocked query, the soc...

17fe74efcfbda603964e3c9eca221c58269a6f21 authored almost 3 years ago by Simone Basso <[email protected]>
fix(dbsteps): avoid division by zero

This condition occurs when you're analysing a measurement that
doesn't produce any blocking tag.

61125f0557d3fa8146f1d2dd4d63d883100e0221 authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): fetch small body snapshot in fast mode

181dfde135d0a2e57574bd476db1e96cc46092dc authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): truly enforce max number of addrs per family

They key to do that is always returning a stable list with the
addresses initially discovered by...

e7a3a2d3fb9e6c73b4b98761ca627c1e50b8ed60 authored almost 3 years ago by Simone Basso <[email protected]>
refactor: consolidate code for checking for IPv6

966e7f7cdde534dca8beaf54dda08746660cc324 authored almost 3 years ago by Simone Basso <[email protected]>
fix(measurex): off by one in max-crawler depth

65831b0627c6e06b4e7b321fb4bbf90ec9918074 authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): force Mozilla CA pool

According to gorilla/websocket, and also according to reality, the
TLSClientConfig field is igno...

d1b22d9e7bce8aab53e2f81caa159c63c781112e authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): we must aggregate flags after reprocessing

83323bb090476e68df7a7b3d58c7ba326c8c01d3 authored almost 3 years ago by Simone Basso <[email protected]>
doc(dbsteps): document j to print whole JSON inside s environment

ac99bebfacb32c54225330eaaa5ec1a4830db4e1 authored almost 3 years ago by Simone Basso <[email protected]>
feat(dbsteps): implement search by tag name

d65befe20affe0a46d9940457abaef6d7bfe6cc2 authored almost 3 years ago by Simone Basso <[email protected]>
feat(dbsteps): show stats about the observed tags

dc2c3cde89ef344e556edf5af15711c616f76628 authored almost 3 years ago by Simone Basso <[email protected]>
feat(dbsteps): detect and handle gzipped jsonl files

490ed8450c408a547efe1583dac98e5cdbcd0210 authored almost 3 years ago by Simone Basso <[email protected]>
feat(websteps): add flag to perform a deep scan

0b4cbb1352c30d8b59927da1f8e4cf0f67fa034f authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): use the OONI backend by default

34daacc54b7e13b41c479e4740deaea625059006 authored almost 3 years ago by Simone Basso <[email protected]>
feat(thd): run as nobody if executed as root

Allows to run on-the-fly experiments without running the
daemon with root privileges, which is r...

164af7e5c0488a976619e2720f52650b0a7205bd authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): strengthen HTTP heuristics

1. improve the documentation of the reprocessing of HTTP replies
with status code diff, explaini...

2a0e2136d0e729b87d628b4da8907f628f30a8bd authored almost 3 years ago by Simone Basso <[email protected]>
doc(README): document how to run on {0,1}.th.ooni.org

f54584e4b3ef3a172dec19b57effb1c7f65a2204 authored almost 3 years ago by Simone Basso <[email protected]>
feat(dbsteps): implement the / (= search) command

e38b97d1960bf1d18c8310352cb7bf57ada96b6f authored almost 3 years ago by Simone Basso <[email protected]>
feat(dbsteps): load measurements dump from the TH

e2a41240405f752c7b047d203850b0f0a928fd49 authored almost 3 years ago by Simone Basso <[email protected]>
fix(thd): ensure we generate a JSONL file

07d4dfe395f8df71b41548e6db57a298a24bc787 authored almost 3 years ago by Simone Basso <[email protected]>
fix(dbsteps): print number of imported measurements

de951e975dab17ede12c664dbe1ae03454693d9a authored almost 3 years ago by Simone Basso <[email protected]>
feat(websteps): save TH measurements

This is definitely going to help with hunting differences between
the client and the server in t...

c3f1cfca8b5d75f9c5f4b352619af0930ccd1969 authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): th must forward user-agent

3933b2a2227256003120a700b10234cbb844f26d authored almost 3 years ago by Simone Basso <[email protected]>
refactor(measurex): log the operation ID

This makes reading websteps' output much easier.

While there, also make some websteps messages ...

b1e7a32a67d6decce75efa7af9a58c4f08f36779 authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): avoid panic in hashing

I should probably submit a fix to the library author.

4b37cc34eda4be12373d1ea275a2319ee6225292 authored almost 3 years ago by Simone Basso <[email protected]>
feat(websteps): don't misdiagnose what seem transparent proxies

We're still able to get the final response, hence we're not going to
emit a flashing anomaly but...

b3df643332f2a66d0fe3fb5b5e8e7fe7b0869c66 authored almost 3 years ago by Simone Basso <[email protected]>
feat(websteps): emit bodies aggregated by their hash

b7df42a5d00e18528067562df0ae92ce509d9d8a authored almost 3 years ago by Simone Basso <[email protected]>
fix(archival): keep MaybeBinaryData as []byte as long as possible

This change avoids creating duplicate copies of bodies since we can
easily share the same byte a...

ac5b1f0be3238eb348ac49e650bff5a14a94eb96 authored almost 3 years ago by Simone Basso <[email protected]>
feat(websteps): serialize the body TLSH rather than the body

Here the idea is that we're going anyway to need to go through the
bodies and compare their TLSH...

0a75c2fd217187a22e3aea6c9229d7fb18396598 authored almost 3 years ago by Simone Basso <[email protected]>
feat(websteps): implement TLSH based body comparison

c2e973dd8f7486ef0a8644965f73b370e70f208c authored almost 3 years ago by Simone Basso <[email protected]>
feat(dbsteps): introduce history courtesy readline

d199ccb3e8542b5be07b506bfbe551475b51166d authored almost 3 years ago by Simone Basso <[email protected]>
feat(dbsteps): implement ld<id> and le<id>

37d0a44457821dde1b3d44678a4a5ca9c33ad2b3 authored almost 3 years ago by Simone Basso <[email protected]>
fix(dbsteps): find_entry throws KeyError not IndexError

19bdb5bf16d8340d3b7834326aea8b9798a1544d authored almost 3 years ago by Simone Basso <[email protected]>
feat(dbsteps): more easily print the whole measurement JSON

d7a96a287bf2fa097f5027c2afdf9d1021279fb0 authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): move non blocking flags in reserved space

The idea is that we could check for blocking by using a simple
bitmask and all the flags with po...

444949bd1a133ef491442fc6b666509ab5775cd3 authored almost 3 years ago by Simone Basso <[email protected]>
feat(dbsteps): decode query and reply

1b3ed3e10bbd88b895f781368a32f0a40b33e3a2 authored almost 3 years ago by Simone Basso <[email protected]>
feat(dbsteps): import dnsping results

8525fad158843614d3bb9e4e468f427e4f458882 authored almost 3 years ago by Simone Basso <[email protected]>
feat(dbsteps): implement li and lia

The `li` command operates in the `main` environment and only
shows the "interesting" measurement...

406306bfd636ca91ea76686f9665754ce25a61a3 authored almost 3 years ago by Simone Basso <[email protected]>
feat(dnsping): cancel timeout flag using dnsping

This change allows us to remove a potential source of false positives
by checking for the result...

07fb98fcae52bab516a66a23b9e7d1104c557e7c authored almost 3 years ago by Simone Basso <[email protected]>
feat(websteps): learn from dnsping results

7d70edbf3cce69788c5df53103844ecb140b2292 authored almost 3 years ago by Simone Basso <[email protected]>
feat(websteps): run dnsping as a follow-up experiment

We need this experiment to disambiguate cases in which the main
lookup process (which is synchro...

c690bc7025d84dc4944243bc13d45e6dea550bd5 authored almost 3 years ago by Simone Basso <[email protected]>
feat: add dnsping implementation

The implementation is flexible enough that we can use it as a
standalone implementation or insid...

8834e6deb24ab114e873d256d239bd0a54588dd6 authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): simple heuristic to detect legitimate redirects

If the HTTP response is a redirect and the location and original URL
are equal (plus or minus "w...

ede0e7e4b4d9f0521751f18f6df940700b706505 authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): more robust heuristic for IPv6 not being available

I have only seen net_unreach/host_unreach being used for censoring
once. We have said multiple t...

3b0ea1814c7ee7bf6692005af1eb630af46773a4 authored almost 3 years ago by Simone Basso <[email protected]>
fix(measurex): significantly increase default DNS timeout

8e8cab8c4690880f7bd52c981389b6ef65593ead authored almost 3 years ago by Simone Basso <[email protected]>
feat: rewrite measurement viewer from scratch

f56268f1f8059ff5723ee83368b8887242415341 authored almost 3 years ago by Simone Basso <[email protected]>
refactor(websteps): avoid using misleading name

ae100a76a25c511856e9ac57c379ad76feae8db9 authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): simplify SingleStep flags

There's no point in hiding them when they can safely be toplevel.

3e182546c95482cd2beb4b5dbf9d8f19f459fd3f authored almost 3 years ago by Simone Basso <[email protected]>
fix(measurex): always include cookies

In general, we'd like to always include fields to simplify
processing them in Python.

9c2a323307fba2a52abe894bb94e2d790278fd66 authored almost 3 years ago by Simone Basso <[email protected]>
feat(websteps): start aggregating at the TestKeys level

5957b1e539693ed8426cf3531b8ee4a1c4b9fd82 authored almost 3 years ago by Simone Basso <[email protected]>
feat(websteps): track whether the TH failed

We need to know about this to mark a measurement as failed and
allow a user to run the measureme...

90ee938711091d36e9abbb980f75dbd4bd267b13 authored almost 3 years ago by Simone Basso <[email protected]>
doc(websteps): clarify some flags definition

7964ddb888329734cef28ff1d8c6219fc56af3d0 authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): keep more free flags and join TH and GL

7ca13ca9b242a93fee19a4d09b6ec8c6f111b5ec authored almost 3 years ago by Simone Basso <[email protected]>
feat(websteps): reserve bits for TH and URL

d39aefe64d400ecac15af20e3a7a0b0f23925fa8 authored almost 3 years ago by Simone Basso <[email protected]>
feat(websteps): define accessibility of an HTTP endpoint

bfb7cc8ae8f5ade6aa475408fc274203291df528 authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): attribute failure to the observable protocol

We need to think about what the adversary can observe. This is the
protocol against which there'...

7be5f7ece4b3ec015c202a2fee8bd82cbfa0d3fa authored almost 3 years ago by Simone Basso <[email protected]>
refactor(websteps): rationalize flags

841507930bc94751502545c1baafc99e8c3ddac7 authored almost 3 years ago by Simone Basso <[email protected]>
feat: sync up measurement viewer with flags

While there, recognize that more flags are just implementation
internal and should actually be p...

4f17075bc454a345f1be3a6b7a03adb1043657df authored almost 3 years ago by Simone Basso <[email protected]>
feat(websteps): rework flags to inform about possible censorship

Based on feedback from @hellais.

10ec49d191f96f14feb254d44795bc0dc92b3338 authored almost 3 years ago by Simone Basso <[email protected]>
doc(websteps): document a TODO for myself

d3bdf03bbddc6c827d84da4f88606ad242e66668 authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): better request logging

Execute all actions in the context of a THRequestHandler such that
we get logging per request an...

55fa48fc1ecab67b1c41e999116270e322fa139d authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): gracefully close websocket conn

While there, stop reading early when we've got a response. From then
on, whatever happens on the...

bbffcc6e008dd48664c80fca4f2b3011acf5dfd0 authored almost 3 years ago by Simone Basso <[email protected]>
feat(websteps): use websocket to communicate with the TH

dc2e09d518299a1a4e09b7f22a743f8c05e8cc5e authored almost 3 years ago by Simone Basso <[email protected]>
refactor(thd): ensure each request is logged independently

6a24bc82a352f68a0e8e9dbac45f707ea20d109b authored almost 3 years ago by Simone Basso <[email protected]>
feat(websteps): randomize the A and AAAA resolvers we use

1ab2fe774a3e19371f18a68d663b564da433a2e0 authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): move the endpoint bogon check earlier

a6297da0418840e4bd06f230488f4c46903658d7 authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): remove obsolete TODO

60b226c8e816ad9e12765d7118b97bee5aece7e9 authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): ensure algorithm is Web Connectivity compatible

9c6e34d7caac7bcd4861076753c07e8f73f42816 authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): backout DNS blocking when probe and TH agree

2f1be8800e126a47173049340d9bd180ffc8354f authored almost 3 years ago by Simone Basso <[email protected]>
feat(dbstep): show URLMeasurement analysis

1103ee673a9516c94cf19f00ae42dae24454d085 authored almost 3 years ago by Simone Basso <[email protected]>
feat(websteps): compute URLMeasurement flags

c0277af5fe0ab8c77faecd7302e8bb8e302f97f9 authored almost 3 years ago by Simone Basso <[email protected]>
feat: add tool to import and navigate measurements

66a7eb78525575efdb3c44b06e29980e8eed7643 authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): ensure the TH sees all the probe's resolved endpoints

f9f89ed97436d35f2a92986f12c05875a46fcd41 authored almost 3 years ago by Simone Basso <[email protected]>
refactor(websteps): minor cosmetic changes

3f4d6d58ea61fffb9a866ed97f8c41f8e0878b8c authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): less strict and more correct check for DNS validated by HTTPS

07c400cc0129713b5202b463d82b1d240c4907ec authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): second round of planning should only measure HTTP3

The second round is to address Alt-Svc, hence we just need to check for
HTTP3 and it would be wr...

8d2033a833696ffb134533a46aa74bc3819f4ccc authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): don't mark successful HTTPS as inconclusive

e46e97217d2b104b10226321b734a9e5cc4e0635 authored almost 3 years ago by Simone Basso <[email protected]>
feat(websteps): link back analysis to its URLMeasurementID

d8441356a58ec998fd11c92ba6accb0671a9fff2 authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): TH measurement now refers to orig URL measurement

That is actually the right thing to do in terms of ensuring that we're
later able to reconstruct...

30fa0a3d0cb480cda9d1088e32eaf690c50daeff authored almost 3 years ago by Simone Basso <[email protected]>
feat(measurex): expose more DNSMeasurement fields

197625c447d9e2df23e1f6f0261d266f1ba3594e authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): include root URL in the test keys

e356b12356410876d7c1ad3fcf7589a806eea5f9 authored almost 3 years ago by Simone Basso <[email protected]>
refactor(websteps): make output like real OONI measurement

d8ad336abb74f7c0872aa603774018a7cea35358 authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): emit measurements like OONI would do

7c587df0957c4586615e2c364891fec30c9a320b authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): consider TH when following redirects

7ad0c859caa7317b7b78ea56b7136889f9573ac2 authored almost 3 years ago by Simone Basso <[email protected]>
feat(websteps): save URLMeasurement that caused TH measurement

0565c1bfebbd0ca0af7af7693911a8a467884b7e authored almost 3 years ago by Simone Basso <[email protected]>
fix(websteps): assign an ID to every analysis

f642db4d4a701ee1423da5617f57d27b1e7b1931 authored almost 3 years ago by Simone Basso <[email protected]>