Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/ooni/2022-04-websteps-illustrated

websteps: winter 2022 edition
https://github.com/ooni/2022-04-websteps-illustrated

chore: adjust the import path

41ba115289da0cca3fb633b55066f35622d56091 authored over 1 year ago by Simone Basso <[email protected]>
doc(README.md): mention the required golang version

7e0c6a2e91b6cd0514a4ad2f0eff182e1f92c2be authored over 1 year ago by Simone Basso <[email protected]>
Python modules and ignore cache (#3)

* Ignore CACHE folder

* Required Python modules

b3bf45c44f92bc6d0473f57b27eb73c4bb67e3c0 authored over 2 years ago by Arky <[email protected]>
fix: remove test case that is uninteresting

I misinterpreted it when reading the results. We can safely zap it.

836d2d7f9185b4269acfd751bdb90df71ab9d463 authored over 2 years ago by Simone Basso <[email protected]>
fix(websteps): emit greedy mode log using scrutinize emoji

9f9e08b6b5b0d0f4d968ac161fb05a9dbf8940bc authored over 2 years ago by Simone Basso <[email protected]>
fix(dnsping): rewrite replies IDs when loading from cache

Otherwise, the IDs will hide other legitimate IDs in the measurement
and we would be very confus...

47f287375f11d99c4ebf1572da5e4c4c30066d91 authored over 2 years ago by Simone Basso <[email protected]>
doc: document newly added test case

be679ba929c927dfe294ef4938b10f3551a9956f authored over 2 years ago by Simone Basso <[email protected]>
chore: add interesting test case we don't correctly detect

260d754205e641e90c0335bd0299750d04d1c542 authored over 2 years ago by Simone Basso <[email protected]>
fix(flatdecode.py): print binary bodies as well

1fc2d3d728cc23afc30c47e4d60d7efc51f9df17 authored over 2 years ago by Simone Basso <[email protected]>
fix(measurex): test HTTPS for every HTTP-only redirect

This strategy ensures that we always have HTTPS oracles for the IP
addresses, which helps a lot ...

f6eb564a4cc9c23ff857d0515fb806207e721d42 authored over 2 years ago by Simone Basso <[email protected]>
feat(dbsteps): print redirect chains length

e3c8f88d378a2c7f8b8a3f23c6d60f8849e6d705 authored over 2 years ago by Simone Basso <[email protected]>
fix(measurex): canonicalize empty URL paths

b6a14fb9f55c6257440cadc1bd564c6f122b5d7e authored over 2 years ago by Simone Basso <[email protected]>
fix(websteps): correct log message wording

7b68f87cb3660fdf5384ea670f1cc2fb80fd9442 authored over 2 years ago by Simone Basso <[email protected]>
doc: add sample JSON and sample analysis

77db77f56e5a0d7bfb4522026d13b086daf7c076 authored over 2 years ago by Simone Basso <[email protected]>
fix(design): mention we try to avoid redirects

e4c65317959748d2a521018051d8247131e1ca6a authored over 2 years ago by Simone Basso <[email protected]>
fix(p/t/shell): pass correct TH URL

88887286cae81d6cb74fa30a162feb3971af7229 authored over 2 years ago by Simone Basso <[email protected]>
fix(design): use comparatives not absolutes

We're comparing A and B so we cannot say "hard" in general.

355632975ee82f0edc3c64d6177c41882edcef0a authored over 2 years ago by Simone Basso <[email protected]>
fix(spec): clarify how do_not_follow... works

eff5a91053cc244bd2e98339ede1a4b38b2d7d5b authored over 2 years ago by Simone Basso <[email protected]>
fix(spec): clarify the meaning of options

Closes #1

3899997259d4193a480a0808a09895c66867f88d authored over 2 years ago by Simone Basso <[email protected]>
fix: clarify the terms of comparison

Since it's a 1:1 comparison, it does not make much sense to
say "medium" because I do not have o...

72fd1421d5125a16089a054ba8bddfee12fbe854 authored over 2 years ago by Simone Basso <[email protected]>
doc: start adding draft of websteps design doc

52161f0d7ef26ddcfd04b33523758afeebf5b852 authored over 2 years ago by Simone Basso <[email protected]>
doc(README.md): document more enhancements

a6769ec3676f4405a945bcf5e21267ce55c8a834 authored over 2 years ago by Simone Basso <[email protected]>
doc: add more toplevel directory READMEs

d800f6ae19d7b9fc13ea8c18187d6c28766309ba authored over 2 years ago by Simone Basso <[email protected]>
cleanup: mark dbsteps legacy as clearly legacy

b9d340e47116c32d6279137b38d752e9c74687fb authored over 2 years ago by Simone Basso <[email protected]>
cleanup: remove partially incomplete command

Turns out I committed this script by mistake :facepalm:

156c484056cb1233f7ce4fc811f54cc6f858bef8 authored over 2 years ago by Simone Basso <[email protected]>
cleanup: remove now unnecessary directory

3f45df97f4dc980b398cae347dad7c809f9d91db authored over 2 years ago by Simone Basso <[email protected]>
doc: continue improving the README

4f3a92946ee65792da565f1f5e5f096050abdcce authored over 2 years ago by Simone Basso <[email protected]>
doc: document more enhancements

b75c3ffb7884af0eb3bd04fe17f2ce4b9142f8fe authored over 2 years ago by Simone Basso <[email protected]>
doc: revamp readme and explain what changed

50c1aa49b96c4b637fa4b9697142c6e964eeba7d authored over 2 years ago by Simone Basso <[email protected]>
chore: add license to the spec

While the code should be GPL I think it's nice and fine
for the spec to be BSD-3-Clause.

a000f9d46ad0f75b2c9854f64d874e33101cfbc4 authored over 2 years ago by Simone Basso <[email protected]>
chore(spec): minor enhancements

2ab7ecf9bf5b2b4bb35b91f778f489c7ee02565f authored over 2 years ago by Simone Basso <[email protected]>
doc: first draft of a nearly complete spec

5906441d08444515b66bc85a6916f6beba27585f authored over 2 years ago by Simone Basso <[email protected]>
fix(websteps): mark as bug two cases that are bugs

6dea142554f3025b49164534d21895959cf5d5ad authored over 2 years ago by Simone Basso <[email protected]>
fix(websteps.py): use correct dictionary for unmarshalling

ef7a88b15ac5d13fb95525551e9cb078853cfa7b authored over 2 years ago by Simone Basso <[email protected]>
feat: add websteps implementation in Python

This implementation is meant to be simple and aid to reason
about websteps algorithms and in wri...

37a04c5c4ff2f617af72739867c8695487908ee6 authored over 2 years ago by Simone Basso <[email protected]>
cleanup(websteps): avoid passing unused argument

1e276ad58795fd778596a353e0c66066327730c3 authored over 2 years ago by Simone Basso <[email protected]>
fix(websteps/client.go): avoid possible nil dereference

While this is extremely unlikely, it's better to write the
correct implementation nonetheless.

862efee1b8efdb2ee4f9e9955b8a75dbf9c2be4d authored over 2 years ago by Simone Basso <[email protected]>
fix(singlestep.go): explicitly initialize DNSPing to nil

7c51f29f8ab47b4426034ff662f440065fd99be3 authored over 2 years ago by Simone Basso <[email protected]>
feat(websteps): accept TH clients both on ws and http

Having the possibility of using HTTP helps to write simplified
clients, which in turn are great ...

7dcf73213cbb3ecb31f30d152773660cb46cb271 authored over 2 years ago by Simone Basso <[email protected]>
fix(pkg_archival.py): save peer certs as []string

e3403ed424314bab5b889d3a0a20b12b67a283c9 authored over 2 years ago by Simone Basso <[email protected]>
fix(dblike.py): include cross TH analysis in result

Without this piece of information, we're missing extra data
about the probe comparing different ...

f5d772e43a70f2b4a1e29bfb075fde257c99a60c authored over 2 years ago by Simone Basso <[email protected]>
feat(dbsteps): indicate measurement ID in webpage

While we have context in the textual interface, we can easily
lose context when we have multiple...

3b55a5cf0cfdcaf74bf007c91679363b063afb86 authored over 2 years ago by Simone Basso <[email protected]>
feat(dbsteps): implement b in the s environment

ae02ec37c721f0a811169b85ec749d9e1b107153 authored over 2 years ago by Simone Basso <[email protected]>
feat(dbsteps): implement interactive ci or cu

This command adds an interactive prompt allowing us to cycle
through each row returned by ci or ...

26d714fae4952b696494a238062f2feb91c728bf authored over 2 years ago by Simone Basso <[email protected]>
feat(dbsteps): ci/cu: include measurements samples

This diff automates part of my workflow for going through
measurements and checking whether the ...

1546a3781b8fe9558f3d385c5a402f64bf246b40 authored over 2 years ago by Simone Basso <[email protected]>
fix(python): avoid pylance type errors

Yesterday, pylance was able to infer that a List[DBLikeEntry] is
a List[Tabulable] because DBLik...

6b8e925312da01fb71562ad77e7ba22f50463538 authored over 2 years ago by Simone Basso <[email protected]>
feat(dbsteps): shows details in the webpage

ca4e0ca70db50e6d444d2cff0c46adae650f1e08 authored over 2 years ago by Simone Basso <[email protected]>
refactor: entry decoding now returns a string

06be25b7d632446a9210f7397dcf1382654b8d30 authored over 2 years ago by Simone Basso <[email protected]>
refactor: factor code to decode a DBLikeEntry

d5812995a589235da88381386b1f5dd745bfe84d authored over 2 years ago by Simone Basso <[email protected]>
feat(dbsteps): start visualizing websteps measurements

5811afcac91c2eb287cd16f5ecd2d9ea18724d0a authored over 2 years ago by Simone Basso <[email protected]>
feat: create scaffolding for showing websteps results in the browser

So, now we have a way to show a measurement in the browser except
that for now we just produce a...

b294bdab1e920001d34dec059d18b65539aad798 authored over 2 years ago by Simone Basso <[email protected]>
fix(dbsteps): main_command_hashtag should return on parse error

d4e1800bc29c6a6a5ed6b5f4768147d19dac82b7 authored over 2 years ago by Simone Basso <[email protected]>
refactor: share code to create tempfiles and tempdirs

a131c94829b8000b7c4bcffcd5e5730e3a2d3efc authored over 2 years ago by Simone Basso <[email protected]>
doc: improve documentation of jsonl.py

70497ae5e9327d6f265a1d360604259794e70215 authored over 2 years ago by Simone Basso <[email protected]>
refactor: move load functionality in dblike

cdc18cee6889e3ea9317f959fd62085f19ac0606 authored over 2 years ago by Simone Basso <[email protected]>
refactor: tabulatex could be a simple py file

dbfb204aa246eda5b0a8a7db3c81b583e8729f54 authored over 2 years ago by Simone Basso <[email protected]>
refactor: also jsonl.py belongs to dataformat

e2022f8399377cbdff72827c62e63b88b42f6a10 authored over 2 years ago by Simone Basso <[email protected]>
refactor: move testcase.py into the dataformat

It is also an overlay on top of functionality in dataformat, so it
makes sense to move this code...

6ad497a266e8430a1c01caa700a9ef82790cc106 authored over 2 years ago by Simone Basso <[email protected]>
refactor: move decode.py into the dataformat

It's a set of extra routines useful when managing the dataformat
so it makes sense for it to liv...

7ecc28dd3640d49a621b63949eee2d42cf5d16f2 authored over 2 years ago by Simone Basso <[email protected]>
refactor: move dblike into p/ooni/dataformat

a4bcfaee9f796d066ed6f0f983b242f14abdedc5 authored over 2 years ago by Simone Basso <[email protected]>
refactor(dblike): change names to be more specific

The previous names survived the refactoring after I ported over
previous dblike code to use ./py...

9b098a7fc841c355b9cb97156fe7fda68b21f325 authored over 2 years ago by Simone Basso <[email protected]>
refactor(htmlx): start making space for more htmlx users

We're moving in testcase.py the code for HTMLifying test cases.

The __init__.py now just export...

2a129a492763e350c8fd3fecd287fe144d4d3f2f authored over 2 years ago by Simone Basso <[email protected]>
refactor(htmlx): rename function for clarity

8f1797781bce5851ba2d0a0559e3444d850690b6 authored over 2 years ago by Simone Basso <[email protected]>
feat(dbsteps): implement superset `l` command that lists all

Still, I think I'd be better served by a web version of this command
because looking at tables i...

7930b7fb6f96e56cd238d84ec1e0ff8536388f60 authored over 2 years ago by Simone Basso <[email protected]>
fix(dblike.py): add more entries to DNS lookup results

85fa5ec65ec5fcf88e5827e3efd94080aad3de17 authored over 2 years ago by Simone Basso <[email protected]>
chore(dblike.py): minimum cleanup and columns reordering

7abb53216c65ae39b89e88022f52f145f41ef3a1 authored over 2 years ago by Simone Basso <[email protected]>
feat: update dbsteps to post 2022-03-23 data format

This happens in a fork of the original code because we still need
to jeep the original code arou...

1f8bf2fa70e121b7e4fab9fc8edbcda0237d4cc7 authored over 2 years ago by Simone Basso <[email protected]>
fix(dbsteps): avoid possibly unbound variable

While there, clarify that this command is deprecated because we
have changed the format and it m...

f569468eff952457cad1c8967d6b6d13893a3323 authored over 2 years ago by Simone Basso <[email protected]>
chore: add more test cases

These one are from Italy and we have plenty of DNS lies and
block pages all around the boot :-)

5b2124122a111fa86f89b98ba35001e567f88863 authored over 2 years ago by Simone Basso <[email protected]>
chore: add one more test case

This test case does not involve censorship but rather it contains
peculiar conditions where the ...

087e4f9afc8c71e53ba0c7ad69c73e4d5a526c43 authored over 2 years ago by Simone Basso <[email protected]>
fix(measurex): return cached results sorted by completion time

I've seen cases where a resolver returns A, B, C and another one
returns C, B, A where the first...

0dc8721d72a6c371cf715291bbeafda3ab10c79b authored over 2 years ago by Simone Basso <[email protected]>
chore(websteps): intercept and log more bugs

When the cache fails, it returns a measurement with an invalid
ID of zero. We don't want to anal...

d63c9b30ee14e26f837dc550b5122a56ef7828a7 authored over 2 years ago by Simone Basso <[email protected]>
chore: add more tests cases

These tests have been collected in Iran.

More test cases to come soon.

1f09952c9a8fefb8f3b0ef81065b27f3c1059ecc authored over 2 years ago by Simone Basso <[email protected]>
fix(websteps): better legitimate redirect heuristics

One of the two endpoints must be a redirect and the other must be 200.

We cannot say we have a ...

c0dfbaa0a41db01e792e6b014b4f3fd7f5f46ab3 authored over 2 years ago by Simone Basso <[email protected]>
fix(analysisredirect.go): emit message at NOTICE level

7744275858e2422b0a2ae5fc8192bfedc76c6bc3 authored over 2 years ago by Simone Basso <[email protected]>
fix(bogon.go): avoid emitting confusing log message

a65f3e8579b59565789a1f38c468b0a9646783cd authored over 2 years ago by Simone Basso <[email protected]>
fix: ensure we write all the logs

When we're using a slow mechanism to transfer logs (e.g., ssh) we may
end the main program much ...

c899318a2ee343d01de84a3203de08547a11e077 authored over 2 years ago by Simone Basso <[email protected]>
fix(dnsping): emit notice that we're importing from cache

Not a big deal but it's useful to see that log line.

1b24d20cc39642ef9cc441792a04acff334d0380 authored over 2 years ago by Simone Basso <[email protected]>
feat: add three test cases from China

These three test cases consist of basically probe and TH caches to
replicate measurements and te...

1401a1e3dedcf713dc86b6332cc1253f60c4342b authored over 2 years ago by Simone Basso <[email protected]>
fix(websteps): ensure the dnscache represents itself correctly

The address should be empty and the network should be `dnschache` so
we produce the `dnscache://...

844501c5009bfb386ea9476435d1b713f5f65b65 authored over 2 years ago by Simone Basso <[email protected]>
fix(websteps): more pragmatic approach to flags

Keep the low 32 bits for anomalies. Keep up high 32 bits for things
the probe noted and may be u...

4263ab94122119dff876bd9f58b326a21c4cf74b authored over 2 years ago by Simone Basso <[email protected]>
fix(websteps): also check for legitimate TH redirections

We're going to flag what look like legitimate redirections on
both sides specially, because we b...

2f40d957fd589ba97a6a89d48484eed58d7d10f6 authored over 2 years ago by Simone Basso <[email protected]>
fix(netxlite): avoid saying every IPv4 is a bogon

It seems the way in which I extended bogons code was completely
broken and we should instead spl...

f313a86988e62f2e3cc884c47e80a944afd2a63e authored over 2 years ago by Simone Basso <[email protected]>
fix(measurex): don't reverse lookup bogons

There's really no point in trying that.

14b0f0f9011781f9c55da071573d3c35402b5116 authored over 2 years ago by Simone Basso <[email protected]>
fix(netxlite): significantly improve our bogons list

I have used the information provided by ipinfo.io to collect this
list of bogons IP addrs. Hopef...

68fba07e450f52bf9d12f78e7ceb80a7b3cc7a24 authored over 2 years ago by Simone Basso <[email protected]>
feat: add web interface to explore the cache

It took some time to develop a web interface that seems easy to
use for my purposes, but now at ...

fa9ec0d29c914c770c6679fdc76ccdab2d6243f8 authored over 2 years ago by Simone Basso <[email protected]>
refactor(shell): do not print inside measurex

Let's move all printing to the shell and let's just decode
to string inside of p/o/measurex.

Th...

146daf5cc6f3ee3ed14cb5f451531d027c57c88a authored over 2 years ago by Simone Basso <[email protected]>
refactor(p/o/measurex): decode without printing

We should separate decoding from printing. This change is
quite useful to allow us to not immedi...

f7f2154457b5f1c3332a0eff8a52fbcf2f8d610a authored over 2 years ago by Simone Basso <[email protected]>
fix(p/t/shell): make rerun work again

My previous rewrite broke the way in which we set content into
the cache. We need to serialize a...

fa4db397f37938e169e0b52f61be969ed45adea6 authored over 2 years ago by Simone Basso <[email protected]>
feat(tabulatex): implement dumping to JSON

This diff adds to tabulatex support for dumping to JSON. We construct
an ordered dict out of eve...

99a517b78c7b40db340b35a7bdbca45c5b823b4a authored over 2 years ago by Simone Basso <[email protected]>
feat: improve how we emit tables

1. factor Tabular into the tabulatex package

2. teach Tabular to represent itself using tabulat...

0fa429b4e50fa6d00cce248fff17ea0bdbfcf27d authored over 2 years ago by Simone Basso <[email protected]>
feat: start sketching out better python library

1. I have expanded upon previous versions of the library for
parsing OONI data formats, which no...

9f08b4982626f3af3ff69e6dea941442fab2eedd authored over 2 years ago by Simone Basso <[email protected]>
chore(.gitignore): ignore more files and dirs

93ee224fa5116d3da59e08b277365047d6a244da authored over 2 years ago by Simone Basso <[email protected]>
doc(internal): improve README.md

dd1d94aff0471c2b5f7ebfdd8ca97c8937d30291 authored over 2 years ago by Simone Basso <[email protected]>
fix(websteps): tweak previous commit wrt extracting title

I forgot to commit these files that contain changes implemented in the
previous commit that alwa...

45b373eba08a814631171427e8445c6afcc2fc45 authored over 2 years ago by Simone Basso <[email protected]>
fix(measurex): we always want to extract the title

While other options may be changed, I don't see any reason to change
this one from "always (try ...

407fa3fe81dd2a9fb11b8da45e41ef6a12ce9ad1 authored over 2 years ago by Simone Basso <[email protected]>
fix(websteps): make cross-TH comparison right

We need to pit the set of only-probe IP addresses measurements
performed by the TH against the T...

632f27443ab9d94fb05efcf5e0b0c1ce190221e2 authored over 2 years ago by Simone Basso <[email protected]>
fix(websteps): slightly better logging

1a04dcd60fc39414d3fa0c173a8ff86a86fca4eb authored over 2 years ago by Simone Basso <[email protected]>
fix(logcat): more pleasant formatting of logs

8d0400aaa02f4ef19344be997f7957ca73b6404b authored over 2 years ago by Simone Basso <[email protected]>
feat(.gitignore): ignore temporary directories

615618b770d55359c77e66e7b47f7139a55ae2a6 authored over 2 years ago by Simone Basso <[email protected]>