Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/HTTPArchive/bigquery

BigQuery import and processing pipelines
https://github.com/HTTPArchive/bigquery

Merge pull request #88 from HTTPArchive/zero-queries

Make sure Fugu queries show zeros and not blank charts

ce0212a4825e9df50a4f6dd3eb78e99fbddafbaf authored about 4 years ago by Thomas Steiner <[email protected]>
Rename nativeFileSystem.sql to fileSystemAccess.sql

64d22356f4fa317fb7aa9d0f64c954a8438cb8bd authored about 4 years ago by Thomas Steiner <[email protected]>
Rewrite all queries following @rviscomi's sample

Fixes #221

6acc9487fe270d4b93ad4905563315f6e5a67483 authored about 4 years ago by tomayac <[email protected]>
add lazy image metric (#87)

aba1f9c6e76289ff62e053e22721ef1eb8260759 authored over 4 years ago by Rick Viscomi <[email protected]>
response body requestId (#86)

e2394a24cf663a5b614ea3f90c89d6883c2df174 authored over 4 years ago by Rick Viscomi <[email protected]>
bail early if BQ tables exist

773f32c8bfdd6a66901b35623e9692da2181c7df authored over 4 years ago by Ilya Grigorik <[email protected]>
Merge branch 'master' of github.com:rviscomi/bigquery

f4885a13242baea3637a65020f5a5c79fe663997 authored over 4 years ago by Ilya Grigorik <[email protected]>
skip page if too big

6eebb33278f02e98b687b1f532356c66fff47a01 authored over 4 years ago by Rick Viscomi <[email protected]>
increase row limit to 100 MB

9a77fbb621e35aac887946e1c6c3e7f48498bf7b authored over 4 years ago by Rick Viscomi <[email protected]>
rm csv if bq succeeded

03cc422a4fe35f3d84370b3ea10c1ec1d3b055be authored over 4 years ago by Ilya Grigorik <[email protected]>
Exit on bq load error

1787de8e9242deb6096ef4ff4c37dd1162b2fb1d authored over 4 years ago by Rick Viscomi <[email protected]>
Delete CSV artifacts (#85)

8f91f7922bea09c3405b6e5d4e498f4eac666fe8 authored over 4 years ago by Rick Viscomi <[email protected]>
Add Project Fugu report (#84)

* Add a couple of reports

* Add more reports

* Add more reports

* Add even moar reports...

f0eb80de58a04f3e58d7c1d80161048e53b956e4 authored over 4 years ago by Thomas Steiner <[email protected]>
Trivial typo fix

93b3cbd5fa248822a33894d1918c422d6e0d3518 authored over 4 years ago by Thomas Steiner <[email protected]>
download CSVs from GCP storage

5a5f45ae9bbc6eb39bcd8c5df32119d1475ac2e2 authored over 4 years ago by Ilya Grigorik <[email protected]>
Merge branch 'master' of https://github.com/HTTPArchive/bigquery

be1507faa4657766c275671b16b5601e83b49cf1 authored almost 5 years ago by Ilya Grigorik <[email protected]>
tolerate 10 bad records

539b01009c5b7868f9be4b1f8605029e2f58e28e authored almost 5 years ago by Ilya Grigorik <[email protected]>
fix SW feature query (#80)

fixes https://github.com/HTTPArchive/httparchive.org/issues/200

65955f5a3ce61d9f4b3169efdb87c53e2f2c46a5 authored almost 5 years ago by Rick Viscomi <[email protected]>
Avoid regenerating existing reports (#75)

* avoid regenerating existing reports

* negate :)

7cb9ad1429437e245543faf32e8da07333412bad authored over 5 years ago by Rick Viscomi <[email protected]>
avoid FID division by 0 (#73)

* use materialized crux timeseries

* fix division by 0 bug in FID query

97988d5e1d0e3059a3c8304eccf02eda432773f2 authored over 5 years ago by Rick Viscomi <[email protected]>
use materialized crux timeseries (#72)

da5c65bbf63b0c5dcd4810a23700eea16e9f045e authored over 5 years ago by Rick Viscomi <[email protected]>
Merge pull request #71 from rviscomi/corpus

adjust post-processing for monthly crawls

867f4d7a0c38928abdf15a5bdeeef5198be7f9f3 authored almost 6 years ago by Ilya Grigorik <[email protected]>
adjust post-processing for monthly crawls

a92e548e9a46dca649e37f52b1de8137bbdf44db authored almost 6 years ago by Rick Viscomi <[email protected]>
LH a11y timeseries (#68)

107c3ef371f2662a79979b26e998881e9e58d871 authored about 6 years ago by Rick Viscomi <[email protected]>
Update new_metric.sh

3b760bcd0631318fc6b061ae7ede602170146272 authored about 6 years ago by Rick Viscomi <[email protected]>
lightboard histogram metrics (#66)

- bootupJs
- offscreenImages
- optimizedImages
- responsiveImages (deleted)

- a new script...

0691ea5ef318f0589172f3dcb8ed612be3f3fe97 authored about 6 years ago by Rick Viscomi <[email protected]>
slow FCP and FID timeseries (#65)

f6a880f010655e12dce752e7936f7e0468ecbf4e authored about 6 years ago by Rick Viscomi <[email protected]>
Update generate_report.sh

fix strange `CommandException: Invalid command "application/json".` error

ebb54bf4673d17c195a5cc0983a861efb098bc10 authored about 6 years ago by Rick Viscomi <[email protected]>
Update to latest Lighthouse report format (#64)

8192a27784f66dc2167568838b5ef45a8a3444a5 authored about 6 years ago by Thomas Steiner <[email protected]>
Add service worker controlled pages query (#62)

* Add Lighthouse PWA scores report

* Add service worker controlled pages query

* Remove er...

7d65ab701508897f8221119ebc81f1ac69c9dc83 authored over 6 years ago by Thomas Steiner <[email protected]>
Add Lighthouse PWA scores query (#61)

* Add Lighthouse PWA scores report

* Remove nested query, reformat, improve consistency

38c3a390990a52708430ff0f2fae49fcbb0db9ad authored over 6 years ago by Thomas Steiner <[email protected]>
numUrls query (#60)

57db33e29314c504717b9f59c5e3e6381ea6a5d3 authored over 6 years ago by Rick Viscomi <[email protected]>
One more lens join fix (#59)

* Add Drupal and Magento lens queries

* whitelist lenses on GS CORS

* fix lens join

2fda19da2ce130d702f5b8261f3e994313e3152c authored over 6 years ago by Rick Viscomi <[email protected]>
Whitelist lenses on GS CORS (#58)

* Add Drupal and Magento lens queries

* whitelist lenses on GS CORS

bb0b94f46b56f0686712837c3ae3d079c9cd159c authored over 6 years ago by Rick Viscomi <[email protected]>
Add Drupal and Magento lens queries (#57)

344294bdb1e3357df124db63fc9394931e8f2eee authored over 6 years ago by Rick Viscomi <[email protected]>
lightboard metrics (#56)

bc7303a527df477be36a127798bcd3463c7c3054 authored over 6 years ago by Rick Viscomi <[email protected]>
fix lens join replacement (#55)

640ee29058f58aabc82406a8334730b6ebb11620 authored over 6 years ago by Rick Viscomi <[email protected]>
Lens fixes (#54)

* whitelist WordPress lens subdomain

* request queries should alias page to url

make joins...

0c31641707e87b9146f90279e97dae6816a34b38 authored over 6 years ago by Rick Viscomi <[email protected]>
CrUX timeseries (#53)

3025fc1d72662edcf057036f19f78570da109eae authored over 6 years ago by Rick Viscomi <[email protected]>
Add FID to CrUX report (#52)

9fd0c4c5e2f0de6d7de9dd66852cb79c8e87e8a0 authored over 6 years ago by Rick Viscomi <[email protected]>
Fix lens joins for requests queries (#51)

* whitelist WordPress lens subdomain

* request queries should alias page to url

make joins...

a493f6cef4e586d939ed2559496d89df2586a467 authored over 6 years ago by Rick Viscomi <[email protected]>
Added H2 requests SQL care of @rviscomi (#50)

548b136c0272698b9557298b10539a8a9bcfe17c authored over 6 years ago by Jeremy Wagner <[email protected]>
Add localhost to CORS whitelist

649f517f6131e35229bb867f7078f5227fc8615b authored over 6 years ago by Rick Viscomi <[email protected]>
Lens reporting (#48)

* add wordpress lens

* increase imgSavings bin granularity

* add gzip savings

* add spe...

ec5913e268c49b815bd721bd1c57c729f171f752 authored over 6 years ago by Rick Viscomi <[email protected]>
Merge pull request #45 from HTTPArchive/lighthouse

Update queries to handle LH v3.0

99aa05372098ebe0d61a8f2bfb99671d93746a71 authored over 6 years ago by Rick Viscomi <[email protected]>
update to handle LH v3.0

cab82c9b59da247b8c6f3d409abd6f1ebeae71cc authored over 6 years ago by Rick Viscomi <[email protected]>
Merge pull request #44 from HTTPArchive/lighthouse

updated LHR schema

3dc8e29baeb67e4d8bd42ddd4851724121d430d9 authored over 6 years ago by Rick Viscomi <[email protected]>
updated LHR schema

3ac84a1ce8636b5d2f2e2e08c7201a7f25fdd623 authored over 6 years ago by Rick Viscomi <[email protected]>
Merge pull request #40 from rviscomi/apps

write wappalyzer detections to dedicated `technologies` table

148eb3103773e2b3f3fedcda14e6d4a16a94ca34 authored over 6 years ago by Rick Viscomi <[email protected]>
write apps to `technologies` dataset

8b993b7b9d8a08a32f6536145cea2e021cb367c8 authored over 6 years ago by Rick Viscomi <[email protected]>
fix date bug

ad0e0714a3c720464c8a35a52295fc54ee594507 authored over 6 years ago by Rick Viscomi <[email protected]>
fix compileJs query

2262c0defb3e11041a3c1a5ee07e13f890fb14ab authored over 6 years ago by Rick Viscomi <[email protected]>
add www subdomain to cdn cors whitelist

3a749bbb7cc23bfe4f55547d809e41f71105a1c2 authored over 6 years ago by Rick Viscomi <[email protected]>
defend against unusual input

- detection objects may be empty arrays if nothing detected
- apps may have multiple comma-separ...

fc48166627f4beda69e9fd5da7d6e1326d61421c authored over 6 years ago by Rick Viscomi <[email protected]>
write wappalyzer detections to dedicated `apps` table

28d388d3184933b9269f8068ba695183e54500e8 authored over 6 years ago by Rick Viscomi <[email protected]>
Merge pull request #38 from rviscomi/37

use legacy server for csv downloads

979d611f1579dd8362a370c515c145e51d576797 authored over 6 years ago by Rick Viscomi <[email protected]>
use legacy server for csv downloads

f9d28a0f38080ef0145d7f5cda6e6c826ce92d4f authored over 6 years ago by Rick Viscomi <[email protected]>
Merge pull request #33 from rviscomi/lh

handle _lighthouse empty array

9828532a5415904b8899df7ba118789a6345315d authored over 6 years ago by Rick Viscomi <[email protected]>
Merge pull request #35 from rviscomi/34

no spaces in gsutil headers

87d0e435b7d6b013211ecf6ef024bf9e6cfe7374 authored over 6 years ago by Rick Viscomi <[email protected]>
no spaces in gsutil headers

b353ad004df54bd02679e7039a671160e10392c1 authored over 6 years ago by Rick Viscomi <[email protected]>
handle _lighthouse empty array

9737b147bd3cbb2fecddfb390a64203c0edb5476 authored over 6 years ago by Rick Viscomi <[email protected]>
Auto generate reports (#30)

234a4af967fadb651d128b713a2a59ec9469e504 authored almost 7 years ago by Rick Viscomi <[email protected]>
SEO sql and fixcsv script (#29)

4335095126347035c83e71ddf22a43686da5b622 authored almost 7 years ago by Rick Viscomi <[email protected]>
SEO metrics (#26)

a1c34083e6493c8e824e25ba99ecbb5fff782398 authored almost 7 years ago by Rick Viscomi <[email protected]>
Move beta report SQL across repos (#25)

2bfa041ca4a28adfeeb7746ff8d7cb929e9d78e9 authored almost 7 years ago by Rick Viscomi <[email protected]>
Merge pull request #24 from rviscomi/dataflow

fix string equality

a639be4f2513daf4ebe57c4827688539e87f24d8 authored almost 7 years ago by Rick Viscomi <[email protected]>
fix string equality

9b6c1928e50f25fe8b97c31fb72cb007034e2022 authored almost 7 years ago by Rick Viscomi <[email protected]>
redirect BQ output to datasubsets (#23)

har:
- pages
- requests
- response_bodies
- lighthouse

runs:
- summary_pages
- su...

3ae5d0bc903140e1c1e89aa859a1440f8cdc898f authored almost 7 years ago by Rick Viscomi <[email protected]>
Merge pull request #22 from rviscomi/master

Overwrite tables on sync CSV

deebf20a178770153727bf486d0c3dbd1c20e6ed authored over 7 years ago by Ilya Grigorik <[email protected]>
overwrite on sync csv

960cae27a024c3d3d0d1457cc647ffdfc689efc8 authored over 7 years ago by Rick Viscomi <[email protected]>
Merge pull request #18 from rviscomi/dataflow

Update to dataflow 1.9

0489d8e96a7b733e475af5eebfd937f92a20c2f1 authored over 7 years ago by Ilya Grigorik <[email protected]>
Merge branch 'master' of github.com:HTTPArchive/bigquery into dataflow

f44223a29984fc18ec2aa5dd545fcfd2e55504ea authored over 7 years ago by Rick Viscomi <[email protected]>
update to dataflow 1.9

c0e88f754b8003e32bb6bd83729d5c7097945925 authored over 7 years ago by Rick Viscomi <[email protected]>
prepare for dataflow 1.9 and beyond (#9)

* prepare for dataflow 1.9 and beyond

* dataflow

* type annotations and transforms for GCS...

60baae6577f6df4355999fcf0d08cb9b324ca9bf authored over 7 years ago by Rick Viscomi <[email protected]>
whitespace cleanup

0f0a73824847473e1a159bd2c7541cb4c4454e83 authored over 7 years ago by Rick Viscomi <[email protected]>
Merge branch 'master' of github.com:HTTPArchive/bigquery into dataflow

ead5fb6e4d43858362eafa6e0240c2ad1d6531ee authored over 7 years ago by Rick Viscomi <[email protected]>
Write Lighthouse reports to a new *_lighthouse table (#16)

* add lighthouse field to _pages tables

- defaults to empty string if LH property not in HAR ...

1b11d6a40c77fa79b6995d75891700065418b5d4 authored over 7 years ago by Rick Viscomi <[email protected]>
Merge branch 'master' of github.com:HTTPArchive/bigquery into dataflow

27cb2dc5f9d86d8e464ad322d23bd1d22fd69bee authored over 7 years ago by Rick Viscomi <[email protected]>
fix gunzip

9c6009af2d3263bca788f8b465445e6da24f065e authored over 7 years ago by Rick Viscomi <[email protected]>
unzip har

3ac6f8b94dd89914a98c004094d449500edcbf69 authored over 7 years ago by Rick Viscomi <[email protected]>
GcsPathCoder

f8f888af98330b37bf7ec0b77c04239374cc9101 authored over 7 years ago by Rick Viscomi <[email protected]>
decodeHar, unzipGcsFile

6755be17126a687e84e8e8116b148d911ae10ef4 authored over 7 years ago by Rick Viscomi <[email protected]>
Lighthouse support in HAR Dataflow (#14)

* add lighthouse field to _pages tables

- defaults to empty string if LH property not in HAR ...

5f106f94b0e7a0b683b7130dcd41b78b7ed64c09 authored over 7 years ago by Rick Viscomi <[email protected]>
fix har table check (#12)

423aabe5560d0ec2c3fffa22d4386cd9e8ab1d7c authored over 7 years ago by Rick Viscomi <[email protected]>
type annotations and transforms for GCS pipeline

7c347b7be882032b53b073a24090e4dcf27bf475 authored over 7 years ago by Rick Viscomi <[email protected]>
add lighthouse field to _pages tables (#10)

- defaults to empty string if LH property not in HAR file

86469c66d87e3c8bc33e36796e8afc175fe81413 authored over 7 years ago by Rick Viscomi <[email protected]>
Skip redundant steps in CSV sync (#8)

* skip redundant steps
* fix bq show

d47945eec8252a2be19805331a778b17a8dabd60 authored over 7 years ago by Rick Viscomi <[email protected]>
dataflow

64ece6608f8b7d9a49f1097b3859fc1d48801241 authored over 7 years ago by Rick Viscomi <[email protected]>
prepare for dataflow 1.9 and beyond

3085e10b65f48cef464ab3f9af626a5e0deb1967 authored over 7 years ago by Rick Viscomi <[email protected]>
fix crontab

81da79028be344fa9789b05173325b913f31dc47 authored about 8 years ago by Ilya Grigorik <[email protected]>
ignore OSX misc

f881c210f1d98e991a72b422b1a03a5959336138 authored about 8 years ago by Ilya Grigorik <[email protected]>
check presense of pageUrl in HAR pipeline

39d7c549f19cdb20e09d36d8d9e430c3aa24c8d7 authored about 8 years ago by Ilya Grigorik <[email protected]>
version Alexa ranks & join against DMOZ

Closes https://github.com/HTTPArchive/bigquery/issues/3.

250cb5cb69d716cb5e88f91dc16eb52945214659 authored about 8 years ago by Ilya Grigorik <[email protected]>
remove old scripts

16b6ef0e6007f3fdb2a35e0830bdee4473fd01c7 authored about 8 years ago by Ilya Grigorik <[email protected]>
account for JSON escaping when truncating body content

b021c9a5bf29a5ab205d61220a335719778f7114 authored over 8 years ago by Ilya Grigorik <[email protected]>
update project defaults to follow new table convention

388c365c76d1d2c61e0f2ad45a46062d7e984371 authored over 8 years ago by Ilya Grigorik <[email protected]>
add slf4f dependencies

f43cc5c287e129192fbe7a46d5610ae074d658e6 authored over 8 years ago by Ilya Grigorik <[email protected]>
write bodies to a separate table

7f4b3dc92f58a4857a31d4991ca2d5ff177e0ce9 authored over 8 years ago by Ilya Grigorik <[email protected]>
switch to yyyy_mm_dd_type format for table names

8c9539afc0f366f5eff02147164d54e3388c2908 authored over 8 years ago by Ilya Grigorik <[email protected]>
fix get_rules scrips

e6689ca4a7fbd7141a5e583d3b53bf3d7c5dd2a7 authored over 8 years ago by Ilya Grigorik <[email protected]>