Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/openaddresses/batch-machine

Python scripts to download and process a single source
https://github.com/openaddresses/batch-machine

Ditch whitespace I guess?

d927604369307485f2eed8b0c63ff840e571c740 authored almost 10 years ago by Ian Dees <[email protected]>
Work around Shapefiles with occasonal missing geometry

This failure showed up in us-ga-muscogee, maybe others.
Unfortunately I don't know how to create...

36287b2c1fa5c12ad4fcfb361f633d2ba78617a4 authored almost 10 years ago by Nelson Minar <[email protected]>
More test cases for rounding

aedfd049f0e0569c4ff71c41f96a0b0fb15ae0ca authored almost 10 years ago by Nelson Minar <[email protected]>
Round lat/lon to 7 places, remove .0 from street numbers.

Also fix a harmless misuse of rstrip() in EPSG code conversion.
See issue #31

c369a20ba8417eb935585636e51a4f4e27db613e authored almost 10 years ago by Nelson Minar <[email protected]>
Made a few small tweaks

f122dd7708cf0c69b1b6c8c84c94ee857420c38b authored almost 10 years ago by Michal Migurski <[email protected]>
Reorganized and tested file extension guessing in openaddr.cache

2e5f10a56a16d294d75b671452d5c862f989d6f7 authored almost 10 years ago by Michal Migurski <[email protected]>
oops

ecf0004f85e2165e205df720ae788b2ab3ca4121 authored almost 10 years ago by Michal Migurski <[email protected]>
Added handling for file:// URLs when testing

e19f78245cc23f5d2e3006d786948028a13d53da authored almost 10 years ago by Michal Migurski <[email protected]>
Added file magic check for downloaded files

cc5612ee9da2d49ce38e0faed6a974a10db8a184 authored almost 10 years ago by Michal Migurski <[email protected]>
Tweaked logger name setup

3b9df6e5d1eb459a0b477ca6668c148eddf8ea1e authored almost 10 years ago by Michal Migurski <[email protected]>
Handling Content-Disposition reponses from new Socrata hotness

4c1c5b97b153e604e3356a12098440bd08b2e08c authored almost 10 years ago by Michal Migurski <[email protected]>
Tweaked error logging for command line output

1f6d0e899a897c15dd5f59793bebfd729bf5758d authored almost 10 years ago by Michal Migurski <[email protected]>
Removed thread_work() from cache() to match conform()

2f9b90e93e3a8f13bd1dead62c77167e4a356b63 authored almost 10 years ago by Michal Migurski <[email protected]>
Fixed S3 bucket creation

8a040719a17a1af96996fddfa3e932f4d867860e authored almost 10 years ago by Michal Migurski <[email protected]>
Oops, make sample logging less spammy.

25092e8da39f35d5b81266ed472ea729c2257e29 authored almost 10 years ago by Nelson Minar <[email protected]>
Make excerpt failures non-fatal.

Helps with issue #35, doesn't solve it.
Also be a bit more verbose in logging progress.

97f0968f9ea1fbd8a139e7120ae568cc64bdbe71 authored almost 10 years ago by Nelson Minar <[email protected]>
Special case expansion so "3RD" becomes "3rd".

Fixes issue #37

9f1151fd57bca46eaba1f68b0cfb9c667357dc0d authored almost 10 years ago by Nelson Minar <[email protected]>
Add support for GML files (conform type: xml)

Just uses OGR to do the parsing. Note that OGR has this habit
of creating .gfs files next to the ...

bbd215935f36adf81971c9409531c94a50e4c514 authored almost 10 years ago by Nelson Minar <[email protected]>
Handle CSV with variable cols. Smash advanced_merge case.

Fixes issue #36, improperly quoted Korean sources that have
some rows with commas where they don...

26f24fa69300bd8c83db1512b0421ac1e099c7e7 authored almost 10 years ago by Nelson Minar <[email protected]>
CSV SRS reprojection, also case mismatches in CSV headers.

CSV files are now reprojected when extracted, to deal with a
few unusual sources that have the c...

e70342d910222afe958208aaa6d2d3e8c7ce83bf authored almost 10 years ago by Nelson Minar <[email protected]>
Support and tests for SRS tag, for shapefiles without .prj files.

884b6ad84b04eefcb3fada2844912a3ac611aea2 authored almost 10 years ago by Nelson Minar <[email protected]>
Implement skiplines and headers CSV tags

It's still a partial implementation, but handles the only case we
have now for CSV files that hav...

b2250acc47d1889e1ec0d4d8520dff195e3da2b6 authored almost 10 years ago by Nelson Minar <[email protected]>
Implement CSV headers = -1

0d9542968a28e46d8c0bf186777b09e56f6c9764 authored almost 10 years ago by Nelson Minar <[email protected]>
Support advanced_merge. Also add a thorough test for a Japanese CSV source.

Replaced the lake-man-jp.csv test data with a sample from live data.

15387dc3a8850316d05a89d37f279510925d2ef9 authored almost 10 years ago by Nelson Minar <[email protected]>
Add FTP support. See issue #30.

dc6adf590cd95fb14679f77bacc84b08267851f4 authored almost 10 years ago by Nelson Minar <[email protected]>
Support CSV files in encodings other than UTF-8

d3a545e1d3396258f91bad354ddb1295811d916b authored almost 10 years ago by Nelson Minar <[email protected]>
Factor out test strings for reuse.

More ambitious Unicode CSV test.

9c6de7a857fc5d7e916b4cf4f51fa4fd40a1d0c1 authored almost 10 years ago by Nelson Minar <[email protected]>
Bumped to version 1.3.1 with expanded setup.py description

904b3a6ec92894b0dcd3143533e1cf9b47201c9a authored almost 10 years ago by Michal Migurski <[email protected]>
Update setup.py

Adding `description` to `setup.py`

42adbdd745bb6d8ac65af04f355b6e5780098711 authored almost 10 years ago by Low Kian Seong <[email protected]>
Bumped to version 1.3.0 with support for ISO 3166 "code" tag

775bd6fdbd56565dd13d84c09b6ea4fc58583587 authored almost 10 years ago by Michal Migurski <[email protected]>
Support for csvsplit.

Also a new test suite for source CSV conversion.

82e99aea4488824398fd7c98bd85f73eabd322b5 authored almost 10 years ago by Nelson Minar <[email protected]>
Fix small bug in CSV unimplemented feature detection.

82afde10516319f6d92a15a48f01839d5e1da7a0 authored almost 10 years ago by Nelson Minar <[email protected]>
Fix bug in state writing if the file failed to download.

I introduced this bug in e7fc483ccf9b; this code path is not tested.

ff81f99ce29ab314f2cc3061a83c633f343337a9 authored almost 10 years ago by Nelson Minar <[email protected]>
Basic CSV source support; doesn't implement all CSV processing tags.

Also change how the extraction code works. The extract schema now
requires the lon/lat be written...

5d77eba6426217cf199944681c8f764bddbf5663 authored almost 10 years ago by Nelson Minar <[email protected]>
Support GeoJSON sources. Reuses existing OGR code for shapefiles.

0857b6a9b3c0d211626342bcdd844be238feeb02 authored almost 10 years ago by Nelson Minar <[email protected]>
Implement "file" tag for sources with multiple shapefiles.

Remove last vestiges of the conform code gamely trying to convert
everything it could find. This...

ce6d9a28b0273a769b8fe3f1d48c78a9a176389d authored almost 10 years ago by Nelson Minar <[email protected]>
Change the log level on download notification

a49163b589c1aa7fc926e38270ad730e0ce5e585 authored almost 10 years ago by Nelson Minar <[email protected]>
Remove last vestiges of invoking Node conform in tests.

Also reorganize the conform tests a little bit.

4329af89c5e6e59a7ab910b79cfdee46d17597e0 authored almost 10 years ago by Nelson Minar <[email protected]>
Change default log level to VERBOSE. It's quiet in here!

Add a --verbose flag to command line tools to show DEBUG.
See Issue #29 for details.

dcbafda8d9e59ccc3b6587951d87dc9aa5dc3dab authored almost 10 years ago by Nelson Minar <[email protected]>
Switch all code to _L style of logging. See Issue #29 for details.

a357428639c83444fc5715a9c6721d6a4bf24961 authored almost 10 years ago by Nelson Minar <[email protected]>
Turn off logging when running tests.

It can still be enabled with "-l" or a config file.

71f219f8f50c2b992c60be089f774950ce58b8cd authored almost 10 years ago by Nelson Minar <[email protected]>
Refactor test boilerplate

e1a84a5661c4ca0ab2ab7c472100fb1a4f86933e authored almost 10 years ago by Nelson Minar <[email protected]>
Add test for a non-ASCII shapefile.

d8929d60870a35f572673f5f8999948c9db0c244 authored almost 10 years ago by Nelson Minar <[email protected]>
Bumped to version 1.2.0 with addition of ISO-3166-2 support

0116ad16268bab92b07d79fbb3ddb77493acdd65 authored about 10 years ago by Michal Migurski <[email protected]>
Removed unprojected US shapefiles

5d49d1ed515ee23f1299f94bc6d06dd28495170b authored about 10 years ago by Michal Migurski <[email protected]>
Added rendering support for Admin 1 subdivisions

cbd7e807ab567c579ffd310da6206a6836c75847 authored about 10 years ago by Michal Migurski <[email protected]>
Added geodata for Admin 1 subdivisions with ISO-3166-2 codes

Based on Natural Earth v3.0, Generated data using fuzzy matching
script from https://gist.github...

3166dabe80a10f0a45749a4e70ceff9d6df43998 authored about 10 years ago by Michal Migurski <[email protected]>
Added sample test files based on OpenAddresses sources/jp-fukushima.json

2bcf640331362629b68216bbfb4edc3c3ba0c499 authored about 10 years ago by Michal Migurski <[email protected]>
Added sample test files based on OpenAddresses sources/es-25829.json

5cacf6389bd313adcc65ad5f589f2007f4d172fd authored about 10 years ago by Michal Migurski <[email protected]>
Configure with ~/.openaddr-logging.json. Revert file/lineno change.

190c044008c030b0b9c13f6d50c73a9eb74590bb authored about 10 years ago by Nelson Minar <[email protected]>
Refactor setup_logger. Add file and line number to log format.

8e80c1c00bfbce067abaabe6efadb957ac709b05 authored about 10 years ago by Nelson Minar <[email protected]>
Convert to _L style of module-level logger object.

26714deeeee36857173a7c8d7345e766c8aa2778 authored about 10 years ago by Nelson Minar <[email protected]>
Cleaned up a few post-thread_work() holdovers from conform()

609f2ae8686d297413ff42c12561d81db6608421 authored about 10 years ago by Michal Migurski <[email protected]>
Expand abbreviations in street names, alter case.

Code patterned after the Node code.

e49772ce896e731df3312deb05fa05562d23b467 authored about 10 years ago by Nelson Minar <[email protected]>
Don't crash when splitting a column that only has one token in it.

bddc46a9e15038fb89bfd6b954366d2eaf2359b3 authored about 10 years ago by Nelson Minar <[email protected]>
New approach to mismatching field name case.

Now we downcase all fields named in conform spec.
This change reimplements the work in 6ca127f1cee

caad468e5c333774dc66db8bdaedca1c1f3ded6c authored about 10 years ago by Nelson Minar <[email protected]>
Accept shapefile-polygon sources along with shapefile

0af6ea3f604b2baff3d663a684d044a9e56dcdeb authored about 10 years ago by Nelson Minar <[email protected]>
Handle conform specs that don't match cases of fieldnames.

6ca127f1cee5a593ee7454f0b514e1c1d6fa72e3 authored about 10 years ago by Nelson Minar <[email protected]>
Switch machine to use new Python conform code.

This change got complicated because the old Python conform code always
created a single CSV file,...

e7fc483ccf9bdaa09dfaf716f1af40a486659350 authored about 10 years ago by Nelson Minar <[email protected]>
Remove forking of subprocess for conform.

No longer needed with new task architecture.
Note: most of this diff is just reindenting existing...

0895e0655448de2afee9fca944a1ae862465a348 authored about 10 years ago by Nelson Minar <[email protected]>
Rename transform functions to avoid abbreviations.

a4cf70872a30678bfc7c5bcc2d10298307bd5f8e authored about 10 years ago by Nelson Minar <[email protected]>
Working merge, split with tests.

Migrated tests of synthetic sources from test.py into conform.py
Wrote some minimal unit tests fo...

33d95bb091fec762f0e5bdb04fedf8a70f31c465 authored about 10 years ago by Nelson Minar <[email protected]>
Refactor code for row-level transformations of data. Add tests.

9ad344a304fedece275f0a1c5672400ba25a8430 authored about 10 years ago by Nelson Minar <[email protected]>
Port the first Node conform test to my new Python code

Create a new test suite inside openaddr/conform.py
Remove old node & python test for lake-man, m...

ade92967fc08c3abcae8f8e9ce0b173ab48eec49 authored about 10 years ago by Nelson Minar <[email protected]>
Street and number attribute tags for conform command line tool.

Change the old conform CSV schema a bit; writes X and Y instead of centroid.
With this commit the...

8699f761f422816ace89bd604020e9b32221b7be authored about 10 years ago by Nelson Minar <[email protected]>
Require unicodecsv module and switch conform to use it.

The stock Python 2 CSV module doesn't do well with Unicode, this
shim helps. The Python 3 CSV mo...

b6db84b0d402983da40b6ab75228dd8588d99a1f authored about 10 years ago by Nelson Minar <[email protected]>
Fixed README

abb3cf2619398408a33d8a02418859ea10718926 authored about 10 years ago by Michal Migurski <[email protected]>
Renamed openaddr.process to openaddr.process_all

a3da841b8ab43986571996b62cbafdd8b10dc3af authored about 10 years ago by Michal Migurski <[email protected]>
Renamed openaddr.process2 to openaddr.process_one

880856b6c712b113faad2cc901209835aea0bb19 authored about 10 years ago by Michal Migurski <[email protected]>
Refactor ConvertToCsvTask, extracting a shp_to_csv method.

d07e1f8ccb3fc08c29c9ed26ccacd285c63e286b authored about 10 years ago by Nelson Minar <[email protected]>
Use underscores, not camel case. Also add a required command line argument.

a1c8f3b510e00d9c601d4195f72538f00a493e19 authored about 10 years ago by Nelson Minar <[email protected]>
Refactor a bit where the filenames are created.

8a3d4e5710013f8b96908cb996591fba668862aa authored about 10 years ago by Nelson Minar <[email protected]>
skeleton openaddr-pyconform command line tool

f49023c90a4d86a0b3923bad84805eb95ca5cdab authored about 10 years ago by Nelson Minar <[email protected]>
Consolidated S3 upload code to upload_file()

3d283608158b9962fef93ef37f5bfae45f5ba627 authored about 10 years ago by Michal Migurski <[email protected]>
Changed S3 key names to match existing behavior

6fd41d630e5388b3f0d6d8e46ee17a57ad59f49d authored about 10 years ago by Michal Migurski <[email protected]>
Moved definition of 'out' directory to process.process()

6f139bec3e187c6d52eee71a43dcce722770c994 authored about 10 years ago by Michal Migurski <[email protected]>
Added temp dir cleanup to test tear-downs

239a3c63322df67d76971b969d0fae009b0e433d authored about 10 years ago by Michal Migurski <[email protected]>
Located process2() temp dir inside destination dir

fa08a79e4d5d3a5184d7256ee0c73bf08743bcf1 authored about 10 years ago by Michal Migurski <[email protected]>
Located cache() and conform() temp dirs inside destination dir

f241f15e239cedc45d970d1f3dd30e3801c90563 authored about 10 years ago by Michal Migurski <[email protected]>
Trying to track down the mystery Travis error

b436413c5e7b95e72f86d79aded3da369136aafe authored about 10 years ago by Michal Migurski <[email protected]>
Replaced StringIO with BytesIO for Python 3

49f0e06fd4bab2ca215e45521bfac9dac3f78d72 authored about 10 years ago by Michal Migurski <[email protected]>
Replaced six with future after discussion in issue #17

c6d2d78ab9fd1fc792674ea5422fbb249153c75e authored about 10 years ago by Michal Migurski <[email protected]>
Removed more unused code

6f3fbbb547d0499a66483f8020979befc7b33830 authored about 10 years ago by Michal Migurski <[email protected]>
Switched print statement in process2

bfaf44e606e50afd71dbb8bd5fb76363a3aa2f45 authored about 10 years ago by Michal Migurski <[email protected]>
minor fixes for encoding, one exception

a6a5f27fc4c69e20b5b990ec7804cf258ea55864 authored about 10 years ago by Nelson Minar <[email protected]>
open pickled files in binary mode

Conflicts:
test.py

342bd3a79b9627c3d9f148bdc6ba53a1736f2de7 authored about 10 years ago by Nelson Minar <[email protected]>
python3 can now parse the code from test.py

Conflicts:
openaddr/__init__.py

b1855df6567aa9fdafd99890d008874ba373a376 authored about 10 years ago by Nelson Minar <[email protected]>
Removed some unneeded imports

d5d4089108cd23ae4c688b75fb6b40eb74104d46 authored about 10 years ago by Michal Migurski <[email protected]>
Simplify logging config; all messages now logged to a 'openaddr' logger

2cd120d5669ca884dc01c2ce2881e206c158e511 authored about 10 years ago by Nelson Minar <[email protected]>
Removed excerpt() and ExcerptResult, issue #23

d102e1e5b64deffe67cdc534dfaed3128971ec2a authored about 10 years ago by Michal Migurski <[email protected]>
Removed references to ExcerptResult from process2

25ae09b5994ab06e48692458d80ba8720424ce7c authored about 10 years ago by Michal Migurski <[email protected]>
Added geometry_type to ConformResult

65bb6d54a55a09c3b4bb955db7ac443d52b7036b authored about 10 years ago by Michal Migurski <[email protected]>
Awkwardly removing references to excerpt() for issue #23

3eb3e97642bb59e1508440761059018a6de2c9ba authored about 10 years ago by Michal Migurski <[email protected]>
Moved sample_geojson() use to ExcerptDataTask.excerpt()

33ec0cdef8530f6f90c0adf1256bdc1191170620 authored about 10 years ago by Michal Migurski <[email protected]>
Excised S3 entirely from cache() and conform(), closes #22

2defbac82938aabd01e6c4c9c0041ad3da628328 authored about 10 years ago by Michal Migurski <[email protected]>
Removed S3 dependency from conform() for issue #22

94caf1fbea714533d6f1f235f8a6a515bde3b5a5 authored about 10 years ago by Michal Migurski <[email protected]>
Removed S3 dependency from cache() for issue #22

ddace43b000dd92a216dd7d0d860da61a6028dd8 authored about 10 years ago by Michal Migurski <[email protected]>
Cleaned up piles of unneeded code

fc2fff8eb87f5ab492189b0fdf9c481bd143c50d authored about 10 years ago by Michal Migurski <[email protected]>
Added post-process upload and passed all tests

f1191b875c77820877e4c966dddba80a68bf703c authored about 10 years ago by Michal Migurski <[email protected]>
Collected individual output states into one list

09655c9999b4aca269a2f5d2be43310774b051c0 authored about 10 years ago by Michal Migurski <[email protected]>