Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

libpostal

libpostal is a C library for parsing/normalizing street addresses around the world
Collective - Host: opensource - https://opencollective.com/libpostal - Code: https://github.com/openvenues/libpostal

[fix] key checks for Quattroshapes cities, removing city in non-local language case

github.com/openvenues/libpostal - 2e0f35b13a58b96258cd74c829ef981d1089d30f authored about 9 years ago by Al <[email protected]>
[fix] argument order

github.com/openvenues/libpostal - 3eea35535274ad912ca4c9fd430c5850d3a06fc5 authored about 9 years ago by Al <[email protected]>
[fix] import again

github.com/openvenues/libpostal - 51f6a827275e27dec7276062c999aa86fc485369 authored about 9 years ago by Al <[email protected]>
[osm/formatting] Adding in cities from Quattroshapes/GeoNames in the case of non-local languages or in general with a small random probability

github.com/openvenues/libpostal - 283098607331510e62f2184877d498a5369ace77 authored about 9 years ago by Al <[email protected]>
[fix] only care about levels in Quattroshapes index, not Zetashapes

github.com/openvenues/libpostal - b0667d003241eea1a59f1025a1e2969c2735e14a authored about 9 years ago by Al <[email protected]>
[fix] Same in neighborhoods reverse geocoder lookups

github.com/openvenues/libpostal - 0eb0042826d22a3add44381cee362942afcedaf0 authored about 9 years ago by Al <[email protected]>
[fix] same options for geohash-based index

github.com/openvenues/libpostal - 4170f6e9e3d8b10c6d5f3f3aa7e359ec0df6188c authored about 9 years ago by Al <[email protected]>
[fix] Quattroshapes neighborhoods index uses geohashes for slightly better coverage

github.com/openvenues/libpostal - 4cff1f8a9d62a0e442281e22a2cdc2260d90b360 authored about 9 years ago by Al <[email protected]>
[polygons/quattroshapes] Converting Quattroshapes lookups to an R-tree index

github.com/openvenues/libpostal - 98d8054a2b147ab3c263d4c5c01d4ad0bcb6d10a authored about 9 years ago by Al <[email protected]>
[polygons/quattroshapes] Removing local admin and neighborhoods from the Quattroshapes reverse geocoder since they're covered in neighborhoods

github.com/openvenues/libpostal - bd88628a98fccf9e23d624ed88173a69028fd3ac authored about 9 years ago by Al <[email protected]>
[polygons/osm] Switching back to buffer(0). Still destroys many polygons, may need to look into another solution

github.com/openvenues/libpostal - 40d18aa7f67eeb980dd87be532d86d6c6d364309 authored about 9 years ago by Al <[email protected]>
[polygons/osm] Ommitting last node in every way of a connected component since that node is equal to the start node of its neighbor

github.com/openvenues/libpostal - a50c971732de0316bc677b8b81a2196b6e9d913a authored about 9 years ago by Al <[email protected]>
[geonames] Adding ability to lookup GeoNames alternate names (may obtain IDs from Quattroshapes). Not great for local-language primary names (OSM remains the best) but decent for extracting foreign toponyms

github.com/openvenues/libpostal - d6d5eab9890664f80ffc1303d02aa40ed3c4accc authored about 9 years ago by Al <[email protected]>
[fix] add country randomly in the formatted language training data in cases where country is not present

github.com/openvenues/libpostal - 3217fa39cd4312833f47003c721e7274d43909c8 authored about 9 years ago by Al <[email protected]>
[fix] Python float precision doesn't appear to be the problem

github.com/openvenues/libpostal - 1a6618957bceffa6dafd96edcaeac7908a43ccf0 authored about 9 years ago by Al <[email protected]>
[fix] For countries like Denmark, removing country with a smaller probability

github.com/openvenues/libpostal - 5781813cbd48ee32ae21336ee572f7a786dc989a authored about 9 years ago by Al <[email protected]>
[fix] sparsity of country tags should be enough for language address training data

github.com/openvenues/libpostal - e4b8349d98086f394f5c72eb9201c7ee49bcac57 authored about 9 years ago by Al <[email protected]>
[fix] Cutting down training repeatedly on country names

github.com/openvenues/libpostal - 824c779107ab31b27c2bfd7be543c96910bdd803 authored about 9 years ago by Al <[email protected]>
[fix] country formatting in language address training data

github.com/openvenues/libpostal - 88529d28e20a59ed6af4b2d47a37a4099493c951 authored about 9 years ago by Al <[email protected]>
[fix] not requiring minimal keys in format language data

github.com/openvenues/libpostal - cd74fcda3c7c581e28d2d9ae1bc2a1bb5f4ed130 authored about 9 years ago by Al <[email protected]>
[osm] Adding new localized country names in anguage training data for formatted addresses

github.com/openvenues/libpostal - 8c422a6e611b13cfe4c4a7d8e71ad5c5ead1b64b authored about 9 years ago by Al <[email protected]>
[fix] Removing house numbers from formatted address language training data, using a simple whitespace splitter

github.com/openvenues/libpostal - e40ca0bb8933ab229c47d91795d5548c3bd54931 authored about 9 years ago by Al <[email protected]>
[osm] Trying fixed-point precision in converting OSM coordinates to avoid issues with polygon self-intersection when the lines are very close together (e.g. parts of Berlin, UK country polygon)

github.com/openvenues/libpostal - a92cbb80037594d7670fb3f99fcd4689ad2e54aa authored about 9 years ago by Al <[email protected]>
[fix] limited addresses

github.com/openvenues/libpostal - e75c1ce8603d63c777cf401f11b2a626dab8930b authored about 9 years ago by Al <[email protected]>
[fix] argument validation in OSM training data script

github.com/openvenues/libpostal - 94039f98ad9fcbd46ed6183e9e9a87a1316fae25 authored about 9 years ago by Al <[email protected]>
[polygons] Trying a slightly higher value for buffer() as suggested by this issue https://github.com/Toblerity/Shapely/issues/277

github.com/openvenues/libpostal - de9f3120c8e573a3a1f7fbcedb2c3bf136fceb09 authored about 9 years ago by Al <[email protected]>
[osm] Using OSM namespaced tags from polygons in the case of non-local languages

github.com/openvenues/libpostal - 6d20d7348f67ebbf2359217869a3ea2d64a71d8e authored about 9 years ago by Al <[email protected]>
[fix] ISO code and simple/international name checks should be on the polygons

github.com/openvenues/libpostal - e46e1a93a0119dad30462ce792c97975a97b30dc authored about 9 years ago by Al <[email protected]>
[fix] Making country replacement probability independent of the probability used for local vs non-local languages

github.com/openvenues/libpostal - eb7488ab556a4dcb4145b0ce4d4bf77dfba5cda8 authored about 9 years ago by Al <[email protected]>
[fix] var, non-local languages

github.com/openvenues/libpostal - f4f7cceba28cdcd6c43cd0c512be167fe54f8fdb authored about 9 years ago by Al <[email protected]>
[fix] Moving is_in:country to lower priority

github.com/openvenues/libpostal - 6aa640b5f007b0af7ab3190f054ea59c8864cc72 authored about 9 years ago by Al <[email protected]>
[osm] Using name:simple and int_name to capture more variations for US addresses, adding ISO codes occationally instead of names

github.com/openvenues/libpostal - 2b1c346fde2fcccc159670b6382c1e7ea087df1c authored about 9 years ago by Al <[email protected]>
[osm/formatting] replacing keys with the highest priority so addr:* tags take precedence over is_in:* tags

github.com/openvenues/libpostal - f1b6620369bcf7546e08efb2a832d73f7dcb0a33 authored about 9 years ago by Al <[email protected]>
[osm] Shortening state names obtained from reverse geocoding for relevant countries

github.com/openvenues/libpostal - 2695b5dd2641fd27f13b82b7b205a1cf9c8b47e9 authored about 9 years ago by Al <[email protected]>
[osm] Change probabilities for country names

github.com/openvenues/libpostal - 8b035814c7e926aa02266f281af79adc5537469a authored about 9 years ago by Al <[email protected]>
[fix] non-integer admin levels

github.com/openvenues/libpostal - 04183c672e1fb34b68445b6c74c4143025098bd4 authored about 9 years ago by Al <[email protected]>
[fix] another issue with tokenize API

github.com/openvenues/libpostal - efa0e38e4525a4e0ea0a6c78ad77aa429edcc1ce authored about 9 years ago by Al <[email protected]>
[fix] using new pypostal tokenize API

github.com/openvenues/libpostal - ce065bb9ecb09e076c6998105b30862905841d2e authored about 9 years ago by Al <[email protected]>
[fix] reverting to old Rtree index filename

github.com/openvenues/libpostal - f77ddc71e7644ea0c4c86e6c43b6d227400a70e2 authored about 9 years ago by Al <[email protected]>
[fix] default arg again

github.com/openvenues/libpostal - 4f0d6fbf796fdb4e6b966edafcfde643c3f13948 authored about 9 years ago by Al <[email protected]>
[fix] doc and default arg

github.com/openvenues/libpostal - 4cc275e3132fc56e8b04dbb45804983c6b59eb4c authored about 9 years ago by Al <[email protected]>
[osm/formatting] Adding OSM polygon lookups and neighborhood polygon lookups to the training data in order to provide more variations for the model to work with

github.com/openvenues/libpostal - c8f47b38a2938c62c5042a488525d1b5cb3cd02e authored about 9 years ago by Al <[email protected]>
[fix] OSM reverse geocoder polygon ordering

github.com/openvenues/libpostal - 9fc60600dd502ea371876f383f35012b8133488b authored about 9 years ago by Al <[email protected]>
[polygons] OSM reverse geocoder sort levels

github.com/openvenues/libpostal - 130518fe586627deed3cf2f8a60c924e4ddd4944 authored about 9 years ago by Al <[email protected]>
[osm] Adding global keys which map to OSM address components

github.com/openvenues/libpostal - b948a8ebd84495251762b28831b85ab79f3cbad1 authored about 9 years ago by Al <[email protected]>
[formatting] Adding city_district and state_district tags to address formatting templates where it makes sense. These will not be in all addresses, tags can be added and removed from the training data with certain probabilities

github.com/openvenues/libpostal - 85667997cd56c4bccd0e43a579fabc193d25f4f2 authored about 9 years ago by Al <[email protected]>
[formatting] Adding configs for a few dozen countries mapping OSM admin level to an address formatter field

github.com/openvenues/libpostal - 470bd17c074490b203f93cbe23dfae6a43d9006a authored about 9 years ago by Al <[email protected]>
[osm] Adding a few more boundary types to planet admin borders

github.com/openvenues/libpostal - 946bce1cb9bb5061fd79641cf6a7136a82f90781 authored about 9 years ago by Al <[email protected]>
[formatting] Adding OSM address components lookup by country

github.com/openvenues/libpostal - b3ef8ded1210ec6b77659d180846763d797fee99 authored about 9 years ago by Al <[email protected]>
[formatting] Adding city_district as a separate format tag

github.com/openvenues/libpostal - 0b74039a6ab12103d1649ee8f7a62dc0ce66c1c1 authored about 9 years ago by Al <[email protected]>
[fix] Reverting last two changes, have to fix on the OSM side

github.com/openvenues/libpostal - 48a305c8c4672119f1e241b44db7757d0ff68009 authored about 9 years ago by Al <[email protected]>
[polygons] Only fixing polygons in cases with inner rings

github.com/openvenues/libpostal - 90773294b985440836a85407c992b246c218db45 authored about 9 years ago by Al <[email protected]>
[polygons] Eliminating fix_polygon

github.com/openvenues/libpostal - 477300c06146ba195fac22a2a95cee08ce2935f2 authored about 9 years ago by Al <[email protected]>
[polygons/neighborhoods] Not counting local admin polys unless they match OSM, fix for Paris arrondissements

github.com/openvenues/libpostal - 1dbfc6a87bacaccc49a650459ebfe507f3eae0a3 authored about 9 years ago by Al <[email protected]>
[fix] Don't need Quattroshapes dir for OSM Rtree

github.com/openvenues/libpostal - 4fdaef26384e5953596c3a3b2e89a4817ef2b8bd authored about 9 years ago by Al <[email protected]>
[fix] default index path

github.com/openvenues/libpostal - e9a6ea1d7210bf545f12f4de553bb4d276930538 authored about 9 years ago by Al <[email protected]>
[fix] Removing some debug code

github.com/openvenues/libpostal - 0227d9335f7d01c43267fb10d3306c7978648a0e authored about 9 years ago by Al <[email protected]>
[fix] command-line arg II

github.com/openvenues/libpostal - 5882c2d64b944a5f5d40cd44758739148703598d authored about 9 years ago by Al <[email protected]>
[fix] command-line arg

github.com/openvenues/libpostal - 66f8a2dc9e8e1e28681119c970acf9e3884d079d authored about 9 years ago by Al <[email protected]>
[fix] argument default

github.com/openvenues/libpostal - e5d8812504574cb1aedd6382366930f463ceaca4 authored about 9 years ago by Al <[email protected]>
[fix] encoding yet again

github.com/openvenues/libpostal - 8166cd66c8fda49aa829890a6347abe4cf387a68 authored about 9 years ago by Al <[email protected]>
[fix] encoding, different file

github.com/openvenues/libpostal - f473ff0dad356601c90b0f35663d9561e1829c71 authored about 9 years ago by Al <[email protected]>
[fix] file encoding

github.com/openvenues/libpostal - a2eb40109cc6029367c0c0b63586af11e57e48f5 authored about 9 years ago by Al <[email protected]>
[polygons/osm] Adding a unified neighborhood reverse geocoder incorporating Zetashapes, OSM and Quattroshapes. Uses the new Soft TFIDF implementation to approximately match OSM names to Quattroshapes/Zetashapes names and geohash indices for more coarse point-in-polygon tests (OSM neighborhoods are stored as points not polygons, so need to match with a geometry from the other sources)

github.com/openvenues/libpostal - 3e43ac725524d2839cac238423a0fe3f6669bf43 authored about 9 years ago by Al <[email protected]>
[similarity] Adding NameDeduper base class for deduping geographic names using the new Soft TFIDF similarity

github.com/openvenues/libpostal - a38624ba59ae69ca63fa6e063cfda364fb2cc172 authored about 9 years ago by Al <[email protected]>
[similarity] Adding Jaccard similarity with word frequencies instead of simple sets, better for ideographic scripts (Han, Hangul, etc.) in the absence of word segmentation since there may be many high frequency characters

github.com/openvenues/libpostal - a5c12960449562b16a6c2e2c3fdbdd694770e205 authored about 9 years ago by Al <[email protected]>
[python/normalize] importing options from the C module

github.com/openvenues/libpostal - cbeb08f1d16c978efb9197ea41324cf65589fb16 authored about 9 years ago by Al <[email protected]>
[similarity] Using Soft-TFIDF for approximate name matching. Soft-TFIDF is a hybrid string distance metric which balances local token similarities (using Jaro-Winkler similarity by default) allowing for slight spelling errors with global TFIDF statistics so that very frequent words don't affect the score as much

github.com/openvenues/libpostal - cccc3e9cf5bd35d6e5048a4e25125c7982d5be42 authored about 9 years ago by Al <[email protected]>
[python/normalize] Adding remove parentheses options in Python normalize (would require compiling with the scanner to do it from C, but could switch)

github.com/openvenues/libpostal - e7f783477f5da1c5ce7c0ac0b49ef8b71f830ef0 authored about 9 years ago by Al <[email protected]>
[similarity] Adding an in-memory IDF index for weighted similarities

github.com/openvenues/libpostal - 5076c0409b32e3d646b9390beb6eca3cf5fb7007 authored about 9 years ago by Al <[email protected]>
[osm/formatting] Adding is_in tags to the address formatter as they're common in OSM, aliasing addr:district to state_district instead of suburb

github.com/openvenues/libpostal - 1c543a5271b96653d2c27e4ab0d58605ee5d1168 authored about 9 years ago by Al <[email protected]>
[osm] Adding a list of various OSM name tags obtained from Nominatim

github.com/openvenues/libpostal - c7df3fcb3aba329fcba2b5893532ba8d63b177fd authored about 9 years ago by Al <[email protected]>
[fix] using tokenize_raw API

github.com/openvenues/libpostal - cee9da05d61b0472a04d9a319557a4e41057d908 authored about 9 years ago by Al <[email protected]>
[polygons] Polygon area calculations

github.com/openvenues/libpostal - 110451d6d60a39ecdf6027f13a26fb5449a6f179 authored about 9 years ago by Al <[email protected]>
[polygons] Changing language polygon index to use new index_polygon method

github.com/openvenues/libpostal - e946e63222689dd8ad021358b6d92329ad83031f authored about 9 years ago by Al <[email protected]>
[polygons] Adding a geohash polygon index which selects a prefix size based on the area of the polygon's bounding box

github.com/openvenues/libpostal - 5fdbb7e832ff9ba21dd8b1fe5309236d402dccda authored about 9 years ago by Al <[email protected]>
[dictionaries] adding Jnr and Snr forms for generational suffixes

github.com/openvenues/libpostal - 094a5bf5f46a765ddb86c595b8611a65c7402bca authored about 9 years ago by Al <[email protected]>