Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

libpostal

libpostal is a C library for parsing/normalizing street addresses around the world
Collective - Host: opensource - https://opencollective.com/libpostal - Code: https://github.com/openvenues/libpostal

[build] setup.py include/library dirs

github.com/openvenues/libpostal - d8f731b6727f208cbf98f98f584e2f1754d0ed47 authored about 9 years ago by Al <[email protected]>
[python] libpostal includes

github.com/openvenues/libpostal - faf8b005967bb9a504459a84ed248708c93a1e27 authored about 9 years ago by Al <[email protected]>
[build] Adding include and library dirs based on autoconf prefix

github.com/openvenues/libpostal - cb648b63da91f5df2ff7fea417f1af09c841a20e authored about 9 years ago by Al <[email protected]>
[fix] standard headers in new extensions

github.com/openvenues/libpostal - 7cf48acd2049c96a1061f4ec41f7abc5fedb8f80 authored about 9 years ago by Al <[email protected]>
[build] bumping Python version

github.com/openvenues/libpostal - bec43750d5fcbaa549c5a177fa8fb933bc291c05 authored about 9 years ago by Al <[email protected]>
[build] setup.py changes for parser extension

github.com/openvenues/libpostal - 33fdb912b6db220db4ce7f691cb573a222fd1f51 authored about 9 years ago by Al <[email protected]>
[python] Forgot expand.py

github.com/openvenues/libpostal - c40ab06dd6b1eddca9429fc0f1cdcc9ec5f80b8b authored about 9 years ago by Al <[email protected]>
[python] Adding address parser Python API

github.com/openvenues/libpostal - 842ef4526bb079357a7fae1975bab0cda3e8ca54 authored about 9 years ago by Al <[email protected]>
[fix] Moving address_parser_response_destroy into libpostal so caller can free

github.com/openvenues/libpostal - b9bf5c629e03e40d4b85d3f0a081cb41b143055f authored about 9 years ago by Al <[email protected]>
[python/build] Modified install command for setup.py allowing --datadir and --prefix to be passed in. If there's a virtualenv active and nothing else is specified, install libpostal and its data files there by default

github.com/openvenues/libpostal - ab3ba249d7d215e193edf2195913e9070c874e32 authored about 9 years ago by Al <[email protected]>
[python] Adding Python bindings to the expand API

github.com/openvenues/libpostal - 7af0e2d967e801525aff19ff5a290e09a98f4fa4 authored about 9 years ago by Al <[email protected]>
[fix] warning about size_t

github.com/openvenues/libpostal - b59c830ba6851a7bd2c9ee86b864d29d5b57ea0b authored about 9 years ago by Al <[email protected]>
[api] Separating parser setup/teardown into two separate methods

github.com/openvenues/libpostal - 406f9c533d2315f05fab8f9b487c197018b9db17 authored about 9 years ago by Al <[email protected]>
[fix] Python 3 version of tokenize/normalize

github.com/openvenues/libpostal - 0f52f976218ac55a1064f3e627684426eccdd485 authored about 9 years ago by Al <[email protected]>
[fix] changing labels in Python normalize, adding a NULL check

github.com/openvenues/libpostal - 3401045b4f2a4866c64a0472f1cd144d5bfaf482 authored about 9 years ago by Al <[email protected]>
[fix] size_t in benchmark script

github.com/openvenues/libpostal - 43b212a09b8f28fdb7bd7e287c851686da025fef authored about 9 years ago by Al <[email protected]>
[math] Adding an aligned memory allocator for vectors to help with vectorization/SIMD

github.com/openvenues/libpostal - dc03c83bb25860ae54826e8879566840f887a56b authored about 9 years ago by Al <[email protected]>
[fix] default address parser dir

github.com/openvenues/libpostal - bd1e8ecaf880f2cde9b523fc7ad9359dd7e7c876 authored about 9 years ago by Al <[email protected]>
[build] address_parser client now links to libpostal, adding address_parser to download script with an "all" option

github.com/openvenues/libpostal - 2950358697bde2db5b9ad8c7d612a54b0bdccea8 authored about 9 years ago by Al <[email protected]>
[api] Adding parse_address implementation to the libpostal API. GeoDB and address parser are now required. Stripping punctuation from the normalized output

github.com/openvenues/libpostal - 88836e56e1aacb354e3a5c5bdbc216ba4c61e3ff authored about 9 years ago by Al <[email protected]>
[api] Moving parse_address definition into libpostal.h

github.com/openvenues/libpostal - a8d6cc4053917d3b6048ff97da1aa1c29f69e301 authored about 9 years ago by Al <[email protected]>
[parser] Using different char_array for each of the potential phrases as token i

github.com/openvenues/libpostal - fe4c528f26325009aa3fac2c943c9286618fd2f0 authored about 9 years ago by Al <[email protected]>
[fix] removing printf

github.com/openvenues/libpostal - e6303f70f3b65c3be6741e34c2d60c8e0113413d authored about 9 years ago by Al <[email protected]>
[parser] Fixing possible invalid writes in training for values beginning with a separator

github.com/openvenues/libpostal - 671dd4a5d2a76726c672eba3eda9845b929f18c5 authored about 9 years ago by Al <[email protected]>
[parser] Simplifying args in address_parser_data_set_tokenize_line

github.com/openvenues/libpostal - 743b74aea5b337074e68c9e139b74ac49c611112 authored about 9 years ago by Al <[email protected]>
[osm] Fixing an issue in the training data with house numbers in OSM (seen mostly in Uruguay) where a comma separated list of house numbers is entered.

github.com/openvenues/libpostal - 1d288954d78b619a2e1a118e45099f7adbae5847 authored about 9 years ago by Al <[email protected]>
[fix] Bug in address parser feature extraction, can hold onto the wrong pointer

github.com/openvenues/libpostal - 88b8023ac8b4dc5c47e2787f23365446b18790da authored about 9 years ago by Al <[email protected]>
[parser] Internal separators for parsing purposes include open/close parens, at sign, semicolon, etc. Ignore stray colons not internal to a word (as in Swedish abbreviations)

github.com/openvenues/libpostal - 3de59506ae5e8dc18dedb85224b3f4541133a34f authored about 9 years ago by Al <[email protected]>
[utils] Removing kvec and using similar implementation with pointers that can be passed around

github.com/openvenues/libpostal - 71d6d3c5e1bd74277c0f6e007f45d95aa4645bfd authored about 9 years ago by Al <[email protected]>
[utils] Adding a default small size to all arrays based on a look at malloc/realloc usage

github.com/openvenues/libpostal - ab205eff96c70c338a368dc1e2fe0528d720e25b authored about 9 years ago by Al <[email protected]>
[osm] In cases with more than one official language and where the address language can be determined, use it for looking up language-specific OSM polygons

github.com/openvenues/libpostal - 779298360cb6a592fdbcf4bec2d0ecd904c5d128 authored about 9 years ago by Al <[email protected]>
[osm] Randomly select up to n components for state_district OSM boundaries. For all other fields select one name at random

github.com/openvenues/libpostal - aeb72d7d26c45c6a9fa3edc330a9033b8f0cf8ef authored about 9 years ago by Al <[email protected]>
[fix] Belgium cities again

github.com/openvenues/libpostal - 2c254ebc5eab65f479393d8a2f8487781478d702 authored about 9 years ago by Al <[email protected]>
[dictionaries] adding ste to English dictionaries

github.com/openvenues/libpostal - f252869671856777321e53eb03da2db80b5729aa authored about 9 years ago by Al <[email protected]>
[osm] Choosing a language at random in countries with multilingual addresses for the parser training data so we get some monolingual examples

github.com/openvenues/libpostal - 69a469d9d35b49d5e3a53282ea0821fc135d0765 authored about 9 years ago by Al <[email protected]>
[fix] Fixes to matrix methods

github.com/openvenues/libpostal - fe37286bcf6367dba1f2d945a391a6910c416392 authored about 9 years ago by Al <[email protected]>
[math] Matrix method updates

github.com/openvenues/libpostal - d9d53ce17ed8528d7959fa798c02b5e2a25a5693 authored about 9 years ago by Al <[email protected]>
[scripts] Benchmark script using default options

github.com/openvenues/libpostal - 48ee665e71e6f56a8606bb573f014b5a108596bc authored about 9 years ago by Al <[email protected]>
[fix] multitoken canonical strings

github.com/openvenues/libpostal - 2fcc72ae07dafdc1b262a8d48a46954ab25c9dee authored about 9 years ago by Al <[email protected]>
[api] Adding place name expansions by default

github.com/openvenues/libpostal - a857138d955904f1ff6187778b1f84b339230c21 authored about 9 years ago by Al <[email protected]>
[expansion] regenerating expansion data

github.com/openvenues/libpostal - beec43fe151a4c0b82efc7fe762447cb75fa5589 authored about 9 years ago by Al <[email protected]>
[fix] canonical index in address expansion data, should be -1 for all canonical phrases

github.com/openvenues/libpostal - 35db855819c8d59bb3ec952ecbd1e553d8123b9c authored about 9 years ago by Al <[email protected]>
[expansion] Toponym dictionaries can apply to street names and place names

github.com/openvenues/libpostal - e1ea2ac70470af220348d1191067b4d025b3c069 authored about 9 years ago by Al <[email protected]>
[fix] Belgium districts

github.com/openvenues/libpostal - bfc517ae42b33812b6cdd0b57f0dd8143369d12b authored about 9 years ago by Al <[email protected]>
[expansion] The ambiguous expansions dictionary shouldn't add to the component bitset

github.com/openvenues/libpostal - cbe5cd742916db4c0e42bd6c223fa8101b7965d2 authored about 9 years ago by Al <[email protected]>
[expansion] Fixing case where non-ideographic tokens like # can potentially be concatenated with surrounding tokens and should normalized with whitespace in between

github.com/openvenues/libpostal - d35f5196292cba644e860333a002f01c22d29ac7 authored about 9 years ago by Al <[email protected]>
[math] Signatures for array_exp and array_log

github.com/openvenues/libpostal - f5739dd42be4c45e90565c1ff5f1115bc43d4f02 authored about 9 years ago by Al <[email protected]>
[expansion] Fixing cases like ML King where a global (all languages) expansion subsumes the specific language expansion (like English)

github.com/openvenues/libpostal - 0d8d3961084cc9f770e66cc527ef65bf952589f1 authored about 9 years ago by Al <[email protected]>
[numex] Always adding a version of the string without Roman numeral expansion since many times those tokens can be ambiguous

github.com/openvenues/libpostal - 9bab70909dda7af8bf82643a33a9f41e7b08eb95 authored about 9 years ago by Al <[email protected]>
[fix] city name in OSM formatting

github.com/openvenues/libpostal - f8a3081d0fe585d25833d64e63972ed824d00ae2 authored about 9 years ago by Al <[email protected]>
[math] Only reallocate on matrix_resize if needed

github.com/openvenues/libpostal - a066ee9aad4f3913789059fe62b1a4b06ab76950 authored about 9 years ago by Al <[email protected]>
[parsing] Using the entire phrase as the ith word

github.com/openvenues/libpostal - cfd0dc69f2931a117d4851b6a040162c87c3e72c authored about 9 years ago by Al <[email protected]>
[dictionaries] Regenerating address expansion data file

github.com/openvenues/libpostal - 8186e2606e798bd48ebe2fe0a1cbbb99661fe5fb authored about 9 years ago by Al <[email protected]>
[dictionaries] Adding state abbreviations for US, CA and AU into dictionaries

github.com/openvenues/libpostal - 4dba0c54e4bfe2c99a320cdd8652c2bdc7069f28 authored about 9 years ago by Al <[email protected]>
[osm] Doing more deduping in the OSM training data to avoid confusing the parser when city, state, district all have the same name

github.com/openvenues/libpostal - b25a7380003849214100bb8a54f22a18e67e62f5 authored about 9 years ago by Al <[email protected]>
[math] Matrix resize

github.com/openvenues/libpostal - 44f7fd0844d07cc7e182489400e24d45e55e07fe authored about 9 years ago by Al <[email protected]>
[fix] prefix/suffix regexes

github.com/openvenues/libpostal - dd8f8b4d7bbc22492c08fafaae8e56a343d5fb58 authored about 9 years ago by Al <[email protected]>
[osm] Stripping standard city prefixes/suffies e.g. Township of

github.com/openvenues/libpostal - 2a4210f93f1d6e729221b089f8e41cd520beb858 authored about 9 years ago by Al <[email protected]>
[fix] Tokenized trie search

github.com/openvenues/libpostal - 596c5ffdd323e8ad1c286c1279663cb3840329e3 authored about 9 years ago by Al <[email protected]>
[parsing] Adding a training data derived index of complete phrases from suburb up to country. Only adding bias and word features for non phrases, using UNKNOWN_WORD and UNKNOWN_NUMERIC for infrequent tokens (not meeting minimum vocab count threshold).

github.com/openvenues/libpostal - 24208c209f1f72a8d18945ce694521a207844503 authored about 9 years ago by Al <[email protected]>
[osm] Avoid using the alternate name (e.g. Brooklyn instead of Kings County) when it is the same as city

github.com/openvenues/libpostal - f41158b8b36218abfbb61f1b39327c8950c578cc authored about 9 years ago by Al <[email protected]>
[fix] osm components

github.com/openvenues/libpostal - 7c26317903cdaa70e7b5411fb5941379903c1041 authored about 9 years ago by Al <[email protected]>
[osm] Only removing local language city if there are prior components from OSM

github.com/openvenues/libpostal - 42a8890652132ed8e54b8a2489830129bcf952b2 authored about 9 years ago by Al <[email protected]>
[formatting] Switching back over to OpenCageData

github.com/openvenues/libpostal - ab0a4e622d31d109fab227e390ffb029212ba5ad authored about 9 years ago by Al <[email protected]>
[osm] Adding GeoNames abbreviated city names in a small percentage of cases to get variations like NYC, BK, SF, etc. in the training data

github.com/openvenues/libpostal - 5af95ee613a2b3c193e4b8ec1f44d08e4b65777e authored about 9 years ago by Al <[email protected]>
[fix] tokenized trie search edge case where tail is stored on the space node

github.com/openvenues/libpostal - 25e89bcc412a6eef8cee1c247e8ef8ed7097d490 authored about 9 years ago by Al <[email protected]>
[osm] Removing multilinestring boundaries from OSM polygon index (often partial boundaries e.g. France-Germany)

github.com/openvenues/libpostal - 218361f43f31b60554c475beea867e1e58b98a75 authored about 9 years ago by Al <[email protected]>
[normalization/phrases] Fixing a bug which occurs with an already-separated elision

github.com/openvenues/libpostal - 43287db90adb9a772eb9b19227195fb739eec53b authored about 9 years ago by Al <[email protected]>
[fix] path in setup.py

github.com/openvenues/libpostal - 87c04b4d37dd0a29209c4c6409416e9d923dce89 authored about 9 years ago by Al <[email protected]>
[fix] pip install command

github.com/openvenues/libpostal - 09a3e2ab6442411cb69d18d88d1279fd0cb07fba authored about 9 years ago by Al <[email protected]>
[fix] transliterate using string_equals

github.com/openvenues/libpostal - 746b5d0f3477026477e59ff19bdb1bd3f7f54b10 authored about 9 years ago by Al <[email protected]>
[utils] string_equals with NULL check

github.com/openvenues/libpostal - d0aaff1482688f4b5084fae71b4d15b3b1a86b97 authored about 9 years ago by Al <[email protected]>
[build] adding shuffle.c to Makefile rule

github.com/openvenues/libpostal - f322ae0a1c93e8c5566cf961b06144cbd225ea54 authored about 9 years ago by Al <[email protected]>
[parser] Forgot to add shuffle.h/.c

github.com/openvenues/libpostal - b94264b7459c2585425919d6db1d933c5a0a8ab3 authored about 9 years ago by Al <[email protected]>
[parser] gshuf (Mac equivalent of shuf) is quite a bit slower than shuf, so removing it. Need to train on Linux unless a better alternative is found for shuffling large files on Mac

github.com/openvenues/libpostal - 116fe857db798519b51ff9558b46319ae109319b authored about 9 years ago by Al <[email protected]>
[fix] venue names should be removed probabilistically in the training data, giving neighborhoods a slightly better chance of being included

github.com/openvenues/libpostal - 8484d4fffdafdececd5cf32cf09901bf5fd65539 authored about 9 years ago by Al <[email protected]>
[fix] dupe checking

github.com/openvenues/libpostal - 6ef40c17691efa73e9123f97e7ee68c660e7eacb authored about 9 years ago by Al <[email protected]>
[fix] Smaller probabilities on adding neighborhoods and admin polygons, eliminating duplicates on the row level

github.com/openvenues/libpostal - af170de0195d9eac2b9235084d1c1c2ebf6bffe3 authored about 9 years ago by Al <[email protected]>
[osm/formatting] Adding pick random name logic to neighborhoods as well, getting rid of drop probabilities as they're covered elsewhere, adding several forms of venue names to the training data

github.com/openvenues/libpostal - b430fb7657165b35f2d62a3bfd5d2bf684099987 authored about 9 years ago by Al <[email protected]>
[formatting] Not applying template replacements from address formatting by default

github.com/openvenues/libpostal - d4b6450f1958a6b475d9ced0326156e6199fefda authored about 9 years ago by Al <[email protected]>
[osm/formatting] Changing drop probabilities and doing it in random order

github.com/openvenues/libpostal - 839a12b2124947a40f1c84dc95efa0a0f91f1645 authored about 9 years ago by Al <[email protected]>
[parsing/build] Makefile changes for address parser

github.com/openvenues/libpostal - 5f13041140e5ab2c817dcb173222fb74a7b06e2b authored about 9 years ago by Al <[email protected]>
[parsing] Adding a command-line client (with history) to test address parsing

github.com/openvenues/libpostal - 4ca911baf8c62e7d22f67c166943c83743f3370f authored about 9 years ago by Al <[email protected]>
[parsing] Initial commit of the address parser, training/testing, feature function, I/O

github.com/openvenues/libpostal - 89677d94a374ca48a3c7a4a5a74bf0bb70133ed4 authored about 9 years ago by Al <[email protected]>
[math] Matrix file I/O

github.com/openvenues/libpostal - e62eb1e697fcbdab77553e291ea39e113f3f5311 authored about 9 years ago by Al <[email protected]>
[fix] close file handle

github.com/openvenues/libpostal - 5682c347acaa099b926f9762023e514db2d9dee5 authored about 9 years ago by Al <[email protected]>
[osm/formatting] Adding per-field drop probabilities to OSM training data to make some fields more likely to be dropped, although it might create more training data

github.com/openvenues/libpostal - 9a8ba148876af176fa5c925c511a0cacb56e4db1 authored about 9 years ago by Al <[email protected]>
[fix] Neighborhoods reverse geocoder discriminates between OSM matched with Zetashapes and OSM matched with Quattroshapes

github.com/openvenues/libpostal - c8e4602d4c25baf92516ac079d17841e91f12514 authored about 9 years ago by Al <[email protected]>
[cli] Adding antirez's linenoise for command-line interfaces

github.com/openvenues/libpostal - feab77970b68f7f746c7dcfc95159078af74a0f1 authored about 9 years ago by Al <[email protected]>
[osm/formatting] Adding in more ISO alpha-3 codes for countries in the training data

github.com/openvenues/libpostal - 15d9e0012144bd5f1b429c87d23814e394c71cc1 authored about 9 years ago by Al <[email protected]>
[fix] moving separator definitions

github.com/openvenues/libpostal - d3040036ec89ff0baa54befe4ff97560223a5643 authored about 9 years ago by Al <[email protected]>
[fix] non-local language states

github.com/openvenues/libpostal - 66778737ff1aeaad7f61ee6068add96b61914e1e authored about 9 years ago by Al <[email protected]>
[docs] updating params in OSM training data docs

github.com/openvenues/libpostal - 69ba631dc919c6b7dab1912e18f9600d2c339936 authored about 9 years ago by Al <[email protected]>