Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/openvenues/libpostal

A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.
https://github.com/openvenues/libpostal

[fix] all_names, use values instead of name keys

d02a18a5a8d2474f83376d5d5e16aeb33ba3ec46 authored about 8 years ago
[fix] check fixed list of keys in all_names as well

e9c7bc43e37c98ae5c2a8fb217db14b77120a65b authored about 8 years ago
[addresses] using the name key disttribution in AddressComponents.all_names. Returning names and valid components from the new function instead of the full gazetteer (can be build later)

2727572822499027015f78bf9dc4404e9e502d96 authored about 8 years ago
[names] adding name_key_dist method to boundary names to account for certain boundaries like e.g. Kings County that have name exceptions

954b6548bf0c494ae0102d919e1069a6195e340f authored about 8 years ago
[addresses] separating boundary phrase gazetteer construction into its own method

d308473686eed05565e8de5b9856b5a3552ca730 authored about 8 years ago
[fix] /props/attrs/

585b203a4f69e23b4c8fd9cd0d5518b289829dfd authored about 8 years ago
[fix] name comparison in neighborhoods index

82b26117aa731c8c944e931b5a38a481825493c2 authored about 8 years ago
[utils] using renaming char_array_append_vjoined to char_array_add_vjoined to follow convention that add_* calls NUL-terminate while append_* calls do not

3ac2c93e1cfdc1639decd5f541cafae76423e86e authored about 8 years ago
[fix] var name II

8322e98ad3abdb182a438748e768a8665340cef0 authored about 8 years ago
[fix] var name

0c55bc3bb856967f9081c42aae5af492bafdfc3f authored about 8 years ago
[fix] putting the neighborhoods check after the dupe threshold check, as it's not really needed until then anyway

e5657c56125681e1209e02fea92500e19809f0ad authored about 8 years ago
[fix] don't need to do two checks for OSM boundaries

4314a6822dbbd78ed9b50e511e67850c89f7343f authored about 8 years ago
[fix] move OSM check to after ClickThatHood/Quattroshapes checks as we don't need to check the point if it doesn't match a neighborhood geometry. Should speed up neighborhood index construction

590246748f13f8ade2b9b892ca1361f9edf1d09c authored about 8 years ago
[fix] yaml config

0a1e69ee9be8ca67ad9fea2758f9f208a6a92a65 authored about 8 years ago
[openaddresses] adding new config option to OA config for aliasing fields based on a regex

86a8315b9d9ef3c5559a6883a73ed01632fec7c0 authored about 8 years ago
[neighborhoods] check polygon boundaries in OSM neighborhood points for a name match at the city level or below

d357f0f37cfbe282b3d461cddb2950bffbd9d300 authored about 8 years ago
[openaddresses] aliasing Paris/Marseilles/Lyon arrondissements to city_district in OpenAddresses

a2cf1a35dff8e63259e383d776d4563a4da5b692 authored about 8 years ago
[boundaries] adding exceptions for Arrondissements in Paris, Marseilles and Lyon

fc57c437cb85c3680f6ecded7b7bedafda0c94f9 authored about 8 years ago
[openaddresses] 5-digit postcodes for Spain, some are stored as integers stripping the initial zeros

154a227285d2512e1bae5511b793f1471a11a09b authored about 8 years ago
[openaddresses] fixing state abbreviations for Mexico

726ee2a299572d4228ec51de105f600bbc61a151 authored about 8 years ago
[ngrams] adding function to extract an array of ngrams from a string, with optional special prefixes/suffixes for the edges

3ed95a175eb68ff8a258b6110fc86a99e11e495f authored about 8 years ago
[openaddresses] adding regex replacement to remove "*" from any field

3c6ed7489c701cf5c463c096871f5516887d28c5 authored about 8 years ago
[openaddresses] adding state abbreviations for OA Switzerland

f1a460b87488498785cbfa855e255dc32049db87 authored about 8 years ago
[states] adding Canton abbreviations for Switzerland

10d4979f21089a56f8130a03caa1aa0151597be0 authored about 8 years ago
[places] higher probability of adding Canton (state) for smaller cities in Switzerland

e99d76e750b5b83090d0ace3c8bc330a5b8f5657 authored about 8 years ago
[places] add state_district (province) and state (region) in Italy more often

05adbaca01d80e4a07209af30c7365ef06c77f23 authored about 8 years ago
[fix] openaddresses formatter

ba96f68b62a83783d54490718fbfe0f22bbaf27f authored about 8 years ago
[openaddresses] adding a value map for Italian province abbreviations in the countrywide file (they're commonly used in addresses and this may be a better place to handle that since the province names are given). Updating OpenAddresses config to use new dictionary field maps.

d08e8d8dd3ef8289aef7ac82ae18cfa08cef1fb9 authored about 8 years ago
[openaddresses] making field maps in OpenAddresses config a dictionary rather than a list to make inheritance easier

da3240d5f66503fe0534cad7a27aca76b1264a6f authored about 8 years ago
[openaddresses] adding option to map values for a particular field

83aab5a46a165d97456fb32b09d66f43a4cec9d6 authored about 8 years ago
[openaddresses] add city and state to Mexico City

ae32645e0d90b226130eb5dd3b33be58c23be3f7 authored about 8 years ago
[boundaries] adding a few more US non-city_districts as exceptions.

558cd2af2d0154628745e2039449e2b6b70bfb21 authored about 8 years ago
[addresses] let the place config take care of adding/removing neighborhoods rather than doing it as part of the add_neighborhoods method

846b88cde5938c5cbd24a04c74ff03599e66b0cb authored about 8 years ago
[addresses] using the defined component from the neighborhoods index for city_district (they're fairly rare, just NYC boroughs basically)

5946ead37fc8fe69bae9cd854b7dda009de569e9 authored about 8 years ago
[neighborhoods] adding component to neighborhoods index at construction time

026737cd3bf34471705db83086fae51fa24aadc4 authored about 8 years ago
[addresses] removing place_type override requirement from the neighborhoods index (NYC boroughs, etc.)

5846943b70ede1b28894e19405043105ca62bcf6 authored about 8 years ago
[geoplanet] only add short postal codes to GeoPlanet data set if they match the Google regexes

09f808ca47a5f33f2ea01b85d2d843259b039b6e authored about 8 years ago
[openaddresses] Mendocino County, CA

34db27b80c540c6943f7efb577a3235226008847 authored about 8 years ago
[neighborhoods] adjust cache size when building neighborhoods index

6b04711195b9eeab7232bb2a4e50a70b6706139d authored about 8 years ago
[addresses] only add city relacement if a city is not found first

40cd86c3be191f2e94a1a21567cf4da03cf469bc authored about 8 years ago
[openaddresses] Pierce County, WA

7e6566188452a5d02b97d643dfe8f98f5fd26ba3 authored about 8 years ago
[neighborhoods] fix neighborhoods index checks to include the borough points while still not making letting something like Santa Monica pass as a neighborhoods when it's a proper city

cd91068f0f1fab47072709d256c3841489691eb8 authored about 8 years ago
[openaddresses] adding Sunshine Coast, BC and Sardegna, Italy

cb475d8245dc6e0adc868bcba60830763c7bfcf6 authored about 8 years ago
Merge pull request #137 from openvenues/fix_address_parser_train

Fix address_parser_train

bcf6b3cc683d58d22e73adee00256bbf8c7641d1 authored about 8 years ago
[fix] loading transliteration module in address_parser_test.c as well

8f1e69960fd1985b2ff570cb2427989e2f3c4ee5 authored about 8 years ago
[fix] tokenized_string_t should copy its source string

5e07f5e8c50b220b5d802ccd7b22390842a03559 authored about 8 years ago
[fix] Need to load transliteration module for Latin-ASCII normalization

521a094a472d52f414e7c3603e62ef8029222324 authored about 8 years ago
[fix] calls and NULL checks

6baa7087fe490b6153896883128403d159ef0935 authored about 8 years ago
[fix] cstring_array_split calls

3939dd0ca67cdc3b9006dfcc114a7296688648fe authored about 8 years ago
[fix] brace

a42d0e917a9ad3e808b70f96fa44dce660d34320 authored about 8 years ago
[parser] Ignore multiple spaces in parser input post-normalization. If normalizing the string creates several distinct tokens (namely in Vulgar fractions e.g. ½ => 1/2), add all the sub-tokens with the same label as the parent

ced8f9ae2729972860a4142d82d36d8ce8a1550e authored about 8 years ago
[utils] Adding cstring_array_split_ignore_consecutive

b1816e9b70057480907f72f4d8de11f7de410bd4 authored about 8 years ago
[addresses] same rules for state_district apply to state, no alt_names etc. unless a city is present

d158751d9234aa8768edaa27950a7f0e63b46499 authored about 8 years ago
[osm] during place formatting, add point-based cities for any places/polygons that are smaller than cities e.g. suburb or city_district, use admin_center as the point for reverse geocoding if available (instead of representative_point() which can be expensive or centroid which can be inaccurate)

bf3e9749ca6c48bd3e270b6f1239b6e3fc1d6210 authored about 8 years ago
[places] allowing state_district to depend on state in the US

33dd9223dc8ddbaa2f0bcfe2605d2557811f6f0c authored about 8 years ago
[boundareis] adding two exceptions for admin_level=9 in US

5d98f3115c51ff6b32488ab353bc110a93c672d6 authored about 8 years ago
[addresses] option to add city points, no random keys for state_district if city or replacement is not present

da4fe37fb48f30ac27282457caf2a010cc74132f authored about 8 years ago
[fix] typo

dfc88a47b2ed28231e50a6932b00c8e60c32bab1 authored about 8 years ago
[neighborhoods] check if there's no defined place-type before classifying a polygon as city_district

e8abf44c16af97c14941c86017cd01a5f4ca45f6 authored about 8 years ago
[fix] "District of" is only a valid prefix in the non-US Anglophone world

01d6bc27b6f4d8e0477fe504a86e307880a2148e authored about 8 years ago
[states] adding abbreviations with internal periods for multi-word US states

9b95601e42210a6458f04803391ee1dd7cb27bf5 authored about 8 years ago
[fix] default value

fffc81a17a174a92eb5428f6dd74570095716248 authored about 8 years ago
[fix] typo

371198da3c3ad224ad1ae14e3447484b2d12ffe2 authored about 8 years ago
[fix] normalize place names after adding admin boundaries as well

91982528c6579a95edddd16dcb738d496aac8737 authored about 8 years ago
[addresses] fixing normalized_place_name so it deals with things like Washington DC where Washington DC may actually be one of the OSM names

34d3ae7e9e8e8c225d56a4396160532b4d1bc00c authored about 8 years ago
[text] adding normalization with whitespace

80ee34cc3a5762ca6aa290afd3e84636ad5de32a authored about 8 years ago
[fix] var name

4550f00f03887f2f9ebe297142fb1e230116c7e8 authored about 8 years ago
[fix] order

72771741c38d949b4786d08688b3589cf0e6b7c7 authored about 8 years ago
[addresses] don't add components to the trie that have the same normalized name as the given component

8595d8da05fc08812c71ed7fbad4d12dcd840755 authored about 8 years ago
[fix] options/docs in osm address training

bb12d0940e3c89f2eaf0889d2754b374caeaecfd authored about 8 years ago
[states] adding all forms of the state abbreviation to the trie when doing place name normalization to handle the D.C./DC case

ffc584f679a792cd31a2185de58634ecd7cdb457 authored about 8 years ago
[addresses] remove Quattroshapes/GeoNames cities as they may have problematic names, and in any case we have point-based cities from OSM now

5098599ed6d977a049652cf07b81939d15a366cd authored about 8 years ago
[fix] check for non-None city

18c5fd08555be224a1ed1eac7516ab6fc6610160 authored about 8 years ago
[osm] adding normalized_place_name to Quattroshapes city

dc022f86524cd6a5bd55c86855c0f999e6db5f76 authored about 8 years ago
[openaddresses] adding D.C. with periodds as the state for the DC data set

7edb98356635adaaf35747b7a88cb6838dbfe774 authored about 8 years ago
[fix] imports

c7b1818695e2e5e66c7dbe32140fe9f5dbdb937d authored about 8 years ago
[states] adding multiple state abbreviations for states that can have periods in the naem like D.C., D.F. in Mexico and Brasil, etc.

973466bb1370fc20f54b6acb8209c46b2994b8a3 authored about 8 years ago
[data] using UTC for libpostal data files on the Mac version of the download script as well

d575caba8ae6f74cdbd3cd7e98def766f2313365 authored about 8 years ago
[fix] update test for date function in data download script

c3f3896b484f5e26180b19c90eef423cfc321556 authored about 8 years ago
[addresses] using normalized tokens when stripping off compound place names for things like D.C.

675552d2543fb338d5631c21d2a7afcb16d62d54 authored about 8 years ago
[normalization] adding a normalize_token function and some token options for deleting periods

c0a468d7e8cc81cf7c80d724e87d76099f42a6ac authored about 8 years ago
[parser] header changes for the data set struct

318773ffe7bebda6021e450b81e0381fbf0bb43b authored about 8 years ago
[openaddresses] adding units to Olpympia training data

69ca4a85cec60d924e23cb8b7ba4e04023f8101f authored about 8 years ago
[fix] checking if building is a rail station

8f30987bdfd83c85d4fb3c10f2dfbd3daece4f8d authored about 8 years ago
[openaddresses] adding new counties from OpenAddresses, strip commas option for thousands separators

e92963de50fb3f5d1f85e55ddf61c225f5c092f8 authored about 8 years ago
[geoplanet] adding an index of state_districts, states, etc. that contain a city with an identical name. Alias to the city if it's the only contained place, otherwise don't allow the admin name without the city.

b60b7c9009bcbfb98a9e5b317fd6d33946a7be61 authored about 8 years ago
[geoplanet] all_places table, specified dirs

640f70c05d135629bfdb45f1d0080efeee7bffb4 authored about 8 years ago
[addresses] if suburb/city_district is already listed, and we're finding the closest city by point rather than by boundary, use the closest actual city, not something smaller like a village/hamlet

f9945103ba14973651ec8eaa6f6fcfa0f3754eef authored about 8 years ago
[geoplanet] fixing geoplanet aliases insert warning

28d9ef12c062cdaeb4d68bc5c99c3d3ec1b188ff authored about 8 years ago
[geoplanet] add County to the names of US counties outside of Louisiana and Alaska, add Parish in Lousiana

763c86dcd4ab084f2459e9004bef8e6a8adf70c8 authored about 8 years ago
[openaddresses] adding Douglas County and Paulding County in GA. Jackson County and Rankin County in MS

7d0c402a31fcc6a64746d78edc5c4ba43d1c7adf authored about 8 years ago
[openaddresses] adding today's changes from OpenAddresses

c2c282293640ad2edf59cc1d5e666a1bcfd8c65f authored about 8 years ago
[dictionaries] adding US highway and US route expansions

55c2f1889682732f55692ac545cb35f3eebb19e9 authored about 8 years ago
[names] adding New Zealand to places that normalize City as a suffix (not Australia though as it has some cities that actually do end in City)

42861aa38c2227df555298ae90ccbc97b67fd881 authored about 8 years ago
[names] adding new name_affixes call to replace both prefixes/suffixes in one call, using in GeoPlanet training and the generic AddressComponents normalizations

7436d9693a608e4f603eeb3dbbbf8b0096a985cc authored about 8 years ago
[names] adding country-specific affixes and only normalizing the word City as a suffix in UK/Ireland

9386a999f67371d22b48a3795260d56c10e00864 authored about 8 years ago
[openaddresses] adding Kenton County, KY

a9209fae37f3f4d6d4dadfd2d38fab1681ce1a8d authored about 8 years ago
[openaddresses] adding Kansas City, MO

b69914ff187843367e99e799ebad65a9bffaec95 authored about 8 years ago
[openaddresses] fixing house numbers with multiple consecutive hyphens

3ff472c8cfa79ccac1ebac19635a5c1bfb40c58b authored about 8 years ago
[fix] indentation

ae527ef5b1e60088dfcd78f81c2821b8d998a3dc authored about 8 years ago