Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/openvenues/libpostal
A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.
https://github.com/openvenues/libpostal
[dedupe] adding the core pairwise deduping module which ties together most of the work on this branch. Includes simple phrase-aware exact deduping methods, with per-component variations as to whether e.g. a root expansion match counts as an exact duplicate or not (in a secondary unit, "No. 2" and "Apt 2" can be considered an exact match in English whereas we wouldn't want to make that kind of assumption for street e.g. "Park Ave" and "Park Pl"). The API is fairly low-level at present, and may require a few calls. Notably, we leave the TFIDF scores or other weighting schemes to the client. Since each component gets its own dupe classification, it leaves the door open for doing more specific checks around e.g. compound house numbers/ranges in the future.
098babfdee01b2e738a290910b79a7fa2acdf818 authored almost 7 years ago by Al <[email protected]>
098babfdee01b2e738a290910b79a7fa2acdf818 authored almost 7 years ago by Al <[email protected]>
[api] adding libpostal_place_languages method to public API for classifying languages consistently from components (may need to make several calls using the same languages and don't necessarily want the language classifier to be run on house numbers when we already know the languages from e.g. the street name - this provides a simple window into the language classifier focused on the entire address/record
1f1412c1205844268a2352880fd04f7d8949155e authored almost 7 years ago by Al <[email protected]>
1f1412c1205844268a2352880fd04f7d8949155e authored almost 7 years ago by Al <[email protected]>
[similarity] adding a string array version of Jaccard similarity that creates the string sets internally for convenience
1d1ce10fadcbd9a9a38d94e353925dfdf2781985 authored almost 7 years ago by Al <[email protected]>
1d1ce10fadcbd9a9a38d94e353925dfdf2781985 authored almost 7 years ago by Al <[email protected]>
[similarity] moving stopword tokens array to a separate function in acronym token alignments
c5ad080fb0ed3c85f83d65e64965330375a28dd3 authored almost 7 years ago by Al <[email protected]>
c5ad080fb0ed3c85f83d65e64965330375a28dd3 authored almost 7 years ago by Al <[email protected]>
[similarity/dedupe] adding options for acronym alignments and address phrase matches in Soft-TFIDF. Acronym alignments will give higher similarity to NYU vs. "New York University" whereas phrase matches would match known phrases that share the same canonical like "Cty Rd" vs. "C.R." vs. "County Road" within the Soft-TFIDF similarity calculation.
f1e68865366ed1287f3f1d573d0f05e88416d248 authored almost 7 years ago by Al <[email protected]>
f1e68865366ed1287f3f1d573d0f05e88416d248 authored almost 7 years ago by Al <[email protected]>
[fix] another valgrind error in counting transposes in our counting affine gap implementation (mixed indices)
24a77ea03f192a0cc5e65ae93a6b203b256a8490 authored almost 7 years ago by Al <[email protected]>
24a77ea03f192a0cc5e65ae93a6b203b256a8490 authored almost 7 years ago by Al <[email protected]>
[fix] using same order in root expansions
cabdbfccd2e977bb2749a96fc904e720a6aac137 authored almost 7 years ago by Al <[email protected]>
cabdbfccd2e977bb2749a96fc904e720a6aac137 authored almost 7 years ago by Al <[email protected]>
[fix] bug in Jaro distance
8fd4242eb8be2649171df6668c7c0d5e3fb47c94 authored almost 7 years ago by Al <[email protected]>
8fd4242eb8be2649171df6668c7c0d5e3fb47c94 authored almost 7 years ago by Al <[email protected]>
[similarity/dedupe] adding Soft-TFIDF implementation with several different fallback qualifiers for the max-sim function (Damerau-Levenshtein and libpostal's new bucketed affine gap method for detecting abbreviations), but keeping Jaro-Winkler as the secondary similarity function in the final distance metric. Overall this should results in higher similarity values when one of the tokens may not quite match the pure secondary threshold in terms of Jaro-Winkler but may match on one of the other criteria.
b90c3dab4bbf73ba53cd93eac818c2b955ee99bc authored almost 7 years ago by Al <[email protected]>
b90c3dab4bbf73ba53cd93eac818c2b955ee99bc authored almost 7 years ago by Al <[email protected]>
[utils] adding place.h header, which converts parser-like output into an object that can be used for comparisons. Currently single-value, but could use cstring_arrays for fields instead.
33bb90d94b44d95c778eed4983235553137485ba authored almost 7 years ago by Al <[email protected]>
33bb90d94b44d95c778eed4983235553137485ba authored almost 7 years ago by Al <[email protected]>
[expand] fixing case where too many permutations were getting added for longer strings due to the new-ish ordinal suffix handling, using string_tree_num_tokens instead of string_tree_num_strings throughout to check for previously added words, using new is_likely_roman_numeral API
d731339811c80221a112ef47792ef69f7488e861 authored almost 7 years ago by Al <[email protected]>
d731339811c80221a112ef47792ef69f7488e861 authored almost 7 years ago by Al <[email protected]>
[numex] changing is_roman_numeral to is_likely_roman_numeral to get rid of most of the false positives like \"La\" in Spanish which could be L(=50) + the ordinal suffix \"a\", but in practice it never means that. For Roman numerals that are shorter than two characters (whether on their own like "DC" or "MD", or attached to a potential ordinal suffix like \"Ce\" in French), will be ignored unless they're composed of more likely, smaller, Roman numerals: I, V, and X, so VI, IX, etc. are expanded as Roman numerals but LI is not.
b4fdc51bf952eb9eece330f8799ac032519522b1 authored almost 7 years ago by Al <[email protected]>
b4fdc51bf952eb9eece330f8799ac032519522b1 authored almost 7 years ago by Al <[email protected]>
[dictionaries] adding hill/hills to synonyms lists in English. In general any ambiguous street types that can also be part of a core street name can also be stored in synonyms
b17b2bdcc4b65cf22bce32da9941bbc9ff14136c authored almost 7 years ago by Al <[email protected]>
b17b2bdcc4b65cf22bce32da9941bbc9ff14136c authored almost 7 years ago by Al <[email protected]>
Merge branch 'master' into lieu_api
bb9f6a4c6bf120f713057a05d9251c479ed5541c authored almost 7 years ago by Al <[email protected]>
bb9f6a4c6bf120f713057a05d9251c479ed5541c authored almost 7 years ago by Al <[email protected]>
Merge pull request #292 from oschwald/greg/fix-leak
Fix leak of normalized value in early return
1e7cc23b8100aff9d357d795c9b8b0156e30951e authored almost 7 years ago by Al Barrentine <[email protected]>
Fix leak of normalized value in early return
1bb62784466623eeb8b281205a1108c0b3bfc00c authored almost 7 years ago by Gregory Oschwald <[email protected]>
1bb62784466623eeb8b281205a1108c0b3bfc00c authored almost 7 years ago by Gregory Oschwald <[email protected]>
[test] adding E Ctr St tests
2afcd747797a8494d66e441222479381ad487808 authored almost 7 years ago by Al <[email protected]>
2afcd747797a8494d66e441222479381ad487808 authored almost 7 years ago by Al <[email protected]>
[expand] adding improvements to root expansions (using possible phrase roots even if they're abbreviated e.g. "E Ctr St", adding special valid components check for root expansions beyond what's stored in the build address dictionaries), removing spaces before checking unique strings, only splitting numeric from alpha in the case of non-ordinals, using cstring_array internally and char ** in the public API
152761fcbccc0c977d6c2f6643a97f33a5f2739c authored almost 7 years ago by Al <[email protected]>
152761fcbccc0c977d6c2f6643a97f33a5f2739c authored almost 7 years ago by Al <[email protected]>
[dictionaries] removing ave/avens/aves from ambiguous
b4ce042f80bc5e3b8e4108108ef7546119568151 authored almost 7 years ago by Al <[email protected]>
b4ce042f80bc5e3b8e4108108ef7546119568151 authored almost 7 years ago by Al <[email protected]>
[fix] reverting gazetteer changes as it would affect the parser features as well and require retraining
a3f39be0d47763bb44108af8855c29b8e3c441b3 authored almost 7 years ago by Al <[email protected]>
a3f39be0d47763bb44108af8855c29b8e3c441b3 authored almost 7 years ago by Al <[email protected]>
[build] adding new source files for near dupe hashing and the command-line program to the Makefile
acbebc9ecfd388efbfd554f39a191d9f90a7c172 authored almost 7 years ago by Al <[email protected]>
acbebc9ecfd388efbfd554f39a191d9f90a7c172 authored almost 7 years ago by Al <[email protected]>
[api] adding API functions for near dupe hashes to the public header
f3a626463a77f5f506790c08584af7b350c753b3 authored almost 7 years ago by Al <[email protected]>
f3a626463a77f5f506790c08584af7b350c753b3 authored almost 7 years ago by Al <[email protected]>
[dedupe] adding a test program for near dupe hashing that simply prints out the results. Automated tests in the works
8b75c44026aa8eeb7a72b9ebb21961b86469fec2 authored almost 7 years ago by Al <[email protected]>
8b75c44026aa8eeb7a72b9ebb21961b86469fec2 authored almost 7 years ago by Al <[email protected]>
[dedupe] adding near-dupe hashing function, which can be thought of as the blocking function in record linkage or as a form of locally sensitive hashing in general document deduping. The goal is, if two addresses/names are the same, they should share at least one hash. These hashes can also be used as an inverted index (DB, ES, hashtable, etc.). Uses the double metaphone for name words in Latin script (otherwise each individual token, and sequences of two tokens in the case of ideograms for e.g. Chinese, Japanese, Korean, etc.)
acfdb50d7ce2b900a094edd79eec60cfaecc71a6 authored almost 7 years ago by Al <[email protected]>
acfdb50d7ce2b900a094edd79eec60cfaecc71a6 authored almost 7 years ago by Al <[email protected]>
[gazetteers] removing stopwords, etc. from numeric type components, adding street type expansions to name components
6c6e5062e0e600b10bbd776f7348263637806ca2 authored almost 7 years ago by Al <[email protected]>
6c6e5062e0e600b10bbd776f7348263637806ca2 authored almost 7 years ago by Al <[email protected]>
[utils] adding cstring_array_extend and string_tree_clear
c78566c2410d7499fb39442a0b93e5d620c335f1 authored almost 7 years ago by Al <[email protected]>
c78566c2410d7499fb39442a0b93e5d620c335f1 authored almost 7 years ago by Al <[email protected]>
[parser] adding label constants to address_parser header
4e3d868bd02709038c85a45e01a2e002b7be5aa5 authored almost 7 years ago by Al <[email protected]>
4e3d868bd02709038c85a45e01a2e002b7be5aa5 authored almost 7 years ago by Al <[email protected]>
[dictionaries] adding ambiguous expansions in English
3e554b8033755ff2e9a34579373bacc9282d097d authored almost 7 years ago by Al <[email protected]>
3e554b8033755ff2e9a34579373bacc9282d097d authored almost 7 years ago by Al <[email protected]>
[dictionaries] adding "a" to English stopwords, "service" and "services" to English place names
03c89bcf3cf8570ec612fb95ebed542d5dfe4fde authored almost 7 years ago by Al <[email protected]>
03c89bcf3cf8570ec612fb95ebed542d5dfe4fde authored almost 7 years ago by Al <[email protected]>
[dictionaries] adding associates/association to company types
1fd5433bc5e605ec4bec5fcc0e2af36bb679e6e9 authored almost 7 years ago by Al <[email protected]>
1fd5433bc5e605ec4bec5fcc0e2af36bb679e6e9 authored almost 7 years ago by Al <[email protected]>
[dictionaries] adding "for" to English stopword dictionaries
7d42c94b199df114f527a832d15b0902738ae34a authored almost 7 years ago by Al <[email protected]>
7d42c94b199df114f527a832d15b0902738ae34a authored almost 7 years ago by Al <[email protected]>
[dictionaries] adding Stores to place names dictionary
dfc9064b0f8291c894c6b04ed4f723c7162444e1 authored almost 7 years ago by Al <[email protected]>
dfc9064b0f8291c894c6b04ed4f723c7162444e1 authored almost 7 years ago by Al <[email protected]>
[dictionaries] adding "7 11" as a name for 7-eleven, even though it's completely numeric. Only affects the house/name component in deduping, so should be fine
e432243256e74d581db2eaa689ae5f7bca204a5d authored about 7 years ago by Al <[email protected]>
e432243256e74d581db2eaa689ae5f7bca204a5d authored about 7 years ago by Al <[email protected]>
[expand] remove blank expansions and strip spaces
d03ce4e058a73f42da8c57136d6e61b3d3349783 authored about 7 years ago by Al <[email protected]>
d03ce4e058a73f42da8c57136d6e61b3d3349783 authored about 7 years ago by Al <[email protected]>
Merge pull request #289 from AeroXuk/FixWindowsLog
Removing console colors for Windows builds
66b32ee377f247d326831eeb988216a316bd4aac authored about 7 years ago by Al Barrentine <[email protected]>
Removing console colors for Windows builds.
f6157224ed21399936027510a0fc87c558a38ee2 authored about 7 years ago by AeroXuk <[email protected]>
f6157224ed21399936027510a0fc87c558a38ee2 authored about 7 years ago by AeroXuk <[email protected]>
[test] PO box expansion tests
ff3c7ab3b6fc1ec349f20fc6dfaf0ee8785c2437 authored about 7 years ago by Al <[email protected]>
ff3c7ab3b6fc1ec349f20fc6dfaf0ee8785c2437 authored about 7 years ago by Al <[email protected]>
[expand] adding number phrases as ignorable in PO boxes
f63a9cc579b184e31aeb0825bcfc26740444a5fc authored about 7 years ago by Al <[email protected]>
f63a9cc579b184e31aeb0825bcfc26740444a5fc authored about 7 years ago by Al <[email protected]>
[test] unit expansion tests
27f4eb27214a950619167c7dc259ca83d2174ca9 authored about 7 years ago by Al <[email protected]>
27f4eb27214a950619167c7dc259ca83d2174ca9 authored about 7 years ago by Al <[email protected]>
[test] level expansion tests
f7326e52f6ea80249e8cefa56e55cdea5dd61b4e authored about 7 years ago by Al <[email protected]>
f7326e52f6ea80249e8cefa56e55cdea5dd61b4e authored about 7 years ago by Al <[email protected]>
[expand] no longer delete phrases in cases like "PH 1" for units, where there's a phrase that can accompany numbered units and thus be ignored similar to "Apt 1" but that phrase may also be a qualifier (i.e. Apt 1 and Penthouse 1 are not the same)
727469b7367de0a7092c48a4ff7e16824b11a25c authored about 7 years ago by Al <[email protected]>
727469b7367de0a7092c48a4ff7e16824b11a25c authored about 7 years ago by Al <[email protected]>
[test] house number expansion tests
1d22da603f5970ba796b041f4f76dcfd5e98f31c authored about 7 years ago by Al <[email protected]>
1d22da603f5970ba796b041f4f76dcfd5e98f31c authored about 7 years ago by Al <[email protected]>
[test] adding header to fix warning
bfdb6b8f87cc1cae9ba47870ff23deae0bb8ba51 authored about 7 years ago by Al <[email protected]>
bfdb6b8f87cc1cae9ba47870ff23deae0bb8ba51 authored about 7 years ago by Al <[email protected]>
[test] adding tests for root-only expansions. Mostly English tests for the moment to deal with the various edge cases, but is also important for Spanish where "Calle" is so common that it's often omitted, same with French and "rue", etc.
26a6d9684d83a9689d63d4e493e7d454d87c9af7 authored about 7 years ago by Al <[email protected]>
26a6d9684d83a9689d63d4e493e7d454d87c9af7 authored about 7 years ago by Al <[email protected]>
[expand/normalize] the split_alpha_from_numeric option now applies to both e.g. A1 and 1A since we now strip out ordinal suffixes prior to normalization
a1db4d773466470617b2daa0ed56c0ac4ae3648c authored about 7 years ago by Al <[email protected]>
a1db4d773466470617b2daa0ed56c0ac4ae3648c authored about 7 years ago by Al <[email protected]>
[api] adding libpostal_expand_address_root to the public API. This will attempt to delete tokens that can be safely ignored. It's deterministic and rule-based, but is informed by libpostal's fairly comprehensive dictionaries, and should work relatively well across languages for deduping purposes.
8b2a4d1ecf78a998e9dfc29f9c369d8cfeff721c authored about 7 years ago by Al <[email protected]>
8b2a4d1ecf78a998e9dfc29f9c369d8cfeff721c authored about 7 years ago by Al <[email protected]>
[expand] in cases like "Avenue D" where there are two phrases, one is ambiguous (and canonical) but not necessarily edge-ignorable (pre/post-directional), allow deletion of the other token (so "Avenue" in this case). Also allows skipping in cases where the language classifier may predict a second language with some small probability, such as French for a short string like "Avenue D" (in addition to English). If the token was ignorable in the highest probability language, ignore it in both.
9eef46adeece81547e945fa057e1babe82018b99 authored about 7 years ago by Al <[email protected]>
9eef46adeece81547e945fa057e1babe82018b99 authored about 7 years ago by Al <[email protected]>
[expand] adding a method that allows hash/equality comparisons of addresses like "100 Main" with "100 S Main St." or units like "Apt 101" vs. "#101". Instead of expanding the phrase abbreviations, this version tries its best to delete all but the root words in a string for a specific component. It's probably not perfect, but does handle a number of edge cases related to pre/post directionals in English e.g. "E St" will have a root word of simply "E", "Avenue E" => "E", etc. Also handles a variety of cases where the phrase could be a thoroughfare type but is really a root word such as "Park Pl" or the famous "Avenue Rd". This can be used for near dupe hashing to catch possible dupes for later analysis. Note that it will normalize "St Marks Pl" and "St Marks Ave" to the same thing, which is sometimes warranted (if the user typed the wrong thoroughfare), but can also be reconciled at deduping time.
3f7abd5b24f965ebeed7eec143f3ccacd58a525b authored about 7 years ago by Al <[email protected]>
3f7abd5b24f965ebeed7eec143f3ccacd58a525b authored about 7 years ago by Al <[email protected]>
[expand] adding method for checking phrase is in multiple dictionaries, and a helper method for determining whether an address phrase has a canonical interpretation
d0364ab6fbe81573749016aa60acb41c1e0e740b authored about 7 years ago by Al <[email protected]>
d0364ab6fbe81573749016aa60acb41c1e0e740b authored about 7 years ago by Al <[email protected]>
[dictionaries] adding a few more ambiguous expansions in English
272ee3b965255d63799443cecdfb6011300b0019 authored about 7 years ago by Al <[email protected]>
272ee3b965255d63799443cecdfb6011300b0019 authored about 7 years ago by Al <[email protected]>
[auto][ci skip] Adding data files from Travis build #323
e1d89b5d626ac8ab181092945bf5bb5b6cf9a699 authored about 7 years ago by Travis <[email protected]>
e1d89b5d626ac8ab181092945bf5bb5b6cf9a699 authored about 7 years ago by Travis <[email protected]>
Merge pull request #285 from dmvianna/master
melbourne city toponym
22ee778c8c95a3d507cd4cd4cf1f102a69bf590b authored about 7 years ago by Al Barrentine <[email protected]>
melbourne city toponym
daa9cb2896cb050e71eb368c5d925b7d52e86a54 authored about 7 years ago by Daniel Vianna <[email protected]>
daa9cb2896cb050e71eb368c5d925b7d52e86a54 authored about 7 years ago by Daniel Vianna <[email protected]>
[expand] moving expand to its own module so the internal methods can be exposed, calling from libpostal.c
8968a6c9667722e3a3f2b49a5b9d7f69a8703e28 authored about 7 years ago by Al <[email protected]>
8968a6c9667722e3a3f2b49a5b9d7f69a8703e28 authored about 7 years ago by Al <[email protected]>
[utils] adding unicode_common_prefix/unicode_common_suffix, string_hyphen_prefix_len and string_hyphen_suffix_len to string_utils
e4e84f0147a1ebd588eb29959af7773527857912 authored about 7 years ago by Al <[email protected]>
e4e84f0147a1ebd588eb29959af7773527857912 authored about 7 years ago by Al <[email protected]>
[similarity] needed to add utf8proc_category and invert the indices for counting transposes in affine gap
55ba627c3cf6dc32a108b03d5b5f2c6880bfe8f8 authored about 7 years ago by Al <[email protected]>
55ba627c3cf6dc32a108b03d5b5f2c6880bfe8f8 authored about 7 years ago by Al <[email protected]>
[similarity] adding a stopword-aware acronym alignment method for matching U.N. with United Nations, Museum of Modern Art with MoMA, as well as things like University of California - Los Angeles with UCLA. All of these should work across languages, including non-Latin character sets like Cyrllic (but not ideograms as the concept doesn't make as much sense there). Skipping tokens like "of" or "the" depends only on the stopwords dictionary being defined for a given language.
cfa5b1ce42ff908a26aecab35b7c22335857d3b6 authored about 7 years ago by Al <[email protected]>
cfa5b1ce42ff908a26aecab35b7c22335857d3b6 authored about 7 years ago by Al <[email protected]>
Merge pull request #283 from AeroXuk/AppVeyor_32bit
Setup AppVeyor Build Matrix for 32 & 64 bit builds.
825408feff9b5c31254a32e3d8ebd5aebceed49e authored about 7 years ago by Al Barrentine <[email protected]>
Merging upstream changes from openvenues/libpostal. Tweeked AppVeyor to package artifacts after setting /LARGEADDRESSAWARE on 32-bit build.
e740970d5a5cdf1543470896331219abac08e531 authored about 7 years ago by AeroXuk <[email protected]>
e740970d5a5cdf1543470896331219abac08e531 authored about 7 years ago by AeroXuk <[email protected]>
Merge branch 'master' into lieu_api
252d5a0f37ac5a0ac260bf7ee460b4ec4abff0b1 authored about 7 years ago by Al <[email protected]>
252d5a0f37ac5a0ac260bf7ee460b4ec4abff0b1 authored about 7 years ago by Al <[email protected]>
Adding make -j4 and setting appveyor to only build master branch.
363e13263e0ac67d1a7198927d49472f2890906e authored about 7 years ago by AeroXuk <[email protected]>
363e13263e0ac67d1a7198927d49472f2890906e authored about 7 years ago by AeroXuk <[email protected]>
Merge pull request #282 from openvenues/faster_windows_build
adding make -j4 to Windows build
8b2f91477ea225778bb71e4345f089459bb80691 authored about 7 years ago by Al Barrentine <[email protected]>
[build] only build master on Appveyor so PRs don't trigger multiple builds
cf56da98f7df0a1cf955c91056bb4ccbd84a046c authored about 7 years ago by Al <[email protected]>
cf56da98f7df0a1cf955c91056bb4ccbd84a046c authored about 7 years ago by Al <[email protected]>
[build] also trying make -j4 to the Windows build. Set an option on the Appveyor side that will hopefully not build twice on pull requests
02d049b8d999ccfdd5b09074f08933a4c4330c21 authored about 7 years ago by Al <[email protected]>
02d049b8d999ccfdd5b09074f08933a4c4330c21 authored about 7 years ago by Al <[email protected]>
Merge pull request #280 from openvenues/faster_builds
make -j4 for all builds
133dce6f2c80f893f0ffedc9084a71ac3f7bc794 authored about 7 years ago by Al Barrentine <[email protected]>
[docs/build] adding make -j4 as the default for make, including in the Travis/Appveyor builds, should make build times faster
f207a4680d64ce2dc3ff8945438d9a136249317e authored about 7 years ago by Al <[email protected]>
f207a4680d64ce2dc3ff8945438d9a136249317e authored about 7 years ago by Al <[email protected]>
[api] adding LIBPOSTAL_EXPORT to some of the new public API functions in this branch
e27f5f1d70cf852f7e01bab36f0a1d8b9a730792 authored about 7 years ago by Al <[email protected]>
e27f5f1d70cf852f7e01bab36f0a1d8b9a730792 authored about 7 years ago by Al <[email protected]>
Setup AppVeyor Build Matrix for 32 & 64 bit builds.
e35d6bf39259e4327d881ba2997a0ce71605ad3b authored about 7 years ago by AeroXuk <[email protected]>
e35d6bf39259e4327d881ba2997a0ce71605ad3b authored about 7 years ago by AeroXuk <[email protected]>
Merge branch 'master' into lieu_api
ec4d683d1bcdf648fd238d73d0f5c8656b103d62 authored about 7 years ago by Al <[email protected]>
ec4d683d1bcdf648fd238d73d0f5c8656b103d62 authored about 7 years ago by Al <[email protected]>
[docs][ci skip] adding a section for Windows installation, shoutouts to @BenK10 and @AeroXuk
2cb0d146e5e4644290d42288a63052e41324b4dd authored about 7 years ago by Al <[email protected]>
2cb0d146e5e4644290d42288a63052e41324b4dd authored about 7 years ago by Al <[email protected]>
Merge pull request #279 from openvenues/drand48_fix
fix Mac build/standardize conditional compilation of strndup and drand48 for Windows
1c42df5ca957d79c441ac06c0ffcd0e7d091f0ef authored about 7 years ago by Al Barrentine <[email protected]>
[fix] deleting comment, this is not a header-only implementation
9e837f72092441a5085aa91d1ffc489cdf9da6fa authored about 7 years ago by Al <[email protected]>
9e837f72092441a5085aa91d1ffc489cdf9da6fa authored about 7 years ago by Al <[email protected]>
[fix] conditional compilation for strndup and drand48 for Windows, using config.h
26e4ef08bca2398f978c2510e39821bb36d60f41 authored about 7 years ago by Al <[email protected]>
26e4ef08bca2398f978c2510e39821bb36d60f41 authored about 7 years ago by Al <[email protected]>
[merge] merging in the Ohio expansion numex changes from master
1a64ad682b6b44e803e167b7789b363eeefc94de authored about 7 years ago by Al <[email protected]>
1a64ad682b6b44e803e167b7789b363eeefc94de authored about 7 years ago by Al <[email protected]>
Merge pull request #272 from AeroXuk/master
Windows support via AppVeyor
18eb5ef9ee98890658034ad8a7e6a2cb709dc5be authored about 7 years ago by Al Barrentine <[email protected]>
Adding include config.h to strndup.c so that the function is not compiled and doesn't cause errors when the system has its own implementation.
19ae97d52792b56353c52deecab145aa5ccb71bc authored about 7 years ago by AeroXuk <[email protected]>
19ae97d52792b56353c52deecab145aa5ccb71bc authored about 7 years ago by AeroXuk <[email protected]>
Modifed the libpostal API to add an extra function libpostal_parser_print_features to toggle debugging info. Updated address_parser app to use the new function.
90908118269e3c5c39e707f7bc371aa2e6f669ea authored about 7 years ago by AeroXuk <[email protected]>
90908118269e3c5c39e707f7bc371aa2e6f669ea authored about 7 years ago by AeroXuk <[email protected]>
Updated linenoise to be MSys2/MinGW compatible. Updated address_parser app to use the defined libpostal api and not include internal components directly. Removed windows src Makefile as it is now the same as the standard one.
69e0d5d963213fd930fc1449767ff515c1bae605 authored about 7 years ago by AeroXuk <[email protected]>
69e0d5d963213fd930fc1449767ff515c1bae605 authored about 7 years ago by AeroXuk <[email protected]>
Adding libpostal.h to the AppVeyor package.
bb5535602ab0975a418dcccecce11551ef23daa2 authored about 7 years ago by AeroXuk <[email protected]>
bb5535602ab0975a418dcccecce11551ef23daa2 authored about 7 years ago by AeroXuk <[email protected]>
Removing EXPORT statements from all source files and most header files, leaving only the exports for the main API in libpostal.h. Modified Makefiles so that all the test apps build without having extra functions exported from libpostal.
26ac9ab5c2a89c9b0e2ce5625e1249c5d3a3c722 authored about 7 years ago by AeroXuk <[email protected]>
26ac9ab5c2a89c9b0e2ce5625e1249c5d3a3c722 authored about 7 years ago by AeroXuk <[email protected]>
[auto][ci skip] Adding data files from Travis build #284
15b3758be89099c348b54bbc3ec87c7d468ac937 authored about 7 years ago by Travis <[email protected]>
15b3758be89099c348b54bbc3ec87c7d468ac937 authored about 7 years ago by Travis <[email protected]>
Merge pull request #274 from openvenues/fix_oh_expansion
Context-sensitive expansion of words like "oh" inside vs. outside numeric expressions
7d001489efc426fe1b970c28c48ebf6a39c04e7c authored about 7 years ago by Al Barrentine <[email protected]>
[test] missing paren in Columbus, OH test. Adding test for "oh" as part of a number in Nineteen oh one W El Segundo Blvd
ebe7fc9be9af246392e70b695ba8447a8a360fd3 authored about 7 years ago by Al <[email protected]>
ebe7fc9be9af246392e70b695ba8447a8a360fd3 authored about 7 years ago by Al <[email protected]>
[test] adding an expansion test for the Columbus, OH case
d7f22544b4610fb1e534ec47501da85b2c5524ba authored about 7 years ago by Al <[email protected]>
d7f22544b4610fb1e534ec47501da85b2c5524ba authored about 7 years ago by Al <[email protected]>
[numex] implementing the numex concat_only_if_number left context, which helps in the case of e.g. Columbus, OH in #271
ef098fd2e79c1f915c0094dc2b9b7f379abc85bd authored about 7 years ago by Al <[email protected]>
ef098fd2e79c1f915c0094dc2b9b7f379abc85bd authored about 7 years ago by Al <[email protected]>
[numex] adding a new type of left context for numeric expressions called conat_only_if_number (for something like "oh" which can be "Columbus, OH" or something like "Twenty-One Oh One"
c276cf15291881fdaa1a3bb6f939ea7bdf237ccd authored about 7 years ago by Al <[email protected]>
c276cf15291881fdaa1a3bb6f939ea7bdf237ccd authored about 7 years ago by Al <[email protected]>
Fix bug in strndup fix for windows. Move all includes out of headers and into code for strndup.h and move it to be the last include.
f0246e7333c379c3f0adb74a7e6928cc83aaf554 authored about 7 years ago by AeroXuk <[email protected]>
f0246e7333c379c3f0adb74a7e6928cc83aaf554 authored about 7 years ago by AeroXuk <[email protected]>
Adding artifacts to AppVeyor config.
d205f4d2bb91136778c3305b64ef9ffd54c9e13c authored about 7 years ago by AeroXuk <[email protected]>
d205f4d2bb91136778c3305b64ef9ffd54c9e13c authored about 7 years ago by AeroXuk <[email protected]>
Adding the export marker to all functions used in tests.
f07ab765cbc345c959a18436749af9ca1e6ce9d2 authored about 7 years ago by AeroXuk <[email protected]>
f07ab765cbc345c959a18436749af9ca1e6ce9d2 authored about 7 years ago by AeroXuk <[email protected]>
Altered Makefile to include strndup.c on the other programs which require it. For the windows version of the Makefile, commented out address_parser lines as it has dependencies on includes we don't have.
ad682b75925f201abaab30d5a4c1a4c204094732 authored about 7 years ago by AeroXuk <[email protected]>
ad682b75925f201abaab30d5a4c1a4c204094732 authored about 7 years ago by AeroXuk <[email protected]>
Fix bugs in AppVeyor config and build script. Added call to test script.
dbf232b8f890e49b466146441f052b78c1a8a7bc authored about 7 years ago by AeroXuk <[email protected]>
dbf232b8f890e49b466146441f052b78c1a8a7bc authored about 7 years ago by AeroXuk <[email protected]>
Merging changes from AeroXuk/libpostal_windows.
2d3b420d352e32a09f044160d444428c3b4fa6f0 authored about 7 years ago by AeroXuk <[email protected]>
2d3b420d352e32a09f044160d444428c3b4fa6f0 authored about 7 years ago by AeroXuk <[email protected]>
[auto][ci skip] Adding data files from Travis build #271
7d6e648fc3fb537d4e7a66a904f59fca4c1e47e7 authored about 7 years ago by Travis <[email protected]>
7d6e648fc3fb537d4e7a66a904f59fca4c1e47e7 authored about 7 years ago by Travis <[email protected]>
Merge pull request #269 from Jeffrey04/ms-dictionary-expansion-1.0
Ms dictionary expansion for 1.0
27b3e99515344399c252f89cbfdfbaf9b6c3296b authored about 7 years ago by Al Barrentine <[email protected]>
new names with alternate spelling
86c3105d4499fc63baa53b6b53b5db04f508b3ee authored about 7 years ago by jeffrey04 <[email protected]>
86c3105d4499fc63baa53b6b53b5db04f508b3ee authored about 7 years ago by jeffrey04 <[email protected]>
reordered list of synonyms
e9d2ab640099ea6c0aca8cac8e5ce42f72868714 authored about 7 years ago by jeffrey04 <[email protected]>
e9d2ab640099ea6c0aca8cac8e5ce42f72868714 authored about 7 years ago by jeffrey04 <[email protected]>
new synonyms
b3d306456feaf1b7dacdd54c18b41b5550601709 authored about 7 years ago by jeffrey04 <[email protected]>
b3d306456feaf1b7dacdd54c18b41b5550601709 authored about 7 years ago by jeffrey04 <[email protected]>
updated street types
0d76d190e17d26be448fc6c6acb004d9d2790f88 authored about 7 years ago by jeffrey04 <[email protected]>
0d76d190e17d26be448fc6c6acb004d9d2790f88 authored about 7 years ago by jeffrey04 <[email protected]>
updated qualifiers
f726970d2b70bf4c73a5f6bd24c1c37cacbe837e authored about 7 years ago by jeffrey04 <[email protected]>
f726970d2b70bf4c73a5f6bd24c1c37cacbe837e authored about 7 years ago by jeffrey04 <[email protected]>
list of titles update
39fd7f0cb1c1393332b36d4a7a531af12ef3048d authored about 7 years ago by jeffrey04 <[email protected]>
39fd7f0cb1c1393332b36d4a7a531af12ef3048d authored about 7 years ago by jeffrey04 <[email protected]>