- Support multiple routable points.
- Add support for up to 256 indexes, instead of a max of 128.
- Overrides proximity scaleRadius based on type, for more local types
neighborhood
andlocality
.
- Add support for
carmen:routable_points
override to allow routable point corrections.
- Add support for
geocoder_stack_bounds
parameter to limit results to a stack's bounds during coalesce.
- Move mapnik to devDependencies, and remove other unused dependencies.
- Fix a bug where a string is passed GridStore instead of a number when indexing.
- Fix a bug introduced in 34.0.0 that caused some features to occasionally be dropped from MBTiles files during indexing.
- Add support for Node 14, and drop support for all previous Node versions.
- Fix an issue with overly aggressive deduplication of ghost features.
- Modify treatment of autocomplete queries ending in whitespace or other word-boundary characters, such that they no longer allow prefix maching where the last token before the boundary is treated is a partial word; prefix matches in this case must now have a word boundary between the last word and any subsequent words.
- Make comparison operators available in all Carmen handlebars templates (e.g.,
eq
) - Add a new helpers file with generic reusable homegrown template helpers, with one so far: a helper to rearrange US-order street addresses into EU order (move numbers to end)
- Add a mechanism for correcting whitespace errors in addresses in limited circumstances.
- Fix a regression introduced in 33.1.1 that occasionally allowed worse address number matches to fill up all the slots before better ones could be considered.
- Add a new strategy for matching house numbers in address clusters in cases where they contain non-numeric characters, in which only the initial numeric portion is compared.
- Add a new scoring bonus for disambiguating features that have more than one element in their context with the same name
- Fix a bug in deduplication of 0-scored features
- Add the ability to only expose features in reverse geocodes
- Replace the
stackable
,coalesce
, andrebalance
algorithms with new ones that differently explore the possible combinations of index components to generate results.
- Fix a bug in selecting templates for per-feature template override
- Add "worldview" concept, where multiple versions of some indexes can be loaded for different views of the world, sharing other indexes that don't differ across worldviews
- Fix issue where exact matches and partial matches of address numbers were treated as equally good
- Allow carmen:proximity_radius to be stored in a feature doc
- Use carmen:proximity_radius to override the zoom-based proximity radius in scoredist
- Switch backing storage engine from carmen-cache to carmen-core, a new Rust datastore
- Update to a newer version and format of fuzzy-phrase that corresponds with carmen-core
- Update geocoder_version to 10 -- this is an index format breaking change
- Indexes features with ordinals dropped with lower relevance to handle missing ordinals in queries
- Performance improvements from removing deadcode
- Give a slight boost to scorefactors of 1 when adjusting scoredist in spatialmatch
- Upgrade VTQuery to 0.5.0
- Require polygons to be direct hits on reverse geocodes
- Adjust the penalty applied to longer stacks to penalize stacks of length 2 less
- Use a list of frequent words to drop common words while indexing phrases
- Add logic for giving partial relevance credit to close-but-misaligned context features
- Added modular address style matching
- Added Queens address style
- Add and update various limiting constants for verifymatch
- Add backfill process for loading more features as needed in verifymatch, up to 50
- Add named callbacks for most verifymatch functions
- Tweaked result sort order for address results outside of interpolation ranges in proximity queries
- Added missing JSDocs
- Added missing changelog entries
- Improve outlier detection for interpolation ranges
- Enable selective filtering in VTQuery
- Improve VTQuery performance
- Revert VTQuery Filtering
- Changes templating engine to handlebars
- Enabled per feature templating overwrites
- Enable filtering in VTQuery
- Decrease VERIFYMATCH_STACK_LIMIT to 50
- Decrease MAX_CONTEXTS_LIMIT to 20
- Increase VERIFYMATCH_STACK_LIMIT to 100
- Add backfill for loading more contexts if relevance doesnt match expected relevance from spatialmatch
- Improve handling of ID collisions that sometimes previously resulted in failing to return any results
- Adjust relevance calculations to change how different administrative components of a query are weighed relative to one another
- Replaced mapnik with VTQuery
- Adjusted sorting in reverse geocoding with scoring to use distance as a tie breaker
- Tuned minimum distance for distscore from 50 m down to 25 m
- Don't de-duplicate address results based on matched text and context if the matched text is a numerical autocomplete or other short query.
- Convert numeric
override:*
values to strings before indexing.
- Add
&
support for intersection search queries which allows users to search for an intersection using&
between street names. For example, "Street A & Street B"
- Ensure
carmen:text
is stringified before enteringclostest-lang
- No longer generate or read the frequency.rocksdb file
- Demand that the
maxscore
is in index metadata - Remove out-of-date
bin/tokenize.js
script - Correctly return matching_text for address numbers longer than 2 digits
- Do not include categories as part of the matching_text
- Pass a proximity radius specified on an index through to coalesce
- Slightly increase the penalty for features where an address number isn't found (e.g. street fallbacks), since more of these make it into final results when address indexes have a higher proximity radius
- Use carmen-cache version 0/27.0 that specifies a distance floor
- Allow properties with an
override:<index name>
property to replace the calculated value that would be returned in the context array otherwise
- De-duplicate address results based on matched text and context
- Add support for Russian-style address numbers, including Korpus and Stroenie
- Fix error condition when attempting to resolve ID collisions involving intersection features
- Output
feature.geometry.intersection: true
on intersection features
- Add intersection search support which allows users to search for "Street A and Street B"
- Add a mechanism for address indexes to express their expected housenumber order (before or after the street name) and use it to rank otherwise-tied results.
- Add support for individual address properties using the
carmen:addressprops
tag
- Allow source replacements to change the ‘cardinality’ of a query, including splitting, combining and removing tokens.
- Global replacements are no longer enumerated at index time when generating index-able variants.
- Improved address number parsing, now recognizes formats like
2/3-4
- Improved CJK numeric tokenization doesn’t stop parsing text after the first number.
- Fixed a bug where we could detect an address number early in processing but fail to locate it again before presenting results
- Sort non-interpolated address results over interpolated address results given the same relevance.
- Added support for performing autocomplete queries on partial housenumbers (e.g., querying for "51" and getting "510 Main St.") if proximity is enabled
- Moved to a new carmen-cache release (0.26.0) that generates more compact indexes in some circumstances, and supports new fast-path operations to make partial-housenumber search more performant
- Added support for indexes with index bounds that cross the antimeridian
- Refactored tokenization to logic to facilitate future feature work around cardinality-changing token replacements
- Removed name regex support in token replacement via XRegExp. Native support in node 10+ is still functional.
- Fix address vs unit number position detection when address and unit numbers exist in different clusters.
- Add fallback behavior if tilecover fails to calculate covers for geometry
- Update to carmen-cache to 0.25.0
- Tighened the width of gaussian curved use is proximity scoring by factor of 3.
- Rescaled distance and weight during proximity scoring. Previously both were put on a 1-11 scale. Now; distance is 1-10, score is 1-500.
- Fixed a bug where the composite relevance + scoredist value was ignored during a sort if the b value was greater than a.
- Lessen relevence penalty on street features.
- Rewrite carmen token replacement logic to more heavily leverage fuzzy-phrase
- Fix a bug where category matches could have a relevence greater than 1.
- Updated carmen-cache to 0.24.0 which adds support for word-boundary aware autocomplete
- Added support for passing carmen-cache an integer value indicating the type of autocomplete to perform for simple token replacements
- CLI options
--autocomplete
and--fuzzyMatch
now default totrue
when not explicitly set - CLI option
--routing
now defaults tofalse
when not explicitly set
- Updated carmen-cache to 0.23.0 which includes internal refacor and RocksDB update.
- Fix a bug in language auto-population rules including language tags containing hyphens.
- Fix the behavior of the geocoder_universal_text index flag so it actually skips language penalties for unviversal indexes.
- Added support for the TileJSON property
geocoder_ignore_order
. This can be used to exempt specific indexes from incurring the "backy" penalty in verifymatch, and is useful for address components whose typical written position is contra the hierarchical ordering (eg. US postcodes are typically written at the end of an address, despite not being the highest level of the US address hierarchy. - Removed query-time de-duplication based on street address & distance.
- Sort final results based on a composite scoredist and relevance score with penalties for features with
carmen:address
ofnull
, omitted geometries, orcarmen:score
s of-1
(ghost features)
- Update all deps to latest versions
- Add a mechanism for auto-populating language bitfields based on the assumed language of a given country
- Improve efficiency of forward queries that use type filters
- Calculate
scoredist
in verifymatch.js as a product of score normalized by max score of all features and distance normalized by proximity radius and scaled along a gaussian curve
- Consider nmask earlier in stackable to improve performance
- Allow streets to be returned as a fallback if no address number match is possible
- Use proximity point (when provided) to bias sort order before spatial match cutoff
- Added
geocoder_categories
to TileJSON input to allow a small score bump for category queries
- Added
autocomplete
andfuzzyMatch
boolean CLI options tobin/carmen.js
- Added some missing tests for checking other CLI options like
proximity
,bbox
andreverseMode
- Expose some internal cutoffs as configuration options
- Refactor proximity.scoredist tests
- Fix proximity.scoredist tests
- Fix geocoder_format template parsing to handle presence of arabic comma
- update to caarmen-cache@0.21.5, which reduces the cross-language relevance penalty
- Ensure all
.address
properties are String values
- Fix a bug in text processing that crashed indexing when the
text
contains reserved words.
- Constrain the circumstances under which fuzzy matching is used to improve speed
- Fix a bug in language parameter parsing that can cause crashes on malformed language tags
- Replace the dawg-cache text backend with the new node-fuzzy-phrase library
- Add support for fuzzy text matching
- Significantly refactor the phrasematch operation
- Move to a new, faster carmen-cache release
- Add support for node 8 and node 10
- Update yarn lockfiles
- Update mapnik & sqlite3 deps to their latest versions
- Fix text indexing issue that caused indexing to fail upon encountering certain kinds of malformed unicode input
- Update context.js to only calculate routable points if
routing
option is enabled
- Add basic routable point functionality for forward and reverse geocoding of features from sources that are routable
- improve diacritical mark removal to support combining diacritics
- update to carmen-cache@0.21.1, which removes unnecessary dependencies
- major reorganization of modules, but no API changes
- major reorganization of tests
- remove carmen-copy, along with associated lib and tests
- add support for dashes in format string templates
- changed the fallback for
hr
fromsr
tosr-Latn
- Add support for Oconomowoc, WI style addresses
- JSDoc comments added
- documentation.js for auto-generating docs/api.md
- warn when code is missing JSDoc
- switch to codecov for coverage reporting
- massive linting/cleaning
- overhauls readme
- adds example project
- adds separate docs folder for in-depth topics
- Fix context builder bug
- Reduces index size by dropping interpolation for addresss clusters that have a wide address number range
- Add support for maching_place_name output on address features
- Improve tokenize script
- Fix bug with where capture groups would be incorrectly numbered when using a token with a diacritic and a capture group
- Fix a bug with validating
--bbox
flag in scripts/carmen.js
- Add
geocoder_stack
filter to phrasematch, to skip unneeded inxdexes earlier when using thecountry
query param - Hard fail when indexing features with more than 10 synonyms
- Fix a bug with validating type filters on geocodes.
- Use es6-native-set and es6-native-map instead of builtin Set and Map, avoid hitting Node's memory limit for large indexing jobs
- Update to turf@5.x.x
- Fix memory issue introduced in 24.1.1 by limiting the number of duplicate address numbers considered per address cluster to 10.
- Fix typo in carmen-merge bin script from 24.1.7
- Add carmen-merge script to
bin
in package.json
- add carmen-merge to package.json.s
- Indexing performance improvements via optimizations to token replacement.
- Switch to
yarn
for tests and migrate package-lock to yarn.lock
- Fix indexing of
universal_text
when the value is shared with another language
- Disable autocomplete on an address's numerical token when the token is moved to the beginning of the query string.
- Support correct forward geocoding over address clusters that contain multiple entries for the same address number.
- Allow for greater flexibility in the token replacement representation introduced in 24.0.0
- Optionally support a greater number of token-replacement permutations efficiently
- Fix proximity issue via upstream fix in carmen-cache
- Tune proximity settings to weight local results more heavily
- Fix indexing of address text without house numbers to be weighted consistently
- Make maskAddress a bit smarter by looking at both the coverText and query to determine if it's about to reuse a housenum that was really originally interpreted as a street. (https://github.com/mapbox/carmen/pull/648)
- Fix a bug that could let indexer-only token replacers leak into runtime replacers (https://github.com/mapbox/carmen/pull/649)
- Remove stacky bonus and gappy penalty (https://github.com/mapbox/carmen/pull/647)
- Packaging fix for carmen-cache.
- Improves handling of cross-language queries against data with partial translation coverage.
- Update handling of default text to no longer have preferential fallback treatment.
- Split display and query fallback language definitions.
- Create
[W,S,E,N]
bboxes when feature geometry straddles the antimeridian - Optionally clip these `[W,S,E,N] bboxes at +/-179.9 degrees, to preserve backwards compatibility
- Small improvements to language fallback behavior for Latvian, Lithuanian, Azerbaijani, and Estonian
- Fix a problem with carmen-index.js in cases where the passed-in tokens file contains a function
- 🚀 Allow function to be used as tokens file in scripts/carmen.js
- Update Deps to @mapbox prefix where possible
trim()
abbr after each tokenize.replaceToken call
- Index multiple variations on token replacement to better support autocomplete of token- replaced text, defaulting to indexing unambiguously reversible replacements
- Add
custom_inverse_tokens
mechanism to allow specifying behavior in ambiguous cases - Fix a bunch of token-replacement-related bugs
- Add
text_universal
, for text which can apply to all langauges
- Update carmen-cache
- Collapsing variants of ARABIC LETTER YEH for uniform indexing
- Update tests to arrow functions,
let
, &const
- Change sort behavior for tied addresses so first number is given slight boost
- Update to mapnik
~3.6.0
- Handle situations in which an ID shard contains multiple features from the same tile
- Fix a few cases where
matching_text
andmatching_place_name
properties were not set as expected.
- Add support for multiple languages to be specified in the
language
option and multiple language output formatting.
- Drop support for node
4.x.x
- Support centered around
6.10.2
- Update dependancies to support 6.x.
- Fix a bug where
indexes
weren't returned for an idGeocode
- Fix a bug in string sorting affecting some strings with mixed complex scripts after the unidecode removal.
- Update carmen-cache to v0.18.0, which stores per-language metadata in the grid cache, and adapt carmen accordingly, to allow proper supported of multilingual autocomplete, and language-weighted results.
- Drop unidecode altogether, and replace it with a much slimmer diacritical mark folder, such that most non-ASCII text is now indexed as-is, improving multilingual accuracy.
- Fix a bug that in certain situations allows features with a null value in their
carmen:center
property to pass validation
- PT addresses are now returned over ITP addresses only if they fall within a set distance
- Fix a bug in package.json
- Fix an issue introduced by the switch to RocksDB, in which numeric tokens would match address numbers before features with numeric text (such as postcodes)
- add English as a fallback language for Arabic and tests to confirm this behaviour.
- add
reverseMode
parameter. When set toscore
, a feature's score will be considered when sorting the results of a reverse query. Defaults todistance
.
- Update to carmen-cache@0.17.0, a major revision which eliminates cache sharding and moves the underlying storage mechanism to one backed by RocksDB
- Adapt carmen to this new cache layer by eliminating logic around on-the-fly loading and storing of grid and frequency data, which is now delegated to RocksDB
- Change phrase IDs to strings, allowing elimination of degen indexing in favor of ID prefix scans in carmen-cache
- Add a
digraphic
array of languages known to use multiple scripts, for more rigorous filtering inlanguageMode: strict
- Add additional Serbian fallabcks
- Add an
equivalent.json
mapping of allowed equivalent languages - Allow equivalent languages to pass the
languageMode: strict
filter
- Add
sr_Latn
fallback forsr_BA
,sr_CS
,sr_ME
, andsr_RS
language codes
- Remove code/support for version 0 legacy features
- Adds index-level option
geocoder_universal_text
for allowing features in an index to be considered language-agnostic/compatible with any requested language when usinglanguageMode=strict
- Improve proximity distance calculation for polygon features.
- Update to carmen-cache@0.16.5.
- Update to carmen-cache@0.16.4.
- Add support for IL style addresses:
43N134 Woodward Ave.
- Revert spatialmatch stack truncation from 18.1.2
- Update to carmen-cache@0.16.3 with additional
coalesce()
performance optimizations
- Spatialmatch the top 4 most specific features of each subquery stack as a performance optimization/safeguard against massive
coalesce()
jobs
- Optimizations to runtime query and indexing operations
- Adds new querytime option languageMode which can be set to
strict
to limit returned features to only those that fully match the language specified in the language option
- Breaking change: a log scale distribution is now used for the 3-bit grid cache simplified score
- Move project to
@mapbox
namespace on npm - Fix the timing calculation reported with the
--stats
flag - Update outdated dependencies. In particular, use namesapced
@turf
modules
- Use
Number
instead ofparseFloat
to detect reverse queries asparseFloat
will silently drop non-numeric parts of a string leading to9a,10b
being interpreted as a reverse query.
- Update to
@mapbox/carmen-cache
package namespace and use latest release (0.16.2
) that addresses several performance and stability issues.
- Fix a spatialmatch bug where low relevance partial text matches would displace higher-relevance full text matches
- Refine multitype behavior and add
place_type
key to explicitly enumerate the types a feature can be.
- Fix indexer behavior for indexes where the max score is specified as 0
- Change penalty from 0.006 => 0.01 to put it on the same %10 scale as other penalties
- Change indexing behavior: don't generate degens (for autocomplete) for feature synonyms
- Filter results disallowed by the
types
filter before sorting and limiting potential matches - In spatialmatch, sort stacks by index from lowest to highest when zoom level is the same
- Add alternate unicode apostrophes for punctuation normalization
- Use fallback language when the specified language key exists, but has a null value.
- Update to
carmen-cache@0.15.0
.
- Automatically lowercase all
stacks
values for a given query
- Move eslint to dev dependencies.
- Update to
carmen-cache@0.14.1
.
- Trim whitespace from text values when outputting feature values.
- Update to
carmen-cache@0.14.0
.
- Robustify language fallback behavior for unmatched language suffixes.
- Modified language fallback behavior to reflect feedback collected from human translators.
- Fix multitype corner case where a feature promoted across levels would not always be properly promoted.
- Update several dependencies to
@mapbox
namespaced versions. - Performance optimizations for
phrasematch()
when dealing with tokens that resolve to empty strings/whitespace when unidecoded.
- Fixes bug where unencodable text like emojis wasn't being ignored.
- Adds index-level option
geocoder_inherit_score
for promoting features that nest within other similar named parent features (e.g. promote New York (city) promoted above New York (state)).
- Add stopgap measure to indexer to partially handle features with > 10k zxy covers. (https://github.com/mapbox/carmen/pull/545)
- More consistent behavior for nested feature promotion when used with the
language
option. - Code and style improvements.
- Modifies verifyContext to better handle identically-named nested features e.g. "New York, New York". Preferentially returns the smaller feature in such cases.
- Introduce mechanisms for approximate guessing of requested language, both using heuristics and hard-coded fallbacks.
- Include private
carmen:
properties in feature output when in debug mode. - Switch
carmen:dbidx
tocarmen:index
to track feature to index relationship more easily.
- Performance improvements to
spatialmatch.stackable()
- Fix bug where type filters would not always work correctly with forward geocodes and multitype indexes.
- Fix bug around feature loading in verifymatch.
- Adds support for individual multitype features in indexes determined by the
carmen:types
attribute. See README for more details.
- Fix typo in
lib/verifymatch.js
- Performance optimizations for
spatialmatch.stackable()
.
- Fix for several calls that could lead to max call stack exceeded errors.
- During indexing, ensure all work in
process.stdout
finishes before exiting the process
- Fixes formatting of error message when an invalid
types
value is specified.
- Allows for filtering by subtypes (e.g.
poi.landmark
) which are defined by score range.
- Allow more flexible regexes in global tokens and refactor how they are applied.
- types + limit reverse query mode is now only a concept handled by reverseGeocode().
- context() always returns a single context.
- Adds context.nearest() for playing the role that the proximity context mode played before -- returns a flat array of [ lon, lat ] points that can then be context() queried for full features.
- Adds additional unit test to demonstrate that in types/limits mode reverse geocodes do indeed load full features/derive address points properly.
- More verbosity in --debug output
- Optimize vector geojson output at indexing time for ligther vector tiles.
- Bump carmen-cache for better error handling on index merges.
- Use stricter eslint rules.
- Add support for addresses that are ordered from largest feature to smallest
- Fix a bug in ID queries when
geocoder_name
!=geocoder_type
.
- Fix an issue with too-strict filtering of indexes that use a combined stack range
- All addresses are now standardized to GeometryCollections internally
- Allows for mixed type (pt/itp) features as well as reducing complexity at runtime (at the cost of index time)
- Bump due to npm strangeness
- Fix global token bug that prevented global tokens being used by indexer
- Added ability for carmen cli to specify global token file
- Moves limit constants into
lib/constants.js
for easier tracking and updates.
- Set the relevance score to 1 when a feature is queried by ID
- Ensures that tokens which contain whitespaces are a part of the global tokens
- Fix bug where dedup could put less relevant results infront of higher ones
- Fix broken phrasematch bench
- Use normalized ranges ITP instead of default feature - fixes bug where null lf/lt/rf/rt would hard error if null instead of empty array
- Ensure address clusters are all lowercase to ensure no case disparity between input query and cluster
- Dedup identical addresses with different cases ie MAIN ST = Main St
- Remove unneccesary check for carmen:center at indexing time
- Fix bug where non-clustered address ranges (LineString) of a numeric type would fail
- Fix bug where copy, merge streams would be considered done prematurely
- Moved merge operations to cpp threadpool for better performance
- carmen-cache@0.13.0
- carmen-cache@0.12.1
- Add
bbox
query option - save memory in addresscluster by calculating minimum without unnecessary array
- 30% more efficient string traversal in getPhraseDegens
- Removes parallel process capability in carmen-indexer
- Disables generation of autocomplete degens in the grid cache at indexing time for translated text
- Upgrades mapnik to version 3.5
- Add infrastructure for merging multiple indexes together, to facilitate parallel indexing.
- Improve query fallback logic by scoring queries per number of matching indexes as well, instead of just per number of matching tokens.
- Segment exclusively Chinese/Japanese/Korean (CJK) terms from everthing else in the index in order to avoid collisions introduced by unidecoding (e.g. 'Aruba' / 'Arubatazhou').
- Add a flag to disable autocomplete in forward geocoding
- Remove deprecated tilelive from index parameter and update to only use streaming interface
- Expand index.update to use object options instead of just zoom
- Expose source._commit where it exists
- Cleanly exit after obtaining a results with scripts/carmen.js when using --config flag
- Better handling of empty strings in DAWG index
- Streaming indexer should utilize geocoder_resolution in tile cover
- Fix context properties bug
- Add id=> output for all features
- Enforce GeoJSON compliance on indexing
- Ensure addressitp parity exists
- Allow up to 10 forward geocodes (default 5) specified by the limit param
- Allow up to 5 reverse geocodes (default 1) when only querying for a single type
- Update indexer to transform ITP & Clusters for vectorization and output to stream
- Add new limit tests
- Upgraded to use latest node-mapnik API for decoding and encoding Vector Tiles
- Enforce max query length of 256 chars
- Enforce max token length of 20 tokens
- Fix bug where a feature with a stack name could be discareded, giving the next feature an incorrect carmen:idx
- Update addFeature.js to index and then vectorize using output from stream.
- Upgrade to Node 4, dropping 0.10
- Update deps in anticipation of deprecating Node 0.10 in favour of 4.0
- Migrate all unit tests to GeoJSON
- Internal addFeature function now only accests GeoJSON
- Cleanup unused code as well as add additional JSDoc comments
- Add streaming interface for indexing
- Output transformed GeoJSON features for vector tiles as stream
- Optimize/reduce I/O when types filter is used.
- Fix bad reference in verifymatch leading to crashing error.
- Fix wasteful duplicate I/O when loading grid cache shards.
- Stop addFeature() in unit tests from overwiting VT. Instead decode and append to it.
- Add support for feature/index level
geocoder_stack
parameter. This parameter allows for stack based filtering (as opposed to type filtering) - Also uses stack for building stackable phrase list instead of bounds
- Drop
mmap
dependency. - Reintroduce
XRegExp
dependency for limited circumstances where named capture groups are necessary.
- Carmen's dict cache now uses directed acyclic word graphs instead of the bit array cache introduced in carmen 9.0.0. As before, they are generated at index time and stored and can be dumped and loaded in a single contiguous-memory chunk, so fast start times should be preserved as compared to bit cache, but with more memory compactness and lower collision rates.
- CJK characters in an indexable word or phrase are now indexed individually to support the practice of addresses being written from largest -> smallest geographical entity and without delimeters.
- Bugfixes to multiconf approach.
- Refactored dict cache using bit arrays and
mmap
for lower runtime memory profile. - Refactored index loading to cleanly handle multiple configurations from the same source instances.
- Supported
geocoder_version
is now 5.
- Catch more unhandled error cases for debugging.
- Expose custom feature properties in
context
entries.
- Clean up context handling of various feature encoding methods from VTs.
- Catch unhandled error case for debugging purposes.
- Update to locking@2.0.2.
- Perform type filtering for reverse geocodes at the context.js level instead of after the context stack has been generated.
- Add
geocoder_type
flag which allows non-similiar indexes to compete for the lowest result in a reverse geocode.
- Set better text templating failovers for localization support.
- Fail index builds with bad language codes.
- Add a
language
option that will return the values ofcarmen:text_{ISO language code}
in the format ofgeocoder_format_{ISO language code}
if available in the index.
- Synonymize
carmen:text_{ISO language code}
field in indicies withcarmen:text
field to support queries in multiple languages.
- Change
geocoder_address
field togeocoder_format
to retain ability to differentiate between address and non-address indexes. geocoder_address
is not a binary0
or1
value.
- Update geocoder to accept templates for place name formatting
- Update indexer & feature objects to use fully compliant GeoJSON
- Update carmen-cache to 0.9.0, introduce
geocoder_cachesize
option.
- Update mapnik to version 3.4.3
- Use a singleton VT cache to limit memory usage across indexes.
- Switch to murmur hash and 52-bit phrase IDs.
- Triage unknown conditions that can cause an unexpected error.
- Improve
scoredist
calculation by using geometric mean of features for scaling scoredist, not max score.
- Add
types
query option to filter results by feature type.
- Fix feature id query mode.
- Smart dedupe of ghost features out of result sets when they match text of other non-ghost features.
- NumTokens V3 for more efficient feature verification. See https://github.com/mapbox/carmen/pull/310.
- Sieve indexing mode for broader indexing of feature text.
- Bumps geocoder_version to 3 (version 2 continues to be supported at runtime).
- Fix for feature verification bug where a non-optimal relevance could sometimes be assigned.
- Fix for proximity bug where extreme values would push into negative xy integers.
- Fix for decollide bug.
- Improved performance for feature verification in verifymatch.
- More efficient spatialmatch by introducing a bounds mask per index and loading grids at spatialmatch time.
- Update index format for delta encoding in carmen-cache@0.7.x. This is a breaking change that requires reindexing. See https://github.com/mapbox/carmen/pull/301 and mapbox/carmen-cache#37 for details.
- More sophisticated tokenization behavior around punctuation -- apostraphe and period characters collapse while most others split terms.
- Doubletap - more conservative vt caching settings
- Updates to carmen-cache @ 0.6.0 for more conservative memory use
- Reduces vtile LRU cache size for high zoom sources
- Robustification fix for how token replacement handles unidecode at indexing + query time.
- Fixes to proximity mode to account for both score and distance.
- Bug fix for lone housenum subquery permutations and upgrade to carmen-cache@0.5.1.
- Large refactor of carmen index structure and indexing/runtime processes. See https://github.com/mapbox/carmen/pull/287
- Rollback XRegExp use.
- Move
addfeature.js
from test directory to lib for external testing use.
- Fix for max call stack errors when using grid indexes with high cardinality.
- Extend geocoder_tokens to use XRegExp.
- Pin to node-mapnik 3.2.x until mapnik 3.3.x is ready.
- Allow geocoder_tokens to be expressed as explicit regex patterns.
- Include ghost features if queried for explicitly.
- Proximity fixes.
- Proximity fixes.
- Prioritizes layer type + score consistently across proximity/non-proximity mode.
- Additional sort stabilization at verifymatch stage.
- Improvements to result stability in proximity mode.
- Use single closest degen for non-terminal terms.
- Update to carmen-cache@0.4.1 with support for swapped order setRelevance.
- Snap dataterm min/max values to nearest thousand. Reduces cardinality of phrase index with minimal affect on dataterm accuracy at querytime.
- Use cache#loadall when indexing.
- Added cache#loadall for loading shards without retrieving results.
- Improved suggestion/autocomplete support for partial queries.
- Breaking change: Introduces
dataterm
term ID type. Any carmen indexes generated previously that contained numeric text (e.g. US zipcodes or addresses with housenumbers) need to be reindexed using carmen@3.x.