summaryrefslogtreecommitdiff
path: root/src/tutorial
diff options
context:
space:
mode:
authorJeff Davis2025-03-15 20:00:50 +0000
committerJeff Davis2025-03-15 20:00:50 +0000
commit27bdec06841d1bb004ca7627eac97808b08a7ac7 (patch)
tree4657fade845bd4a6bc791f439e462aa0c9705a5d /src/tutorial
parentc3953226a07527a1e2f7f410b83e1a7021f42888 (diff)
Optimization for lower(), upper(), casefold() functions.
Improve performance and reduce table sizes for case mapping. The main case mapping table stores only 16-bit offsets, which can be used to look up the mapped code point in any of the case tables (fold, lower, upper, or title case). Simple case pairs point to the same offsets. Generate a function in generate-unicode_case_table.pl that consists of a nested branches to test for specific codepoint ranges that determine the offset in the main table. Other approaches were considered, such as representing these ranges as another structure (rather than branches in a generated function), or a different approach such as a radix tree, or perfect hashing. The author implemented and tested these alternatives and settled on the generated branches. Author: Alexander Borisov <lex.borisov@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/7cac7e66-9a3b-4e3f-a997-42aa0c401f80%40gmail.com
Diffstat (limited to 'src/tutorial')
0 files changed, 0 insertions, 0 deletions