diff options
author | nyamatongwe <devnull@localhost> | 2012-05-26 13:26:11 +1000 |
---|---|---|
committer | nyamatongwe <devnull@localhost> | 2012-05-26 13:26:11 +1000 |
commit | e1370f834348a12fea75ad883a0d801dfd1b9d8d (patch) | |
tree | d5bbf357b84a4794326f3cf3572b2b9d39f96f7e /src/UniConversion.h | |
parent | 44241ccc28b561efcdbda77350bb5435b11b3d47 (diff) | |
download | scintilla-mirror-e1370f834348a12fea75ad883a0d801dfd1b9d8d.tar.gz |
For case-insensitive UTF-8 searching, use UTF8Classify for finding valid
character width so compatible with other similar code. Optimize treatment of
single byte ASCII characters and also optimize loop conditions. These
mostly make up for the performance decrease from calling UTF8Classify.
Add support definitions UTF8MaxBytes and UTF8IsAscii in UniConversion.
Remove ExtractChar as no longer needed.
Diffstat (limited to 'src/UniConversion.h')
-rw-r--r-- | src/UniConversion.h | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/src/UniConversion.h b/src/UniConversion.h index 87cc43f77..704f16239 100644 --- a/src/UniConversion.h +++ b/src/UniConversion.h @@ -5,6 +5,8 @@ // Copyright 1998-2001 by Neil Hodgson <neilh@scintilla.org> // The License.txt file describes the conditions under which this software may be distributed. +const int UTF8MaxBytes = 4; + unsigned int UTF8Length(const wchar_t *uptr, unsigned int tlen); void UTF8FromUTF16(const wchar_t *uptr, unsigned int tlen, char *putf, unsigned int len); unsigned int UTF8CharLength(unsigned char ch); @@ -18,5 +20,9 @@ inline bool UTF8IsTrailByte(int ch) { return (ch >= 0x80) && (ch < 0xc0); } +inline bool UTF8IsAscii(int ch) { + return ch < 0x80; +} + enum { UTF8MaskWidth=0x7, UTF8MaskInvalid=0x8 }; int UTF8Classify(const unsigned char *us, int len); |