aboutsummaryrefslogtreecommitdiffhomepage
path: root/doc
diff options
context:
space:
mode:
authorRobin Haberkorn <robin.haberkorn@googlemail.com>2024-09-04 12:49:29 +0200
committerRobin Haberkorn <robin.haberkorn@googlemail.com>2024-09-09 18:22:21 +0200
commitb31b88717172e22b49c0493185f603b8f84989ec (patch)
tree43850d7d04e721987b89c37c68f24e657b5cb9c6 /doc
parentb85edaa0021c06d63fee6d8904fc822815e8b933 (diff)
downloadsciteco-b31b88717172e22b49c0493185f603b8f84989ec.tar.gz
the ^EUq string building escape now respects the encoding (can insert bytes or codepoints) (refs #5)
* This is trickier than it sounds because there isn't one single place to consult. It depends on the context. If the string argument relates to buffer contents - as in <I>, <S>, <FR> etc. - the buffer's encoding is consulted. If it goes into a register (EU), the register's encoding is consulted. Everything else (O, EN, EC, ES...) expects only Unicode codepoints. * This is communicated through a new field teco_machine_stringbuilding_t::codepage which must be set in the states' initial callback. * Seems overkill just for ^EUq, but it can be used for context-sensitive processing of all the other string building constructs as well. * ^V and ^W cannot be supported for Unicode characters for the time being without an Unicode-aware parser
Diffstat (limited to 'doc')
-rw-r--r--doc/sciteco.7.template6
1 files changed, 6 insertions, 0 deletions
diff --git a/doc/sciteco.7.template b/doc/sciteco.7.template
index a6cca40..ca23c93 100644
--- a/doc/sciteco.7.template
+++ b/doc/sciteco.7.template
@@ -1647,6 +1647,12 @@ Expands to the character whose code is stored in the numeric
part of Q-Register \fIq\fP.
For instance if register \(lqA\(rq contains the code 66,
\(lq^EUa\(rq expands to the character \(lqB\(rq.
+The interpretation of this code depends on the context.
+Within inserts and searches (\fBI\fP, \fBS\fP, etc.) bytes or Unicode codepoints
+are expected depending on the buffer's encoding.
+Operations on registers (\fBEU\fP) similarily consult the
+register's encoding.
+Everything else expects Unicode codepoints.
.TP
.SCITECO_TOPIC ^EQ ^EQq
.BI ^EQ q