Age | Commit message (Collapse) | Author | Files | Lines |
|
* The new official homepage is https://sciteco.fmsbw.de/
* My new contact address is rhaberkorn AT fmsbw.de.
* The scintilla-mirror is now also on https://git.fmsbw.de/scintilla-mirror/
* Added CI script for my server on fmsbw.de that builds
the website.
It's run in a FreeBSD container, but does not currently
distribute FreeBSD binaries.
|
|
empty labels are ignored
* This has long been a TECO-11 incompatibility.
* The first label in a list has index 0, i.e. `1Ofoo,bar$` jumps to label `!bar!`.
Consequently 0 is also implied, so `Olabel$` continues to do what you expect.
* `0Ofoo$` was previously also jumping to `!foo!` which was inconsistent:
All invalid indexes should do nothing, i.e. execution continues after the go-to command.
* Fixed handling of empty labels as in `1Ofoo,,bar$` - execution should also continue
after the command.
This eases writing "default" clauses immediately after the go-to.
* The ED hook values now also begin at 0, so most existing ED hook macros should
continue to work.
* Similarily, the mouse events returned by -EJ also begin at 0 now,
so fnkeys.tes continues to work as expected.
* It's still very possible of course that this breaks existing code.
|
|
TECO compatibility
|
|
special Q-registers ^Ax
* The unnamed buffer is also used for reading from --stdin, so you couldn't practically combine
--stdin with passing command-line arguments to macros.
* The old approach of passing command-line arguments via lines in the
unnamed buffer was flawed anyway as it wouldn't work with filenames containing LF.
This is just a very ancient feature, written when there weren't even long Q-reg names in SciTECO.
* You can now e.g. pipe into SciTECO and edit what was read interactively, e.g. `dmesg | sciteco -i`.
You can practically use SciTECO as a pager.
* htbl.tes is now a command-line filter (uses -qio).
* grosciteco.tes reads Troff intermediate code from stdin, so we no longer need
"*.intermediate" temporary files.
* added a getopt.tes test case to the testsuite.
* This change unfortunately breaks most macros accepting command-line arguments,
even if they used getopt.tes.
It also requires updating ~/.teco_ini - see fallback.teco_ini.
|
|
commands with multiple string arguments
* When `@`-modifying a command with several string arguments and choosing `{` as the alternative
string termination character, the parser would get totally confused.
Any sequence of `{` would be ignored and only the first non-`{` would become the termination character.
Consequently you also couldn't choose a new terminator after the closing `}`.
So even a documented code example from sciteco(7) wouldn't work.
The same was true when using $ (escape) or ^A as the alternative termination character.
* We can now correctly parse e.g. `@FR{foo}{bar}` or `@FR$foo$bar$` (even though the
latter one is quite pointless).
* has probably been broken forever (has been broken even before v2.0).
* Whitespace is now ignored in front of alternative termination characters as in TECO-64, so
we can also write `@S /foo/` or even
```
@^Um
{
!* blabla *!
}
```
I wanted to disallow whitespace termination characters, so the alternative would have been
to throw an error.
The new implementation at least adds some functionality.
* Avoid redundancies when parsing no-op characters via teco_is_noop().
I assume that this is inlined and drawn into any jump-table what would be
generated for the switch-statement in teco_state_start_input().
* Alternative termination characters are still case-folded, even if they are Unicode glyphs,
so `@IЖfooж` would work and insert `foo`.
This should perhaps be restricted to ANSI characters?
|
|
and also the CTRL+L immediate editing command
* ^W can be added to loops in order to view progress in interactive mode.
It also sleeps for a given number of milliseconds (10ms by default).
* In batch mode it is therefore the sleep command.
* Since CTRL+W is an immediate editing command, you will usually type it Caret+W.
ASCII 23 however will also be accepted.
* While ^W only updates the screen, you can force a complete redraw by pressing CTRL+L.
This is what most terminal applications use for redrawing.
It will make it harder to insert ASCII 12, but this is seldom necessary since it
is a form feed.
^L (ASCII 12 and the upcaret variant ) is still a whitespace character and therefore treated as a NOP.
* DEC TECO had CTRL+W as the refresh immediate editing command.
Video TECO uses <ET> as a regular command for refreshign in loops.
I'd rather keep ET reserved as a potential terminal configuration command
as in DEC TECO, though.
|
|
support the append operation (:Xq, :^Uq...)
Works via a default implementation in the "external" Q-register "class"
by first querying the string, appending and re-setting it.
|
|
* `[q]~` was broken and resulted in crashes since it reset the clipboard character to 0.
In fact, if we don't want to break the `[a]b` idiom we cannot use the numeric cell
of register `~`.
* Therefore we no longer use the numeric part of register `~`.
Once the clipboard registers are initialized they completely replace
any existing register with the same name that may have been
set in the profile.
So we still don't leak any memory.
(But perhaps it would now be better to fail with an error
if one of the clipboard registers already exist?)
* Instead, bit 10 (1024) of ED is now used to change the default
clipboard to the primary selection.
The alternative might have been an EJ flag or even a special register containing
the name of the default clipboard register.
* partially reverses 8c6de6cc718debf44f6056a4c34c4fbb13bc5020
|
|
* It continues to default to 67 (C), which is the system clipboard.
But you can now overwrite it e.g. by adding `^^PU~` to the profile.
* This fixes a minor memory leak:
If you set one of the clipboard registers in the profile (initializing
them as plain registers), the clipboard register had been leaked.
The clipboard registers now replace any existing register,
while at the same time preserving the numeric part.
* All remaining Q-Reg table insertions use a new function
teco_qreg_table_insert_unique() which adds an assertion, so that
we notice any future possible memory leaks.
|
|
codepoints in a strtoul()-like manner
|
|
* Now, `I^P` can replace `EI`.
EI is therefore now free to be repurposed as the new "mung file" command for improved TECO-11 compatibility.
* On the downside when inserting large blocks of TECO code, you will have to write something like
`@I{^P !...! }`
* The construct is also useful when searching for carets as in `S^P^Q^`.
|
|
So you can lookup `?bool$` for instance.
|
|
`a-b"=` idiom
* There might theoretically be problems with the uncommon one's complement or magnitude
representation of negative integers, but it's practically impossible to meet those in
the wild.
* Still, we do some checks now, so we will at least notice any exotic architectures.
* Also, documented the `a^#b"=` idiom for checking for equality.
It's longer to type, but faster and will also work for floats.
For floats it will be the only permissible idiom for checking for bitwise equality
as `a-b` can be 0 even if a!=b (if the difference is very small).
Changing the `-` semantics is out of the question.
|
|
* Instead of separate stand-alone commands, they are now allowed only immediately
in front of the commands that accept them.
* The order is still insignificant if both `@` and `:` are accepted.
* The number of colon modifiers is now also checked.
We basically get this for free.
* `@` has syntactic significance, so it could not be set conditionally anyway.
Still, it was possible to provoke bugs were `@` was interpreted conditionally
as in `@ 2<I/foo/$>`.
* Even when not causing bugs, a mistyped `@` would often influence the
__next__ command, causing unexpected behavior, for instance when
typing `@(233C)W`.
* While it was theoretically possible to set `:` conditionally, it could also
be "passed through" accidentally to some command where it wasn't expected as in
`:Ifoo$ C`.
I do not know of any real useful application or idiom of a conditionally set `:`.
If there would happen to be some kind of useful application, `:'` and `:|` could
be re-allowed easily, though.
* I was condidering introducing a common parser state for modified commands,
but that would have been tricky and introduce a lot of redundant command lists.
So instead, we now simply everywhere check for excess modifiers.
To simplify this task, teco_machine_main_transition_t now contains flags
signaling whether the transition is allowed with `@` or `:` modifiers set.
It currently only has to be checked in the start state, after `E` and `F`.
|
|
* This was actually broken if the command is preceded by `@` and `:` characters, which
are __not__ modifiers.
E.g. `Q:@I/foo^W` would have rubbed out the `:` register as well.
* Also, since it was all done in teco_state_process_edit_cmd(),
it would also rub out modifier characters from within string arguments,
E.g. `@I/::^EQ^W`
* Real commands now have their own ^W rubout implementation, while the generic
fallback just rubs out until the start state is re-established.
This fails to rub out modifiers as in `@I/^W`, though.
* Real command characters now use the common TECO_DEFINE_STATE_COMMAND().
* Added test cases for CTRL+W rub out.
A few control characters are now portably available to tests
via environment variables `$ESCAPE`, `$RUBOUT` and `$RUBOUT_WORD`.
|
|
* The old heuristics - scroll if dot changes after key press -
turned out to be too simplistic.
They broke the clang-format macro (M#cf), which left the view at the
top of the document since the entire document is temporarily erased.
Other simplified examples of this bug would be:
@^Um{[: HECcat$ ]:} Mm
Or even: @^Um{[: H@X.aG.a ]:} Mm
* Actually, the heuristics could be tricked even without deleting any
significant amount of text from the buffer.
The following test case replaces the previous character with a linefeed
in a single key press:
@^Um{-DI^J$} Mm
If executed on the last visible line, dot wouldn't be scrolled into the view
since it did not change.
* At the same time, we'd like to keep the existing mouse scroll behavior from
fnkeys.tes, which is allowed to scroll dot outside of the visible area.
Therefore, dot is scrolled into view always, except after mouse events.
You may have to call SCI_SCROLLCARET manually in the ^KMOUSE macro,
which is arguably not always straight forward.
* Some macros like M#cf may still leave the vertical scrolling position
in unexpected positions. This could either be fixed by eradicating all
remaining automatic scrolling from Scintilla or by explicitly restoring
the vertical position from the macro (FIXME).
* This was broken since the introduction of mouse support,
so it wasn't in v2.3.0.
|
|
* It makes little sense to e.g. rub out until `I` in `@I/foo/`, but
leave the `@` modifier.
Modifiers have to be considered part of the command,
even though the state machine is not currently modelled like that.
|
|
This was changed ages ago for some old version of Groff.
These workarounds should no longer be necessary.
|
|
out no-op commands (whitespace)
* In string arguments, ^W first rubs out non-word chars (usually whitespace),
so it makes sense if ^W would work analogously at the command level.
A non-command would be one of the no-ops.
|
|
the beginning of words now
* All commands and their documentations were inconsistent.
* ^W rubbed out to the beginning of words.
* Shift+Right (fnkeys.tes) moved to the beginning of the next word if
invoked at the beginning of a word and to the end of the next word otherwise.
* <W> (and <V> and <Y> by extension) moved to the end of the next word.
* The cheat sheet would claim that <W> moves to the beginning of the next word.
* Video TECO's <W> command would differ again from everything else.
With positive arguments, it moved to the beginning of words, while
with negative it moved to end of words.
I decided not to copy this behavior.
* It has been decided to adopt a consistent beginning-of-words policy.
-W therefore differs from Video TECO in moving to the beginning of the
current or previous word.
* teco_find_words() is now based on parsing the document pointer, instead
of relying on SCI_WORDENDPOSITION, since the latter cannot actually be
used to skip strictly non-word characters.
This requires a constant amount of Scintilla messages but will require fewer
messages only when moving for more than 3 words.
* The semantics of <W> are therefore now consistent with Vim and Emacs as well.
* Shift+Right/Left is still based on SCI_WORDENDPOSITION, so it's behavior
differs slightly from <W> for instance at the end of lines, as it will
stop at linebreaks.
* Unfortunately, these changes will break lots of macros, among others
the M#rf, M#sp and git.blame macros ("Useful macros" from the wiki).
|
|
* As an alternative to OSC-52, which is rarely supported by terminal emulators.
* Makes the new mouse support much more useful since you rely on good builtin
clipboard support. You can no longer e.g. just double-click a word to copy it into
the "primary" selection as terminal emulators do by default.
* Set $SCITECO_CLIPBOARD_SET/GET e.g. to xclip, way-copy, pbcopy or some wrapper script.
* This is currently using POSIX-specific popen() API, so it behaves a bit different
to command execution via EC/EG.
I am not sure if it's worth rewriting with the GSpawn-API, since it will be used
only on POSIX anyway and a GSpawn-based implementation is likely to be a bit larger.
* Should there be some small command-line utility for interacting (esp. pasting) via OSC-52,
built-in OSC-52 support could well be removed from SciTECO.
Currently, I know only of https://github.com/theimpostor/osc/ and it requires
very recent Go compilers. (I still haven't tested it. Quite possibly, pasting when run as
a piped command is impossible.)
|
|
* Curses allows scrolling with the scroll wheel at least
if mouse support is enabled via ED flags.
Gtk always supported that.
* Allow clicking on popup entries to fully autocomplete them.
Since this behavior - just like auto completions - is parser state-dependant,
I introduced a new state method (insert_completion_cb).
All the implementations are currently in cmdline.c since there is some overlap
with the process_edit_cmd_cb implementations.
* Fixed pressing undefined function keys while showing the popup.
The popup area is no longer redrawn/replaced with the Scintilla view.
Instead, continue to show the popup.
|
|
* You need to set 0,64ED to enable mouse processing in Curses.
It is always enabled in Gtk as it should never make the experience worse.
sample.teco_ini enables mouse support, since this should be the new default.
`sciteco --no-profile` won't have it enabled, though.
* On curses, it requires the ncurses mouse protocol version 2, which will
also be supported by PDCurses.
* Similar to the Curses API, a special key macro ^KMOUSE is inserted if any of the supported
mouse events has been detected.
* You can then use -EJ to get the type of mouse event, which can be used
with a computed goto in the command-line editing macro.
Alternatively, this could have been solved with separate ^KMOUSE:PRESSED,
^KMOUSE:RELEASED etc. pseudo-key macros.
* The default ^KMOUSE implementation in fnkeys.tes supports the following:
* Left click: Edit command line to jump to position.
* Ctrl+left click: Jump to beginning of line.
* Right click: Insert position or position range (when dragging).
* Double right click: insert range for word under cursor
* Ctrl+right click: Insert beginning of line
* Scroll wheel: scrolls (faster with shift)
* Ctrl+scroll wheel: zoom (GTK-only)
* Currently, there is no visual feedback when "selecting" ranges
via right-click+drag.
This would be tricky to do and most terminal emulators do not appear
to support continuous mouse updates.
|
|
* Allowing label redefinitions might have been useful when used as comments,
since you will want to be able to define arbitrary comments.
However as flow control constructs, this introduced a certain ambiguity since
gotos might jump to different locations, depending on the progression
of the parser.
* On the other hand, making label redefinition an error might disqualify labels as
comments when writing or porting classic TECO code.
Therefore, it has been made a warning as a compromise.
* Added test case
|
|
* The previous convention of !* ... *! are now true block comments,
i.e. they are parsed faster, don't spam the goto table and allow
embedding of exclamation marks - only "*!" terminates the comment.
* It is therefore now forbidden to have goto labels beginning with "*".
* Also support "!!" to introduce EOL comments (like C++'s //).
This disallows empty labels, but they weren't useful anyway.
This is the shortest way to begin a comment.
* All comment labels have been converted to true comments, to ensure
that syntax highlighting works correctly.
EOL comments are used for single line commented-out code, since it's
easiest to uncomment - you don't have to jump to the line end.
This is a pure convention / coding style.
Other people might do it differently.
* It's of course still possible to abuse goto labels as comments
as TECO did for ages.
* In lexing / syntax highlighting, labels and comments are highlighted differently.
* When syntax highlighting, a single "!" will first be highlighted as a label
since it's not yet unambiguous. Once you type the second character (* or !),
the first character is retroactively styled as a comment as well.
|
|
* @ES/SCI_SETILEXER/lib^@name/ now opens the lexer <name> in library <lib>.
* You need to define the environment variable $SCITECO_SCINTILLUA_LEXERS to point
to the lexers/ subdirectory (containing the *.lua files).
Perhaps this should default to the dirname of <lib>?
* The semantics of SCI_NAMEOFSTYLE have been changed:
It now returns style ids when given style names, so you can actually write
Scintillua lexer *.tes files.
This will be superfluous if we had a way to return strings from Scintilla messages into
Q-Registers, e.g. 23@EPq/SCI_NAMEOFSTYLE/.
* We now depend on gmodule as well, but it should always be part of glib.
It does not change the library dependencies of any package.
It might result in gmodule shared libraries to be bundled in the Win32 and Mac OS
packages if they weren't already.
|
|
::FS as well)
* The colon modifier can now occur 2 times.
Specifying `@` more than once or `:` more than twice is an error now.
* Commands do not check for excess colon modifiers - almost every command would have
to check it. Instead, a double colon will simply behave like a single colon on most
commands.
* All search commands inherit the anchored semantics, but it's not very useful in some combinations
like -::S, ::N or ::FK.
That's why the `::` variants are not documented everywhere.
* The lexer.checkheader macro could be simplified and should also be faster now,
speeding up startup.
Eventually this macro can be made superfluous, e.g. by using 1:FB or 0,1^Q::S.
|
|
* Can be freely combined with the colon-modifier as well.
:@Xq cut-appends to register q.
* This simply deletes the given buffer range after the copy or append operation
as if followed by another <K> command.
* This has indeed been a very annoying missing feature, as you often have to retype the
range for a K or D command.
At the same time, this cannot be reasonably solved with a macro since macros
do not accept Q-Register arguments -- so we would have to restrict ourselves to one or a few
selected registers.
I was also considering to solve this with a special stack operation that duplicates the
top values, so that Xq leaves arguments for K, but this couldn't work for cutting lines
and would also be longer to type.
* It's the first non-string command that accepts @.
Others may follow in the future.
We're approaching ITS TECO madness levels.
|
|
|
|
|
|
* We cannot call it "." since that introduces a local register
and we don't want to add an unnecessary syntactic exception.
* Allows the idiom [: ... ]: to temporarily move around.
Also, you can now write ^E\: without having to store dot in a register first.
* In the future we might add an ^E register as well for byte offsets.
However, there are much fewer useful applications.
* Of course, you can now also write nU: instead of nJ, Q: instead of "." and
n%: instead of "nC.". However it's all not really useful.
|
|
^R now (refs #17)
* This way the search mode and radix are local to the current macro frame,
unless the macro was invoked with :Mq.
If colon-modified, you can reproduce the same effect by calling
[.^X 0^X ... ].^X
* The radix register is cached in the Q-Reg table as an optimization.
This could be done with the other "special" registers as well, but at the
cost of larger stack frames.
* In order to allow constructs like [.^X typed with upcarets,
the Q-Register specification syntax has been extended:
^c is the corresponding control code instead of the register "^".
|
|
* also explicitly mention -%q
|
|
|
|
* makes it possible, albeit cumbersome, to escape pattern match characters
* For instance, to search for ^Q, you now have to type
S^Q^Q^Q^Q$.
To search for ^E you have to type
S^Q^Q^Q^E$.
But the last character cannot be typed with carets currently (FIXME?).
For pattern-only characters, two ^Q should be sufficient as in
S^Q^Q^X$.
* Perhaps it would be more elegant to abolish the difference between string building
and pattern matching characters to avoid double quoting.
But then all string building constructs like ^EQq should operate at the pattern level
as well (ie. match the contents of register q verbatim instead of being interpreted as a pattern).
TECOC and TECO-64 don't do that either.
If we leave everything as it is, at least a new string building construct should be added for
auto-quoting patterns (analoguous to ^EN and ^E@).
|
|
* This allows you to type ^Q^U (which would otherwise rub out the entire argument)
and ^Q^W (which would otherwise rub out the ^Q).
* ^Q^U coincidentally worked previously since the teco_state_stringbuilding_escaped
state would default to teco_state_process_edit_cmd().
But it's better to make this feauture explicit.
* This finally makes it possible to insert the ^W (23) char into a buffer.
In interactive mode, you can still only type Caret+W as a string building construct.
* ^G could also be inhibited after ^Q, but the control char is not used anywhere yet,
so there is no point in doing that.
|
|
* The XTerm version is still checked if we detect running under XTerm.
* Actually, the XTerm implementation is broken for Unicode clipboard contents.
* Kitty supports OSC-52, but you __must__ enable read-clipboard.
With read-clipboard-ask, there will be a timeout.
But we cannot read without a timeout since otherwise we would hang indefinitely
if the escape sequence turns out to not work.
* For urxvt, I have hacked an existing extension:
https://gist.github.com/rhaberkorn/d7406420b69841ebbcab97548e38b37d
* st currently supports only setting the clipboard, but not querying it.
|
|
all expansions of ^EQq, ^EUq and so on
* Previously, there was no way to enter upper-case mode in interactive commands since
the Ctrl+W immediate editing command is interpreted everywhere.
* Without the case folding of ^EQq/^EUq results, the upper and lower case modes are actually pretty useless
considering that modern keyboards have caps lock.
So it was clear we need this, regardless of what the classic TECOs did.
The TECO-11 manual is not very clear on this.
tecoc apparently does not case-fold ^EQq results.
* This opens up new idioms, for instance
`EUq^W^W^EQq$` in order to upper case register q.
It's also the only way you can currently upper-case Unicode codepoints.
|
|
* Ctrl+^ (30) and Caret+caret (^^) were both translated to a single caret.
While there might be some reason to keep this behavior for double-caret,
it is certainly pointless for Ctrl+^.
* That gives you an easy way to insert Ctrl+^ (code 30) into documents with <I>.
Perviously, you either had to insert a double-caret, typing 4 carets in a row,
or you had to use <EI> or 30I$.
* The special handling of double-caret could perhaps be abolished altogether,
as we also have ^Q^ to escape plain carets.
The double-caret syntax is very archaic from the time that there was no proper
^Q as far as I recall correctly.
|
|
|
|
* Unfortunately, the list in sciteco(7) does not format with FreeBSD's man or
within SciTECO.
* Removed references to the old sciteco.sf.net.
We don't have a proper "homepage" for the time being.
|
|
* Practically requires one of the "Nerd Font" fonts,
so it's disabled by default.
Add 0,512ED to the profile to enable them.
* The new ED flag could be used to control Gtk icons as well,
but they are left always-enabled for the time being.
Is there any reason anybody would like to disable icons in Gtk?
* The list of icons has been adapted and extended from exa:
https://github.com/ogham/exa/blob/master/src/output/icons.rs
* The icons are hardcoded as presorted lists,
so we can binary search them.
This could change in the future. If there is any demand,
they could be made configurable via Q-Registers as well.
|
|
* ALL keypresses (the UTF-8 sequences resulting from key presses) can now be remapped.
* This is especially useful with Unicode support, as you might want to alias
international characters to their corresponding latin form in the start state,
so you don't have to change keyboard layouts so often.
This is done automatically in Gtk, where we have hardware key press information,
but has to be done with key macros in Curses.
There is a new key mask 4 (bit 3) for that purpose now.
* Also, you might want to define non-ANSI letters to perform special functions in
the start state where it won't be accepted by the parser anyway.
Suppose you have a macro M→, you could define
@^U[^K→]{m→} 1^_U[^K→]
This effectively "extends" the parser and allow you to call macro "→" by a single
key press. See also #5.
* The register prefix has been changed from ^F (for function) to ^K (for key).
This is the only thing you have to change in order to migrate existing
function key macros.
* Key macros are enabled by default. There is no longer any way to disable
function key handling in curses, as I never found any reason or need to disable it.
Theoretically, the default ESCDELAY could turn out to be too small and function
keys don't get through. I doubt that's possible unless on extremely slow serial lines.
Even then, you'd have to increase ESCDELAY and instead of disabling function keys
simply define an escape surrogate.
* The ED flag has been removed and its place is reserved for a future mouse support flag
(which does make sense to disable in curses sometimes).
fnkeys.tes is consequently also enabled by default in sample.teco_ini.
* Key macros are handled as an unit. If one character results in an error,
the entire string is rubbed out.
This fixes the "CLOSE" key on Gtk.
It also makes sure that the original error message is preserved and not overwritten
by some subsequent syntax error.
It was never useful that we kept inserting characters after the first error.
|
|
The following rules apply:
* All SciTECO macros __must__ be in valid UTF-8, regardless of the
the register's configured encoding.
This is checked against before execution, so we can use glib's non-validating
UTF-8 API afterwards.
* Things will inevitably get slower as we have to validate all macros first
and convert to gunichar for each and every character passed into the parser.
As an optimization, it may make sense to have our own inlineable version of
g_utf8_get_char() (TODO).
Also, Unicode glyphs in syntactically significant positions may be case-folded -
just like ASCII chars were. This is is of course slower than case folding
ASCII. The impact of this should be measured and perhaps we should restrict
case folding to a-z via teco_ascii_toupper().
* The language itself does not use any non-ANSI characters, so you don't have to
use UTF-8 characters.
* Wherever the parser expects a single character, it will now accept an arbitrary
Unicode/UTF-8 glyph as well.
In other words, you can call macros like M§ instead of having to write M[§].
You can also get the codepoint of any Unicode character with ^^x.
Pressing an Unicode character in the start state or in Ex and Fx will now
give a sane error message.
* When pressing a key which produces a multi-byte UTF-8 sequence, the character
gets translated back and forth multiple times:
1. It's converted to an UTF-8 string, either buffered or by IME methods (Gtk).
On Curses we could directly get a wide char using wget_wch(), but it's
not currently used, so we don't depend on widechar curses.
2. Parsed into gunichar for passing into the edit command callbacks.
This also validates the codepoint - everything later on can assume valid
codepoints and valid UTF-8 strings.
3. Once the edit command handling decides to insert the key into the command line,
it is serialized back into an UTF-8 string as the command line macro has
to be in UTF-8 (like all other macros).
4. The parser reads back gunichars without validation for passing into
the parser callbacks.
* Flickering in the Curses UI and Pango warnings in Gtk, due to incompletely
inserted and displayed UTF-8 sequences, are now fixed.
|
|
* When enabled with bit 2 in the ED flags (0,4ED),
all registers and buffers will get the raw ANSI encoding (as if 0EE had been
called on them).
You can still manually change the encoding, eg. by calling 65001EE afterwards.
* Also the ANSI mode sets up character representations for all bytes >= 0x80.
This is currently done only depending on the ED flag, not when setting 0EE.
* Since setting 16,4ED for 8-bit clean editing in a macro can be tricky -
the default unnamed buffer will still be at UTF-8 and at least a bunch
of environment registers as well - we added the command line option
`--8bit` (short `-8`) which configures the ED flags very early on.
As another advantage you can mung the profile in 8-bit mode as well
when using SciTECO as a sort of interactive hex editor.
* Disable UTF-8 checks in 8-bit clean mode (sample.teco_ini).
|
|
|
|
or codepoints) (refs #5)
* This is trickier than it sounds because there isn't one single place to consult.
It depends on the context.
If the string argument relates to buffer contents - as in <I>, <S>, <FR> etc. -
the buffer's encoding is consulted.
If it goes into a register (EU), the register's encoding is consulted.
Everything else (O, EN, EC, ES...) expects only Unicode codepoints.
* This is communicated through a new field teco_machine_stringbuilding_t::codepage
which must be set in the states' initial callback.
* Seems overkill just for ^EUq, but it can be used for context-sensitive
processing of all the other string building constructs as well.
* ^V and ^W cannot be supported for Unicode characters for the time being without an Unicode-aware parser
|
|
* This will naturally work with both ASCII characters and various
non-English scripts.
* Unfortunately, it cannot work with the other non-ANSI single-byte codepages.
* If we'd like to support scripts working with all sorts of codepoints,
we'd have to introduce a new command for translating individual codepoints
from the current codepage (as reported by EE) to Unicode.
|
|
* It's generally a bad idea to pass backslashes as a glyph in macro arguments, even as `\\`
since this could easily be interpreted as an escape.
* Instead we now always use `\[rs]`.
|
|
* was introduced in e7867fb0
|