sciteco - Scintilla-based Text Editor and COrrector

Age	Commit message (Collapse)	Author	Files	Lines
2025-09-21	moved most resources to fmsbw.de	Robin Haberkorn	1	-1/+1
	* The new official homepage is https://sciteco.fmsbw.de/ * My new contact address is rhaberkorn AT fmsbw.de. * The scintilla-mirror is now also on https://git.fmsbw.de/scintilla-mirror/ * Added CI script for my server on fmsbw.de that builds the website. It's run in a FreeBSD container, but does not currently distribute FreeBSD binaries.
2025-08-30	the computed go-to command (O) is now 0-indexed and all invalid indexes and ↵	Robin Haberkorn	1	-7/+9
	empty labels are ignored * This has long been a TECO-11 incompatibility. * The first label in a list has index 0, i.e. `1Ofoo,bar$` jumps to label `!bar!`. Consequently 0 is also implied, so `Olabel$` continues to do what you expect. * `0Ofoo$` was previously also jumping to `!foo!` which was inconsistent: All invalid indexes should do nothing, i.e. execution continues after the go-to command. * Fixed handling of empty labels as in `1Ofoo,,bar$` - execution should also continue after the command. This eases writing "default" clauses immediately after the go-to. * The ED hook values now also begin at 0, so most existing ED hook macros should continue to work. * Similarily, the mouse events returned by -EJ also begin at 0 now, so fnkeys.tes continues to work as expected. * It's still very possible of course that this breaks existing code.
2025-08-18	sciteco(7): clarified SciTECO's policy with regards to TECO-11 and Video ↵	Robin Haberkorn	1	-5/+7
	TECO compatibility
2025-08-06	command-line arguments are no longer passed via the unnamed buffer, but via ↵	Robin Haberkorn	1	-0/+20
	special Q-registers ^Ax * The unnamed buffer is also used for reading from --stdin, so you couldn't practically combine --stdin with passing command-line arguments to macros. * The old approach of passing command-line arguments via lines in the unnamed buffer was flawed anyway as it wouldn't work with filenames containing LF. This is just a very ancient feature, written when there weren't even long Q-reg names in SciTECO. * You can now e.g. pipe into SciTECO and edit what was read interactively, e.g. `dmesg \| sciteco -i`. You can practically use SciTECO as a pager. * htbl.tes is now a command-line filter (uses -qio). * grosciteco.tes reads Troff intermediate code from stdin, so we no longer need ".intermediate" temporary files. added a getopt.tes test case to the testsuite. * This change unfortunately breaks most macros accepting command-line arguments, even if they used getopt.tes. It also requires updating ~/.teco_ini - see fallback.teco_ini.
2025-08-02	fixed serious bug with certain alternative string termination chars in ↵	Robin Haberkorn	1	-8/+22
	commands with multiple string arguments * When `@`-modifying a command with several string arguments and choosing `{` as the alternative string termination character, the parser would get totally confused. Any sequence of `{` would be ignored and only the first non-`{` would become the termination character. Consequently you also couldn't choose a new terminator after the closing `}`. So even a documented code example from sciteco(7) wouldn't work. The same was true when using $ (escape) or ^A as the alternative termination character. * We can now correctly parse e.g. `@FR{foo}{bar}` or `@FR$foo$bar$` (even though the latter one is quite pointless). * has probably been broken forever (has been broken even before v2.0). * Whitespace is now ignored in front of alternative termination characters as in TECO-64, so we can also write `@S /foo/` or even ``` @^Um { !* blabla ! } ``` I wanted to disallow whitespace termination characters, so the alternative would have been to throw an error. The new implementation at least adds some functionality. Avoid redundancies when parsing no-op characters via teco_is_noop(). I assume that this is inlined and drawn into any jump-table what would be generated for the switch-statement in teco_state_start_input(). * Alternative termination characters are still case-folded, even if they are Unicode glyphs, so `@IЖfooж` would work and insert `foo`. This should perhaps be restricted to ANSI characters?
2025-08-01	implemented the ^W command for refreshing the screen in loops, for sleeping ↵	Robin Haberkorn	1	-0/+10
	and also the CTRL+L immediate editing command * ^W can be added to loops in order to view progress in interactive mode. It also sleeps for a given number of milliseconds (10ms by default). * In batch mode it is therefore the sleep command. * Since CTRL+W is an immediate editing command, you will usually type it Caret+W. ASCII 23 however will also be accepted. * While ^W only updates the screen, you can force a complete redraw by pressing CTRL+L. This is what most terminal applications use for redrawing. It will make it harder to insert ASCII 12, but this is seldom necessary since it is a form feed. ^L (ASCII 12 and the upcaret variant ) is still a whitespace character and therefore treated as a NOP. * DEC TECO had CTRL+W as the refresh immediate editing command. Video TECO uses <ET> as a regular command for refreshign in loops. I'd rather keep ET reserved as a potential terminal configuration command as in DEC TECO, though.
2025-07-19	special Q-registers `$` (working directory) and the clipboard registers now ↵	Robin Haberkorn	1	-1/+0
	support the append operation (:Xq, :^Uq...) Works via a default implementation in the "external" Q-register "class" by first querying the string, appending and re-setting it.
2025-07-16	the primary clipboard is now chosen by the 10th bit in the ED flags	Robin Haberkorn	1	-8/+6
	* `[q]~` was broken and resulted in crashes since it reset the clipboard character to 0. In fact, if we don't want to break the `[a]b` idiom we cannot use the numeric cell of register `~`. * Therefore we no longer use the numeric part of register `~`. Once the clipboard registers are initialized they completely replace any existing register with the same name that may have been set in the profile. So we still don't leak any memory. (But perhaps it would now be better to fail with an error if one of the clipboard registers already exist?) * Instead, bit 10 (1024) of ED is now used to change the default clipboard to the primary selection. The alternative might have been an EJ flag or even a special register containing the name of the default clipboard register. * partially reverses 8c6de6cc718debf44f6056a4c34c4fbb13bc5020
2025-07-13	allow changing the default clipboard by setting the `~` integer	Robin Haberkorn	1	-2/+9
	* It continues to default to 67 (C), which is the system clipboard. But you can now overwrite it e.g. by adding `^^PU~` to the profile. * This fixes a minor memory leak: If you set one of the clipboard registers in the profile (initializing them as plain registers), the clipboard register had been leaked. The clipboard registers now replace any existing register, while at the same time preserving the numeric part. * All remaining Q-Reg table insertions use a new function teco_qreg_table_insert_unique() which adds an assertion, so that we notice any future possible memory leaks.
2025-07-03	implemented ^E<code> string building constructs for embedding bytes and ↵	Robin Haberkorn	1	-0/+13
	codepoints in a strtoul()-like manner
2025-05-24	new string building construct ^P disables all further string building magic	Robin Haberkorn	1	-0/+6
	* Now, `I^P` can replace `EI`. EI is therefore now free to be repurposed as the new "mung file" command for improved TECO-11 compatibility. * On the downside when inserting large blocks of TECO code, you will have to write something like `@I{^P !...! }` * The construct is also useful when searching for carets as in `S^P^Q^`.
2025-05-18	sciteco(7): added a help topic for booleans	Robin Haberkorn	1	-0/+1
	So you can lookup `?bool$` for instance.
2025-04-10	testsuite: check whether comparisons for equality really work with the ↵	Robin Haberkorn	1	-12/+16
	`a-b"=` idiom * There might theoretically be problems with the uncommon one's complement or magnitude representation of negative integers, but it's practically impossible to meet those in the wild. * Still, we do some checks now, so we will at least notice any exotic architectures. * Also, documented the `a^#b"=` idiom for checking for equality. It's longer to type, but faster and will also work for floats. For floats it will be the only permissible idiom for checking for bitwise equality as `a-b` can be 0 even if a!=b (if the difference is very small). Changing the `-` semantics is out of the question.
2025-04-09	tightened rules for specifying modifiers	Robin Haberkorn	1	-3/+5
	* Instead of separate stand-alone commands, they are now allowed only immediately in front of the commands that accept them. * The order is still insignificant if both `@` and `:` are accepted. * The number of colon modifiers is now also checked. We basically get this for free. * `@` has syntactic significance, so it could not be set conditionally anyway. Still, it was possible to provoke bugs were `@` was interpreted conditionally as in `@ 2<I/foo/$>`. * Even when not causing bugs, a mistyped `@` would often influence the __next__ command, causing unexpected behavior, for instance when typing `@(233C)W`. * While it was theoretically possible to set `:` conditionally, it could also be "passed through" accidentally to some command where it wasn't expected as in `:Ifoo$ C`. I do not know of any real useful application or idiom of a conditionally set `:`. If there would happen to be some kind of useful application, `:'` and `:\|` could be re-allowed easily, though. * I was condidering introducing a common parser state for modified commands, but that would have been tricky and introduce a lot of redundant command lists. So instead, we now simply everywhere check for excess modifiers. To simplify this task, teco_machine_main_transition_t now contains flags signaling whether the transition is allowed with `@` or `:` modifiers set. It currently only has to be checked in the start state, after `E` and `F`.
2025-04-08	improved rubbing out commands with modifiers	Robin Haberkorn	1	-0/+2
	* This was actually broken if the command is preceded by `@` and `:` characters, which are __not__ modifiers. E.g. `Q:@I/foo^W` would have rubbed out the `:` register as well. * Also, since it was all done in teco_state_process_edit_cmd(), it would also rub out modifier characters from within string arguments, E.g. `@I/::^EQ^W` * Real commands now have their own ^W rubout implementation, while the generic fallback just rubs out until the start state is re-established. This fails to rub out modifiers as in `@I/^W`, though. * Real command characters now use the common TECO_DEFINE_STATE_COMMAND(). * Added test cases for CTRL+W rub out. A few control characters are now portably available to tests via environment variables `$ESCAPE`, `$RUBOUT` and `$RUBOUT_WORD`.
2025-04-04	scroll caret __almost__ always automatically after key presses	Robin Haberkorn	1	-0/+4
	* The old heuristics - scroll if dot changes after key press - turned out to be too simplistic. They broke the clang-format macro (M#cf), which left the view at the top of the document since the entire document is temporarily erased. Other simplified examples of this bug would be: @^Um{[: HECcat$ ]:} Mm Or even: @^Um{[: H@X.aG.a ]:} Mm * Actually, the heuristics could be tricked even without deleting any significant amount of text from the buffer. The following test case replaces the previous character with a linefeed in a single key press: @^Um{-DI^J$} Mm If executed on the last visible line, dot wouldn't be scrolled into the view since it did not change. * At the same time, we'd like to keep the existing mouse scroll behavior from fnkeys.tes, which is allowed to scroll dot outside of the visible area. Therefore, dot is scrolled into view always, except after mouse events. You may have to call SCI_SCROLLCARET manually in the ^KMOUSE macro, which is arguably not always straight forward. * Some macros like M#cf may still leave the vertical scrolling position in unexpected positions. This could either be fixed by eradicating all remaining automatic scrolling from Scintilla or by explicitly restoring the vertical position from the macro (FIXME). * This was broken since the introduction of mouse support, so it wasn't in v2.3.0.
2025-03-29	^W also rubs out/in `@` and `:` modifiers	Robin Haberkorn	1	-0/+3
	* It makes little sense to e.g. rub out until `I` in `@I/foo/`, but leave the `@` modifier. Modifiers have to be considered part of the command, even though the state machine is not currently modelled like that.
2025-03-23	sciteco(7): fixed formatting of some tables	Robin Haberkorn	1	-4/+4
	This was changed ages ago for some old version of Groff. These workarounds should no longer be necessary.
2025-03-23	the ^W immediate editing command now mimics `Y` more closely and also rubs ↵	Robin Haberkorn	1	-5/+3
	out no-op commands (whitespace) * In string arguments, ^W first rubs out non-word chars (usually whitespace), so it makes sense if ^W would work analogously at the command level. A non-command would be one of the no-ops.
2025-03-22	harmonized all word-movement and deletion commands: they move/delete until ↵	Robin Haberkorn	1	-2/+3
	the beginning of words now * All commands and their documentations were inconsistent. * ^W rubbed out to the beginning of words. * Shift+Right (fnkeys.tes) moved to the beginning of the next word if invoked at the beginning of a word and to the end of the next word otherwise. * <W> (and <V> and <Y> by extension) moved to the end of the next word. * The cheat sheet would claim that <W> moves to the beginning of the next word. * Video TECO's <W> command would differ again from everything else. With positive arguments, it moved to the beginning of words, while with negative it moved to end of words. I decided not to copy this behavior. * It has been decided to adopt a consistent beginning-of-words policy. -W therefore differs from Video TECO in moving to the beginning of the current or previous word. * teco_find_words() is now based on parsing the document pointer, instead of relying on SCI_WORDENDPOSITION, since the latter cannot actually be used to skip strictly non-word characters. This requires a constant amount of Scintilla messages but will require fewer messages only when moving for more than 3 words. * The semantics of <W> are therefore now consistent with Vim and Emacs as well. * Shift+Right/Left is still based on SCI_WORDENDPOSITION, so it's behavior differs slightly from <W> for instance at the end of lines, as it will stop at linebreaks. * Unfortunately, these changes will break lots of macros, among others the M#rf, M#sp and git.blame macros ("Useful macros" from the wiki).
2025-02-27	implemented ncurses clipboard support via external processes	Robin Haberkorn	1	-7/+14
	* As an alternative to OSC-52, which is rarely supported by terminal emulators. * Makes the new mouse support much more useful since you rely on good builtin clipboard support. You can no longer e.g. just double-click a word to copy it into the "primary" selection as terminal emulators do by default. * Set $SCITECO_CLIPBOARD_SET/GET e.g. to xclip, way-copy, pbcopy or some wrapper script. * This is currently using POSIX-specific popen() API, so it behaves a bit different to command execution via EC/EG. I am not sure if it's worth rewriting with the GSpawn-API, since it will be used only on POSIX anyway and a GSpawn-based implementation is likely to be a bit larger. * Should there be some small command-line utility for interacting (esp. pasting) via OSC-52, built-in OSC-52 support could well be removed from SciTECO. Currently, I know only of https://github.com/theimpostor/osc/ and it requires very recent Go compilers. (I still haven't tested it. Quite possibly, pasting when run as a piped command is impossible.)
2025-02-23	support mouse interaction with popup windows	Robin Haberkorn	1	-2/+20
	* Curses allows scrolling with the scroll wheel at least if mouse support is enabled via ED flags. Gtk always supported that. * Allow clicking on popup entries to fully autocomplete them. Since this behavior - just like auto completions - is parser state-dependant, I introduced a new state method (insert_completion_cb). All the implementations are currently in cmdline.c since there is some overlap with the process_edit_cmd_cb implementations. * Fixed pressing undefined function keys while showing the popup. The popup area is no longer redrawn/replaced with the Scintilla view. Instead, continue to show the popup.
2025-02-16	implemented mouse support via special ^KMOUSE and <EJ> with negative keys	Robin Haberkorn	1	-0/+9
	* You need to set 0,64ED to enable mouse processing in Curses. It is always enabled in Gtk as it should never make the experience worse. sample.teco_ini enables mouse support, since this should be the new default. `sciteco --no-profile` won't have it enabled, though. * On curses, it requires the ncurses mouse protocol version 2, which will also be supported by PDCurses. * Similar to the Curses API, a special key macro ^KMOUSE is inserted if any of the supported mouse events has been detected. * You can then use -EJ to get the type of mouse event, which can be used with a computed goto in the command-line editing macro. Alternatively, this could have been solved with separate ^KMOUSE:PRESSED, ^KMOUSE:RELEASED etc. pseudo-key macros. * The default ^KMOUSE implementation in fnkeys.tes supports the following: * Left click: Edit command line to jump to position. * Ctrl+left click: Jump to beginning of line. * Right click: Insert position or position range (when dragging). * Double right click: insert range for word under cursor * Ctrl+right click: Insert beginning of line * Scroll wheel: scrolls (faster with shift) * Ctrl+scroll wheel: zoom (GTK-only) * Currently, there is no visual feedback when "selecting" ranges via right-click+drag. This would be tricky to do and most terminal emulators do not appear to support continuous mouse updates.
2025-02-15	redefining labels is a warning now	Robin Haberkorn	1	-1/+6
	* Allowing label redefinitions might have been useful when used as comments, since you will want to be able to define arbitrary comments. However as flow control constructs, this introduced a certain ambiguity since gotos might jump to different locations, depending on the progression of the parser. * On the other hand, making label redefinition an error might disqualify labels as comments when writing or porting classic TECO code. Therefore, it has been made a warning as a compromise. * Added test case
2024-12-24	introduced true block and EOL comments	Robin Haberkorn	1	-3/+31
	* The previous convention of !* ... ! are now true block comments, i.e. they are parsed faster, don't spam the goto table and allow embedding of exclamation marks - only "!" terminates the comment. * It is therefore now forbidden to have goto labels beginning with "". Also support "!!" to introduce EOL comments (like C++'s //). This disallows empty labels, but they weren't useful anyway. This is the shortest way to begin a comment. * All comment labels have been converted to true comments, to ensure that syntax highlighting works correctly. EOL comments are used for single line commented-out code, since it's easiest to uncomment - you don't have to jump to the line end. This is a pure convention / coding style. Other people might do it differently. * It's of course still possible to abuse goto labels as comments as TECO did for ages. * In lexing / syntax highlighting, labels and comments are highlighted differently. * When syntax highlighting, a single "!" will first be highlighted as a label since it's not yet unambiguous. Once you type the second character (* or !), the first character is retroactively styled as a comment as well.
2024-12-22	support external Scintilla lexer libraries and Scintillua in particular	Robin Haberkorn	1	-3/+8
	* @ES/SCI_SETILEXER/lib^@name/ now opens the lexer <name> in library <lib>. * You need to define the environment variable $SCITECO_SCINTILLUA_LEXERS to point to the lexers/ subdirectory (containing the .lua files). Perhaps this should default to the dirname of <lib>? The semantics of SCI_NAMEOFSTYLE have been changed: It now returns style ids when given style names, so you can actually write Scintillua lexer .tes files. This will be superfluous if we had a way to return strings from Scintilla messages into Q-Registers, e.g. 23@EPq/SCI_NAMEOFSTYLE/. We now depend on gmodule as well, but it should always be part of glib. It does not change the library dependencies of any package. It might result in gmodule shared libraries to be bundled in the Win32 and Mac OS packages if they weren't already.
2024-12-06	support the ::S anchored search (string comparison) command (and ::FD, ::FR, ↵	Robin Haberkorn	1	-0/+5
	::FS as well) * The colon modifier can now occur 2 times. Specifying `@` more than once or `:` more than twice is an error now. * Commands do not check for excess colon modifiers - almost every command would have to check it. Instead, a double colon will simply behave like a single colon on most commands. * All search commands inherit the anchored semantics, but it's not very useful in some combinations like -::S, ::N or ::FK. That's why the `::` variants are not documented everywhere. * The lexer.checkheader macro could be simplified and should also be faster now, speeding up startup. Eventually this macro can be made superfluous, e.g. by using 1:FB or 0,1^Q::S.
2024-12-04	the <Xq> command now supports the @ modifier for cutting into the register	Robin Haberkorn	1	-3/+13
	* Can be freely combined with the colon-modifier as well. :@Xq cut-appends to register q. * This simply deletes the given buffer range after the copy or append operation as if followed by another <K> command. * This has indeed been a very annoying missing feature, as you often have to retype the range for a K or D command. At the same time, this cannot be reasonably solved with a macro since macros do not accept Q-Register arguments -- so we would have to restrict ourselves to one or a few selected registers. I was also considering to solve this with a special stack operation that duplicates the top values, so that Xq leaves arguments for K, but this couldn't work for cutting lines and would also be longer to type. * It's the first non-string command that accepts @. Others may follow in the future. We're approaching ITS TECO madness levels.
2024-11-30	sciteco(7): fixed outdated information about the STYLE_CALLTIP default colors	Robin Haberkorn	1	-3/+3

2024-11-24	sciteco(7): minor documentation fix	Robin Haberkorn	1	-1/+1

2024-11-24	added special Q-Register ":" for accessing dot	Robin Haberkorn	1	-0/+13
	* We cannot call it "." since that introduces a local register and we don't want to add an unnecessary syntactic exception. * Allows the idiom [: ... ]: to temporarily move around. Also, you can now write ^E\: without having to store dot in a register first. * In the future we might add an ^E register as well for byte offsets. However, there are much fewer useful applications. * Of course, you can now also write nU: instead of nJ, Q: instead of "." and n%: instead of "nC.". However it's all not really useful.
2024-11-23	the search mode and current radix are mapped to __local__ Q-Registers ^X and ↵	Robin Haberkorn	1	-1/+28
	^R now (refs #17) * This way the search mode and radix are local to the current macro frame, unless the macro was invoked with :Mq. If colon-modified, you can reproduce the same effect by calling [.^X 0^X ... ].^X * The radix register is cached in the Q-Reg table as an optimization. This could be done with the other "special" registers as well, but at the cost of larger stack frames. * In order to allow constructs like [.^X typed with upcarets, the Q-Register specification syntax has been extended: ^c is the corresponding control code instead of the register "^".
2024-11-19	minor documentation fixes	Robin Haberkorn	1	-1/+1
	* also explicitly mention -%q
2024-11-18	fixed some common typos: "ie." and "eg.", "ocur" instead of "occur"	Robin Haberkorn	1	-4/+4

2024-10-04	pattern match characters support ^Q/^R now as well	Robin Haberkorn	1	-0/+8
	* makes it possible, albeit cumbersome, to escape pattern match characters * For instance, to search for ^Q, you now have to type S^Q^Q^Q^Q$. To search for ^E you have to type S^Q^Q^Q^E$. But the last character cannot be typed with carets currently (FIXME?). For pattern-only characters, two ^Q should be sufficient as in S^Q^Q^X$. * Perhaps it would be more elegant to abolish the difference between string building and pattern matching characters to avoid double quoting. But then all string building constructs like ^EQq should operate at the pattern level as well (ie. match the contents of register q verbatim instead of being interpreted as a pattern). TECOC and TECO-64 don't do that either. If we leave everything as it is, at least a new string building construct should be added for auto-quoting patterns (analoguous to ^EN and ^E@).
2024-09-25	inhibit some immediate editing commands after ^Q/^R string building constructs	Robin Haberkorn	1	-0/+3
	* This allows you to type ^Q^U (which would otherwise rub out the entire argument) and ^Q^W (which would otherwise rub out the ^Q). * ^Q^U coincidentally worked previously since the teco_state_stringbuilding_escaped state would default to teco_state_process_edit_cmd(). But it's better to make this feauture explicit. * This finally makes it possible to insert the ^W (23) char into a buffer. In interactive mode, you can still only type Caret+W as a string building construct. * ^G could also be inhibited after ^Q, but the control char is not used anywhere yet, so there is no point in doing that.
2024-09-23	allow OSC-52 clipboards on all terminal emulators	Robin Haberkorn	1	-7/+18
	* The XTerm version is still checked if we detect running under XTerm. * Actually, the XTerm implementation is broken for Unicode clipboard contents. * Kitty supports OSC-52, but you __must__ enable read-clipboard. With read-clipboard-ask, there will be a timeout. But we cannot read without a timeout since otherwise we would hang indefinitely if the escape sequence turns out to not work. * For urxvt, I have hacked an existing extension: https://gist.github.com/rhaberkorn/d7406420b69841ebbcab97548e38b37d * st currently supports only setting the clipboard, but not querying it.
2024-09-20	^W^W and ^V^V can be typed completely with upcarets now and they case fold ↵	Robin Haberkorn	1	-1/+5
	all expansions of ^EQq, ^EUq and so on * Previously, there was no way to enter upper-case mode in interactive commands since the Ctrl+W immediate editing command is interpreted everywhere. * Without the case folding of ^EQq/^EUq results, the upper and lower case modes are actually pretty useless considering that modern keyboards have caps lock. So it was clear we need this, regardless of what the classic TECOs did. The TECO-11 manual is not very clear on this. tecoc apparently does not case-fold ^EQq results. * This opens up new idioms, for instance `EUq^W^W^EQq$` in order to upper case register q. It's also the only way you can currently upper-case Unicode codepoints.
2024-09-19	Ctrl+^ is no longer translated to a single caret in string building (refs #20)	Robin Haberkorn	1	-1/+3
	* Ctrl+^ (30) and Caret+caret (^^) were both translated to a single caret. While there might be some reason to keep this behavior for double-caret, it is certainly pointless for Ctrl+^. * That gives you an easy way to insert Ctrl+^ (code 30) into documents with <I>. Perviously, you either had to insert a double-caret, typing 4 carets in a row, or you had to use <EI> or 30I$. * The special handling of double-caret could perhaps be abolished altogether, as we also have ^Q^ to escape plain carets. The double-caret syntax is very archaic from the time that there was no proper ^Q as far as I recall correctly.
2024-09-17	sciteco(7): mentioned "[a]b" idiom	Robin Haberkorn	1	-1/+2

2024-09-16	updated lists of external links in sciteco(1) and sciteco(7)	Robin Haberkorn	1	-8/+6
	* Unfortunately, the list in sciteco(7) does not format with FreeBSD's man or within SciTECO. * Removed references to the old sciteco.sf.net. We don't have a proper "homepage" for the time being.
2024-09-16	Curses: added support for cool Unicode icons (refs #5)	Robin Haberkorn	1	-0/+6
	* Practically requires one of the "Nerd Font" fonts, so it's disabled by default. Add 0,512ED to the profile to enable them. * The new ED flag could be used to control Gtk icons as well, but they are left always-enabled for the time being. Is there any reason anybody would like to disable icons in Gtk? * The list of icons has been adapted and extended from exa: https://github.com/ogham/exa/blob/master/src/output/icons.rs * The icons are hardcoded as presorted lists, so we can binary search them. This could change in the future. If there is any demand, they could be made configurable via Q-Registers as well.
2024-09-12	function key macros have been reworked into a more generic key macro feature	Robin Haberkorn	1	-99/+133
	* ALL keypresses (the UTF-8 sequences resulting from key presses) can now be remapped. * This is especially useful with Unicode support, as you might want to alias international characters to their corresponding latin form in the start state, so you don't have to change keyboard layouts so often. This is done automatically in Gtk, where we have hardware key press information, but has to be done with key macros in Curses. There is a new key mask 4 (bit 3) for that purpose now. * Also, you might want to define non-ANSI letters to perform special functions in the start state where it won't be accepted by the parser anyway. Suppose you have a macro M→, you could define @^U[^K→]{m→} 1^_U[^K→] This effectively "extends" the parser and allow you to call macro "→" by a single key press. See also #5. * The register prefix has been changed from ^F (for function) to ^K (for key). This is the only thing you have to change in order to migrate existing function key macros. * Key macros are enabled by default. There is no longer any way to disable function key handling in curses, as I never found any reason or need to disable it. Theoretically, the default ESCDELAY could turn out to be too small and function keys don't get through. I doubt that's possible unless on extremely slow serial lines. Even then, you'd have to increase ESCDELAY and instead of disabling function keys simply define an escape surrogate. * The ED flag has been removed and its place is reserved for a future mouse support flag (which does make sense to disable in curses sometimes). fnkeys.tes is consequently also enabled by default in sample.teco_ini. * Key macros are handled as an unit. If one character results in an error, the entire string is rubbed out. This fixes the "CLOSE" key on Gtk. It also makes sure that the original error message is preserved and not overwritten by some subsequent syntax error. It was never useful that we kept inserting characters after the first error.
2024-09-11	the SciTECO parser is Unicode-based now (refs #5)	Robin Haberkorn	1	-6/+14
	The following rules apply: * All SciTECO macros __must__ be in valid UTF-8, regardless of the the register's configured encoding. This is checked against before execution, so we can use glib's non-validating UTF-8 API afterwards. * Things will inevitably get slower as we have to validate all macros first and convert to gunichar for each and every character passed into the parser. As an optimization, it may make sense to have our own inlineable version of g_utf8_get_char() (TODO). Also, Unicode glyphs in syntactically significant positions may be case-folded - just like ASCII chars were. This is is of course slower than case folding ASCII. The impact of this should be measured and perhaps we should restrict case folding to a-z via teco_ascii_toupper(). * The language itself does not use any non-ANSI characters, so you don't have to use UTF-8 characters. * Wherever the parser expects a single character, it will now accept an arbitrary Unicode/UTF-8 glyph as well. In other words, you can call macros like M§ instead of having to write M[§]. You can also get the codepoint of any Unicode character with ^^x. Pressing an Unicode character in the start state or in Ex and Fx will now give a sane error message. * When pressing a key which produces a multi-byte UTF-8 sequence, the character gets translated back and forth multiple times: 1. It's converted to an UTF-8 string, either buffered or by IME methods (Gtk). On Curses we could directly get a wide char using wget_wch(), but it's not currently used, so we don't depend on widechar curses. 2. Parsed into gunichar for passing into the edit command callbacks. This also validates the codepoint - everything later on can assume valid codepoints and valid UTF-8 strings. 3. Once the edit command handling decides to insert the key into the command line, it is serialized back into an UTF-8 string as the command line macro has to be in UTF-8 (like all other macros). 4. The parser reads back gunichars without validation for passing into the parser callbacks. * Flickering in the Curses UI and Pango warnings in Gtk, due to incompletely inserted and displayed UTF-8 sequences, are now fixed.
2024-09-09	added raw ANSI mode to facilitate 8-bit clean editing (refs #5)	Robin Haberkorn	1	-1/+4
	* When enabled with bit 2 in the ED flags (0,4ED), all registers and buffers will get the raw ANSI encoding (as if 0EE had been called on them). You can still manually change the encoding, eg. by calling 65001EE afterwards. * Also the ANSI mode sets up character representations for all bytes >= 0x80. This is currently done only depending on the ED flag, not when setting 0EE. * Since setting 16,4ED for 8-bit clean editing in a macro can be tricky - the default unnamed buffer will still be at UTF-8 and at least a bunch of environment registers as well - we added the command line option `--8bit` (short `-8`) which configures the ED flags very early on. As another advantage you can mung the profile in 8-bit mode as well when using SciTECO as a sort of interactive hex editor. * Disable UTF-8 checks in 8-bit clean mode (sample.teco_ini).
2024-09-09	updated README and sciteco(7) with information about Unicode support (refs #5)	Robin Haberkorn	1	-8/+28

2024-09-09	the ^EUq string building escape now respects the encoding (can insert bytes ↵	Robin Haberkorn	1	-0/+6
	or codepoints) (refs #5) * This is trickier than it sounds because there isn't one single place to consult. It depends on the context. If the string argument relates to buffer contents - as in <I>, <S>, <FR> etc. - the buffer's encoding is consulted. If it goes into a register (EU), the register's encoding is consulted. Everything else (O, EN, EC, ES...) expects only Unicode codepoints. * This is communicated through a new field teco_machine_stringbuilding_t::codepage which must be set in the states' initial callback. * Seems overkill just for ^EUq, but it can be used for context-sensitive processing of all the other string building constructs as well. * ^V and ^W cannot be supported for Unicode characters for the time being without an Unicode-aware parser
2024-09-09	conditionals now check for Unicode codepoints (refs #5)	Robin Haberkorn	1	-6/+6
	* This will naturally work with both ASCII characters and various non-English scripts. * Unfortunately, it cannot work with the other non-ANSI single-byte codepages. * If we'd like to support scripts working with all sorts of codepoints, we'd have to introduce a new command for translating individual codepoints from the current codepage (as reported by EE) to Unicode.
2024-02-06	avoid Groff warnings due to `\` escapes	Robin Haberkorn	1	-1/+1
	* It's generally a bad idea to pass backslashes as a glyph in macro arguments, even as `\\` since this could easily be interpreted as an escape. * Instead we now always use `\[rs]`.
2024-01-20	removed nonsensical line from sciteco(7) man page	Robin Haberkorn	1	-1/+0
	* was introduced in e7867fb0