sciteco - Scintilla-based Text Editor and COrrector

Age	Commit message (Collapse)	Author	Files	Lines
2024-09-25	inhibit some immediate editing commands after ^Q/^R string building constructs	Robin Haberkorn	3	-1/+32
	* This allows you to type ^Q^U (which would otherwise rub out the entire argument) and ^Q^W (which would otherwise rub out the ^Q). * ^Q^U coincidentally worked previously since the teco_state_stringbuilding_escaped state would default to teco_state_process_edit_cmd(). But it's better to make this feauture explicit. * This finally makes it possible to insert the ^W (23) char into a buffer. In interactive mode, you can still only type Caret+W as a string building construct. * ^G could also be inhibited after ^Q, but the control char is not used anywhere yet, so there is no point in doing that.
2024-09-23	allow OSC-52 clipboards on all terminal emulators	Robin Haberkorn	5	-27/+50
	* The XTerm version is still checked if we detect running under XTerm. * Actually, the XTerm implementation is broken for Unicode clipboard contents. * Kitty supports OSC-52, but you __must__ enable read-clipboard. With read-clipboard-ask, there will be a timeout. But we cannot read without a timeout since otherwise we would hang indefinitely if the escape sequence turns out to not work. * For urxvt, I have hacked an existing extension: https://gist.github.com/rhaberkorn/d7406420b69841ebbcab97548e38b37d * st currently supports only setting the clipboard, but not querying it.
2024-09-23	updated the Lexilla submodule URL	Robin Haberkorn	1	-1/+1
	I wonder why it even built on CI without that change. It should fix Yocto builds, though.
2024-09-22	Curses: always wgetch() on a dummy pad, avoiding unnecessary wrefresh()	Robin Haberkorn	1	-35/+45
	* This is especially important on platforms, requiring the wgetch() poll workaround to detect CTRL+C (PDCurses/WinGUI). wgetch(cmdline_window) would implicitly wrefresh(cmdline_window), which resulted in additional flickering when pressing function keys. This is no longer so important since key macros are processed as an unity and the cmdline will be updated only after processing all of the characters contained in them, ie. only once after the key press. Still, there could have still been unwanted side effects. At the very least, wgetch(input_pad) should be faster. * The XTerm clipboard implementation was getch()ing on stdscr, so potentially suffered from the same problem. It should be tested again. * Since keypad() is now always enabled even on netbsd-curses. I assume that the function key processing bug in netbsd-curses has been fixed by now. We are not building any releases with netbsd-curses. But it should be retested. * It does not resolve all flickering issues on PDCurses/WinGUI. Both the command line and the Scintilla view still flicker near the cursor. See https://github.com/Bill-Gray/PDCursesMod/issues/322
2024-09-21	disable shared libraries by default	Robin Haberkorn	4	-15/+16
	* This is necessary to fix the Unicode test suite on Win32, so I was always passing in --disable-shared manually. It's easy to forget though when building from scratch. * We don't currently install any (shared) library, so this is safe on all platforms. In fact on all other platforms, libtool detects that and doesn't generate wrapper binaries in any way. Only on win32 it's apparently buggy.
2024-09-21	PDCurses/WinGUI: fixed Unicode icons on win32	Robin Haberkorn	3	-8/+27
	* Turns out that "%C" in wprintw() does not work with non-ANSI chars. * We still don't want to introduce the Curses widechar API, so I added teco_curses_add_wc() as a replacement for wadd_wch().
2024-09-21	syntax errors are reported with "echoed" characters, ie. as purely printable ↵	Robin Haberkorn	1	-1/+3
	characters * Some characters like LF wouldn't be displayed in the message line correctly. * In fact the Gtk UI cannot display any of the control characters correctly. * I was considering deferring all echoing/formatting to the UIs, so they can use TecoGtkLabel or teco_curses_format_str(). This is not possible since messages transmitted via GError must not contain null-bytes, so these need to be sorted out earlier anyway. * This should also fix syntax errors in PDCurses for Windows where "%C" apparently doesn't work with non-ANSI codepoints.
2024-09-21	screenshots.md: make one of the screenshots smaller	Robin Haberkorn	1	-1/+1

2024-09-20	^W^W and ^V^V can be typed completely with upcarets now and they case fold ↵	Robin Haberkorn	4	-31/+104
	all expansions of ^EQq, ^EUq and so on * Previously, there was no way to enter upper-case mode in interactive commands since the Ctrl+W immediate editing command is interpreted everywhere. * Without the case folding of ^EQq/^EUq results, the upper and lower case modes are actually pretty useless considering that modern keyboards have caps lock. So it was clear we need this, regardless of what the classic TECOs did. The TECO-11 manual is not very clear on this. tecoc apparently does not case-fold ^EQq results. * This opens up new idioms, for instance `EUq^W^W^EQq$` in order to upper case register q. It's also the only way you can currently upper-case Unicode codepoints.
2024-09-19	Ctrl+^ is no longer translated to a single caret in string building (refs #20)	Robin Haberkorn	3	-7/+24
	* Ctrl+^ (30) and Caret+caret (^^) were both translated to a single caret. While there might be some reason to keep this behavior for double-caret, it is certainly pointless for Ctrl+^. * That gives you an easy way to insert Ctrl+^ (code 30) into documents with <I>. Perviously, you either had to insert a double-caret, typing 4 carets in a row, or you had to use <EI> or 30I$. * The special handling of double-caret could perhaps be abolished altogether, as we also have ^Q^ to escape plain carets. The double-caret syntax is very archaic from the time that there was no proper ^Q as far as I recall correctly.
2024-09-19	fixed Load/Save Q-Reg tests on Mac OS and Win32	Robin Haberkorn	1	-8/+3

2024-09-19	"special" Q-Registers now support EQq/.../ (load) and E%q/.../ (save) commands	Robin Haberkorn	6	-68/+156
	* @EQ$/.../ sets the current directory from the contents of the given file. @E%$/.../ stores the currend directory in the given file. * @EQ/.../ will fail, just like ^U...$. @E%/.../ stores the current buffer's name in the given file. It's especially useful with the clipboard registers. There could still be a minor bug in @E%~/.../ with regard to EOL normalization as teco_view_save() will use the EOL style of the current document, which may not be the style of the Q-Reg contents. Conversions can generally be avoided for these particular commands. But without teco_view_save() we'd have to care about save point creation.
2024-09-18	homepage: added facicon and description meta tag	Robin Haberkorn	1	-0/+2

2024-09-18	added NEWS section to homepage	Robin Haberkorn	4	-1/+9
	* This file is required by Autotools and will be distributed in source tarballs as well.
2024-09-18	check that local register is not edited at the end of macro calls	Robin Haberkorn	5	-3/+27
	* This was unsafe and could easily result in crashes, since teco_qreg_current would afterwards point to an already freed Q-Register. * Since automatically editing another register or buffer is not easy to do right, we throw an error instead.
2024-09-17	screenshots.md: added screenshots for the upcoming v2.1.0 release	Robin Haberkorn	1	-0/+6
	Naturally they show off the new Unicode support.
2024-09-17	sciteco(7): mentioned "[a]b" idiom	Robin Haberkorn	1	-1/+2

2024-09-17	updated cheat sheet	Robin Haberkorn	2	-8/+18
	* character-based model, avoid mentioning "ASCII code" * added "0EE" example * should be built with pdfmom, so it's built with gropdf
2024-09-17	screenshots.md: every screenshot is in its own paragraph now	Robin Haberkorn	1	-0/+2

2024-09-17	fixed titles on screenshots page (screenshots.md)	Robin Haberkorn	1	-5/+5

2024-09-17	fixed searches on completely new and empty documents	Robin Haberkorn	2	-1/+6
	This was throwing glib assertions.
2024-09-17	updated screenshots: screenshots.md now lists older screenshots as well	Robin Haberkorn	2	-2/+12

2024-09-17	Github pages are auto-generated from the Markdown files and HTML manuals now	Robin Haberkorn	4	-14/+156
	* This pushes to the gh-pages branch since we don't yet want to introduce a new workflow (that would have to rebuild SciTECO). * Built as part of the nightly MacOS builds. The Ubuntu builds directly build Debian packages which do not contain the HTML manuals. * I don't want to check in images into the master branch. The gh-pages branch is cleaned with every build. Therefore I still cross-link to Sourceforge for any additional images and documents. * We could automatically build the cheat-sheet.pdf (TODO?). For the time being, we are still linking to Sourceforge.
2024-09-17	improved HTML lexer (html.tes)	Robin Haberkorn	1	-3/+48
	This previously highlighted little more than embedded Javascripts.
2024-09-16	updated TODO	Robin Haberkorn	1	-28/+55

2024-09-16	updated lists of external links in sciteco(1) and sciteco(7)	Robin Haberkorn	5	-25/+11
	* Unfortunately, the list in sciteco(7) does not format with FreeBSD's man or within SciTECO. * Removed references to the old sciteco.sf.net. We don't have a proper "homepage" for the time being.
2024-09-16	Curses: added support for cool Unicode icons (refs #5)	Robin Haberkorn	9	-14/+473
	* Practically requires one of the "Nerd Font" fonts, so it's disabled by default. Add 0,512ED to the profile to enable them. * The new ED flag could be used to control Gtk icons as well, but they are left always-enabled for the time being. Is there any reason anybody would like to disable icons in Gtk? * The list of icons has been adapted and extended from exa: https://github.com/ogham/exa/blob/master/src/output/icons.rs * The icons are hardcoded as presorted lists, so we can binary search them. This could change in the future. If there is any demand, they could be made configurable via Q-Registers as well.
2024-09-16	fixed rubout of empty forward kill (FK)	Robin Haberkorn	1	-7/+12
	Test case: IF$ J IX$ FKF$ ^W The range to delete is empty, Scintilla would not generate an undo action, but SCI_UNDO would still be exected on rubout which removes the "X" too early. * We should really get rid of Scintilla undo actions as they are a source of trouble and complexity. There could be a custom undo token to undo SCI_DELETERANGE that automatically fetches the text that's going to be deleted and stores it in the token's data. This could replace most uses of SCI_UNDO. The rest is to undo insertions, which can easily be replaced with undo__teco_interface_ssm(SCI_DELETERANGE...). * We should really allow rubout tests in the test suite...
2024-09-16	minor search optimization: use SCI_GETRANGEPOINTER	Robin Haberkorn	1	-4/+4
	* if the buffer gap does not fall into the searched area, the gap will no longer be removed. * If it does fall into the range, there is nothing I can do about it. Only Gnulib's re_search_2() allows searching over two buffers.
2024-09-16	test suite: enable the recursion overflow test case everywhere	Robin Haberkorn	2	-3/+6
	* It wasn't failing on FreeBSD because there are different default stacksize limits. We now set it to 8MB everywhere.
2024-09-15	FreeBSD package: add the git.tes lexer config	Robin Haberkorn	1	-0/+1

2024-09-13	updated Scintilla to v5.5.2, Scinterm to v5.1 and Lexilla to HEAD	Robin Haberkorn	4	-1/+1
	* There are patches on top of Scintilla as were before * Scinterm has been switched back to the upstream repository and there are unreleased commits - especially for out-of-tree builds. * Lexilla hasn't been released since my troff lexer was merged.
2024-09-13	remaining types of program counters changed to gsize/gssize	Robin Haberkorn	7	-26/+30
	* This fixes F< to the beginning of the macro, which was broken in 73d574b71a10d4661ada20275cafde75aff6c1ba. teco_machine_main_t::macro_pc actually has to be signed as it is sometimes set to -1.
2024-09-13	fixup abb5d23eba21a2aafda0346c0c5dd845561b2aa2: commandline glitches after ↵	Robin Haberkorn	1	-2/+2
	errors * teco_cmdline.pc is not correct after an error occurred. Therefore start_pc is initialized with teco_cmdline.effective_len.
2024-09-13	fixed up 68578072bfaf6054a96bb6bcedfccb6e56a508fe: negative numbers weren't ↵	Robin Haberkorn	1	-1/+1
	parsed correctly
2024-09-12	updated README: mention new key macro feature	Robin Haberkorn	1	-1/+2

2024-09-12	update TODO	Robin Haberkorn	1	-23/+18

2024-09-12	function key macros have been reworked into a more generic key macro feature	Robin Haberkorn	10	-288/+380
	* ALL keypresses (the UTF-8 sequences resulting from key presses) can now be remapped. * This is especially useful with Unicode support, as you might want to alias international characters to their corresponding latin form in the start state, so you don't have to change keyboard layouts so often. This is done automatically in Gtk, where we have hardware key press information, but has to be done with key macros in Curses. There is a new key mask 4 (bit 3) for that purpose now. * Also, you might want to define non-ANSI letters to perform special functions in the start state where it won't be accepted by the parser anyway. Suppose you have a macro M→, you could define @^U[^K→]{m→} 1^_U[^K→] This effectively "extends" the parser and allow you to call macro "→" by a single key press. See also #5. * The register prefix has been changed from ^F (for function) to ^K (for key). This is the only thing you have to change in order to migrate existing function key macros. * Key macros are enabled by default. There is no longer any way to disable function key handling in curses, as I never found any reason or need to disable it. Theoretically, the default ESCDELAY could turn out to be too small and function keys don't get through. I doubt that's possible unless on extremely slow serial lines. Even then, you'd have to increase ESCDELAY and instead of disabling function keys simply define an escape surrogate. * The ED flag has been removed and its place is reserved for a future mouse support flag (which does make sense to disable in curses sometimes). fnkeys.tes is consequently also enabled by default in sample.teco_ini. * Key macros are handled as an unit. If one character results in an error, the entire string is rubbed out. This fixes the "CLOSE" key on Gtk. It also makes sure that the original error message is preserved and not overwritten by some subsequent syntax error. It was never useful that we kept inserting characters after the first error.
2024-09-12	teco_string_get_coord() returns character offsets now (refs #5)	Robin Haberkorn	8	-16/+21
	* This is used for error messages (TECO macro stackframes), so it's important to display columns in characters. * Program counters are in bytes and therefore everywhere gsize. This is by glib convention.
2024-09-11	updated README: mention that the language itself is Unicode-aware	Robin Haberkorn	1	-0/+3

2024-09-11	updated TODO	Robin Haberkorn	1	-11/+19

2024-09-11	improved file name autocompletion	Robin Haberkorn	5	-7/+74
	* pressing ^W in FG now deletes the entire directory component as in EB * commands without glob patterns (eg. EW) can now autocomplete file names containing glob patterns * When the autocompletion contains a glob character in commands accepting glob patterns like EB or EN, we now escape the glob pattern. This already helps if the remaining file name can be autocompleted in one go. Unfortunately, this is still insufficient if we can only partially complete and the partial completion contains glob characters. For instance, if there are 2 files: `file?.txt` and `file?.foo`, completing after `f` will insert `ile[?].`. The second try to press Tab will already do nothing. To fully support these cases, we need a version of teco_file_auto_complete() accepting glob patterns. Perhaps we can simply append `*` to the given glob pattern.
2024-09-11	fixed searches in single-byte encoded documents	Robin Haberkorn	4	-36/+59
	* while code is guaranteed to be in valid UTF-8, this cannot be said about the result of string building. * The search pattern can end up with invalid Unicode bytes even when searching on UTF-8 buffers, e.g. if ^EQq inserts garbage. There are currently no checks. * When searching on a raw buffer, it must be possible to search for arbitrary bytes (^EUq). Since teco_pattern2regexp() was always expecting clean UTF-8 input, this would sometimes skip over too many bytes and could even crash. * Instead, teco_pattern2regexp() now takes the <S> target codepage into account.
2024-09-11	the SciTECO parser is Unicode-based now (refs #5)	Robin Haberkorn	29	-202/+325
	The following rules apply: * All SciTECO macros __must__ be in valid UTF-8, regardless of the the register's configured encoding. This is checked against before execution, so we can use glib's non-validating UTF-8 API afterwards. * Things will inevitably get slower as we have to validate all macros first and convert to gunichar for each and every character passed into the parser. As an optimization, it may make sense to have our own inlineable version of g_utf8_get_char() (TODO). Also, Unicode glyphs in syntactically significant positions may be case-folded - just like ASCII chars were. This is is of course slower than case folding ASCII. The impact of this should be measured and perhaps we should restrict case folding to a-z via teco_ascii_toupper(). * The language itself does not use any non-ANSI characters, so you don't have to use UTF-8 characters. * Wherever the parser expects a single character, it will now accept an arbitrary Unicode/UTF-8 glyph as well. In other words, you can call macros like M§ instead of having to write M[§]. You can also get the codepoint of any Unicode character with ^^x. Pressing an Unicode character in the start state or in Ex and Fx will now give a sane error message. * When pressing a key which produces a multi-byte UTF-8 sequence, the character gets translated back and forth multiple times: 1. It's converted to an UTF-8 string, either buffered or by IME methods (Gtk). On Curses we could directly get a wide char using wget_wch(), but it's not currently used, so we don't depend on widechar curses. 2. Parsed into gunichar for passing into the edit command callbacks. This also validates the codepoint - everything later on can assume valid codepoints and valid UTF-8 strings. 3. Once the edit command handling decides to insert the key into the command line, it is serialized back into an UTF-8 string as the command line macro has to be in UTF-8 (like all other macros). 4. The parser reads back gunichars without validation for passing into the parser callbacks. * Flickering in the Curses UI and Pango warnings in Gtk, due to incompletely inserted and displayed UTF-8 sequences, are now fixed.
2024-09-10	fixed win32 CI and nightly builds (refs #5)	Robin Haberkorn	3	-8/+19
	* The libtool wrapper binaries do not pass down UTF-8 strings correctly, so the Unicode tests failed under some circumstances. * As we aren't actually linking against any locally-built shared libraries, we are passing --disable-shared to libtool which inhibts wrapper generation on win32 and fixes the test suite. * Also use up to date autotools. This didn't fix anything, though. * test suite: try writing an Unicode filename as well * There have been problems doing that on Win32 where UTF-8 was not correctly passed down from the command line and some Windows API calls were only working with ANSI filenames etc.
2024-09-10	win32: fixed opening and saving UTF-8 filenames (refs #5)	Robin Haberkorn	1	-5/+15
	* The default ANSI versions of the Win32 API calls worked only as long as we used the ANSI subset of UTF-8 in filenames. * There is g_win32_locale_filename_from_utf8(), but it's not guaranteed to derive an unique filename. * Therefore we define UNICODE and convert between UTF-8 and UTF-16 (Windows' native Unicode encoding).
2024-09-10	win32: convert command line to UTF-8 (refs #5)	Robin Haberkorn	2	-17/+31
	* Should prevent data loss due to system locale conversions when parsing command line arguments. * Should also fix passing Unicode arguments to munged macros and therefore opening files via ~/.teco_ini. * The entire option parsing is based on GStrv (null-terminated string lists) now, also on UNIX.
2024-09-10	fixed Mac OS nightly builds by installing an up-to-date Groff	Robin Haberkorn	1	-1/+2
	The Mac OS 12 Groff apparently does not accept `-K` for preconv.
2024-09-09	try a different value for LC_ALL on Mac OS to accept UTF-8 command lines ↵	Robin Haberkorn	1	-2/+1
	(refs #5)
2024-09-09	testsuite: try different locale on Mac OS (refs #5)	Robin Haberkorn	1	-1/+9
	hopefully fixes the Unicode test cases on Mac OS