sciteco - Scintilla-based Text Editor and COrrector

Age	Commit message (Collapse)	Author	Files	Lines
2024-11-18	fixed some common typos: "ie." and "eg.", "ocur" instead of "occur"	Robin Haberkorn	1	-1/+1

2024-09-11	the SciTECO parser is Unicode-based now (refs #5)	Robin Haberkorn	1	-1/+1
	The following rules apply: * All SciTECO macros __must__ be in valid UTF-8, regardless of the the register's configured encoding. This is checked against before execution, so we can use glib's non-validating UTF-8 API afterwards. * Things will inevitably get slower as we have to validate all macros first and convert to gunichar for each and every character passed into the parser. As an optimization, it may make sense to have our own inlineable version of g_utf8_get_char() (TODO). Also, Unicode glyphs in syntactically significant positions may be case-folded - just like ASCII chars were. This is is of course slower than case folding ASCII. The impact of this should be measured and perhaps we should restrict case folding to a-z via teco_ascii_toupper(). * The language itself does not use any non-ANSI characters, so you don't have to use UTF-8 characters. * Wherever the parser expects a single character, it will now accept an arbitrary Unicode/UTF-8 glyph as well. In other words, you can call macros like M§ instead of having to write M[§]. You can also get the codepoint of any Unicode character with ^^x. Pressing an Unicode character in the start state or in Ex and Fx will now give a sane error message. * When pressing a key which produces a multi-byte UTF-8 sequence, the character gets translated back and forth multiple times: 1. It's converted to an UTF-8 string, either buffered or by IME methods (Gtk). On Curses we could directly get a wide char using wget_wch(), but it's not currently used, so we don't depend on widechar curses. 2. Parsed into gunichar for passing into the edit command callbacks. This also validates the codepoint - everything later on can assume valid codepoints and valid UTF-8 strings. 3. Once the edit command handling decides to insert the key into the command line, it is serialized back into an UTF-8 string as the command line macro has to be in UTF-8 (like all other macros). 4. The parser reads back gunichars without validation for passing into the parser callbacks. * Flickering in the Curses UI and Pango warnings in Gtk, due to incompletely inserted and displayed UTF-8 sequences, are now fixed.
2024-01-21	updated copyright to 2024	Robin Haberkorn	1	-1/+1

2023-04-14	allow disabling Lexilla (Lexer) support by specifying --without-lexilla	Robin Haberkorn	1	-1/+6
	* This does not make sense for most SciTECO builds, but only when you want to optimize for size as the lexers take up 50% of the compressed binary size. Without Lexilla, it should be possible get it compiled in about 500kb. * It can be useful for instance when building for embedded distributions. * When Lexilla is disabled, symbols-scilexer.c is also not generated (we assume that the Lexilla sources are not available and it also doesn't serve any purpose). * Consequently, most of the lexer configuration scripts are also not installed under --without-lexilla.
2023-04-05	updated copyright to 2023	Robin Haberkorn	1	-1/+1

2022-11-28	fixed a number of crashes due to empty string arguments or uninitialized ↵	Robin Haberkorn	1	-2/+3
	registers * An empty but valid teco_string_t can contain NULL pointers. More precisely, a state's done_cb() can be invoked with such empty strings in case of empty string arguments. Also a registers get_string() can return the NULL pointer for existing registers with uninitialized string parts. * In all of these cases, the language should treat "uninitialized" strings exactly like empty strings. * Not doing so, resulted in a number of vulnerabilities. * EN$$ crashed if "_" was uninitialized * The ^E@q and ^ENq string building constructs would crash for existing but uninitialized registers q. * ?$ would crash * ESSETILEXER$$ would crash * This is now fixed. Test cases have been added. * I cannot guarantee that I have found all such cases. Generally, it might be wise to change our definitions and make sure that every teco_string_t must have an associated heap object to be valid. All functions returning pointer+length pairs should consequently also never return NULL pointers.
2022-06-21	updated copyright to 2022 and updated TODO	Robin Haberkorn	1	-1/+1

2021-10-11	upgraded to Scintilla 5.1.3 and Scinterm 3.1	Robin Haberkorn	1	-40/+45
	* Previous Scintilla version was 3.6.4 and Scinterm was 1.7 (with lots of custom patches). All of the patches are now either irrelevant or have been merged upstream. * Since Scintilla 5 requires C++17, this increases the minimum GCC version at least to 5.0. We may actually require even newer versions. * I could not upgrade the scintilla-mirror (which was imported from Mercurial), so the old sciteco-dev branch was renamed to sciteco-dev-pre-v2.0.0, master was deleted and I reimported the entire Scintilla repo using git-remote-hg. This means that scintilla-mirror now contains two entirely separate trees. But it is still possible to clone old SciTECO repos. * The strategy/workflow of maintaining hotfix branches on scintilla-mirror has been changed. Instead of having one sciteco-dev branch that is rebased onto new Scintilla upstream releases and tagging SciTECO releases in scintilla-mirror (to keep the commits referenced), we now create a branch for every Scintilla version we are based on (eg. sciteco-rel-5-1-3). This branch is never rebased or deleted. Therefore, we are guaranteed to be able to clone arbitrary SciTECO repo commits - not only releases. Releases no longer have to be tagged in scintilla-mirror. On the downside, fixup commits may accumulate in these new branches. They can only be squashed once a new branch for a new Scintilla release is created (e.g. by cherry-picking followed by rebase). * Scinterm does no longer have to reside in the Scintilla subdirectory, so we added it as a regular submodule. There are no more recursive submodules. The Scinterm build system has not been improved at all, but we use a trick based on VPATH to build Scinterm in scintilla/bin/. * Scinterm is now in Git and we reference the upstream repo for the time being. We might mirror it and apply the same branching workflow as with Scintilla if necessary. The scinterm-mirror repository still exists but has not been touched. We will also have to rewrite its master branch as it was a non-reproducible Mercurial import. * Scinterm now also comes with patches for Scintilla which we simply applied on our sciteco-rel-5-1-3 branch. * Scintilla 5 outsourced its lexers into the Lexilla project. We added it as yet another submodule. * All submodules have been moved into contrib/. * The Scintilla API for setting lexers has consequently changed. We now have to call SCI_SETILEXER(0, CreateLexer(name)). As I did not want to introduce a separate command for setting lexers, <ES> has been extended to allow setting lexers by name with the SCI_SETILEXER message which effectively replaces SCI_SETLEXERLANGUAGE. * The lexer macros (SCLEX_...) no longer serve any purpose - they weren't used in the SciTECO standard library anyway - and have consequently been removed from symbols-scilexer.c. The style macros from SciLexer.h (SCE_...) are theoretically still useful - even though they are not used by our current color schemes - and have therefore been retained. They can be specified as wParam in <ES>. * <ES> no longer allows symbolic constants for lParam. This never made any sense since all supported symbols were always wParam. * Scinterm supports new native cursor modes. They are not used for the time being and the previous CARETSTYLE_BLOCK_AFTER caret style is configured by default. It makes no sense to enable native cursor modes now since the command line should have a native cursor but is not yet a Scintilla view. * The Scintilla upgrade performed much worse than before, so some optimizations will be necessary.
2021-06-02	renamed scintilla.[ch] to symbols.[ch]: fixes builds on case-insensitive ↵	Robin Haberkorn	1	-0/+349
	file systems * There is a "Scintilla.h" as well. * should fix macOS and builds on native Windows hosts * It wasn't practical to refer to the Scintilla includes using paths since the Scintilla location is configurable (--with-scintilla). So we'd have to write something like #include <include/Scintilla.h>. For Scinterm we cannot avoid collisions neither as its path is also configurable (--with-scinterm). Effectively, we must prevent name clashes across SciTECO and all of Scintilla and Scinterm.