sciteco - Scintilla-based Text Editor and COrrector

Age	Commit message (Collapse)	Author	Files	Lines
9 days	mention both mailing list and personal mail in `sciteco --help`	Robin Haberkorn	1	-1/+1

2025-08-09	Win32: avoid any automatic LF to CRLF conversions when writing to stdout	Robin Haberkorn	1	-0/+19
	* At least the MSVCRT does this by default, i.e. the translation mode of stdout is not _O_BINARY. * This broke piping through SciTECO with --stdin --stdout, as this relies on SciTECO's builtin EOL normalization. Instead, you would get DOS linebreaks on output even if the source stream contains only UNIX linebreaks. * It would also break binary filters. * It seems to be safe to print only LF also for regular stdio (help and error messages), so I simply disaply the stdout (and stdin and stderr) EOL translation globally. * Also fixes Troff warnings due to the .in preprocessor writing output with DOS linebreaks. * Added a test case. All future platforms shouldn't perform any unexpected EOL translations on output.
2025-08-06	command-line arguments are no longer passed via the unnamed buffer, but via ↵	Robin Haberkorn	1	-18/+18
	special Q-registers ^Ax * The unnamed buffer is also used for reading from --stdin, so you couldn't practically combine --stdin with passing command-line arguments to macros. * The old approach of passing command-line arguments via lines in the unnamed buffer was flawed anyway as it wouldn't work with filenames containing LF. This is just a very ancient feature, written when there weren't even long Q-reg names in SciTECO. * You can now e.g. pipe into SciTECO and edit what was read interactively, e.g. `dmesg \| sciteco -i`. You can practically use SciTECO as a pager. * htbl.tes is now a command-line filter (uses -qio). * grosciteco.tes reads Troff intermediate code from stdin, so we no longer need ".intermediate" temporary files. added a getopt.tes test case to the testsuite. * This change unfortunately breaks most macros accepting command-line arguments, even if they used getopt.tes. It also requires updating ~/.teco_ini - see fallback.teco_ini.
2025-08-03	added --quiet, --stdin and --stdout for easier integration into UNIX pipelines	Robin Haberkorn	1	-1/+40
	* In principle --stdin and --stdout could have been done in pure TECO code using the <^T> command. Having built-in command-line arguments however has several advantages: * Significantly faster than reading byte-wise with ^T. * Performs EOL normalization unless specifying --8bit of course. * Significantly shortens command-lines. `sciteco -qio` and `sciteco -qi` can be real replacements for sed and awk. * You can even place SciTECO into the middle of a pipeline while editing interactively: foo \| sciteco -qio --no-profile \| bar Unfortunately, this will not currently work when munging the profile as command-line parameters are also transmitted via the unnamed buffer. This should be changed to use special Q-registers (FIXME). * --quiet can help to improve the test suite (TODO). Should probably be the default in TE_CHECK(). * --stdin and --stdout allow to simplify many SciTECO scripts, avoiding temporary files, especially for womenpage generation (TODO). * For processing potentially infinite streams, you will still have to read using ^T.
2025-07-31	added -v/--version and <EO> command	Robin Haberkorn	1	-0/+8
	* DEC TECO had an <EO> command. In contrast to DEC TECO's implementation, the value reported by <EO> encodes a major.minor.micro semantic version.
2025-07-28	`ED&2` can be used to access the program termination flag now	Robin Haberkorn	1	-1/+2
	* `0,2ED` is roughly equivalent to `-EX` * `ED&2` can be used to query whether EX has been run. This is useful if macros can run EX. * `2,0ED` could be used to cancel the effect of EX. * But the real motivation is for implementing a REPL script.
2025-07-27	fixed using the command-line replacement register (ESC) in batch mode: was ↵	Robin Haberkorn	1	-2/+1
	causing assertions when entering interactive mode Also added a regression test case.
2025-07-13	allow changing the default clipboard by setting the `~` integer	Robin Haberkorn	1	-6/+12
	* It continues to default to 67 (C), which is the system clipboard. But you can now overwrite it e.g. by adding `^^PU~` to the profile. * This fixes a minor memory leak: If you set one of the clipboard registers in the profile (initializing them as plain registers), the clipboard register had been leaked. The clipboard registers now replace any existing register, while at the same time preserving the numeric part. * All remaining Q-Reg table insertions use a new function teco_qreg_table_insert_unique() which adds an assertion, so that we notice any future possible memory leaks.
2025-05-18	allow process exit status to be determined by macros	Robin Haberkorn	1	-4/+10
	* Any value left on the numeric stack now determines the exit code. This ensures you can call n^C as the SciTECO version of exit(n). It will also work with n$$ in the top level macro. But you don't necessarily need any of these commands. * Could be useful in shell scripting as in `sciteco -e "@EB/file/ :@S/foo/\"F1'"` to fail `foo` is not found.
2025-03-19	fixup cddc9bf83eb5cd2c69626b31ae7373342523b626: errors must be printed ↵	Robin Haberkorn	1	-6/+4
	before cleaning up the interface This fixes crashes on Gtk.
2025-03-19	fixup cddc9bf83eb5cd2c69626b31ae7373342523b626: avoid g_prefix_error_literal()	Robin Haberkorn	1	-1/+1
	This function requires glib v2.70, which impacted portability.
2025-03-17	free some global objects even in the error case, in order to appease Valgrind	Robin Haberkorn	1	-14/+18
	* If building with --enable-debug, we should always free all heap objects, even if they would be freed on program termination anway, so they won't appear as "possibly lost" in Valgrind. I had this missing if munged or evaled macro failed, which resulted in lots of false positives when running the testsuite under Valgrind. * Also fixes possible crashes due to reusing already set GError variables. This could theoretically happen if a munged script terminates with ^C and its "quit" ED-hook would also throw any error.
2025-03-03	rename sample.teco_ini to fallback.teco_ini and mung it by default	Robin Haberkorn	1	-2/+24
	* After installation, SciTECO will therefore start into a more userfriendly mode even if the user does not create a custom ~/.teco_ini. It is hoped that this will scare away less of new users, who are not willing to read through all of the documentation. Still, users are warned in the absence of ~/.teco_ini. This warning however, might not be immediately visible, especially not when running gsciteco without an attached console. (This will change once I redo the UI and allow a number of messages to be queued in the message area.) * Theoretically, you could also just extend fallback.teco_ini from ~/.teco_ini, but that would require installing it into $SCITECOPATH. * Since the fallback profile will now be munged automatically on a wide range of systems, we set up xclip only when detecting X11 ($DISPLAY is non-empty). E.g. when running under Wayland or the Linux console, you still won't get the clipboard registers, which is probably better than having the clipboard operations fail once you try to use them. * xclip is now "suggested" on Debian/Ubuntu. Unfortunately we cannot pull it in only in the presence of X11.
2025-01-13	updated copyright to 2025	Robin Haberkorn	1	-1/+1

2024-12-30	support +line[,column] and filename:line:column syntaxes when opening files	Robin Haberkorn	1	-12/+33
	* This is done via the new opener.tes in the standard library. * Some programs that use $EDITOR expect the +line syntax to work. * You can copy filename:line:column directly from GCC error messages and filename:line from grep output. * Since there may be safe file names beginning with "+" or containing colons, there needs to be a way to turn this off, especially for scripts that don't know anything about the filenames to open. This is done with "--". Unfortunately, the first "--", that stops parameter processing, is always removed from the command line and not passed down into TECO land. This is not a problem for stand-alone scripts, since the script filename is already stopping option processing, so "--" would get passed down. But when calling the profile via `sciteco -- ...`, you could prevent leading minus signs to cause problems but since the `--` is removed, opener.tes cannot use it as a hint. Therefore, we introduced `-S` as a new alternative to `--`, that's always passed down as `--` (i.e. it is equivalent to "-- --"). In other words, `sciteco -S ` will always open exactly the specified files without any danger of misinterpreting certain file names. Should we ever switch to a custom option parsing algorithm, we might preserve "--" (unless after --mung) and thus get rid of "-S". This advanced behavior can be tweaked by the user relatively easily. In the easiest case, we could replace M[opener] with <:L;R 0X.f [* @EB/^EN.f/ ]* L> in ~/.teco_ini to completely disable the special syntax.
2024-11-24	added special Q-Register ":" for accessing dot	Robin Haberkorn	1	-0/+2
	* We cannot call it "." since that introduces a local register and we don't want to add an unnecessary syntactic exception. * Allows the idiom [: ... ]: to temporarily move around. Also, you can now write ^E\: without having to store dot in a register first. * In the future we might add an ^E register as well for byte offsets. However, there are much fewer useful applications. * Of course, you can now also write nU: instead of nJ, Q: instead of "." and n%: instead of "nC.". However it's all not really useful.
2024-11-23	the search mode and current radix are mapped to __local__ Q-Registers ^X and ↵	Robin Haberkorn	1	-1/+6
	^R now (refs #17) * This way the search mode and radix are local to the current macro frame, unless the macro was invoked with :Mq. If colon-modified, you can reproduce the same effect by calling [.^X 0^X ... ].^X * The radix register is cached in the Q-Reg table as an optimization. This could be done with the other "special" registers as well, but at the cost of larger stack frames. * In order to allow constructs like [.^X typed with upcarets, the Q-Register specification syntax has been extended: ^c is the corresponding control code instead of the register "^".
2024-11-10	Win32: fixed Unicode commandlines with newer MinGW runtimes	Robin Haberkorn	1	-0/+13
	* should also fix Win32 nightly builds * Even though we weren't using main's argv, but were using glib API for retrieving the command line in UTF-8, newer MinGW runtimes would fail when converting the Unicode command line into the system codepage would be lossy. * Most people seem to compile in a "manifest" to work around this issue. But this requires newer Windows versions and using some Microsoft tool which isn't even in $PATH. Instead, we now link with -municode and define wmain() instead, even though we still ignore argv. wmain() proabably get's the command line in UTF-16 and we'd have to convert it anyway. * See https://github.com/msys2/MINGW-packages/issues/22462
2024-11-07	test suite: fixed failure detection in the commandline-editing test cases	Robin Haberkorn	1	-0/+4
	* The program exit code will usually not signal failures since they are caught earlier. * Therefore, we always have to capture and check stderr.
2024-11-06	fixed possible crashes during --fake-cmdline	Robin Haberkorn	1	-4/+2
	* A test case has been added, although it might have been accidental that on caused crashes.
2024-11-05	fully support relocatable binaries, improving AppImages	Robin Haberkorn	1	-13/+10
	* You can now specify `--with-scitecodatadir` as a relative path, that will be interpreted relative to the binary's location. * Win32 binaries already were relocatable, but this was a Windows-specific hack. Win32 binaries are now built with `--with-scitecodatadir=.` since everything is in a single directory. * Ubuntu packages are now also built `--with-scitecodatadir=../share/sciteco`. This is not crucial for ordinary installations, but is meant for AppImage creation. * Since AppImages are now built from relocatable packages, we no longer need the unionfs-workaround from pkg2appimage. This should fix the strange root contents when autocompleting in AppImage builds. * This might also fix the appimage.github.io CI issues. I assume that because I could reproduce the issue on FreeBSD's Linuxulator in dependence of pkg2appimage's "union"-setting. See https://github.com/AppImage/appimage.github.io/pull/3402 * Determining the binary location actually turned out be hard and very platform-dependant. There are now implementations for Windows (which could also read argv[0]), Linux and generic UNIX (which works on FreeBSD, but I am not sure about the others). I believe this could also be useful on Mac OS to create app bundles, but this needs to be tested - currently the Mac OS binaries are installed into fixed locations and don't use relocation.
2024-11-03	Added "infinite monkey"-style test (refs #26)	Robin Haberkorn	1	-0/+18
	Supposing that any monkey hitting keys on a typewriter, serving as a hardcopy SciTECO terminal, will sooner or later trigger bugs and crash the application, the new monkey-test.apl script emulates such a monkey. In fact it's a bit more elaborate as the generated macro follows the frequency distribution extracted from the corpus of SciTECO macro files (via monkey-parse.apl). This it is hoped, increases the chance to get into "interesting" parser states. This also adds a new hidden --sandbox argument, but it works only on FreeBSD (via Capsicum) so far. In sandbox mode, we cannot open any file or execute external commands. It is made sure, that SciTECO cannot assert in sandbox mode for scripts that would run without --sandbox, since assertions are the kind of things we would like to detect. SciTECO must be sandboxed during "infinite monkey" tests, so it cannot accidentally do any harm on the system running the tests. All macros in sandbox mode must currently be passed via --eval. Alternatively, we could add a test compilation unit and generate the test data directly in memory via C code. The new scripts are written in GNU APL 1.9 and will probably work only under FreeBSD. These scripts are not meant to be run by everyone.
2024-10-28	added hidden --fake-cmdline parameter for testing command-line editing	Robin Haberkorn	1	-0/+13
	* Supports all immediate editing commands. Naturally it cannot emulate arbitrary key presses since there is no canonic ASCII-encoding of function keys. Key macros are not consequently also not testable. The --fake-cmdline parameter is instead treated very similar to a key macro expansion. * Most importantly this allows adding test cases for rubout behavior and bugs that are quite common. * Added regression test cases for the last two rubout bugs. * It's not easy to pass control codes in command line arguments in a portable manner, so the test cases will often use { and }. Control codes could be used e.g. by defining variables like RUBOUT=`printf '\b'` and referencing them with ${RUBOUT}.
2024-09-21	disable shared libraries by default	Robin Haberkorn	1	-0/+5
	* This is necessary to fix the Unicode test suite on Win32, so I was always passing in --disable-shared manually. It's easy to forget though when building from scratch. * We don't currently install any (shared) library, so this is safe on all platforms. In fact on all other platforms, libtool detects that and doesn't generate wrapper binaries in any way. Only on win32 it's apparently buggy.
2024-09-16	updated lists of external links in sciteco(1) and sciteco(7)	Robin Haberkorn	1	-1/+1
	* Unfortunately, the list in sciteco(7) does not format with FreeBSD's man or within SciTECO. * Removed references to the old sciteco.sf.net. We don't have a proper "homepage" for the time being.
2024-09-10	win32: convert command line to UTF-8 (refs #5)	Robin Haberkorn	1	-17/+21
	* Should prevent data loss due to system locale conversions when parsing command line arguments. * Should also fix passing Unicode arguments to munged macros and therefore opening files via ~/.teco_ini. * The entire option parsing is based on GStrv (null-terminated string lists) now, also on UNIX.
2024-09-09	added raw ANSI mode to facilitate 8-bit clean editing (refs #5)	Robin Haberkorn	1	-0/+7
	* When enabled with bit 2 in the ED flags (0,4ED), all registers and buffers will get the raw ANSI encoding (as if 0EE had been called on them). You can still manually change the encoding, eg. by calling 65001EE afterwards. * Also the ANSI mode sets up character representations for all bytes >= 0x80. This is currently done only depending on the ED flag, not when setting 0EE. * Since setting 16,4ED for 8-bit clean editing in a macro can be tricky - the default unnamed buffer will still be at UTF-8 and at least a bunch of environment registers as well - we added the command line option `--8bit` (short `-8`) which configures the ED flags very early on. As another advantage you can mung the profile in 8-bit mode as well when using SciTECO as a sort of interactive hex editor. * Disable UTF-8 checks in 8-bit clean mode (sample.teco_ini).
2024-09-09	allow Unicode characters in command line arguments (refs #5)	Robin Haberkorn	1	-0/+8
	* the locale must be initialized very early before g_option_context_parse() * will allow UTF-8 characters in the test suite
2024-01-21	updated copyright to 2024	Robin Haberkorn	1	-1/+1

2023-05-09	fixed CTRL+C interruptions on Windows; optimized CTRL+C polling on Gtk+	Robin Haberkorn	1	-27/+8
	* teco_interrupt() turned out to be unsuitable to kill child processes (eg. when <EB> hangs). Instead, we have Win32-specific code now. * Since SIGINT can be ignored on UNIX, pressing CTRL+C was not guaranteed to kill the child process (eg. when <EB> hangs). At the same time, it makes sense to send SIGINT first, so programs can terminate gracefully. The behaviour has therefore been adapted: Interrupting with CTRL+C the first time will kill gracefully. The second time, a more agressive signal is sent to kill the child process. Unfortunately, this would be relatively tricky and complicated to do on Windows, so CTRL+C will always "hard-kill" the child process. * Moreover, teco_interrupt() killed the entire process on Windows when called the second time. This resulted in any interruption to terminate SciTECO unexpectedly when tried the second time on Gtk/Win32. * teco_sigint_occurred renamed to teco_interrupted: There may be several different sources for setting this flag. * Checking for CTRL+C on Gtk involves driving the main event loop repeatedly. This is a very expensive operation. We now do that only every 100ms. This is still sufficient since keyboard input comes from humans. This optimization saves 75% runtime on Windows and 90% on Linux. * The same optimization turned out to be contraproductive on PDCurses/WinGUI.
2023-04-05	updated copyright to 2023	Robin Haberkorn	1	-1/+1

2022-06-21	updated copyright to 2022 and updated TODO	Robin Haberkorn	1	-1/+1

2021-10-11	fixed crashes when the Q-Reg stack is non-empty at exit	Robin Haberkorn	1	-0/+1
	* Test case: sciteco -e '[a' [aEX$$ in interactive mode would also crash. * No longer use a destructor - it was executed after the Q-Reg view was destroyed. * Instead, we now explicitly call teco_qreg_stack_clear() in main(). * Added a regression test case.
2021-06-08	Windows: normalize $COMSPEC	Robin Haberkorn	1	-1/+10
	* Environment variables are case insensitive on Windows while SciTECO variables are case sensitive. We must therefore make sure that we first unset any $COMSPEC or $ComSpec from the environment before resetting it, thereby fixing its case. * Fixes command execution via <EC> on systems where the variable was not called $ComSpec.
2021-05-30	THE GREAT CEEIFICATION EVENT	Robin Haberkorn	1	-0/+459
	This is a total conversion of SciTECO to plain C (GNU C11). The chance was taken to improve a lot of internal datastructures, fix fundamental bugs and lay the foundations of future features. The GTK user interface is now in an useable state! All changes have been squashed together. The language itself has almost not changed at all, except for: * Detection of string terminators (usually Escape) now takes the string building characters into account. A string is only terminated outside of string building characters. In other words, you can now for instance write I^EQ[Hello$world]$ This removes one of the last bits of shellisms which is out of place in SciTECO where no tokenization/lexing is performed. Consequently, the current termination character can also be escaped using ^Q/^R. This is used by auto completions to make sure that strings are inserted verbatim and without unwanted sideeffects. * All strings can now safely contain null-characters (see also: 8-bit cleanliness). The null-character itself (^@) is not (yet) a valid SciTECO command, though. An incomplete list of changes: * We got rid of the BSD headers for RB trees and lists/queues. The problem with them was that they used a form of metaprogramming only to gain a bit of type safety. It also resulted in less readble code. This was a C++ desease. The new code avoids metaprogramming only to gain type safety. The BSD tree.h has been replaced by rb3ptr by Jens Stimpfle (https://github.com/jstimpfle/rb3ptr). This implementation is also more memory efficient than BSD's. The BSD list.h and queue.h has been replaced with a custom src/list.h. * Fixed crashes, performance issues and compatibility issues with the Gtk 3 User Interface. It is now more or less ready for general use. The GDK lock is no longer used to avoid using deprecated functions. On the downside, the new implementation (driving the Gtk event loop stepwise) is even slower than the old one. A few glitches remain (see TODO), but it is hoped that they will be resolved by the Scintilla update which will be performed soon. * A lot of program units have been split up, so they are shorter and easier to maintain: core-commands.c, qreg-commands.c, goto-commands.c, file-utils.h. * Parser states are simply structs of callbacks now. They still use a kind of polymorphy using a preprocessor trick. TECO_DEFINE_STATE() takes an initializer list that will be merged with the default list of field initializers. To "subclass" states, you can simply define new macros that add initializers to existing macros. * Parsers no longer have a "transitions" table but the input_cb() may use switch-case statements. There are also teco_machine_main_transition_t now which can be used to implement simple transitions. Additionally, you can specify functions to execute during transitions. This largely avoids long switch-case-statements. * Parsers are embeddable/reusable now, at least in parse-only mode. This does not currently bring any advantages but may later be used to write a Scintilla lexer for TECO syntax highlighting. Once parsers are fully embeddable, it will also be possible to run TECO macros in a kind of coroutine which would allow them to process string arguments in real time. * undo.[ch] still uses metaprogramming extensively but via the C preprocessor of course. On the downside, most undo token generators must be initiated explicitly (theoretically we could have used embedded functions / trampolines to instantiate automatically but this has turned out to be dangereous). There is a TECO_DEFINE_UNDO_CALL() to generate closures for arbitrary functions now (ie. to call an arbitrary function at undo-time). This simplified a lot of code and is much shorter than manually pushing undo tokens in many cases. * Instead of the ridiculous C++ Curiously Recurring Template Pattern to achieve static polymorphy for user interface implementations, we now simply declare all functions to implement in interface.h and link in the implementations. This is possible since we no longer hace to define interface subclasses (all state is static variables in the interface's .c files). Headers are now significantly shorter than in C++ since we can often hide more of our "class" implementations. * Memory counting is based on dlmalloc for most platforms now. Unfortunately, there is no malloc implementation that provides an efficient constant-time memory counter that is guaranteed to decrease when freeing memory. But since we use a defined malloc implementation now, malloc_usable_size() can be used safely for tracking memory use. malloc() replacement is very tricky on Windows, so we use a poll thread on Windows. This can also be enabled on other supported platforms using --disable-malloc-replacement. All in all, I'm still not pleased with the state of memory limiting. It is a mess. * Error handling uses GError now. This has the advantage that the GError codes can be reused once we support error catching in the SciTECO language. * Added a few more test suite cases. * Haiku is no longer supported as builds are instable and I did not manage to debug them - quite possibly Haiku bugs were responsible. * Glib v2.44 or later are now required. The GTK UI requires Gtk+ v3.12 or later now. The GtkFlowBox fallback and sciteco-wrapper workaround are no longer required. * We now extensively use the GCC/Clang-specific g_auto feature (automatic deallocations when leaving the current code block). * Updated copyright to 2021. SciTECO has been in continuous development, even though there have been no commits since 2018. * Since these changes are so significant, the target release has been set to v2.0. It is planned that beginning with v3.0, the language will be kept stable.