Tasks: * Wiki page about creating and maintaining lexer configurations. Also mention how to use the "lexer.test..." macros in the "edit" hook. Known Bugs: * A local Q-Register can be left as the edited document even after invoking a macro via . This is probably not safe and we should throw an error instead. * E%$...$ broken. Similarily EQ$file$ won't do what you would expect it to. * <23(1;)> leaves values on the stack and the internal hidden brace open.. Apparently breaking from within expressions is not currently safe. We could save the brace level in the loop context and then use teco_expressions_brace_return(). * Cannot escape ^E in search strings. This makes it nearly impossible to search for ^E. S^Q^E$ should work. * Ctrl+U rubout and Ctrl+G,Tab file name completions do not work in ^U command. * Editing very large files, or at least files with very long lines, is painstakingly slow. Try for instance openrussian-custom-2023-10-09.sql. For some strange reason, this affects both Curses and GTK. * ?^F does not autocomplete if the control character is typed via control and not via caret. * Using fnkeys.tes still flickers on PDCurses/WinGUI. Apparently a PDCurses bug. * Win32: Interrupting will sometimes hang. Affects both PDCurses/WinGUI and Gtk. In this case you have to kill the subprocess using the task manager. Could this be a race condition when adding the process to the job object? Perhaps the child process should be created suspended before being added to the job object. Glib does not currently allow that. * dlmalloc's malloc_trim() does not seem to free any resident memory after hitting the OOM limit, eg. after <%a>. Apparently an effect of HAVE_MORECORE (sbrk()) - some allocation is always left at the end. * S^ES^N<$ does not find the first line that does not begin with "<" ^ES is apparently not greedy. * Colors are still wrong in Linux console even if TERM=linux-16color when using Solarized. Affects e.g. the message line which uses the reverse of STYLE_DEFAULT. Perhaps we must call init_color() before initializing color pairs (currently done by Scinterm). * Scinterm: The underline attribute is not applied properly even on Urxvt where it obviously works. * session.save should save and reset ^R. Perhaps ^R should be mapped to a Q-Reg to allow [^R. Currently, saving the buffer session fails if ^R != 10. On the other hand, given that almost any macro depends on the correct radix, __every__ portable macro would have to save the old radix. Perhaps it's better to make the radix a property of the current macro invocation frame and guarantee ^R == 10 at the beginning of macros. Since :M should probably inherit the radix, having a ^R register would still be useful. * Saving another user's file will only preserve the user when run as root. Generally, it is hard to ensure that a) save point files can be created and b) the file mode and ownership of re-created files can be preserved. We should fall back silently to an (inefficient) memory copy or temporary file strategy if this is detected. * crashes on large files: S^EM^X$ (regexp: .*) Happens because the Glib regex engine is based on a recursive Perl regex library. The Homebrew and MinGW versions of glib no longer suffer from this. This is apparently impossible to fix as long as we do not have control over the regex engine build. We should therefore switch the underlying Regex engine. Oniguruma looks promising and is also packed for Ubuntu (libonig2). It would also directly allow globbing by tweaking the syntax. TRE also looks promising and is smaller than Oniguruma. GRegEx (PCRE) could still be supported as a fallback. * It is still possible to crash SciTECO using recursive functions, since they map to the C program's call stack. It is perhaps best to use another stack of macro strings and implement our own function calling. * SciTECO crashes can leave orphaned savepoint files lying around. Unfortunately, both the Windows and Linux ways of deleting files on close cannot be used here since that would disallow cheap savepoint restoration. * On UNIX, it's possible to open() and unlink() a file, avoiding any savepoint links. On abnormal program termination, the inodes are reclaimed automatically. Unfortunately, neither Linux, nor FreeBSD allow the re-linking based on file descriptors. On Linux this fails because the file's link count will be 0; On BSD linkat(AT_EMPTY_PATH) requires root privileges and does not work with unlinked files anyway. * Windows NT has hard links as well, but they don't work with file handles either. However, it could be possible to call CreateFile(FILE_FLAG_DELETE_ON_CLOSE) on the savepoint file, ensuring cleanup even on abnormal program termination. There is also MoveFileEx(file, NULL, MOVEFILE_DELAY_UNTIL_REBOOT). * Windows has file system forks, but they can be orphaned just like ordinary files but are harder to locate and clean up manually. * Setting window title is broken on ncurses/XTerm. The necessary capabilities are usually not in the Terminfo database. Perhaps do some XTerm magic here. We can also restore window titles on exit using XTerm. * Glib (error) messages are not integrated with SciTECO's logging system. * Gtk on Unix: On ^Z, we do not suspend properly. The window is still shown. This would be a useful feature especially with --xembed on st. Perhaps we should try to catch SIGTSTP? This does not work with g_unix_signal_add(), though, so any workaround would be tricky. We could create a pipe via g_unix_open_pipe() which we write to using write() in a normal signal handler. We can then add a watcher using g_unix_fd_add() which will hide the main window. Unfortunately, it is also not trivial to get notified when the window has really been hidden/unrealized. Even if everything worked, it might well be annoying if you accidentally suspend your instance while not being connected to a terminal. Although this could be checked at runtime. Suspension from the command-line has therefore been disabled on Gtk for the time being. * Many Scintilla commands can easily crash the editor. A lot of the dangerous cases could be catched by parsing Scintilla.iface instead of the C header. This would also allow automatically mapping a part of the messages (property getters and setters) into the global Q-Reg space using special registers. This feature would especially be important in order to support e.g. the SCI_SETPROPERTY message with two strings in order to set lexer properties. * Mac OS: The colors are screwed up with the terminal.tes color scheme (and with --no-profile) under Mac OS terminal emulators. This does not happen under Linux with Darling. See https://github.com/rhaberkorn/sciteco/issues/12 * GTK: Scrolling via mouse is not reliably prevented in all situations. Features: * Auto-indention could be implemented via context-sensitive immediate editing commands similar to tab-expansion. Avoids having to make LF a magic character in insertion commands. * :$ and :$$ to pop/return only single values * allow top-level macros to influence the proces return code. This can be used in macros to call $$ or ^C akin to exit(1). * Special macro assignment command. It could use the SciTECO parser for finding the end of the macro definition which is more reliable than @^Uq{}. Also this opens up new possibilities for optimizations. Macros could be special QRegs that are not backed by a Scintilla document but a normal string. This would immensely speed up macro calls. Perhaps more generically, we should add a number of alternative balanced string terminators (<> [] () {}) and assign one to parse SciTECO code. "' is not really an option here, since we want to be able to write @I"..." etc. Since ] and } can occur as stand-alone commands, we would have to use <> and/or () for SciTECO code parsing. The advantage would be that we save introducing a special assignment command and can use the same escape with @I<> for editing macro files while still getting immediate interactive syntax feedback. Plus: Once we have a parser-based terminator, there would be no more real need for command variants with disabled string building (as string building will naturally always be disabled in parser-terminator-mode). Instead, a special string building character for disabling string building character processing can be introduced, and all the command variants like EI and EU can be repurposed. Q-Reg specs should support alternative balanced escapes as well for symmetry. * Numbers could be separate states instead of stack operating commands. The current behaviour has few benefits. If a number is a regular command that stops parsing at the first invalid character in the current radix, we could write hexadcimal constants like 16^R0BEEF^D (still clumsy...). (On the other hand, the radix is runtime state and parsing must not depend on runtime state in SciTECO to ensure parseability of the language.) * Furthermore, this opens the possibility of floating point numbers. The "." command does not take arguments, so it could be part of the number syntax. This disallows constructs like "23." to push 23 and Dot which have to be replaced by "23,.". * In the most simple case, all TECO numbers could be floats/doubles with division/modulo having integer semantics. A separate floating point division operator could be introduced (e.g. ^/ with modulo being remapped to ^%). * SciTECO could also be "dynamically" typed by using integer and floating point types internally. The operator decides how to interpret the arguments and the return type. * Having a separate number parser state will simplify number syntax highlighting. * Function key masking flag for the beginning of the command line. May be useful e.g. for solarized's F5 key (i.e. function key macros that need to terminate the command line as they cannot be rubbed out properly). * Function key macros should behave more like regular macros: If inserting a character results in an error, the entire macro should be rubbed out. This means it would be OK to let commands in function key macros fail and would fix, e.g. ^FCLOSE. * Function key macros could support special escape sequences that allow us to modify the parser state reliably. E.g. one construct could expand to the current string argument's termination character (which may not be Escape). In combination with a special function key macro state effective only in the start state of the string building state machine, perhaps only in insertion commands, this could be used to make the cursor movement keys work in insertion commands by automatically terminating the command. Even more simple, the function key flag could be effective only when the termination character is $. * Function key handling should always be enabled. This was configurable because of the way escape was handled in ncurses. Now that escape is always immediate, there is little benefit in having this still configurable. In fact if turned off, SciTECO would try to execute escape sequences. The ED flag could still exist and tell whether the function key macros are used at all (i.e. this is how Gtk behaves currently). * Support more function keys. Perhaps we can safely support more via define_key(3NCURSES). At the very least PDCurses and Gtk could support much more keys and Alt and Ctrl modifiers. See also https://gist.github.com/rkumar/1237091 * Mouse support. Not that hard to implement. Mouse events use a pseudo function key macro as in Curses. Using some special command, macros can query the current mouse state (this maps to an Interface method). * Support loading from stdin (--stdin) and writing to the current buffer to stdout on exit (--stdout). This will make it easy to write command line filters, We will need flags like --8-bit-clean and --quiet with single-letter forms to make it possible to write hash-bang lines like #!...sciteco -q8iom Command line arguments should then also be handled differently, passing them in an array or single string register, so they no longer affect the unnamed buffer. * Once we've got --stdout, it makes sense to ship a version of tecat written in SciTECO. This is useful as a git diff textconv filter. See https://gist.github.com/rhaberkorn/6534ecf1b05de6216d0a9c33f31ab5f8 * For third-party macro authors, it is useful to know the standard library path (e.g. to install new lexers). There could be a --print-path option, or with the --quiet and --stdout options, one could write: sciteco -qoe 'G[$SCITECOPATH]' * The C/C++ lexer supports preprocessor evaluation. This is currently always enabled but there are no defines. Could be added as a global reg to set up defines easily. NOTE: This requires the SCI_SETPROPERTY message which is currently unsupported by . * Now that we have redo/reinsertion: When ^G modifier is active, normal inserts could insert between effective and rubbed out command line - without resetting it. This would add another alternative to { and } for fixing up a command line. * Instead of discarding a rubbed out command line once the user presses a non-matching key, a redo-tree could be built instead. When you rub out to a character where the tree branches, the next character typed always determines whether and which existing redo branch will be activated (ie become the new rubbed out command line). * some missing useful VideoTECO/TECO-11 commands and unnecessary incompatibilities: * EF with buffer id * ER command: read file into current buffer at dot * nEW to save a buffer by id * EI is used to open an indirect macro file in classic TECO. EM should thus be renamed EI. EM (position magtape in classic TECO) would be free again, e.g. for execute macro with string argument or as a special version of EI that considers $SCITECOPATH. Current use of EI (insert without string building) will have to move, but it might vanish anyway once we can disable string building with a special character. * ::S for string "comparisons" (anchored search). This is supposed to be an alias for .,.:FB which would be .,.:S in SciTECO. Apparanetly, the bounded search is still incompatible in SciTECO, as it is allowed to match beyond the bounds. Either the semantics of m,n:S should be changed or an FB command with classic TECO semantics should be introduced.. * ^S (-(length) of last referenced string), ^Y as .+^S,. * ^Q convert line arg into character arg * ^A, T and stdio in general * Search for beginning of string; i.e. a version of S that leaves dot before the search string, similar to FK (request of N.M.). Could be called <_> (a global-search variant in classic TECO). * Shortcut for cutting into Q-Register. Typing 10Xq10K is very annoying to type. We could use the @ modifier 10@Xq or define a new command, like ^X (search-mode flag in classic TECO). On the other hand, a search mode setting would be useful in SciTECO as well! FX would be available as well, but is perhaps best reserved for some mmenonics. An elegant alternative might be to introduce single-character stack operating commands for duplicating the last AND the last two arguments. However, this will not help for cutting a number of lines. * For symmetry, there should be a command for -W, eg. P. Macros and modifiers are obviously not a solution here since they're too long. * Visual selections via `...'. Allows them to be used recursively (eg. as a tool in macros). Returns the buffer range. * Perhaps there should be a built-in command for joining lines as has been requested by users. ^J (caret+J) would still be free. * Buffer ids should be "circular", i.e. interpreted modulo the number of buffers in the ring. This allows "%*" to wrap at the end of the buffer list. * instead of 0EB to show the list of buffers, there should perhaps be a special TAB-completion (^G mode?) that completes only buffers in the ring. It should also display the numeric buffer ids. * properly support Unicode encodings and the character-based model * translate documents to Unicode strings * a position refers to a character/codepoint * Progress indication in commandline cursor: Perhaps blinking or invisible? * Command to free Q-Register (remove from table). e.g. FQ (free Q). :FQ could free by QRegister prefix name for the common use case of Q-Register subtables and lists. * TECO syntax highlighting. This should now be relatively easy to implement by reusing the parser. * multiline commandline * Perhaps use Scintilla view as mini buffer. This means patching Scintilla, so it does not break lines on new line characters and we can use character representations (extend SCI_SETLINEENDTYPESALLOWED?). cmdline.c can then directly operate on the Scintilla document. * A Scintilla view will allow syntax highlighting * command line could highlight dead branches (e.g. gray them out) * Add special Q-Register for dot: Would simplify inserting dot with string building and saving/restoring dot on the QReg stack. Since [. is currently not valid and [[.] is cumbersome, there should be a special syntactic workaround to allow [.. or perhaps we'll simply call it :, so you can write [: * EL command could also be used to convert all EOLs in the current buffer. * nEL should perhaps dirtify the buffer, at least when automatic EOL translation is active, or always (see above). The dirty flag exists to remind users to save their buffers and nEL changes the result of a Save. * exclusive access to all opened files/buffers (locking): SciTECO will never be able to notice when a file has been changed externally. Also reversing a file write will overwrite any changes another process could have done on the file. Therefore open buffers should be locked using the flock(), fcntl() or lockf() interfaces. On Windows we can even enforce mandatory locks. A generic fallback could use lock files -- this would guard against concurrent SciTECO instances at least. * Multi-window support is probably never going to realize. Perhaps we could add a Gtk-frontend option like -X for opening all filenames with the same options in separate processes. For the Curses version, you will need a shell wrapper, or we could add an environment variable like $SCITECO_LAUNCH that can be set to a command for launching a new terminal. For multi-window SciTECO to work properly, file locking is probably a must as it is otherwise too easy to confuse SciTECO if multiple instances open the same file. * Touch restored save point files - should perhaps be configurable. This is important when working with Makefiles, as make looks at the modification times of files. * There should really be a backup mechanism. It would be relatively easy to implement portably, by using timeout() on Curses. The Gtk version can simply use a glib timer. Backup files should NOT be hidden and the timeout should be configurable (EJ?). * Error handling in SciTECO macros: Allow throwing errors with e.g. [n]EE$ where n is an error code, defaulting to 0 and description is the error string - there could be code-specific defaults. All internal error classes use defined codes. Errors could be catched using a structured try-catch-like construct or by defining an error handling label. Macros may retrieve the code and string of the last error. * Once we have our own function call stack, it will be possible, although not trivial, to add support for user-definable macros that accept string arguments, eg. EMq$ This will have to switch back and forth between the macro and the invoking frame supplying the macro (similar to a coroutine). In the most simple case, a special command returns the next character produced by the callers string building machine including rubout and the user will have to implement rubout-support manually. For a lot of common cases, we could also allow a string building construct that symbolizes the string parameter supplied by the caller. This could activate interactive processing in the macro's current command, allowing you to easily "wrap" interactive commands in macros. The same construct would also be useful with non-interactive commands as a way to store the supplied parameter using EU for instance. * Adding a secret command line option to process immediate editing commands in batch mode with undo would allow us to add some test cases that otherwise only occur in interactive mode. * Emscripten nodejs port. This may be a viable way to run SciTECO "cross"-platform, at least for evaluation... on UNIX-like systems in absence of prebuilt binaries. I already got netbsd-curses to build against Emscripten/nodejs using some UNIX shell wrapper calls, so practically all of SciTECO can be run against nodejs as a runtime. I'm not aware of any (working) alternatives, like cross-compiling for the JVM. See also https://gist.github.com/VitoVan/92ba4f2b68fec31cda803119686295e5 * Windows supports virtual terminals now. See: https://docs.microsoft.com/en-us/windows/console/classic-vs-vt Perhaps we should transition from PDCurses to something using terminal escape sequences like ncurses for Windows. * Improve the message line so it can log multiple messages. Especially important on GUI platforms and Win32 so we can get rid of the attached console window. * Currently, you cannot pass UTF-8 parameters to SciTECO macros. This is not critical since we don't support Unicode anyway. Sooner or later however we should use g_win32_get_command_line(). * Some platforms like MinGW and some Linux flavours have native Scintilla packages. Perhaps it makes sense to be able to build against them using --with-scintilla. * There is an Urxvt extension 52-osc for implementing the xterm-like clipboard control sequences. Other emulators also support it. It is not always detectable at run time. It may therefore make sense to always enable it after manually setting the corrsponding flag. Apparently there is also a terminfo entry Ms, but it's probably not worth using it since it won't always be set. See also https://github.com/tmux/tmux/wiki/Clipboard * Add a save-ED-hook, which could be useful for spell checking. * A dirtify-hook would be useful and could be used for spell checking. Naturally it could only be exected at the end of executing interactive commands) and it should be triggered only at the end of command lines. * CSS lexer config * Add a dedicated JSON lexer. JSON files are currently handled by the Javascript lexer (js.tes). * <'> and <|> should result in an error outside of If-statements. This is not strictly necessary and complicates the parser. <'> is currently a no-op outside of dead If-branches. | ... ' is basically equivalent to 0< ... >. Furthermore, it may not be trivial to detect such dangling Else/End-If statements as the parser state does not tell us enough. There is nothing that makes code after a successful If-condition special. Even if we always incresed the nest_level, that variable does not discern Ifs and Whiles. * Possible Nightly Build improvements (and therefore releases): * Build 32-bit Ubuntu packages * Push nightly builds into the Ubuntu PPA. We should probably create a new PPA sciteco-nightly. A new private key has already been registered on Launchpad and Github. We just need to integrate with CI. See also https://github.com/marketplace/actions/import-gpg * 64-bit Windows builds * Mac OS Arm64 builds either separately or via universal binary. See https://codetinkering.com/switch-homebrew-arm-x86/ Target flag: `-target arm64-apple-macos11` * Get into AppImageHub. * Linux: Relocatable binaries instead of hardcoding the library path. This makes it possible to run builds installed via `make install DESTDIR=...` and will aid in creating AppImages. * sample.teco_ini: Support opening files on certain lines (filename:line). Theoretically, this could also be added to the syntax, although the colon character is allowed in filenames under Windows. In principe there is little need for that in interactive mode, but it would ease opening filenames copied from compiler errors or grep -n results. * <:EF> for saving and closing a buffer, similar to <:EX>. * Bash completions. * Case-sensitive search command (modifier or flag). * FreeBSD: rctl(8) theoretically allows setting up per-process actions when exceeding the memory limit. This however requires special system settings. * Auto-completions customization via external programs. This among other things could be used to integrate LSPs-driven autocompletions. * Whereever we take buffer positions (nJ; n,mD; nQ...), negative numbers could refer to the end of the buffer or Q-Register string. * Support extended operators like in TECO-64: https://github.com/fpjohnston/TECO-64/blob/master/doc/oper.md However, instead of introducing a separate parser state, better use operators like ~=, ~< etc. * It should be possible to disable auto-completions of one-character register names, so that we can map the idention macro to M. * fnkeys.tes: Rubin/Rubout via cursor keys? Optimizations: * Use SC_DOCUMENTOPTION_STYLES_NONE in batch mode. However, we do not actually know that we do not continue in interactive mode. Also, some scripts like grosciteco.tes make use of styling in batch mode. So this would have to be requested explicitly per ED flag or command line option. * teco_interface_cmdline_update() should be called only once after inserting an entire command line macro. * There should be a common error code for null-byte tests instead of TECO_ERROR_FAILED. * teco_string_append() could be optimized by ORing a padding into the realloc() size (e.g. 0xFF). However, this has not proven effective on Linux/glibc probably because it will already allocate in blocks of roughly the same size. Should be tested on Windows, though. * commonly used (special) Q-Registers could be cached, saving the q-reg table lookup * refactor search commands (create proper base class) * Add a configure-switch for LTO (--enable-lto). * undo__teco_interface_ssm() could always include the check for teco_current_doc_must_undo(). * Newer GCC and C23 features: * Perhaps teco_bool_t usage could be simplified using __attribute__((hardbool)). * Use `#elifdef` instead of `#elif defined`. * Use `[[gnu::foo]]` instead of `__attribute__((foo))`. * The TECO_FOR_EACH() hack could be simplified at least marginally using __VA_OPT__(). Documentation: * Code docs (Doxygen). It's slowly getting better... * The ? command could be extended to support looking up help terms at dot in the current document (e.g. if called ?$). Furthermore, womanpages could contain "hypertext" links to help topics using special Troff markup and grosciteco support. * The command reference should include an overview. * Write some tutorials for the Wiki, e.g. about paragraph reflowing... Object-oriented SciTECO ideoms etc. ;-) * What to do with `--xembed`: tabbed, st when used as the git editor, etc.