* submit patch for libglib (initialization when linking statically with win32 threads - see glib/glib-init.c). Also gspawn helpers should probably link with -all-static when compiling a static glib. Why would be build a static glib but have the programs depend on other libraries? * Wiki page about creating and maintaining lexer configurations. Also mention how to use the "lexer.test..." macros in the "edit" hook. * OS X port (macports and/or homebrew) Known Bugs: * Colors are still wrong in Linux console even if TERM=linux-16color when using Solarized. Affects e.g. the message line which uses the reverse of STYLE_DEFAULT. Perhaps we must call init_color() before initializing color pairs (currently done by Scinterm). * Scinterm: The underline attribute is not applied properly even on Urxvt where it obviously works. * session.save should save and reset ^R. Perhaps ^R should be mapped to a Q-Reg to allow [^R. Currently, saving the buffer session fails if ^R != 10. On the other hand, given that almost any macro depends on the correct radix, __every__ portable macro would have to save the old radix. Perhaps it's better to make the radix a property of the current macro invocation frame and guarantee ^R == 10 at the beginning of macros. * Null-byte in strings not always handled transparently * Saving another user's file will only preserve the user when run as root. Generally, it is hard to ensure that a) save point files can be created and b) the file mode and ownership of re-created files can be preserved. We should fall back silently to an (inefficient) memory copy or temporary file strategy if this is detected. * crashes on large files: S^EM^X$ (regexp: .*) Happens because the Glib regex engine is based on a recursive Perl regex library. This is apparently impossible to fix as long as we do not have control over the regex engine build. We should either use C++11 regex support, UNIX regex (unportable) or some other library. Perhaps allowing us to interpret the SciTECO matching language directly. * the glib allocators are fundamentally broken: throwing exceptions is unsafe from C-linkage callbacks. We should abandon the custom allocators and rely on SciTECO's memory limiting. * It is still possible to crash SciTECO using recursive functions, since they map to the C++ program's call stack. It is perhaps best to use another ValueStack as a stack of macro strings and implement our own function calling. * SciTECO crashes can leave orphaned savepoint files lying around. Unfortunately, both the Windows and Linux ways of deleting files on close cannot be used here since that would disallow cheap savepoint restoration. On Windows we could work around this using MoveFileEx(file, NULL, MOVEFILE_DELAY_UNTIL_REBOOT) * Clipboard registers are prone to race conditions if the contents change between get_size() and get_string() calls. Also it's a common idiom to query a string and its size, so the internal API must be changed. * Setting window title is broken on ncurses/XTerm. Perhaps do some XTerm magic here. We can also restore window titles on exit using XTerm. Features: * Auto-indention could be implemented via context-sensitive immediate editing commands similar to tab-expansion. Avoids having to make LF a magic character in insertion commands. * :$ and :$$ to pop/return only single values * allow top-level macros to influence the proces return code. This can be used in macros to call $$ or ^C akin to exit(1). * Special macro assignment command. It could use the SciTECO parser for finding the end of the macro definition which is more reliable than @^Uq{}. Also this opens up new possibilities for optimizations. Macros could be special QRegs that are not backed by a Scintilla document but a normal string. This would immensely speed up macro calls. * Numbers could be separate states instead of stack operating commands. The current behaviour has few benefits. If a number is a regular command that stops parsing at the first invalid character in the current radix, we could write hexadcimal constants like 16^R0BEEF^D (still clumsy...). (On the other hand, the radix is runtime state and parsing must not depend on runtime state in SciTECO to ensure parseability of the language.) * Furthermore, this opens the possibility of floating point numbers. The "." command does not take arguments, so it could be part of the number syntax. This disallows constructs like "23." to push 23 and Dot which have to be replaced by "23,.". * In the most simple case, all TECO numbers could be floats/doubles with division/modulo having integer semantics. A separate floating point division operator could be introduced (e.g. ^/ with modulo being remapped to ^%). * SciTECO could also be "dynamically" typed by using integer and floating point types internally. The operator decides how to interpret the arguments and the return type. * Function key masking flag for the beginning of the command line. May be useful e.g. for solarized's F5 key (i.e. function key macros that need to terminate the command line as they cannot be rubbed out properly). * Function key macros should behave more like regular macros: If inserting a character results in an error, the entire macro should be rubbed out. This means it would be OK to let commands in function key macros fail and would fix, e.g. ^FCLOSE. * Function key macros could support special escape sequences that allow us to modify the parser state reliably. E.g. one construct could expand to the current string argument's termination character (which may not be Escape). In combination with a special function key macro state effective only in the start state of the string building state machine, perhaps only in insertion commands, this could be used to make the cursor movement keys work in insertion commands by automatically terminating the command. * Function key handling should always be enabled. This was configurable because of the way escape was handled in ncurses. Now that escape is always immediate, there is little benefit in having this still configurable. In fact if turned off, SciTECO would try to execute escape sequences. The ED flag could still exist and tell whether the function key macros are used at all (i.e. this is how Gtk behaves currently). * Mouse support. Not that hard to implement. Mouse events use a pseudo function key macro as in Curses. Using some special command, macros can query the current mouse state (this maps to an Interface method). * Support loading from stdin (--stdin) and writing to the current buffer to stdout on exit (--stdout). This will make it easy to write command line filters, This also means we need something like --ed to set the ED flags before everything else and --quiet. Command line arguments should then also be handled differently. * For third-party macro authors, it is useful to know the standard library path (e.g. to install new lexers). There could be a --print-path option, or with the --quiet and --stdout options, one could write: sciteco -qoe 'G[$SCITECOPATH]' * The C/C++ lexer supports preprocessor evaluation. This is currently always enabled but there are no defines. Could be added as a global reg to set up defines easily. * Now that we have redo/reinsertion: When ^G modifier is active, normal inserts could insert between effective and rubbed out command line - without resetting it. This would add another alternative to { and } for fixing up a command line. * some missing useful VideoTECO/TECO-11 commands: * EF with buffer id * ER command: read file into current buffer at dot * nEW to save a buffer by id * use CRTP for RBTrees to avoid unnecessary virtual method calls. This means that like the original BSD headers, implementations of the rbtree ops will be generated for every usage. Since currently, only QRegister tables and goto tables are RBTrees, the binary size overhead should be minimal. There's another possible optimization: RBTrees define an entry field for storing node color. This can be avoided on most platforms where G_MEM_ALIGN > 1 by encoding the color in the lowest bit of one of the pointers. The parent pointer is not required for RBTrees in general, but we do use the PREV/NEXT ops to iterate prefixes which requires the parent pointer to be maintained. * Buffer ids should be "circular", i.e. interpreted modulo the number of buffers in the ring. This allows "%*" to wrap at the end of the buffer list. * instead of 0EB to show the list of buffers, there should perhaps be a special TAB-completion (^G mode?) that completes only buffers in the ring. It should also display the numeric buffer ids. * properly support Unicode encodings and the character-based model * link against libncursesw if possible * translate documents to Unicode strings * a position refers to a character/codepoint * Progress indication in commandline cursor: Perhaps blinking or invisible? * Command to free Q-Register (remove from table). e.g. FQ (free Q). :FQ could free by QRegister prefix name for the common use case of Q-Register subtables and lists. * autocompletion of long Q-Reg names * TECO syntax highlighting * multiline commandline * perhaps use Scintilla view as mini buffer. This means patching Scintilla, so it does not break lines on new line characters. * A Scintilla view will allow syntax highlighting * command line could highlight dead branches (e.g. gray them out) * improve GTK interface * proper command-line widget (best would be a Scintilla view, s.a.) * speed improvements * backup files, or even better Journal files: could write a Macro file for each modified file containing only basic commands (no loops etc.). it is removed when the file is saved. in case of an abnormal program termination the journal file can be replayed. This could be done automatically in the profile. * Add special Q-Register for dot: Would simplify inserting dot with string building and saving/restoring dot on the QReg stack * :EL command could also be used to convert all EOLs in the current buffer. * exclusive access to all opened files/buffers (locking): SciTECO will never be able to notice when a file has been changed externally. Also reversing a file write will overwrite any changes another process could have done on the file. Therefore open buffers should be locked using the flock(), fcntl() or lockf() interfaces. On Windows we can even enforce mandatory locks. * Touch restored save point files - should perhaps be configurable. This is important when working with Makefiles, as make looks at the modification times of files. * At least on Windows we could implement file restoration via filesystem forks (called Alternate Data Streams) and fall back to the generic savepoint file solution if this is not possible (e.g. on a FAT32 drive). On the other hand, just like leftover savepoints, there could be leftover forks. And they wouldn't be as easy to find as the current files. * Instead of implementing split screens, it is better to leave tiling to programs dedicated to it (tmux, window manager). SciTECO could create pseudo-terminals (see pty(7)), set up one curses screen as the master of that PTY and spawn a process accessing it as a slave (e.g. urxvt -pty-fd). Each Scintilla view could then be associated with at most one curses screen. GTK+ would simply manage a list of windows. Optimizations: * Add G_UNLIKELY to all error throws. * Instead of using RTTI to implement the immediate editing command behaviours in Cmdline::process_edit_cmd() depending on the current state, this could be modelled via virtual methods in State. This would almost eradicate Cmdline::process_edit_cmd() and the huge switch-case statement, would be more efficient (but who cares in this case?) and would allow us to -fno-rtti saving a few bytes. However, this would mean to make some more Cmdline methods public. The implementations of the States' commandline editing handlers could all be concentrated in cmdline.cpp. * C++14 is supported by GCC 5 and supports new() and delete() operators with a size argument. Replacing these operators with versions using g_slice_alloc() and g_slice_free() should speed up things, especially Q-Register handling and the undo stack. This compiler capability should be checked by the build system. C++11 already allows sized allocators in a class. Testing shows that this does not speed up things on Linux and prevents freeing memory on command line termination (it would be glibc-specific). We should test whether it brings any benefit on Windows and implement with a build-time option. * String::append() could be optimized by ORing a padding into the realloc() size (e.g. 0xFF). However, this has not proven effective on Linux/glibc probably because it will already allocate in blocks of roughly the same size. Should be tested on Windows, though. * Scintilla: SETDOCPOINTER resets representations, so we have to set SciTECO representations up again often. This should be patched in Scintilla. * commonly used (special) Q-Registers could be cached, saving the q-reg table lookup * refactor search commands (create proper base class) * refactor match-char state machine using MicroStateMachine class * The current C-like programming style of SciTECO causes problems with C++'s RAII. Exceptions have to be caught always in every stack frame that owns a heap object (e.g. glib string). This is often hard to predict. There are two solutions: Wrap every such C pointer in a class that implements RAII, e.g. using C++11 unique_ptr or by a custom class template. The downside is meta-programming madness and lots of overloading to make this convenient. Alternatively, we could avoid C++ exceptions largely and use a custom error reporting system similar to GError. This makes error handling and forwarding explicit as in plain C code. RTTI can be used to discern different exception types. Documentation: * Code docs (Doxygen). It's slowly getting better...