| Age | Commit message (Collapse) | Author | Files | Lines | 
|---|
|  | * It makes little sense to e.g. rub out until `I` in `@I/foo/`, but
  leave the `@` modifier.
  Modifiers have to be considered part of the command,
  even though the state machine is not currently modelled like that. | 
|  | out no-op commands (whitespace)
* In string arguments, ^W first rubs out non-word chars (usually whitespace),
  so it makes sense if ^W would work analogously at the command level.
  A non-command would be one of the no-ops. | 
|  | the beginning of words now
* All commands and their documentations were inconsistent.
  * ^W rubbed out to the beginning of words.
  * Shift+Right (fnkeys.tes) moved to the beginning of the next word if
    invoked at the beginning of a word and to the end of the next word otherwise.
  * <W> (and <V> and <Y> by extension) moved to the end of the next word.
  * The cheat sheet would claim that <W> moves to the beginning of the next word.
* Video TECO's <W> command would differ again from everything else.
  With positive arguments, it moved to the beginning of words, while
  with negative it moved to end of words.
  I decided not to copy this behavior.
* It has been decided to adopt a consistent beginning-of-words policy.
  -W therefore differs from Video TECO in moving to the beginning of the
  current or previous word.
* teco_find_words() is now based on parsing the document pointer, instead
  of relying on SCI_WORDENDPOSITION, since the latter cannot actually be
  used to skip strictly non-word characters.
  This requires a constant amount of Scintilla messages but will require fewer
  messages only when moving for more than 3 words.
* The semantics of <W> are therefore now consistent with Vim and Emacs as well.
* Shift+Right/Left is still based on SCI_WORDENDPOSITION, so it's behavior
  differs slightly from <W> for instance at the end of lines, as it will
  stop at linebreaks.
* Unfortunately, these changes will break lots of macros, among others
  the M#rf, M#sp and git.blame macros ("Useful macros" from the wiki). | 
|  | * Curses allows scrolling with the scroll wheel at least
  if mouse support is enabled via ED flags.
  Gtk always supported that.
* Allow clicking on popup entries to fully autocomplete them.
  Since this behavior - just like auto completions - is parser state-dependant,
  I introduced a new state method (insert_completion_cb).
  All the implementations are currently in cmdline.c since there is some overlap
  with the process_edit_cmd_cb implementations.
* Fixed pressing undefined function keys while showing the popup.
  The popup area is no longer redrawn/replaced with the Scintilla view.
  Instead, continue to show the popup. | 
|  |  | 
|  | * this would also leak a few bytes on every of fnkeys.tes' movement commands | 
|  | command line argument
* For instance, you can now rub out ^Q^W at the beginning of a string argument.
  Otherwise, pressing Ctrl+W after ^Q^W would rub out only the ^W.
  The next Ctrl+W would then insert ^W, due to special immediate editing inhibition after ^Q.
* This still only works if the string building construct expanded to at least one byte.
  Suppose you have ^EQq, expanding to nothing, pressing Ctrl+W would chain to the default
  teco_state_process_edit_cmd() and the entire command would be rubbed out.
  This is probably tolerable. | 
|  | * This allows you to type ^Q^U (which would otherwise rub out the entire argument)
  and ^Q^W (which would otherwise rub out the ^Q).
* ^Q^U coincidentally worked previously since the teco_state_stringbuilding_escaped
  state would default to teco_state_process_edit_cmd().
  But it's better to make this feauture explicit.
* This finally makes it possible to insert the ^W (23) char into a buffer.
  In interactive mode, you can still only type Caret+W as a string building construct.
* ^G could also be inhibited after ^Q, but the control char is not used anywhere yet,
  so there is no point in doing that. | 
|  | * This fixes F< to the beginning of the macro, which was broken in 73d574b71a10d4661ada20275cafde75aff6c1ba.
  teco_machine_main_t::macro_pc actually has to be signed as it is sometimes set to -1. | 
|  | errors
* teco_cmdline.pc is not correct after an error occurred.
  Therefore start_pc is initialized with teco_cmdline.effective_len. | 
|  | * ALL keypresses (the UTF-8 sequences resulting from key presses) can now be remapped.
* This is especially useful with Unicode support, as you might want to alias
  international characters to their corresponding latin form in the start state,
  so you don't have to change keyboard layouts so often.
  This is done automatically in Gtk, where we have hardware key press information,
  but has to be done with key macros in Curses.
  There is a new key mask 4 (bit 3) for that purpose now.
* Also, you might want to define non-ANSI letters to perform special functions in
  the start state where it won't be accepted by the parser anyway.
  Suppose you have a macro M→, you could define
  @^U[^K→]{m→} 1^_U[^K→]
  This effectively "extends" the parser and allow you to call macro "→" by a single
  key press. See also #5.
* The register prefix has been changed from ^F (for function) to ^K (for key).
  This is the only thing you have to change in order to migrate existing
  function key macros.
* Key macros are enabled by default. There is no longer any way to disable
  function key handling in curses, as I never found any reason or need to disable it.
  Theoretically, the default ESCDELAY could turn out to be too small and function
  keys don't get through. I doubt that's possible unless on extremely slow serial lines.
  Even then, you'd have to increase ESCDELAY and instead of disabling function keys
  simply define an escape surrogate.
* The ED flag has been removed and its place is reserved for a future mouse support flag
  (which does make sense to disable in curses sometimes).
  fnkeys.tes is consequently also enabled by default in sample.teco_ini.
* Key macros are handled as an unit. If one character results in an error,
  the entire string is rubbed out.
  This fixes the "CLOSE" key on Gtk.
  It also makes sure that the original error message is preserved and not overwritten
  by some subsequent syntax error.
  It was never useful that we kept inserting characters after the first error. | 
|  | * pressing ^W in FG now deletes the entire directory component as in EB
* commands without glob patterns (eg. EW) can now autocomplete file names containing
  glob patterns
* When the autocompletion contains a glob character in commands accepting
  glob patterns like EB or EN, we now escape the glob pattern.
  This already helps if the remaining file name can be autocompleted in one go.
  Unfortunately, this is still insufficient if we can only partially complete
  and the partial completion contains glob characters.
  For instance, if there are 2 files: `file?.txt` and `file?.foo`,
  completing after `f` will insert `ile[?].`.
  The second try to press Tab will already do nothing.
  To fully support these cases, we need a version of teco_file_auto_complete()
  accepting glob patterns.
  Perhaps we can simply append `*` to the given glob pattern. | 
|  | The following rules apply:
 * All SciTECO macros __must__ be in valid UTF-8, regardless of the
   the register's configured encoding.
   This is checked against before execution, so we can use glib's non-validating
   UTF-8 API afterwards.
 * Things will inevitably get slower as we have to validate all macros first
   and convert to gunichar for each and every character passed into the parser.
   As an optimization, it may make sense to have our own inlineable version of
   g_utf8_get_char() (TODO).
   Also, Unicode glyphs in syntactically significant positions may be case-folded -
   just like ASCII chars were. This is is of course slower than case folding
   ASCII. The impact of this should be measured and perhaps we should restrict
   case folding to a-z via teco_ascii_toupper().
 * The language itself does not use any non-ANSI characters, so you don't have to
   use UTF-8 characters.
 * Wherever the parser expects a single character, it will now accept an arbitrary
   Unicode/UTF-8 glyph as well.
   In other words, you can call macros like M§ instead of having to write M[§].
   You can also get the codepoint of any Unicode character with ^^x.
   Pressing an Unicode character in the start state or in Ex and Fx will now
   give a sane error message.
 * When pressing a key which produces a multi-byte UTF-8 sequence, the character
   gets translated back and forth multiple times:
   1. It's converted to an UTF-8 string, either buffered or by IME methods (Gtk).
      On Curses we could directly get a wide char using wget_wch(), but it's
      not currently used, so we don't depend on widechar curses.
   2. Parsed into gunichar for passing into the edit command callbacks.
      This also validates the codepoint - everything later on can assume valid
      codepoints and valid UTF-8 strings.
   3. Once the edit command handling decides to insert the key into the command line,
      it is serialized back into an UTF-8 string as the command line macro has
      to be in UTF-8 (like all other macros).
   4. The parser reads back gunichars without validation for passing into
      the parser callbacks.
 * Flickering in the Curses UI and Pango warnings in Gtk, due to incompletely
   inserted and displayed UTF-8 sequences, are now fixed. | 
|  | * When enabled with bit 2 in the ED flags (0,4ED),
  all registers and buffers will get the raw ANSI encoding (as if 0EE had been
  called on them).
  You can still manually change the encoding, eg. by calling 65001EE afterwards.
* Also the ANSI mode sets up character representations for all bytes >= 0x80.
  This is currently done only depending on the ED flag, not when setting 0EE.
* Since setting 16,4ED for 8-bit clean editing in a macro can be tricky -
  the default unnamed buffer will still be at UTF-8 and at least a bunch
  of environment registers as well - we added the command line option
  `--8bit` (short `-8`) which configures the ED flags very early on.
  As another advantage you can mung the profile in 8-bit mode as well
  when using SciTECO as a sort of interactive hex editor.
* Disable UTF-8 checks in 8-bit clean mode (sample.teco_ini). | 
|  | * ^Uq however always sets an UTF8 register as the source
  is supposed to be a SciTECO macro which is always UTF-8.
* :^Uq preserves the register's encoding
* teco_doc_set_string() now also sets the encoding
* instead of trying to restore the encoding in teco_doc_undo_set_string(),
  we now swap out the document in a teco_doc_t and pass it to an undo token.
* The get_codepage() Q-Reg method has been removed as the same
  can now be done with teco_doc_get_string() and the get_string() method. | 
|  | (refs #5)
certain test cases are still way too slow:
  10000<@I/X^J/> 20000<R>
or
  10000<@I/X^J/> 20000<%a-1J>
SCI_ALLOCATELINECHARACTERINDEX does not help much here.
It probably speeds up only SCI_LINEFROMINDEXPOSITION and SCI_INDEXPOSITIONFROMLINE. | 
|  |  | 
|  | for debug builds
* There is cleanup that is not strictly necessary, because it only frees memory
  which is freed on program termination anyway.
* However, it helps to explicitly free everything for debugging memory leaks via Valgrind.
* The new macro reduces the number of #ifdef statements.
* On NDEBUG, the code of these functions will still be eliminated.
* If functions are referenced only from the destructor, there will be no unused function
  warnings, even in NDEBUG. | 
|  | * This would sometimes rub out more than expected due to
  reading undefined memory.
  Actually even crashes were not impossible.
* This is because SCI_GETWORDCHARS does not null-terminate the buffer
  it writes but this was assumed.
  In effect, we could easily read beyond the allocated memory in wchars
  if there doesn't happen to be a null-char following the buffer.
* Consequently, null-chars in word chars were also not supported,
  although this would hardly trouble anybody.
* Instead, we now store the word chars in a teco_string_t which
  supports non-null-terminated strings natively.
  Still we null-terminate the string to keep teco_string_t's promises
  about degrading to null-terminated char *.
  This is currently not necessary.
* teco_is_wordchar() has been replaced by teco_string_contains(). | 
|  | * We no longer need special NULL-values for teco_cmdline_insert(),
  as teco_cmdline_rubin() will simply take a character from the rubbed-out
  command line and is equivalent to typing a character from the rubbed-out
  command-line. | 
|  | * The rubbed out command line should not be discarded.
* This has been broken since 432ad24e382681f1c13b07e8486e91063dd96e2e
  (C conversion). | 
|  |  | 
|  |  | 
|  | * The C standard actually forbids this (undefined behaviour) even though
  it seems intuitive that something like `memcpy(foo, NULL, 0)` does no harm.
* It turned out, there were actual real bugs related to this.
  If memchr() was called with a variable that can be NULL,
  the compiler could assume that the variable is actually always non-NULL
  (since glibc declares memchr() with nonnull), consequently eliminating
  checks for NULL afterwards.
  The same could theoretically happen with memcpy().
  This manifested itself in the empty search crashing when building with -O3.
  Test case:
  sciteco -e '@S//'
* Consequently, the nightly builds (at least for Ubuntu) also had this bug.
* In some cases, the passed in pointers are passed down from the caller but
  should not be NULL, so I added runtime assertions to guard against it. | 
|  | * This has always been broken as Gtk will not hide the
  window before suspending.
* It has been deemed to complicated to implement at the moment.
  Even if we can catch SIGTSTP (not that trivial), it seems to be
  impossible - at least without some lower level Xlib interaction -
  to hide the program window before raising SIGTSTP.
* Even if everything worked, it is unclear whether it is actually
  desirable to suspend a GUI application - ^Z may be pressed accidentally
  and it will be inconvenient to resume the job.
  So we would additionally have to check for the existence of
  an attached console. | 
|  | * Previous Scintilla version was 3.6.4 and Scinterm was 1.7 (with lots of custom patches).
  All of the patches are now either irrelevant or have been merged upstream.
* Since Scintilla 5 requires C++17, this increases the minimum GCC version at least
  to 5.0. We may actually require even newer versions.
* I could not upgrade the scintilla-mirror (which was imported from Mercurial),
  so the old sciteco-dev branch was renamed to sciteco-dev-pre-v2.0.0,
  master was deleted and I reimported the entire Scintilla repo using
  git-remote-hg.
  This means that scintilla-mirror now contains two entirely separate trees.
  But it is still possible to clone old SciTECO repos.
* The strategy/workflow of maintaining hotfix branches on scintilla-mirror has been changed.
  Instead of having one sciteco-dev branch that is rebased onto new Scintilla upstream
  releases and tagging SciTECO releases in scintilla-mirror (to keep the commits referenced),
  we now create a branch for every Scintilla version we are based on (eg. sciteco-rel-5-1-3).
  This branch is never rebased or deleted. Therefore, we are guaranteed to be able to
  clone arbitrary SciTECO repo commits - not only releases.
  Releases no longer have to be tagged in scintilla-mirror.
  On the downside, fixup commits may accumulate in these new branches.
  They can only be squashed once a new branch for a new Scintilla release is created
  (e.g. by cherry-picking followed by rebase).
* Scinterm does no longer have to reside in the Scintilla subdirectory,
  so we added it as a regular submodule.
  There are no more recursive submodules.
  The Scinterm build system has not been improved at all, but we use
  a trick based on VPATH to build Scinterm in scintilla/bin/.
* Scinterm is now in Git and we reference the upstream repo for the
  time being.
  We might mirror it and apply the same branching workflow as with Scintilla
  if necessary.
  The scinterm-mirror repository still exists but has not been touched.
  We will also have to rewrite its master branch as it was a non-reproducible
  Mercurial import.
* Scinterm now also comes with patches for Scintilla which we simply applied
  on our sciteco-rel-5-1-3 branch.
* Scintilla 5 outsourced its lexers into the Lexilla project.
  We added it as yet another submodule.
* All submodules have been moved into contrib/.
* The Scintilla API for setting lexers has consequently changed.
  We now have to call SCI_SETILEXER(0, CreateLexer(name)).
  As I did not want to introduce a separate command for setting lexers,
  <ES> has been extended to allow setting lexers by name with the SCI_SETILEXER
  message which effectively replaces SCI_SETLEXERLANGUAGE.
* The lexer macros (SCLEX_...) no longer serve any purpose - they weren't used
  in the SciTECO standard library anyway - and have consequently been removed
  from symbols-scilexer.c.
  The style macros from SciLexer.h (SCE_...) are theoretically still useful - even
  though they are not used by our current color schemes - and have therefore been
  retained. They can be specified as wParam in <ES>.
* <ES> no longer allows symbolic constants for lParam.
  This never made any sense since all supported symbols were always wParam.
* Scinterm supports new native cursor modes.
  They are not used for the time being and the previous CARETSTYLE_BLOCK_AFTER
  caret style is configured by default.
  It makes no sense to enable native cursor modes now since the
  command line should have a native cursor but is not yet a Scintilla view.
* The Scintilla upgrade performed much worse than before,
  so some optimizations will be necessary. | 
|  | file systems
* There is a "Scintilla.h" as well.
* should fix macOS and builds on native Windows hosts
* It wasn't practical to refer to the Scintilla includes using paths since
  the Scintilla location is configurable (--with-scintilla).
  So we'd have to write something like #include <include/Scintilla.h>.
  For Scinterm we cannot avoid collisions neither as its path is also
  configurable (--with-scinterm).
  Effectively, we must prevent name clashes across SciTECO and all
  of Scintilla and Scinterm. | 
|  | This is a total conversion of SciTECO to plain C (GNU C11).
The chance was taken to improve a lot of internal datastructures,
fix fundamental bugs and lay the foundations of future features.
The GTK user interface is now in an useable state!
All changes have been squashed together.
The language itself has almost not changed at all, except for:
* Detection of string terminators (usually Escape) now takes
  the string building characters into account.
  A string is only terminated outside of string building characters.
  In other words, you can now for instance write
  I^EQ[Hello$world]$
  This removes one of the last bits of shellisms which is out of
  place in SciTECO where no tokenization/lexing is performed.
  Consequently, the current termination character can also be
  escaped using ^Q/^R.
  This is used by auto completions to make sure that strings
  are inserted verbatim and without unwanted sideeffects.
* All strings can now safely contain null-characters
  (see also: 8-bit cleanliness).
  The null-character itself (^@) is not (yet) a valid SciTECO
  command, though.
An incomplete list of changes:
* We got rid of the BSD headers for RB trees and lists/queues.
  The problem with them was that they used a form of metaprogramming
  only to gain a bit of type safety. It also resulted in less
  readble code. This was a C++ desease.
  The new code avoids metaprogramming only to gain type safety.
  The BSD tree.h has been replaced by rb3ptr by Jens Stimpfle
  (https://github.com/jstimpfle/rb3ptr).
  This implementation is also more memory efficient than BSD's.
  The BSD list.h and queue.h has been replaced with a custom
  src/list.h.
* Fixed crashes, performance issues and compatibility issues with
  the Gtk 3 User Interface.
  It is now more or less ready for general use.
  The GDK lock is no longer used to avoid using deprecated functions.
  On the downside, the new implementation (driving the Gtk event loop
  stepwise) is even slower than the old one.
  A few glitches remain (see TODO), but it is hoped that they will
  be resolved by the Scintilla update which will be performed soon.
* A lot of program units have been split up, so they are shorter
  and easier to maintain: core-commands.c, qreg-commands.c,
  goto-commands.c, file-utils.h.
* Parser states are simply structs of callbacks now.
  They still use a kind of polymorphy using a preprocessor trick.
  TECO_DEFINE_STATE() takes an initializer list that will be
  merged with the default list of field initializers.
  To "subclass" states, you can simply define new macros that add
  initializers to existing macros.
* Parsers no longer have a "transitions" table but the input_cb()
  may use switch-case statements.
  There are also teco_machine_main_transition_t now which can
  be used to implement simple transitions. Additionally, you
  can specify functions to execute during transitions.
  This largely avoids long switch-case-statements.
* Parsers are embeddable/reusable now, at least in parse-only mode.
  This does not currently bring any advantages but may later
  be used to write a Scintilla lexer for TECO syntax highlighting.
  Once parsers are fully embeddable, it will also be possible
  to run TECO macros in a kind of coroutine which would allow
  them to process string arguments in real time.
* undo.[ch] still uses metaprogramming extensively but via
  the C preprocessor of course. On the downside, most undo
  token generators must be initiated explicitly (theoretically
  we could have used embedded functions / trampolines to
  instantiate automatically but this has turned out to be
  dangereous).
  There is a TECO_DEFINE_UNDO_CALL() to generate closures for
  arbitrary functions now (ie. to call an arbitrary function
  at undo-time). This simplified a lot of code and is much
  shorter than manually pushing undo tokens in many cases.
* Instead of the ridiculous C++ Curiously Recurring Template
  Pattern to achieve static polymorphy for user interface
  implementations, we now simply declare all functions to
  implement in interface.h and link in the implementations.
  This is possible since we no longer hace to define
  interface subclasses (all state is static variables in
  the interface's *.c files).
* Headers are now significantly shorter than in C++ since
  we can often hide more of our "class" implementations.
* Memory counting is based on dlmalloc for most platforms now.
  Unfortunately, there is no malloc implementation that
  provides an efficient constant-time memory counter that
  is guaranteed to decrease when freeing memory.
  But since we use a defined malloc implementation now,
  malloc_usable_size() can be used safely for tracking memory use.
  malloc() replacement is very tricky on Windows, so we
  use a poll thread on Windows. This can also be enabled
  on other supported platforms using --disable-malloc-replacement.
  All in all, I'm still not pleased with the state of memory
  limiting. It is a mess.
* Error handling uses GError now. This has the advantage that
  the GError codes can be reused once we support error catching
  in the SciTECO language.
* Added a few more test suite cases.
* Haiku is no longer supported as builds are instable and
  I did not manage to debug them - quite possibly Haiku bugs
  were responsible.
* Glib v2.44 or later are now required.
  The GTK UI requires Gtk+ v3.12 or later now.
  The GtkFlowBox fallback and sciteco-wrapper workaround are
  no longer required.
* We now extensively use the GCC/Clang-specific g_auto
  feature (automatic deallocations when leaving the current
  code block).
* Updated copyright to 2021.
  SciTECO has been in continuous development, even though there
  have been no commits since 2018.
* Since these changes are so significant, the target release has
  been set to v2.0.
  It is planned that beginning with v3.0, the language will be
  kept stable. |