<feed xmlns='http://www.w3.org/2005/Atom'>
<title>sciteco/doc/sciteco.7.template, branch v2.1.1</title>
<subtitle>Scintilla-based Text Editor and COrrector</subtitle>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/'/>
<entry>
<title>pattern match characters support ^Q/^R now as well</title>
<updated>2024-10-04T19:41:16+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2024-10-04T19:41:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=b36ff2502ae3b0e18fa862a01fba9cc2c9067e31'/>
<id>b36ff2502ae3b0e18fa862a01fba9cc2c9067e31</id>
<content type='text'>
* makes it possible, albeit cumbersome, to escape pattern match characters
* For instance, to search for ^Q, you now have to type
  S^Q^Q^Q^Q$.
  To search for ^E you have to type
  S^Q^Q^Q^E$.
  But the last character cannot be typed with carets currently (FIXME?).
  For pattern-only characters, two ^Q should be sufficient as in
  S^Q^Q^X$.
* Perhaps it would be more elegant to abolish the difference between string building
  and pattern matching characters to avoid double quoting.
  But then all string building constructs like ^EQq should operate at the pattern level
  as well (ie. match the contents of register q verbatim instead of being interpreted as a pattern).
  TECOC and TECO-64 don't do that either.
  If we leave everything as it is, at least a new string building construct should be added for
  auto-quoting patterns (analoguous to ^EN and ^E@).
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* makes it possible, albeit cumbersome, to escape pattern match characters
* For instance, to search for ^Q, you now have to type
  S^Q^Q^Q^Q$.
  To search for ^E you have to type
  S^Q^Q^Q^E$.
  But the last character cannot be typed with carets currently (FIXME?).
  For pattern-only characters, two ^Q should be sufficient as in
  S^Q^Q^X$.
* Perhaps it would be more elegant to abolish the difference between string building
  and pattern matching characters to avoid double quoting.
  But then all string building constructs like ^EQq should operate at the pattern level
  as well (ie. match the contents of register q verbatim instead of being interpreted as a pattern).
  TECOC and TECO-64 don't do that either.
  If we leave everything as it is, at least a new string building construct should be added for
  auto-quoting patterns (analoguous to ^EN and ^E@).
</pre>
</div>
</content>
</entry>
<entry>
<title>inhibit some immediate editing commands after ^Q/^R string building constructs</title>
<updated>2024-09-25T11:56:20+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2024-09-25T11:56:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=dcaeb77ef2c5fad1810d242d7a96669e33c4b082'/>
<id>dcaeb77ef2c5fad1810d242d7a96669e33c4b082</id>
<content type='text'>
* This allows you to type ^Q^U (which would otherwise rub out the entire argument)
  and ^Q^W (which would otherwise rub out the ^Q).
* ^Q^U coincidentally worked previously since the teco_state_stringbuilding_escaped
  state would default to teco_state_process_edit_cmd().
  But it's better to make this feauture explicit.
* This finally makes it possible to insert the ^W (23) char into a buffer.
  In interactive mode, you can still only type Caret+W as a string building construct.
* ^G could also be inhibited after ^Q, but the control char is not used anywhere yet,
  so there is no point in doing that.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* This allows you to type ^Q^U (which would otherwise rub out the entire argument)
  and ^Q^W (which would otherwise rub out the ^Q).
* ^Q^U coincidentally worked previously since the teco_state_stringbuilding_escaped
  state would default to teco_state_process_edit_cmd().
  But it's better to make this feauture explicit.
* This finally makes it possible to insert the ^W (23) char into a buffer.
  In interactive mode, you can still only type Caret+W as a string building construct.
* ^G could also be inhibited after ^Q, but the control char is not used anywhere yet,
  so there is no point in doing that.
</pre>
</div>
</content>
</entry>
<entry>
<title>allow OSC-52 clipboards on all terminal emulators</title>
<updated>2024-09-23T09:45:25+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2024-09-23T09:35:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=c2887621a37f429e2e05b561631fff01da8bd462'/>
<id>c2887621a37f429e2e05b561631fff01da8bd462</id>
<content type='text'>
* The XTerm version is still checked if we detect running under XTerm.
* Actually, the XTerm implementation is broken for Unicode clipboard contents.
* Kitty supports OSC-52, but you __must__ enable read-clipboard.
  With read-clipboard-ask, there will be a timeout.
  But we cannot read without a timeout since otherwise we would hang indefinitely
  if the escape sequence turns out to not work.
* For urxvt, I have hacked an existing extension:
  https://gist.github.com/rhaberkorn/d7406420b69841ebbcab97548e38b37d
* st currently supports only setting the clipboard, but not querying it.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* The XTerm version is still checked if we detect running under XTerm.
* Actually, the XTerm implementation is broken for Unicode clipboard contents.
* Kitty supports OSC-52, but you __must__ enable read-clipboard.
  With read-clipboard-ask, there will be a timeout.
  But we cannot read without a timeout since otherwise we would hang indefinitely
  if the escape sequence turns out to not work.
* For urxvt, I have hacked an existing extension:
  https://gist.github.com/rhaberkorn/d7406420b69841ebbcab97548e38b37d
* st currently supports only setting the clipboard, but not querying it.
</pre>
</div>
</content>
</entry>
<entry>
<title>^W^W and ^V^V can be typed completely with upcarets now and they case fold all expansions of ^EQq, ^EUq and so on</title>
<updated>2024-09-20T11:50:13+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2024-09-20T11:50:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=2b5b2a48f8db3d6b73a0f1a6e0aeab3a940b3b85'/>
<id>2b5b2a48f8db3d6b73a0f1a6e0aeab3a940b3b85</id>
<content type='text'>
* Previously, there was no way to enter upper-case mode in interactive commands since
  the Ctrl+W immediate editing command is interpreted everywhere.
* Without the case folding of ^EQq/^EUq results, the upper and lower case modes are actually pretty useless
  considering that modern keyboards have caps lock.
  So it was clear we need this, regardless of what the classic TECOs did.
  The TECO-11 manual is not very clear on this.
  tecoc apparently does not case-fold ^EQq results.
* This opens up new idioms, for instance
  `EUq^W^W^EQq$` in order to upper case register q.
  It's also the only way you can currently upper-case Unicode codepoints.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Previously, there was no way to enter upper-case mode in interactive commands since
  the Ctrl+W immediate editing command is interpreted everywhere.
* Without the case folding of ^EQq/^EUq results, the upper and lower case modes are actually pretty useless
  considering that modern keyboards have caps lock.
  So it was clear we need this, regardless of what the classic TECOs did.
  The TECO-11 manual is not very clear on this.
  tecoc apparently does not case-fold ^EQq results.
* This opens up new idioms, for instance
  `EUq^W^W^EQq$` in order to upper case register q.
  It's also the only way you can currently upper-case Unicode codepoints.
</pre>
</div>
</content>
</entry>
<entry>
<title>Ctrl+^ is no longer translated to a single caret in string building (refs #20)</title>
<updated>2024-09-19T10:53:14+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2024-09-19T10:53:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=fcf962edded2d6a7cb638909587167261e4f2bb0'/>
<id>fcf962edded2d6a7cb638909587167261e4f2bb0</id>
<content type='text'>
* Ctrl+^ (30) and Caret+caret (^^) were both translated to a single caret.
  While there might be some reason to keep this behavior for double-caret,
  it is certainly pointless for Ctrl+^.
* That gives you an easy way to insert Ctrl+^ (code 30) into documents with &lt;I&gt;.
  Perviously, you either had to insert a double-caret, typing 4 carets in a row,
  or you had to use &lt;EI&gt; or 30I$.
* The special handling of double-caret could perhaps be abolished altogether,
  as we also have ^Q^ to escape plain carets.
  The double-caret syntax is very archaic from the time that there was no proper
  ^Q as far as I recall correctly.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Ctrl+^ (30) and Caret+caret (^^) were both translated to a single caret.
  While there might be some reason to keep this behavior for double-caret,
  it is certainly pointless for Ctrl+^.
* That gives you an easy way to insert Ctrl+^ (code 30) into documents with &lt;I&gt;.
  Perviously, you either had to insert a double-caret, typing 4 carets in a row,
  or you had to use &lt;EI&gt; or 30I$.
* The special handling of double-caret could perhaps be abolished altogether,
  as we also have ^Q^ to escape plain carets.
  The double-caret syntax is very archaic from the time that there was no proper
  ^Q as far as I recall correctly.
</pre>
</div>
</content>
</entry>
<entry>
<title>sciteco(7): mentioned "[a]b" idiom</title>
<updated>2024-09-17T20:32:44+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2024-09-17T20:32:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=27fba8be43e431d9316ca27fcfdaf4c6c3f22626'/>
<id>27fba8be43e431d9316ca27fcfdaf4c6c3f22626</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>updated lists of external links in sciteco(1) and sciteco(7)</title>
<updated>2024-09-16T21:01:41+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2024-09-16T20:38:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=c3e25ca55714d3a1338ccaceac9eaef04b804b1e'/>
<id>c3e25ca55714d3a1338ccaceac9eaef04b804b1e</id>
<content type='text'>
* Unfortunately, the list in sciteco(7) does not format with FreeBSD's man or
  within SciTECO.
* Removed references to the old sciteco.sf.net.
  We don't have a proper "homepage" for the time being.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Unfortunately, the list in sciteco(7) does not format with FreeBSD's man or
  within SciTECO.
* Removed references to the old sciteco.sf.net.
  We don't have a proper "homepage" for the time being.
</pre>
</div>
</content>
</entry>
<entry>
<title>Curses: added support for cool Unicode icons (refs #5)</title>
<updated>2024-09-16T20:30:35+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2024-09-16T20:30:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=8744502cbe42c98422a798c06f4c8ce033725412'/>
<id>8744502cbe42c98422a798c06f4c8ce033725412</id>
<content type='text'>
* Practically requires one of the "Nerd Font" fonts,
  so it's disabled by default.
  Add 0,512ED to the profile to enable them.
* The new ED flag could be used to control Gtk icons as well,
  but they are left always-enabled for the time being.
  Is there any reason anybody would like to disable icons in Gtk?
* The list of icons has been adapted and extended from exa:
  https://github.com/ogham/exa/blob/master/src/output/icons.rs
* The icons are hardcoded as presorted lists,
  so we can binary search them.
  This could change in the future. If there is any demand,
  they could be made configurable via Q-Registers as well.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Practically requires one of the "Nerd Font" fonts,
  so it's disabled by default.
  Add 0,512ED to the profile to enable them.
* The new ED flag could be used to control Gtk icons as well,
  but they are left always-enabled for the time being.
  Is there any reason anybody would like to disable icons in Gtk?
* The list of icons has been adapted and extended from exa:
  https://github.com/ogham/exa/blob/master/src/output/icons.rs
* The icons are hardcoded as presorted lists,
  so we can binary search them.
  This could change in the future. If there is any demand,
  they could be made configurable via Q-Registers as well.
</pre>
</div>
</content>
</entry>
<entry>
<title>function key macros have been reworked into a more generic key macro feature</title>
<updated>2024-09-12T14:44:13+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2024-09-12T11:55:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=abb5d23eba21a2aafda0346c0c5dd845561b2aa2'/>
<id>abb5d23eba21a2aafda0346c0c5dd845561b2aa2</id>
<content type='text'>
* ALL keypresses (the UTF-8 sequences resulting from key presses) can now be remapped.
* This is especially useful with Unicode support, as you might want to alias
  international characters to their corresponding latin form in the start state,
  so you don't have to change keyboard layouts so often.
  This is done automatically in Gtk, where we have hardware key press information,
  but has to be done with key macros in Curses.
  There is a new key mask 4 (bit 3) for that purpose now.
* Also, you might want to define non-ANSI letters to perform special functions in
  the start state where it won't be accepted by the parser anyway.
  Suppose you have a macro M→, you could define
  @^U[^K→]{m→} 1^_U[^K→]
  This effectively "extends" the parser and allow you to call macro "→" by a single
  key press. See also #5.
* The register prefix has been changed from ^F (for function) to ^K (for key).
  This is the only thing you have to change in order to migrate existing
  function key macros.
* Key macros are enabled by default. There is no longer any way to disable
  function key handling in curses, as I never found any reason or need to disable it.
  Theoretically, the default ESCDELAY could turn out to be too small and function
  keys don't get through. I doubt that's possible unless on extremely slow serial lines.
  Even then, you'd have to increase ESCDELAY and instead of disabling function keys
  simply define an escape surrogate.
* The ED flag has been removed and its place is reserved for a future mouse support flag
  (which does make sense to disable in curses sometimes).
  fnkeys.tes is consequently also enabled by default in sample.teco_ini.
* Key macros are handled as an unit. If one character results in an error,
  the entire string is rubbed out.
  This fixes the "CLOSE" key on Gtk.
  It also makes sure that the original error message is preserved and not overwritten
  by some subsequent syntax error.
  It was never useful that we kept inserting characters after the first error.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* ALL keypresses (the UTF-8 sequences resulting from key presses) can now be remapped.
* This is especially useful with Unicode support, as you might want to alias
  international characters to their corresponding latin form in the start state,
  so you don't have to change keyboard layouts so often.
  This is done automatically in Gtk, where we have hardware key press information,
  but has to be done with key macros in Curses.
  There is a new key mask 4 (bit 3) for that purpose now.
* Also, you might want to define non-ANSI letters to perform special functions in
  the start state where it won't be accepted by the parser anyway.
  Suppose you have a macro M→, you could define
  @^U[^K→]{m→} 1^_U[^K→]
  This effectively "extends" the parser and allow you to call macro "→" by a single
  key press. See also #5.
* The register prefix has been changed from ^F (for function) to ^K (for key).
  This is the only thing you have to change in order to migrate existing
  function key macros.
* Key macros are enabled by default. There is no longer any way to disable
  function key handling in curses, as I never found any reason or need to disable it.
  Theoretically, the default ESCDELAY could turn out to be too small and function
  keys don't get through. I doubt that's possible unless on extremely slow serial lines.
  Even then, you'd have to increase ESCDELAY and instead of disabling function keys
  simply define an escape surrogate.
* The ED flag has been removed and its place is reserved for a future mouse support flag
  (which does make sense to disable in curses sometimes).
  fnkeys.tes is consequently also enabled by default in sample.teco_ini.
* Key macros are handled as an unit. If one character results in an error,
  the entire string is rubbed out.
  This fixes the "CLOSE" key on Gtk.
  It also makes sure that the original error message is preserved and not overwritten
  by some subsequent syntax error.
  It was never useful that we kept inserting characters after the first error.
</pre>
</div>
</content>
</entry>
<entry>
<title>the SciTECO parser is Unicode-based now (refs #5)</title>
<updated>2024-09-11T14:14:27+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2024-09-11T10:21:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=68578072bfaf6054a96bb6bcedfccb6e56a508fe'/>
<id>68578072bfaf6054a96bb6bcedfccb6e56a508fe</id>
<content type='text'>
The following rules apply:
 * All SciTECO macros __must__ be in valid UTF-8, regardless of the
   the register's configured encoding.
   This is checked against before execution, so we can use glib's non-validating
   UTF-8 API afterwards.
 * Things will inevitably get slower as we have to validate all macros first
   and convert to gunichar for each and every character passed into the parser.
   As an optimization, it may make sense to have our own inlineable version of
   g_utf8_get_char() (TODO).
   Also, Unicode glyphs in syntactically significant positions may be case-folded -
   just like ASCII chars were. This is is of course slower than case folding
   ASCII. The impact of this should be measured and perhaps we should restrict
   case folding to a-z via teco_ascii_toupper().
 * The language itself does not use any non-ANSI characters, so you don't have to
   use UTF-8 characters.
 * Wherever the parser expects a single character, it will now accept an arbitrary
   Unicode/UTF-8 glyph as well.
   In other words, you can call macros like M§ instead of having to write M[§].
   You can also get the codepoint of any Unicode character with ^^x.
   Pressing an Unicode character in the start state or in Ex and Fx will now
   give a sane error message.
 * When pressing a key which produces a multi-byte UTF-8 sequence, the character
   gets translated back and forth multiple times:
   1. It's converted to an UTF-8 string, either buffered or by IME methods (Gtk).
      On Curses we could directly get a wide char using wget_wch(), but it's
      not currently used, so we don't depend on widechar curses.
   2. Parsed into gunichar for passing into the edit command callbacks.
      This also validates the codepoint - everything later on can assume valid
      codepoints and valid UTF-8 strings.
   3. Once the edit command handling decides to insert the key into the command line,
      it is serialized back into an UTF-8 string as the command line macro has
      to be in UTF-8 (like all other macros).
   4. The parser reads back gunichars without validation for passing into
      the parser callbacks.
 * Flickering in the Curses UI and Pango warnings in Gtk, due to incompletely
   inserted and displayed UTF-8 sequences, are now fixed.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The following rules apply:
 * All SciTECO macros __must__ be in valid UTF-8, regardless of the
   the register's configured encoding.
   This is checked against before execution, so we can use glib's non-validating
   UTF-8 API afterwards.
 * Things will inevitably get slower as we have to validate all macros first
   and convert to gunichar for each and every character passed into the parser.
   As an optimization, it may make sense to have our own inlineable version of
   g_utf8_get_char() (TODO).
   Also, Unicode glyphs in syntactically significant positions may be case-folded -
   just like ASCII chars were. This is is of course slower than case folding
   ASCII. The impact of this should be measured and perhaps we should restrict
   case folding to a-z via teco_ascii_toupper().
 * The language itself does not use any non-ANSI characters, so you don't have to
   use UTF-8 characters.
 * Wherever the parser expects a single character, it will now accept an arbitrary
   Unicode/UTF-8 glyph as well.
   In other words, you can call macros like M§ instead of having to write M[§].
   You can also get the codepoint of any Unicode character with ^^x.
   Pressing an Unicode character in the start state or in Ex and Fx will now
   give a sane error message.
 * When pressing a key which produces a multi-byte UTF-8 sequence, the character
   gets translated back and forth multiple times:
   1. It's converted to an UTF-8 string, either buffered or by IME methods (Gtk).
      On Curses we could directly get a wide char using wget_wch(), but it's
      not currently used, so we don't depend on widechar curses.
   2. Parsed into gunichar for passing into the edit command callbacks.
      This also validates the codepoint - everything later on can assume valid
      codepoints and valid UTF-8 strings.
   3. Once the edit command handling decides to insert the key into the command line,
      it is serialized back into an UTF-8 string as the command line macro has
      to be in UTF-8 (like all other macros).
   4. The parser reads back gunichars without validation for passing into
      the parser callbacks.
 * Flickering in the Curses UI and Pango warnings in Gtk, due to incompletely
   inserted and displayed UTF-8 sequences, are now fixed.
</pre>
</div>
</content>
</entry>
</feed>
