<feed xmlns='http://www.w3.org/2005/Atom'>
<title>sciteco/tests, branch v2.4.0</title>
<subtitle>Scintilla-based Text Editor and COrrector</subtitle>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/'/>
<entry>
<title>fixed undoing bitfields on Windows</title>
<updated>2025-04-12T22:33:43+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2025-04-12T21:40:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=628c73d984fd7663607cc3fd9368f809855906fd'/>
<id>628c73d984fd7663607cc3fd9368f809855906fd</id>
<content type='text'>
* It turns out that `bool` (_Bool) in bitfields may cause
  padding to the next 32-bit word.
  This was only observed on MinGW.
  I am not entirely sure why, although the C standard does
  not guarantee much with regard to bitfield memory layout
  and there are 64-bit available due to passing anyway.
  Actually, they could also be layed out in a different order.
* I am now consistently using guint instead of `bool` in bitfields
  to prevent any potential surprises.
* The way that guint was aliased with bitfield structs
  for undoing teco_machine_main_t and teco_machine_qregspec_t flags
  was therefore insecure.
  It was not guaranteed that the __flags field really "captures"
  all of the bit field.
  Even with `guint v : 1` fields, this was not guaranteed.
  We would have required a static assertion for robustness.
  Alternatively, we could have declared a `gsize __flags` variable
  as well. This __should__ be safe since gsize should always be
  pointer sized and correspond to the platform's alignment.
  However, it's also not 100% guaranteed.
  Using classic ANSI C enums with bit operations to encode multiple
  fields and flags into a single integer also doesn't look very
  attractive.
* Instead, we now define scalar types with their own teco_undo_push()
  shortcuts for the bitfield structs.
  This is in one way simpler and much more robust, but on the other
  hand complicates access to the flag variables.
* It's a good question whether a `struct __attribute__((packed))` bitfield
  with guint fields would be a reliable replacement for flag enums, that
  are communicated with the "outside" (TECO) world.
  I am not going to risk it until GCC gives any guarantees, though.
  For the time being, bitfields are only used internally where
  the concrete memory layout (bit positions) is not crucial.
* This fixes the test suite and therefore probably CI and nightly
  builds on Windows.
* Test case: Rub out `@I//` or `@Xq` until before the `@`.
  The parser doesn't know that `@` is still set and allows
  all sorts of commands where `@` should be forbidden.
* It's unknown how long this has been broken on Windows - quite
  possibly since v2.0.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* It turns out that `bool` (_Bool) in bitfields may cause
  padding to the next 32-bit word.
  This was only observed on MinGW.
  I am not entirely sure why, although the C standard does
  not guarantee much with regard to bitfield memory layout
  and there are 64-bit available due to passing anyway.
  Actually, they could also be layed out in a different order.
* I am now consistently using guint instead of `bool` in bitfields
  to prevent any potential surprises.
* The way that guint was aliased with bitfield structs
  for undoing teco_machine_main_t and teco_machine_qregspec_t flags
  was therefore insecure.
  It was not guaranteed that the __flags field really "captures"
  all of the bit field.
  Even with `guint v : 1` fields, this was not guaranteed.
  We would have required a static assertion for robustness.
  Alternatively, we could have declared a `gsize __flags` variable
  as well. This __should__ be safe since gsize should always be
  pointer sized and correspond to the platform's alignment.
  However, it's also not 100% guaranteed.
  Using classic ANSI C enums with bit operations to encode multiple
  fields and flags into a single integer also doesn't look very
  attractive.
* Instead, we now define scalar types with their own teco_undo_push()
  shortcuts for the bitfield structs.
  This is in one way simpler and much more robust, but on the other
  hand complicates access to the flag variables.
* It's a good question whether a `struct __attribute__((packed))` bitfield
  with guint fields would be a reliable replacement for flag enums, that
  are communicated with the "outside" (TECO) world.
  I am not going to risk it until GCC gives any guarantees, though.
  For the time being, bitfields are only used internally where
  the concrete memory layout (bit positions) is not crucial.
* This fixes the test suite and therefore probably CI and nightly
  builds on Windows.
* Test case: Rub out `@I//` or `@Xq` until before the `@`.
  The parser doesn't know that `@` is still set and allows
  all sorts of commands where `@` should be forbidden.
* It's unknown how long this has been broken on Windows - quite
  possibly since v2.0.
</pre>
</div>
</content>
</entry>
<entry>
<title>testsuite: check whether comparisons for equality really work with the `a-b"=` idiom</title>
<updated>2025-04-10T00:50:48+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2025-04-10T00:41:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=9e82d5ee56d258d33f59eb6fdcc363d8c0c47b4c'/>
<id>9e82d5ee56d258d33f59eb6fdcc363d8c0c47b4c</id>
<content type='text'>
* There might theoretically be problems with the uncommon one's complement or magnitude
  representation of negative integers, but it's practically impossible to meet those in
  the wild.
* Still, we do some checks now, so we will at least notice any exotic architectures.
* Also, documented the `a^#b"=` idiom for checking for equality.
  It's longer to type, but faster and will also work for floats.
  For floats it will be the only permissible idiom for checking for bitwise equality
  as `a-b` can be 0 even if a!=b (if the difference is very small).
  Changing the `-` semantics is out of the question.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* There might theoretically be problems with the uncommon one's complement or magnitude
  representation of negative integers, but it's practically impossible to meet those in
  the wild.
* Still, we do some checks now, so we will at least notice any exotic architectures.
* Also, documented the `a^#b"=` idiom for checking for equality.
  It's longer to type, but faster and will also work for floats.
  For floats it will be the only permissible idiom for checking for bitwise equality
  as `a-b` can be 0 even if a!=b (if the difference is very small).
  Changing the `-` semantics is out of the question.
</pre>
</div>
</content>
</entry>
<entry>
<title>fixed formatting of the smallest possible integer</title>
<updated>2025-04-10T00:16:18+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2025-04-09T23:00:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=c462509adfd68e8b849b8a6713360fb4f9026578'/>
<id>c462509adfd68e8b849b8a6713360fb4f9026578</id>
<content type='text'>
* In other words, fixed `-9223372036854775808\` on --with-teco-integer=64
  (which is the default).
* The reason is that ABS(G_MININT64) == G_MININT64 since -G_MININT64 == G_MININT64.
  It is therefore important not to call ABS() on arbitrary teco_int_t's.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* In other words, fixed `-9223372036854775808\` on --with-teco-integer=64
  (which is the default).
* The reason is that ABS(G_MININT64) == G_MININT64 since -G_MININT64 == G_MININT64.
  It is therefore important not to call ABS() on arbitrary teco_int_t's.
</pre>
</div>
</content>
</entry>
<entry>
<title>tightened rules for specifying modifiers</title>
<updated>2025-04-08T21:33:40+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2025-04-08T20:26:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=7c0e4fbb1d1f0d19d11c7417c55a305654ab1c83'/>
<id>7c0e4fbb1d1f0d19d11c7417c55a305654ab1c83</id>
<content type='text'>
* Instead of separate stand-alone commands, they are now allowed only immediately
  in front of the commands that accept them.
* The order is still insignificant if both `@` and `:` are accepted.
* The number of colon modifiers is now also checked.
  We basically get this for free.
* `@` has syntactic significance, so it could not be set conditionally anyway.
  Still, it was possible to provoke bugs were `@` was interpreted conditionally
  as in `@ 2&lt;I/foo/$&gt;`.
* Even when not causing bugs, a mistyped `@` would often influence the
  __next__ command, causing unexpected behavior, for instance when
  typing `@(233C)W`.
* While it was theoretically possible to set `:` conditionally, it could also
  be "passed through" accidentally to some command where it wasn't expected as in
  `:Ifoo$ C`.
  I do not know of any real useful application or idiom of a conditionally set `:`.
  If there would happen to be some kind of useful application, `:'` and `:|` could
  be re-allowed easily, though.
* I was condidering introducing a common parser state for modified commands,
  but that would have been tricky and introduce a lot of redundant command lists.
  So instead, we now simply everywhere check for excess modifiers.
  To simplify this task, teco_machine_main_transition_t now contains flags
  signaling whether the transition is allowed with `@` or `:` modifiers set.
  It currently only has to be checked in the start state, after `E` and `F`.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Instead of separate stand-alone commands, they are now allowed only immediately
  in front of the commands that accept them.
* The order is still insignificant if both `@` and `:` are accepted.
* The number of colon modifiers is now also checked.
  We basically get this for free.
* `@` has syntactic significance, so it could not be set conditionally anyway.
  Still, it was possible to provoke bugs were `@` was interpreted conditionally
  as in `@ 2&lt;I/foo/$&gt;`.
* Even when not causing bugs, a mistyped `@` would often influence the
  __next__ command, causing unexpected behavior, for instance when
  typing `@(233C)W`.
* While it was theoretically possible to set `:` conditionally, it could also
  be "passed through" accidentally to some command where it wasn't expected as in
  `:Ifoo$ C`.
  I do not know of any real useful application or idiom of a conditionally set `:`.
  If there would happen to be some kind of useful application, `:'` and `:|` could
  be re-allowed easily, though.
* I was condidering introducing a common parser state for modified commands,
  but that would have been tricky and introduce a lot of redundant command lists.
  So instead, we now simply everywhere check for excess modifiers.
  To simplify this task, teco_machine_main_transition_t now contains flags
  signaling whether the transition is allowed with `@` or `:` modifiers set.
  It currently only has to be checked in the start state, after `E` and `F`.
</pre>
</div>
</content>
</entry>
<entry>
<title>improved rubbing out commands with modifiers</title>
<updated>2025-04-08T20:59:21+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2025-04-08T19:23:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=44307bd7998e5f1fc81d63d74edaf4756ddf5a47'/>
<id>44307bd7998e5f1fc81d63d74edaf4756ddf5a47</id>
<content type='text'>
* This was actually broken if the command is preceded by `@` and `:` characters, which
  are __not__ modifiers.
  E.g. `Q:@I/foo^W` would have rubbed out the `:` register as well.
* Also, since it was all done in teco_state_process_edit_cmd(),
  it would also rub out modifier characters from within string arguments,
  E.g. `@I/::^EQ^W`
* Real commands now have their own ^W rubout implementation, while the generic
  fallback just rubs out until the start state is re-established.
  This fails to rub out modifiers as in `@I/^W`, though.
* Real command characters now use the common TECO_DEFINE_STATE_COMMAND().
* Added test cases for CTRL+W rub out.
  A few control characters are now portably available to tests
  via environment variables `$ESCAPE`, `$RUBOUT` and `$RUBOUT_WORD`.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* This was actually broken if the command is preceded by `@` and `:` characters, which
  are __not__ modifiers.
  E.g. `Q:@I/foo^W` would have rubbed out the `:` register as well.
* Also, since it was all done in teco_state_process_edit_cmd(),
  it would also rub out modifier characters from within string arguments,
  E.g. `@I/::^EQ^W`
* Real commands now have their own ^W rubout implementation, while the generic
  fallback just rubs out until the start state is re-established.
  This fails to rub out modifiers as in `@I/^W`, though.
* Real command characters now use the common TECO_DEFINE_STATE_COMMAND().
* Added test cases for CTRL+W rub out.
  A few control characters are now portably available to tests
  via environment variables `$ESCAPE`, `$RUBOUT` and `$RUBOUT_WORD`.
</pre>
</div>
</content>
</entry>
<entry>
<title>bumped target release to v2.4.0 and updated README and TODO</title>
<updated>2025-03-29T14:25:05+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2025-03-29T13:53:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=c30a8963a2778ce4e1bd73b5fa667a2fff3693f2'/>
<id>c30a8963a2778ce4e1bd73b5fa667a2fff3693f2</id>
<content type='text'>
* Added a test case for the known bug of out-of-place modifiers.
  Well, this is a syntactic shortcoming rather than a true bug.
  But I did run into it more than once.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Added a test case for the known bug of out-of-place modifiers.
  Well, this is a syntactic shortcoming rather than a true bug.
  But I did run into it more than once.
</pre>
</div>
</content>
</entry>
<entry>
<title>added `@W`, `@P`, `@V` and `@Y` command variants</title>
<updated>2025-03-29T12:29:34+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2025-03-29T12:15:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=ca0d7656b606703f1b5b52e59f0b46ca0038477e'/>
<id>ca0d7656b606703f1b5b52e59f0b46ca0038477e</id>
<content type='text'>
* They swap the default order of skipping characters.
  For positive arguments: first non-word chars, then word chars.
* This is especially useful after executing the non-at-modified versions.
  For instance, at the beginning of a word, `@W` jumps to its end.
  `@V` would delete the remainder of the word.
* Since they have to evaluate the at-modifier, which has syntactic
  significance, the command implementations can no longer use
  transition tables, so they are in the switch-statements instead.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* They swap the default order of skipping characters.
  For positive arguments: first non-word chars, then word chars.
* This is especially useful after executing the non-at-modified versions.
  For instance, at the beginning of a word, `@W` jumps to its end.
  `@V` would delete the remainder of the word.
* Since they have to evaluate the at-modifier, which has syntactic
  significance, the command implementations can no longer use
  transition tables, so they are in the switch-statements instead.
</pre>
</div>
</content>
</entry>
<entry>
<title>added `P` command as a reverse form of `W`</title>
<updated>2025-03-22T12:19:53+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2025-03-22T12:19:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=ab35f6618bc8beb4543cbc7c62332f82d7d5699c'/>
<id>ab35f6618bc8beb4543cbc7c62332f82d7d5699c</id>
<content type='text'>
* All the movement commands have shortcuts for their negative forms:
  `R` instead of `-C`, `B` instead of `-L`.
  Therefore there was always the need for a `-W` shortcut as well.
* `P` is a good choice because it is a file IO command in TECO-11,
  that does not make sense supporting.
  In Video TECO it toggles between display windows (ie. split screens)
  and I do not plan to support multiple windows in SciTECO.
* Added to the cheat sheet.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* All the movement commands have shortcuts for their negative forms:
  `R` instead of `-C`, `B` instead of `-L`.
  Therefore there was always the need for a `-W` shortcut as well.
* `P` is a good choice because it is a file IO command in TECO-11,
  that does not make sense supporting.
  In Video TECO it toggles between display windows (ie. split screens)
  and I do not plan to support multiple windows in SciTECO.
* Added to the cheat sheet.
</pre>
</div>
</content>
</entry>
<entry>
<title>harmonized all word-movement and deletion commands: they move/delete until the beginning of words now</title>
<updated>2025-03-22T11:13:53+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2025-03-22T10:45:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=867d22e419afe769f05ad26b61c6ea5ea1432c3c'/>
<id>867d22e419afe769f05ad26b61c6ea5ea1432c3c</id>
<content type='text'>
* All commands and their documentations were inconsistent.
  * ^W rubbed out to the beginning of words.
  * Shift+Right (fnkeys.tes) moved to the beginning of the next word if
    invoked at the beginning of a word and to the end of the next word otherwise.
  * &lt;W&gt; (and &lt;V&gt; and &lt;Y&gt; by extension) moved to the end of the next word.
  * The cheat sheet would claim that &lt;W&gt; moves to the beginning of the next word.
* Video TECO's &lt;W&gt; command would differ again from everything else.
  With positive arguments, it moved to the beginning of words, while
  with negative it moved to end of words.
  I decided not to copy this behavior.
* It has been decided to adopt a consistent beginning-of-words policy.
  -W therefore differs from Video TECO in moving to the beginning of the
  current or previous word.
* teco_find_words() is now based on parsing the document pointer, instead
  of relying on SCI_WORDENDPOSITION, since the latter cannot actually be
  used to skip strictly non-word characters.
  This requires a constant amount of Scintilla messages but will require fewer
  messages only when moving for more than 3 words.
* The semantics of &lt;W&gt; are therefore now consistent with Vim and Emacs as well.
* Shift+Right/Left is still based on SCI_WORDENDPOSITION, so it's behavior
  differs slightly from &lt;W&gt; for instance at the end of lines, as it will
  stop at linebreaks.
* Unfortunately, these changes will break lots of macros, among others
  the M#rf, M#sp and git.blame macros ("Useful macros" from the wiki).
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* All commands and their documentations were inconsistent.
  * ^W rubbed out to the beginning of words.
  * Shift+Right (fnkeys.tes) moved to the beginning of the next word if
    invoked at the beginning of a word and to the end of the next word otherwise.
  * &lt;W&gt; (and &lt;V&gt; and &lt;Y&gt; by extension) moved to the end of the next word.
  * The cheat sheet would claim that &lt;W&gt; moves to the beginning of the next word.
* Video TECO's &lt;W&gt; command would differ again from everything else.
  With positive arguments, it moved to the beginning of words, while
  with negative it moved to end of words.
  I decided not to copy this behavior.
* It has been decided to adopt a consistent beginning-of-words policy.
  -W therefore differs from Video TECO in moving to the beginning of the
  current or previous word.
* teco_find_words() is now based on parsing the document pointer, instead
  of relying on SCI_WORDENDPOSITION, since the latter cannot actually be
  used to skip strictly non-word characters.
  This requires a constant amount of Scintilla messages but will require fewer
  messages only when moving for more than 3 words.
* The semantics of &lt;W&gt; are therefore now consistent with Vim and Emacs as well.
* Shift+Right/Left is still based on SCI_WORDENDPOSITION, so it's behavior
  differs slightly from &lt;W&gt; for instance at the end of lines, as it will
  stop at linebreaks.
* Unfortunately, these changes will break lots of macros, among others
  the M#rf, M#sp and git.blame macros ("Useful macros" from the wiki).
</pre>
</div>
</content>
</entry>
<entry>
<title>fixed rubout of Q-Register specifications</title>
<updated>2025-03-21T10:40:52+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2025-03-21T10:26:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=33f71654136014bac094babaaa81d91245fdd24c'/>
<id>33f71654136014bac094babaaa81d91245fdd24c</id>
<content type='text'>
* This was a regression introduced by 257a0bf128e109442dce91c4aaa1d97fed17ad1a.
* The undo token that frees newly allocated teco_machine_qregspec_t must actually
  reset the pointer as well since any subsequent token, pushed by teco_undo_qregspec_own(),
  will expect a valid pointer.
* Could have been done via
  ctx-&gt;expectqreg = NULL;
  teco_undo_qregspec_own(ctx-&gt;expectqreg);
  but using a special clear function requires less memory and is easier to understand.
* Added test case. This wouldn't always crash, but should definitely show up in Valgrind.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* This was a regression introduced by 257a0bf128e109442dce91c4aaa1d97fed17ad1a.
* The undo token that frees newly allocated teco_machine_qregspec_t must actually
  reset the pointer as well since any subsequent token, pushed by teco_undo_qregspec_own(),
  will expect a valid pointer.
* Could have been done via
  ctx-&gt;expectqreg = NULL;
  teco_undo_qregspec_own(ctx-&gt;expectqreg);
  but using a special clear function requires less memory and is easier to understand.
* Added test case. This wouldn't always crash, but should definitely show up in Valgrind.
</pre>
</div>
</content>
</entry>
</feed>
