<feed xmlns='http://www.w3.org/2005/Atom'>
<title>sciteco/src/search.c, branch v2.5.2</title>
<subtitle>Scintilla-based Text Editor and COrrector</subtitle>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/'/>
<entry>
<title>updated copyright to 2026</title>
<updated>2026-01-01T06:59:49+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>rhaberkorn@fmsbw.de</email>
</author>
<published>2026-01-01T06:59:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=c2feb2a6f71fc9adb20226fb3c2260c236e974e0'/>
<id>c2feb2a6f71fc9adb20226fb3c2260c236e974e0</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>teco_string_t is now passed by value like a scalar if the callee isn't expected to modify it</title>
<updated>2025-12-28T19:57:31+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>rhaberkorn@fmsbw.de</email>
</author>
<published>2025-12-28T15:23:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=ea0a23645f03a42252ab1ce8df45ae4076ebae75'/>
<id>ea0a23645f03a42252ab1ce8df45ae4076ebae75</id>
<content type='text'>
* When passing a struct that should not be modified, I usually use a const pointer.
* Strings however are small 2-word objects and they are often now already passed via separate
  `gchar*` and gsize parameters. So it is consistent to pass teco_string_t by value as well.
  A teco_string_t will usually fit into registers just like a pointer.
* It's now obvious which function just _uses_ and which function _modifies_ a string.
  There is also no chance to pass a NULL pointer to those functions.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* When passing a struct that should not be modified, I usually use a const pointer.
* Strings however are small 2-word objects and they are often now already passed via separate
  `gchar*` and gsize parameters. So it is consistent to pass teco_string_t by value as well.
  A teco_string_t will usually fit into registers just like a pointer.
* It's now obvious which function just _uses_ and which function _modifies_ a string.
  There is also no chance to pass a NULL pointer to those functions.
</pre>
</div>
</content>
</entry>
<entry>
<title>TECO_DEFINE_STATE() no longer constructs callback names for mandatory callbacks, but tries to use static assertions</title>
<updated>2025-12-26T17:10:42+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>rhaberkorn@fmsbw.de</email>
</author>
<published>2025-12-26T17:10:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=c2114fa0af73b42bc1ef302f7511ef87690cc0b1'/>
<id>c2114fa0af73b42bc1ef302f7511ef87690cc0b1</id>
<content type='text'>
* Requiring state callbacks by generating their names (e.g. NAME##_input) has several disadvantages:
  * The callback is not explicitly referenced when the state is defined.
    So an unintroduced reader will see some static function, which is nowhere referenced and still
    doesn't cause "unused" warnings.
  * You cannot choose the name of function that implements the callback freely.
  * In "substates" you need to generate a callback function if you want to provide a default.
    You also need to provide dummy wrapper functions whenever you want to reuse some existing
    function as the implementation.
* Instead, we are now using static assertions to check whether certain callbacks have been
  implemented.
  Unfortunately, this does not work on all compilers. In particular GCC won't consider
  references to state objects fully constant (even though they are) and does not allow
  them in _Static_assert (G_STATIC_ASSERT). This could only be made to work in newer GCC
  with -std=c2x or -std=gnu23 in combination with constexpr.
  It does work on Clang, though.
  So I introduced TECO_ASSERT_SAFE() which also passes if the expression is *not* constant.
  These static assertions are not crucial - they do not check anything that can differ between
  systems. So we can always rely on the checks performed by FreeBSD CI for instance.
  Also, you will of course quickly notice missing callbacks at runtime - with and without
  additional runtime assertions.
* All mandatory callbacks must still be explicitly initialized in the TECO_DEFINE_STATE calls.
* After getting rid of generated callback implementations, the TECO_DEFINE_STATE macros
  can finally be qualified with `static`.
* The TECO_DECLARE_STATE() macro has been removed. It no longer abstracts anything
  and cannot be used to declare static teco_state_t anyway.
  Also TECO_DEFINE_UNDO_CALL() also doesn't have a DECLARE counterpart.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Requiring state callbacks by generating their names (e.g. NAME##_input) has several disadvantages:
  * The callback is not explicitly referenced when the state is defined.
    So an unintroduced reader will see some static function, which is nowhere referenced and still
    doesn't cause "unused" warnings.
  * You cannot choose the name of function that implements the callback freely.
  * In "substates" you need to generate a callback function if you want to provide a default.
    You also need to provide dummy wrapper functions whenever you want to reuse some existing
    function as the implementation.
* Instead, we are now using static assertions to check whether certain callbacks have been
  implemented.
  Unfortunately, this does not work on all compilers. In particular GCC won't consider
  references to state objects fully constant (even though they are) and does not allow
  them in _Static_assert (G_STATIC_ASSERT). This could only be made to work in newer GCC
  with -std=c2x or -std=gnu23 in combination with constexpr.
  It does work on Clang, though.
  So I introduced TECO_ASSERT_SAFE() which also passes if the expression is *not* constant.
  These static assertions are not crucial - they do not check anything that can differ between
  systems. So we can always rely on the checks performed by FreeBSD CI for instance.
  Also, you will of course quickly notice missing callbacks at runtime - with and without
  additional runtime assertions.
* All mandatory callbacks must still be explicitly initialized in the TECO_DEFINE_STATE calls.
* After getting rid of generated callback implementations, the TECO_DEFINE_STATE macros
  can finally be qualified with `static`.
* The TECO_DECLARE_STATE() macro has been removed. It no longer abstracts anything
  and cannot be used to declare static teco_state_t anyway.
  Also TECO_DEFINE_UNDO_CALL() also doesn't have a DECLARE counterpart.
</pre>
</div>
</content>
</entry>
<entry>
<title>TECO_DEFINE_STATE_INSERT() no longer generates a done_cb</title>
<updated>2025-12-26T00:04:11+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>rhaberkorn@fmsbw.de</email>
</author>
<published>2025-12-26T00:04:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=d7330f252e6b0a1326eac6b5fc0b219a7b706eb7'/>
<id>d7330f252e6b0a1326eac6b5fc0b219a7b706eb7</id>
<content type='text'>
This made problems in teco_state_replace_default_insert, where we had to
override the done_cb.

Perhaps we should avoid all generated callback names (ie. mandatory
callback implementations)?
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This made problems in teco_state_replace_default_insert, where we had to
override the done_cb.

Perhaps we should avoid all generated callback names (ie. mandatory
callback implementations)?
</pre>
</div>
</content>
</entry>
<entry>
<title>fixed ^S/^Y for search-replacement commands</title>
<updated>2025-12-25T21:55:32+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>rhaberkorn@fmsbw.de</email>
</author>
<published>2025-12-25T21:55:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=44166f53d5923be4685a69b85166ada40dc1cc10'/>
<id>44166f53d5923be4685a69b85166ada40dc1cc10</id>
<content type='text'>
It was returning the range of the search, but not of the inserted text.
Since the searched text is deleted, the range of the insertion is more useful.
It's also what was documented and what DEC TECO does.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It was returning the range of the search, but not of the inserted text.
Since the searched text is deleted, the range of the insertion is more useful.
It's also what was documented and what DEC TECO does.
</pre>
</div>
</content>
</entry>
<entry>
<title>properly document some functions in expressions.c and simplified code</title>
<updated>2025-07-26T13:48:56+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2025-07-26T13:30:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=0ea082b74414696a7800455a437656fca2886f6d'/>
<id>0ea082b74414696a7800455a437656fca2886f6d</id>
<content type='text'>
* Practically all calls to teco_expressions_args() must be preceded by teco_expressions_eval().
* In code paths where we know that teco_expressions_args() &gt; 0, it is safe
  to call teco_expressions_pop_num(0) instead of teco_expressions_pop_num_calc().
  This is both easier and faster.
* teco_expressions_pop_num_calc() is for simple applications where you just want to get
  a command argument with default (implied) values.
  Since it includes teco_expressions_eval(), we can avoid superfluous calls.
* -EC...$ turned out to be broken and is fixed now.
  A test case has been added.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Practically all calls to teco_expressions_args() must be preceded by teco_expressions_eval().
* In code paths where we know that teco_expressions_args() &gt; 0, it is safe
  to call teco_expressions_pop_num(0) instead of teco_expressions_pop_num_calc().
  This is both easier and faster.
* teco_expressions_pop_num_calc() is for simple applications where you just want to get
  a command argument with default (implied) values.
  Since it includes teco_expressions_eval(), we can avoid superfluous calls.
* -EC...$ turned out to be broken and is fixed now.
  A test case has been added.
</pre>
</div>
</content>
</entry>
<entry>
<title>revised command topics</title>
<updated>2025-07-18T13:37:13+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2025-07-18T13:37:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=2c236869333dd20b77109fe7e9bb4ace30c0f774'/>
<id>2c236869333dd20b77109fe7e9bb4ace30c0f774</id>
<content type='text'>
* Added some keywords.
* Consistently added command variants with all modifiers.
  In principle including modifiers in the topics is unnecessary -
  you can always strip the modifiers and look up the raw command.
  However, looking up a command with modifiers can speed up the process
  (compare looking up ?S&lt;TAB&gt; vs ?::S&lt;TAB&gt;
* The `@` modifier is listed only for commands without string arguments.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Added some keywords.
* Consistently added command variants with all modifiers.
  In principle including modifiers in the topics is unnecessary -
  you can always strip the modifiers and look up the raw command.
  However, looking up a command with modifiers can speed up the process
  (compare looking up ?S&lt;TAB&gt; vs ?::S&lt;TAB&gt;
* The `@` modifier is listed only for commands without string arguments.
</pre>
</div>
</content>
</entry>
<entry>
<title>^S/^Y calculates the glyph offsets earlier</title>
<updated>2025-06-08T18:54:29+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2025-06-08T18:54:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=aaa1d51a4c85fcc627e88ef7cf5292d9c5f5f840'/>
<id>aaa1d51a4c85fcc627e88ef7cf5292d9c5f5f840</id>
<content type='text'>
* Previously, deleting text after a text match or insertion
  could result in wrong ^S/^Y results.
  In particular, the amount of characters deleted by &lt;FD&gt; at the end of a buffer
  couldn't be queried.
* This also fixes the M#rf (reflow paragraph) macro.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Previously, deleting text after a text match or insertion
  could result in wrong ^S/^Y results.
  In particular, the amount of characters deleted by &lt;FD&gt; at the end of a buffer
  couldn't be queried.
* This also fixes the M#rf (reflow paragraph) macro.
</pre>
</div>
</content>
</entry>
<entry>
<title>added &lt;FN&gt; as a search-and-replace variant of &lt;N&gt;</title>
<updated>2025-06-07T10:58:39+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2025-06-07T10:58:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=59d1188feb5c037eeffe6ba782ab362d2bb46a2d'/>
<id>59d1188feb5c037eeffe6ba782ab362d2bb46a2d</id>
<content type='text'>
* This is not in Video TECO, but TECO-11 has a search-and-replace variant of &lt;N&gt;.
  &lt;N&gt; however is a search-over-page-boundary command in TECO-11, which has been repurposed
  as search-over-buffer-boundary in Video TECO and SciTECO.
* &lt;N&gt; and &lt;FN&gt; no longer call the edit hook after *every* invocation, but only
  if the current buffer changes. This is not really relevant with the current
  default hook from fallback.teco_ini, but might be depending on the use case.
* Added testcases both for &lt;N&gt; and &lt;FN&gt;.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* This is not in Video TECO, but TECO-11 has a search-and-replace variant of &lt;N&gt;.
  &lt;N&gt; however is a search-over-page-boundary command in TECO-11, which has been repurposed
  as search-over-buffer-boundary in Video TECO and SciTECO.
* &lt;N&gt; and &lt;FN&gt; no longer call the edit hook after *every* invocation, but only
  if the current buffer changes. This is not really relevant with the current
  default hook from fallback.teco_ini, but might be depending on the use case.
* Added testcases both for &lt;N&gt; and &lt;FN&gt;.
</pre>
</div>
</content>
</entry>
<entry>
<title>fixed undoing bitfields on Windows</title>
<updated>2025-04-12T22:33:43+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2025-04-12T21:40:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=628c73d984fd7663607cc3fd9368f809855906fd'/>
<id>628c73d984fd7663607cc3fd9368f809855906fd</id>
<content type='text'>
* It turns out that `bool` (_Bool) in bitfields may cause
  padding to the next 32-bit word.
  This was only observed on MinGW.
  I am not entirely sure why, although the C standard does
  not guarantee much with regard to bitfield memory layout
  and there are 64-bit available due to passing anyway.
  Actually, they could also be layed out in a different order.
* I am now consistently using guint instead of `bool` in bitfields
  to prevent any potential surprises.
* The way that guint was aliased with bitfield structs
  for undoing teco_machine_main_t and teco_machine_qregspec_t flags
  was therefore insecure.
  It was not guaranteed that the __flags field really "captures"
  all of the bit field.
  Even with `guint v : 1` fields, this was not guaranteed.
  We would have required a static assertion for robustness.
  Alternatively, we could have declared a `gsize __flags` variable
  as well. This __should__ be safe since gsize should always be
  pointer sized and correspond to the platform's alignment.
  However, it's also not 100% guaranteed.
  Using classic ANSI C enums with bit operations to encode multiple
  fields and flags into a single integer also doesn't look very
  attractive.
* Instead, we now define scalar types with their own teco_undo_push()
  shortcuts for the bitfield structs.
  This is in one way simpler and much more robust, but on the other
  hand complicates access to the flag variables.
* It's a good question whether a `struct __attribute__((packed))` bitfield
  with guint fields would be a reliable replacement for flag enums, that
  are communicated with the "outside" (TECO) world.
  I am not going to risk it until GCC gives any guarantees, though.
  For the time being, bitfields are only used internally where
  the concrete memory layout (bit positions) is not crucial.
* This fixes the test suite and therefore probably CI and nightly
  builds on Windows.
* Test case: Rub out `@I//` or `@Xq` until before the `@`.
  The parser doesn't know that `@` is still set and allows
  all sorts of commands where `@` should be forbidden.
* It's unknown how long this has been broken on Windows - quite
  possibly since v2.0.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* It turns out that `bool` (_Bool) in bitfields may cause
  padding to the next 32-bit word.
  This was only observed on MinGW.
  I am not entirely sure why, although the C standard does
  not guarantee much with regard to bitfield memory layout
  and there are 64-bit available due to passing anyway.
  Actually, they could also be layed out in a different order.
* I am now consistently using guint instead of `bool` in bitfields
  to prevent any potential surprises.
* The way that guint was aliased with bitfield structs
  for undoing teco_machine_main_t and teco_machine_qregspec_t flags
  was therefore insecure.
  It was not guaranteed that the __flags field really "captures"
  all of the bit field.
  Even with `guint v : 1` fields, this was not guaranteed.
  We would have required a static assertion for robustness.
  Alternatively, we could have declared a `gsize __flags` variable
  as well. This __should__ be safe since gsize should always be
  pointer sized and correspond to the platform's alignment.
  However, it's also not 100% guaranteed.
  Using classic ANSI C enums with bit operations to encode multiple
  fields and flags into a single integer also doesn't look very
  attractive.
* Instead, we now define scalar types with their own teco_undo_push()
  shortcuts for the bitfield structs.
  This is in one way simpler and much more robust, but on the other
  hand complicates access to the flag variables.
* It's a good question whether a `struct __attribute__((packed))` bitfield
  with guint fields would be a reliable replacement for flag enums, that
  are communicated with the "outside" (TECO) world.
  I am not going to risk it until GCC gives any guarantees, though.
  For the time being, bitfields are only used internally where
  the concrete memory layout (bit positions) is not crucial.
* This fixes the test suite and therefore probably CI and nightly
  builds on Windows.
* Test case: Rub out `@I//` or `@Xq` until before the `@`.
  The parser doesn't know that `@` is still set and allows
  all sorts of commands where `@` should be forbidden.
* It's unknown how long this has been broken on Windows - quite
  possibly since v2.0.
</pre>
</div>
</content>
</entry>
</feed>
