<feed xmlns='http://www.w3.org/2005/Atom'>
<title>sciteco/tests, branch master</title>
<subtitle>Scintilla-based Text Editor and COrrector</subtitle>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/'/>
<entry>
<title>implemented the ^~ pattern match construct: the rest of the pattern will be an Advanced Regular Expression</title>
<updated>2026-06-28T22:32:13+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>rhaberkorn@fmsbw.de</email>
</author>
<published>2026-06-28T22:15:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=3522966d9584ec16e2f469acd0fe8727857a9d25'/>
<id>3522966d9584ec16e2f469acd0fe8727857a9d25</id>
<content type='text'>
* Allows searching by regular expressions.
  We will never support all ARE constructs in TECO patterns, so this is useful to have available.
* Can only be typed upcaret.
  This leaves ^E~q available as an escape-regexp string building construct.
* Once we replace the pattern2regexp converter with a custom terex lexer,
  we might want to restrict ^~ to the beginning of the pattern.
  Currently, however it can be anywhere, so you can mix TECO patterns with regular expressions.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Allows searching by regular expressions.
  We will never support all ARE constructs in TECO patterns, so this is useful to have available.
* Can only be typed upcaret.
  This leaves ^E~q available as an escape-regexp string building construct.
* Once we replace the pattern2regexp converter with a custom terex lexer,
  we might want to restrict ^~ to the beginning of the pattern.
  Currently, however it can be anywhere, so you can mix TECO patterns with regular expressions.
</pre>
</div>
</content>
</entry>
<entry>
<title>fixed block-wise backwards searches</title>
<updated>2026-06-28T15:30:15+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>rhaberkorn@fmsbw.de</email>
</author>
<published>2026-06-28T15:30:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=f08dea5fead2f9ef9e0fa114b2e09aa94908d629'/>
<id>f08dea5fead2f9ef9e0fa114b2e09aa94908d629</id>
<content type='text'>
The calculation of the block start was faulty and could cause underflows
resulting in unpredictable behavior.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The calculation of the block start was faulty and could cause underflows
resulting in unpredictable behavior.
</pre>
</div>
</content>
</entry>
<entry>
<title>fixed ^EGq (character class) pattern construct for embedded null bytes and `-`</title>
<updated>2026-06-28T11:44:41+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>rhaberkorn@fmsbw.de</email>
</author>
<published>2026-06-28T11:44:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=7bd7bdad687e5f790afda6f0f22444f3a169a6b1'/>
<id>7bd7bdad687e5f790afda6f0f22444f3a169a6b1</id>
<content type='text'>
This was using g_regex_escape_string() which always translates a null byte
to `\0`, which is ambiguous if followed by other digits, so a null byte followed
by a digit would result in a wrong regular expression.
Actually the same could happen outside of character classes, ie. `@S/^@1/` was also broken.
Also it does not escape `-`, so the result cannot be used in character classes.
This is fixed now in a new custom implementation teco_regex_escape().
Once moving to a custom terex lexer, we won't need any of this of course
unless we want to provide a regex escaping string building construct.

We are now completely free of GRegex.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This was using g_regex_escape_string() which always translates a null byte
to `\0`, which is ambiguous if followed by other digits, so a null byte followed
by a digit would result in a wrong regular expression.
Actually the same could happen outside of character classes, ie. `@S/^@1/` was also broken.
Also it does not escape `-`, so the result cannot be used in character classes.
This is fixed now in a new custom implementation teco_regex_escape().
Once moving to a custom terex lexer, we won't need any of this of course
unless we want to provide a regex escaping string building construct.

We are now completely free of GRegex.
</pre>
</div>
</content>
</entry>
<entry>
<title>terex is the new regular expression engine now and replaces PCRE (GRegex)</title>
<updated>2026-06-27T22:39:51+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>rhaberkorn@fmsbw.de</email>
</author>
<published>2026-06-27T22:39:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=4fe5bc6f3867096965270c90f2e1e5df77b8825f'/>
<id>4fe5bc6f3867096965270c90f2e1e5df77b8825f</id>
<content type='text'>
* terex is based on Henry Spencer's regular expression engine for Tcl.
  It is a hybrid NFA/DFA design which has better worst-time runtimes than
  the backtracking PCRE. Memory usage is also limited and can no longer
  increase catastrophically.
* It should no longer be possible to crash SciTECO with pathological
  searches.
* Since it reliably supports partial matches (REG_EXPECT) we can
  now enable the new backwards-search algorithm by default.
  This used to be broken because of a glib bug, which I already
  fixed. It would however take a long time until this ends up
  on the majority of glib installations.
* Regexp executions can still be quite slow if you are looking
  for a pattern at the end of a huge file, which can hang the editor,
  but this can now at least theoretically be solved by adding
  hooks into terex to poll for interruptions.
* We can now also get rid of a TECO-pattern to regexp translation
  step by directly generating terex tokens (TODO).
* Performance-wise terex appears to be slower than PCRE for simple
  forward searches even when linking everything with optimzations (FIXME).
* Having a stand-alone regular expression engine is also a huge
  step in getting rid of glib.

See also: https://git.fmsbw.de/terex/about/
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* terex is based on Henry Spencer's regular expression engine for Tcl.
  It is a hybrid NFA/DFA design which has better worst-time runtimes than
  the backtracking PCRE. Memory usage is also limited and can no longer
  increase catastrophically.
* It should no longer be possible to crash SciTECO with pathological
  searches.
* Since it reliably supports partial matches (REG_EXPECT) we can
  now enable the new backwards-search algorithm by default.
  This used to be broken because of a glib bug, which I already
  fixed. It would however take a long time until this ends up
  on the majority of glib installations.
* Regexp executions can still be quite slow if you are looking
  for a pattern at the end of a huge file, which can hang the editor,
  but this can now at least theoretically be solved by adding
  hooks into terex to poll for interruptions.
* We can now also get rid of a TECO-pattern to regexp translation
  step by directly generating terex tokens (TODO).
* Performance-wise terex appears to be slower than PCRE for simple
  forward searches even when linking everything with optimzations (FIXME).
* Having a stand-alone regular expression engine is also a huge
  step in getting rid of glib.

See also: https://git.fmsbw.de/terex/about/
</pre>
</div>
</content>
</entry>
<entry>
<title>monkey-test: use `I^P` instead of old `EI` (renamed in v2.5.0)</title>
<updated>2026-06-14T09:32:18+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>rhaberkorn@fmsbw.de</email>
</author>
<published>2026-06-14T09:32:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=c5cb45fab6d4a63a4fcff2cf7f6801dae2ac4db2'/>
<id>c5cb45fab6d4a63a4fcff2cf7f6801dae2ac4db2</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>implemented but disabled block-wise backwards search algorithm</title>
<updated>2026-05-31T19:19:24+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>rhaberkorn@fmsbw.de</email>
</author>
<published>2026-05-31T19:19:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=aa7b0bb1445feeefafdcf47fd639b10500b45c03'/>
<id>aa7b0bb1445feeefafdcf47fd639b10500b45c03</id>
<content type='text'>
* The block-wise search algorithm allows for efficient backwards searches
  on large files.
* On the downside the results are not entirely symmetric to forward searches.
  It therefore makes sense to still support the old correct but possibly slow
  algorithm.
  Since the old algorithm is just a special case of the new one (with a single
  block stretching the entire search range), you can configure the block size
  using `8EJ`.
* Unfortunately, the new block-wise algorithm won't work due to a bug in GRegex
  (only in the glib wrapper code).
  It is therefore disabled for the time being by default and will probably
  only be enabled once we switch to a new regexp engine.
  See https://gitlab.gnome.org/GNOME/glib/-/merge_requests/5199
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* The block-wise search algorithm allows for efficient backwards searches
  on large files.
* On the downside the results are not entirely symmetric to forward searches.
  It therefore makes sense to still support the old correct but possibly slow
  algorithm.
  Since the old algorithm is just a special case of the new one (with a single
  block stretching the entire search range), you can configure the block size
  using `8EJ`.
* Unfortunately, the new block-wise algorithm won't work due to a bug in GRegex
  (only in the glib wrapper code).
  It is therefore disabled for the time being by default and will probably
  only be enabled once we switch to a new regexp engine.
  See https://gitlab.gnome.org/GNOME/glib/-/merge_requests/5199
</pre>
</div>
</content>
</entry>
<entry>
<title>fixed test suite on OBS builds for Ubuntu 24.04</title>
<updated>2026-04-17T10:17:44+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>rhaberkorn@fmsbw.de</email>
</author>
<published>2026-04-17T10:17:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=65118ebb971a5b82b3f5e20acdf60115416610c5'/>
<id>65118ebb971a5b82b3f5e20acdf60115416610c5</id>
<content type='text'>
* The GTK version logs additional warnings, so we cannot
  match verbatim against stderr.
  Instead we only look for a line beginning with `Warning:` or `Error:`.
* We now also test info messages (`1^A`).
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* The GTK version logs additional warnings, so we cannot
  match verbatim against stderr.
  Instead we only look for a line beginning with `Warning:` or `Error:`.
* We now also test info messages (`1^A`).
</pre>
</div>
</content>
</entry>
<entry>
<title>`^A` now accepts an optional integer to specify the message severity</title>
<updated>2026-04-14T21:19:45+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>rhaberkorn@fmsbw.de</email>
</author>
<published>2026-04-13T23:16:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=34af154e92383161666751ca69a288c98f5cca60'/>
<id>34af154e92383161666751ca69a288c98f5cca60</id>
<content type='text'>
* I.e. you can now log warnings and errors from SciTECO code as well.
* We do not need a version of ^A accepting code points, since this is
  supported by ^T already.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* I.e. you can now log warnings and errors from SciTECO code as well.
* We do not need a version of ^A accepting code points, since this is
  supported by ^T already.
</pre>
</div>
</content>
</entry>
<entry>
<title>testsuite: added ^ES test case</title>
<updated>2026-03-10T15:26:03+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>rhaberkorn@fmsbw.de</email>
</author>
<published>2026-03-10T15:26:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=b8d8d5f87cbe9a4eec4ac410777c716e557b5466'/>
<id>b8d8d5f87cbe9a4eec4ac410777c716e557b5466</id>
<content type='text'>
In TECO-11 pattern matching constructs do not allow backtracking,
while the PCREs currently generated do allow backtracking.
This would be easy to fix, but there should also be constructs
to re-enable the backtracking semantics.
I left only a Known Bug test case for the time being.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In TECO-11 pattern matching constructs do not allow backtracking,
while the PCREs currently generated do allow backtracking.
This would be easy to fix, but there should also be constructs
to re-enable the backtracking semantics.
I left only a Known Bug test case for the time being.
</pre>
</div>
</content>
</entry>
<entry>
<title>`-$$` and `-^C` always return -1 now instead of passing down the prefix sign</title>
<updated>2026-02-22T21:50:53+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>rhaberkorn@fmsbw.de</email>
</author>
<published>2026-02-22T21:50:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=79c148f9779201c48b0e87d403499545f3ed9a3f'/>
<id>79c148f9779201c48b0e87d403499545f3ed9a3f</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
</feed>
