aboutsummaryrefslogtreecommitdiffhomepage
diff options
context:
space:
mode:
authorRobin Haberkorn <rhaberkorn@fmsbw.de>2026-05-31 22:02:18 +0200
committerRobin Haberkorn <rhaberkorn@fmsbw.de>2026-05-31 22:02:18 +0200
commitd03667b609c91a18fe975686b8519a2599138dc3 (patch)
tree7c1d283085eaa8fb2bf7b277182ff9a6f3776ce3
parentaa7b0bb1445feeefafdcf47fd639b10500b45c03 (diff)
-rw-r--r--TODO39
1 files changed, 34 insertions, 5 deletions
diff --git a/TODO b/TODO
index 9846c07..6588fe7 100644
--- a/TODO
+++ b/TODO
@@ -74,12 +74,23 @@ Known Bugs:
and b) the file mode and ownership of re-created files can be preserved.
We should fall back silently to an (inefficient) memory copy or temporary
file strategy if this is detected.
- * Crashes on large files: S^EM^X$ (regexp: (.)+)
+ * All backward searches from the end of excessively large files can be very
+ slow, especially in UTF mode, since you are always producing
+ all matches over the entire document.
+ Perhaps scan in 4kb blocks from dot upwards, but with partial matches.
+ When getting partial matches, the match falls on a block boundary and
+ we can extended the scanned area downwards until dot.
+ This currently doesn't work with glib's regexp (PCRE) since
+ g_match_info_fetch_pos() handles partial matches like errors.
+ Here's an upstream merge request to fix that:
+ https://gitlab.gnome.org/GNOME/glib/-/merge_requests/5199
+ * Crashes on large files: S^EM^X$ (regexp: (?:.)+)
Happens because the Glib regex engine is based on a recursive (backtracking)
- Perl regex library.
+ Perl regex library and glib doesn't expose pcre_extra.
+ We could include `(*LIMIT_RECURSION=d)` in the pattern, though.
I can provoke the problem only on Ubuntu 20.04.
- This is apparently impossible to fix as long as we do not
- have control over the regex engine build.
+ We can try g_regex_match_all_full() which will use a DFA, but
+ it doesn't capture subexpressions.
We need something based on a non-backtracking Thompson's NFA with Unicode (UTF-8), see
https://swtch.com/~rsc/regexp/
Basically only RE2 would check all the boxes.
@@ -88,7 +99,14 @@ Known Bugs:
re2 should be an optional dependency, so we can still build against the
glib APIs.
Optionally, I could build a PCRE-compatible wrapper for Rust's regex crate.
- It would also be possible to port hxrex to UTF-8 and add it as a submodule.
+ It would also be possible to port one of Henry Spencer's engines (hxrex or its
+ PosgreSQL derivation or the version from Vim) to UTF-8 and add it as a submodule.
+ * It is still possible to hang searches on huge files since a single match
+ could still scan too much memory - e.g. try searching for a word that
+ occurs only at the end of the huge file.
+ Can probably be avoided by including `(*MATCH_LIMIT=d)` in the pattern.
+ A new regexp engine should also allow interruptions within a single match,
+ so we don't have to invent limits like that.
* It is still possible to crash SciTECO using recursive functions,
since they map to the C program's call stack.
It is perhaps best to use another stack of
@@ -220,6 +238,10 @@ Known Bugs:
All blocking operations must be within an event loop and call into
teco_interface_is_interrupted() to potentially drive the UI and
detect CTRL+C presses.
+ * When adding the OBS repo to Ubuntu, Synaptic showed
+ sciteco-curses:s390x packages on an amd64.
+ The package could be installed without problems, though.
+ Probably a Synaptic bug.
Features:
* Should we support *.sgml files with the HTML lexer?
@@ -241,6 +263,7 @@ Features:
* opener.tes should try to center the opened line
(SCI_SETFIRSTVISIBLELINE).
However, this would require a new ED hook, so we can query SCI_LINESONSCREEN.
+ * opener.tes: support *not* opening the .teco_session.
* Rubout of SCI_GOTOPOS could also restore the vertical
scrolling position (SCI_SETFIRSTVISIBLELINE).
So e.g. rubbing out ZJ restores the exact view.
@@ -823,6 +846,8 @@ Features:
* :^W should perhaps inhibit the caret scrolling.
When using ^W purely as a wait command, this could be undesirable.
Also, it is undesirable for some animations (e.g. shooting in tank.tes).
+ * New ED option: use fsync() when writing files.
+ Could be useful on systems that crash frequently.
Optimizations:
* Use SC_DOCUMENTOPTION_STYLES_NONE in batch mode.
@@ -946,6 +971,10 @@ Optimizations:
co take care of it.
However insertion commands would also have to take care of expanding
LF to the buffers EOL sequence.
+ * Perhaps some operations can be sped up with mmap() instead of loading
+ files into the heap. glib has an appropriate function with fallback for
+ platforms without mmap().
+ Perhaps file munging and <EI> could make use of it.
Documentation:
* Doxygen docs could be deployed on Github pages