aboutsummaryrefslogtreecommitdiffhomepage
path: root/TODO
blob: 390c6c53b28596974bf19834e0e1cb503d5ddab4 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
 * submit patch for libglib (initialization when
   linking statically with win32 threads - see glib/glib-init.c).
   Also gspawn helpers should probably link with -all-static when compiling
   a static glib. Why would be build a static glib but have the programs
   depend on other libraries?
 * Wiki page about creating and maintaining lexer configurations.
   Also mention how to use the "lexer.test..." macros in the
   "edit" hook.
 * OS X port (macports and/or homebrew)

Known Bugs:
 * Colors are still wrong in Linux console even if TERM=linux-16color
   when using Solarized. Affects e.g. the message line which uses the
   reverse of STYLE_DEFAULT.
   Perhaps we must call init_color() before initializing color pairs
   (currently done by Scinterm).
 * Scinterm: The underline attribute is not applied properly
   even on Urxvt where it obviously works.
 * session.save should save and reset ^R. Perhaps ^R should be
   mapped to a Q-Reg to allow [^R. Currently, saving the buffer session
   fails if ^R != 10.
   On the other hand, given that almost any macro depends on the
   correct radix, __every__ portable macro would have to save the
   old radix. Perhaps it's better to make the radix a property of the
   current macro invocation frame and guarantee ^R == 10 at the beginning
   of macros.
 * Null-byte in strings not always handled transparently
   (SciTECO is not 8-bit clean.)
 * Saving another user's file will only preserve the user when run as root.
   Generally, it is hard to ensure that a) save point files can be created
   and b) the file mode and ownership of re-created files can be preserved.
   We should fall back silently to an (inefficient) memory copy or temporary
   file strategy if this is detected.
 * crashes on large files: S^EM^X$ (regexp: .*)
   Happens because the Glib regex engine is based on a recursive
   Perl regex library.
   This is apparently impossible to fix as long as we do not
   have control over the regex engine build. We should either use C++11
   regex support, UNIX regex (unportable) or some other library.
   Perhaps allowing us to interpret the SciTECO matching language
   directly.
 * It is still possible to crash SciTECO using recursive functions,
   since they map to the C++ program's call stack.
   It is perhaps best to use another ValueStack as a stack of
   macro strings and implement our own function calling.
 * SciTECO crashes can leave orphaned savepoint files lying around.
   Unfortunately, both the Windows and Linux ways of deleting files
   on close cannot be used here since that would disallow cheap
   savepoint restoration.
   On Windows we could work around this using
   MoveFileEx(file, NULL, MOVEFILE_DELAY_UNTIL_REBOOT)
   This should only happen on abnormal program termination
   since SciTECO's destructors should clean up everything
   (test whether this holds true for assertions...).
 * Clipboard registers are prone to race conditions if the
   contents change between get_size() and get_string() calls.
   Also it's a common idiom to query a string and its size,
   so the internal API must be changed.
 * Setting window title is broken on ncurses/XTerm.
   Perhaps do some XTerm magic here. We can also restore
   window titles on exit using XTerm.
 * Glib (error) messages are not integrated with SciTECO's
   logging system.
 * Auto-completions are prone to unexpected results when
   the insertion contains string-building characters, braces
   (for Q-Register auto-completions) or escape characters
   (just consider autocompleting after @FG/...).
   Insertions should thus be escaped.
   A new string building command should consequently be
   introduced to insert an ASCII character by digits,
   e.g. ^E0XXX.
   Also, auto-completions within string-building constructs
   (except Q-Reg specs) should generally be disabled since
   the result will be unpredictable.

Features:
 * Auto-indention could be implemented via context-sensitive
   immediate editing commands similar to tab-expansion.
   Avoids having to make LF a magic character in insertion
   commands.
 * :$ and :$$ to pop/return only single values
 * allow top-level macros to influence the proces return code.
   This can be used in macros to call $$ or ^C akin to exit(1).
 * Special macro assignment command.
   It could use the SciTECO parser for finding the end of the
   macro definition which is more reliable than @^Uq{}.
   Also this opens up new possibilities for optimizations.
   Macros could be special QRegs that are not backed by a
   Scintilla document but a normal string. This would immensely
   speed up macro calls.
 * Numbers could be separate states instead of stack operating
   commands. The current behaviour has few benefits.
   If a number is a regular command that stops parsing at the
   first invalid character in the current radix, we could write
   hexadcimal constants like 16^R0BEEF^D (still clumsy...).
   (On the other hand, the radix is runtime state and parsing
   must not depend on runtime state in SciTECO to ensure
   parseability of the language.)
   * Furthermore, this opens the possibility of floating point
     numbers. The "." command does not take arguments, so it
     could be part of the number syntax. This disallows constructs
     like "23." to push 23 and Dot which have to be replaced by
     "23,.".
   * In the most simple case, all TECO numbers could be
     floats/doubles with division/modulo having integer semantics.
     A separate floating point division operator could be introduced
     (e.g. ^/ with modulo being remapped to ^%).
   * SciTECO could also be "dynamically" typed by using
     integer and floating point types internally.
     The operator decides how to interpret the arguments
     and the return type.
 * Function key masking flag for the beginning of the command
   line. May be useful e.g. for solarized's F5 key (i.e. function
   key macros that need to terminate the command line as they
   cannot be rubbed out properly).
 * Function key macros should behave more like regular macros:
   If inserting a character results in an error, the entire
   macro should be rubbed out. This means it would be OK to
   let commands in function key macros fail and would fix, e.g.
   ^FCLOSE.
 * Function key macros could support special escape sequences
   that allow us to modify the parser state reliably.
   E.g. one construct could expand to the current string argument's
   termination character (which may not be Escape).
   In combination with a special function key macro state
   effective only in the start state of the string building
   state machine, perhaps only in insertion commands, this
   could be used to make the cursor movement keys work in
   insertion commands by automatically terminating the command.
 * Function key handling should always be enabled. This was
   configurable because of the way escape was handled in ncurses.
   Now that escape is always immediate, there is little benefit
   in having this still configurable. In fact if turned off,
   SciTECO would try to execute escape sequences.
   The ED flag could still exist and tell whether the function
   key macros are used at all (i.e. this is how Gtk behaves currently).
 * Mouse support. Not that hard to implement. Mouse events
   use a pseudo function key macro as in Curses.
   Using some special command, macros can query the current
   mouse state (this maps to an Interface method).
 * Support loading from stdin (--stdin) and writing to
   the current buffer to stdout on exit (--stdout).
   This will make it easy to write command line filters,
   We will need flags like --8-bit-clean and --quiet with
   single-letter forms to make it possible to write hash-bang
   lines like #!...sciteco-wrapper -q8iom
   Command line arguments should then also be handled
   differently, passing them in an array or single string
   register, so they no longer affect the unnamed buffer.
 * For third-party macro authors, it is useful to know
   the standard library path (e.g. to install new lexers).
   There could be a --print-path option, or with the --quiet
   and --stdout options, one could write:
   sciteco -qoe 'G[$SCITECOPATH]'
 * The C/C++ lexer supports preprocessor evaluation.
   This is currently always enabled but there are no defines.
   Could be added as a global reg to set up defines easily.
 * Now that we have redo/reinsertion:
   When ^G modifier is active, normal inserts could insert
   between effective and rubbed out command line - without
   resetting it. This would add another alternative to { and }
   for fixing up a command line.
 * some missing useful VideoTECO/TECO-11 commands and unnecessary
   incompatibilities:
   * EF with buffer id
   * ER command: read file into current buffer at dot
   * nEW to save a buffer by id
   * EI is used to open an indirect macro file in classic TECO.
     EM should thus be renamed EI.
     EM (position magtape in classic TECO) would be free again,
     e.g. for execute macro with string argument or as a special version
     of EI that considers $SCITECOPATH.
     Current use of EI (insert without string building) will have
     to move, e.g. to FI.
   * ^S (-(length) of last referenced string), ^Y as .+^S,.
   * ^Q convert line arg into character arg
   * ^A and stdio in general
 * Buffer ids should be "circular", i.e. interpreted modulo the
   number of buffers in the ring. This allows "%*" to wrap at the
   end of the buffer list.
 * instead of 0EB to show the list of buffers, there should perhaps
   be a special TAB-completion (^G mode?) that completes only buffers
   in the ring. It should also display the numeric buffer ids.
 * properly support Unicode encodings and the character-based model
   * link against libncursesw if possible
   * translate documents to Unicode strings
   * a position refers to a character/codepoint
 * Progress indication in commandline cursor:
   Perhaps blinking or invisible?
 * Command to free Q-Register (remove from table).
   e.g. FQ (free Q). :FQ could free by QRegister prefix name for
   the common use case of Q-Register subtables and lists.
 * TECO syntax highlighting
 * multiline commandline
   * perhaps use Scintilla view as mini buffer.
     This means patching Scintilla, so it does not break lines
     on new line characters.
   * A Scintilla view will allow syntax highlighting
 * improve GTK interface
   * proper command-line widget (best would be a Scintilla view, s.a.)
   * speed improvements
 * command line could highlight dead branches (e.g. gray them out)
 * backup files, or even better Journal files:
   could write a Macro file for each modified file containing
   only basic commands (no loops etc.). it is removed when the file is
   saved. in case of an abnormal program termination the
   journal file can be replayed. This could be done automatically
   in the profile.
 * Add special Q-Register for dot:
   Would simplify inserting dot with string building and saving/restoring
   dot on the QReg stack
 * :EL command could also be used to convert all EOLs in the current
   buffer.
 * exclusive access to all opened files/buffers (locking):
   SciTECO will never be able to notice when a file has been
   changed externally. Also reversing a file write will overwrite
   any changes another process could have done on the file.
   Therefore open buffers should be locked using the flock(), fcntl()
   or lockf() interfaces. On Windows we can even enforce mandatory locks.
 * Touch restored save point files - should perhaps be configurable.
   This is important when working with Makefiles, as make looks
   at the modification times of files.
 * At least on Windows we could implement file restoration via
   filesystem forks (called Alternate Data Streams) and fall back
   to the generic savepoint file solution if this is not possible
   (e.g. on a FAT32 drive).
   On the other hand, just like leftover savepoints, there could
   be leftover forks. And they wouldn't be as easy to find as the
   current files.
 * Instead of implementing split screens, it is better to leave
   tiling to programs dedicated to it (tmux, window manager).
   SciTECO could create pseudo-terminals (see pty(7)), set up
   one curses screen as the master of that PTY and spawn
   a process accessing it as a slave (e.g. urxvt -pty-fd).
   Each Scintilla view could then be associated with at most
   one curses screen.
   GTK+ would simply manage a list of windows.
 * Error handling in SciTECO macros: Allow throwing errors with
   e.g. [n]EE<description>$ where n is an error code, defaulting
   to 0 and description is the error string - there could be code-specific
   defaults. All internal error classes use defined codes.
   Errors could be catched using a structured try-catch-like construct
   or by defining an error handling label.
   Macros may retrieve the code and string of the last error.

Optimizations:
 * The Linux-specific memory limiting using mallinfo() is
   very slow (50% to 150% slower than the fallback implementation
   measured in batch mode).
   The fallback implementation does not come with much of a
   runtime penalty.
   Still I've found no faster way of measuring the process heap.
   A soft resource limit would be ideal but unfortunately,
   it lets malloc() return NULL and we're not in control of all
   the mallocs, so glib could abort before we have a chance to
   react on it.
   Since the slow-down affects interactive mode as well, disabling
   limiting in batch mode is merely a workaround.
   Perhaps Linux should simply use the fallback limiting as well
   (or support this via a configure option).
   On Windows (2000), the overhead is approx. the same.
 * Add G_UNLIKELY to all error throws.
 * Instead of using RTTI to implement the immediate editing command
   behaviours in Cmdline::process_edit_cmd() depending on the current
   state, this could be modelled via virtual methods in State.
   This would almost eradicate Cmdline::process_edit_cmd() and the
   huge switch-case statement, would be more efficient (but who cares
   in this case?) and would allow us to -fno-rtti saving a few bytes.
   However, this would mean to make some more Cmdline methods public.
   The implementations of the States' commandline editing handlers
   could all be concentrated in cmdline.cpp.
 * String::append() could be optimized by ORing a padding
   into the realloc() size (e.g. 0xFF).
   However, this has not proven effective on Linux/glibc
   probably because it will already allocate in blocks of
   roughly the same size.
   Should be tested on Windows, though.
 * Scintilla: SETDOCPOINTER resets representations, so we
   have to set SciTECO representations up again often when
   working with Q-Registers. Has been benchmarked to be a
   major slow-down.
   This should be patched in Scintilla.
 * commonly used (special) Q-Registers could be cached,
   saving the q-reg table lookup
 * refactor search commands (create proper base class)
 * refactor match-char state machine using MicroStateMachine class
 * The current C-like programming style of SciTECO causes
   problems with C++'s RAII. Exceptions have to be caught
   always in every stack frame that owns a heap object
   (e.g. glib string). This is often hard to predict.
   There are two solutions: Wrap
   every such C pointer in a class that implements RAII,
   e.g. using C++11 unique_ptr or by a custom class template.
   The downside is meta-programming madness and lots of overloading
   to make this convenient.
   Alternatively, we could avoid C++ exceptions largely and
   use a custom error reporting system similar to GError.
   This makes error handling and forwarding explicit as in
   plain C code. RTTI can be used to discern different
   exception types. By adding an error code to the error object
   (which we will need anyway for supporting error handling in SciTECO
   macros), we may even avoid RTTI.
 * The position can be eliminated from UndoTokens by
   rewriting the UndoStack into a stack of UndoToken lists.
   Should be a significant memory reduction in interactive mode.
 * RBTrees define an entry field for storing node color.
   This can be avoided on most
   platforms where G_MEM_ALIGN > 1 by encoding the color in the
   lowest bit of one of the pointers.
   The parent pointer is not required for RBTrees in general,
   but we do use the PREV/NEXT ops to iterate prefixes which requires
   the parent pointer to be maintained.
 * Add a configure-switch for LTO (--enable-lto).

Documentation:
 * Code docs (Doxygen). It's slowly getting better...
 * The ? command could be extended to support looking
   up help terms at dot in the current document (e.g. if called ?$).
   Furthermore, womanpages could contain "hypertext" links
   to help topics using special Troff markup and grosciteco support.