Age | Commit message (Collapse) | Author | Files | Lines |
|
* Added a test case for the known bug of out-of-place modifiers.
Well, this is a syntactic shortcoming rather than a true bug.
But I did run into it more than once.
|
|
* They swap the default order of skipping characters.
For positive arguments: first non-word chars, then word chars.
* This is especially useful after executing the non-at-modified versions.
For instance, at the beginning of a word, `@W` jumps to its end.
`@V` would delete the remainder of the word.
* Since they have to evaluate the at-modifier, which has syntactic
significance, the command implementations can no longer use
transition tables, so they are in the switch-statements instead.
|
|
* All the movement commands have shortcuts for their negative forms:
`R` instead of `-C`, `B` instead of `-L`.
Therefore there was always the need for a `-W` shortcut as well.
* `P` is a good choice because it is a file IO command in TECO-11,
that does not make sense supporting.
In Video TECO it toggles between display windows (ie. split screens)
and I do not plan to support multiple windows in SciTECO.
* Added to the cheat sheet.
|
|
the beginning of words now
* All commands and their documentations were inconsistent.
* ^W rubbed out to the beginning of words.
* Shift+Right (fnkeys.tes) moved to the beginning of the next word if
invoked at the beginning of a word and to the end of the next word otherwise.
* <W> (and <V> and <Y> by extension) moved to the end of the next word.
* The cheat sheet would claim that <W> moves to the beginning of the next word.
* Video TECO's <W> command would differ again from everything else.
With positive arguments, it moved to the beginning of words, while
with negative it moved to end of words.
I decided not to copy this behavior.
* It has been decided to adopt a consistent beginning-of-words policy.
-W therefore differs from Video TECO in moving to the beginning of the
current or previous word.
* teco_find_words() is now based on parsing the document pointer, instead
of relying on SCI_WORDENDPOSITION, since the latter cannot actually be
used to skip strictly non-word characters.
This requires a constant amount of Scintilla messages but will require fewer
messages only when moving for more than 3 words.
* The semantics of <W> are therefore now consistent with Vim and Emacs as well.
* Shift+Right/Left is still based on SCI_WORDENDPOSITION, so it's behavior
differs slightly from <W> for instance at the end of lines, as it will
stop at linebreaks.
* Unfortunately, these changes will break lots of macros, among others
the M#rf, M#sp and git.blame macros ("Useful macros" from the wiki).
|
|
* This was a regression introduced by 257a0bf128e109442dce91c4aaa1d97fed17ad1a.
* The undo token that frees newly allocated teco_machine_qregspec_t must actually
reset the pointer as well since any subsequent token, pushed by teco_undo_qregspec_own(),
will expect a valid pointer.
* Could have been done via
ctx->expectqreg = NULL;
teco_undo_qregspec_own(ctx->expectqreg);
but using a special clear function requires less memory and is easier to understand.
* Added test case. This wouldn't always crash, but should definitely show up in Valgrind.
|
|
* Objects, that are restored with TECO_DEFINE_UNDO_OBJECT_OWN(),
could actually leak memory on rubout since the old object was not
deleted when overwriting it.
* Now that it is, it is crucial to at least nullify objects/pointers
after calling the corresponding push-function.
These conditions are now explicitly documented.
* The test suite now runs successfully under Valgrind even with full leak checking.
|
|
* <W> was also using keyboard movement commands.
* This fixes an inconsistency between the handling of punctuation characters,
e.g. "(word" followed by -W vs. Y.
|
|
* This has __always__ been broken.
It's been especially annoying when pressing `Y` at the end of a line with trailing whitespace
since the linebreak would also be deleted.
This was because `Y` always deleted the entire word or non-word character-span.
This was inconsistent with `V`.
* We now use SCI_WORDSTART|ENDPOSITION instead of the keyboard commands.
It therefore also requires less Scintilla messages (4+2*n vs. 4+4*n).
Most importantly, we can now check for errors before changing the buffer,
so there is no need to undo anything in case of errors.
This should always be the preferred strategy.
* Added test case.
|
|
We cannot practically test `?` since it relies on women pages being installed
into the $SCITECOPATH.
While we do set $SCITECOPATH to the source tree, we cannot currently provide
the generated women pages to test scripts, unless installing the project first.
|
|
* Test case: @EQa// @?/EX/ -- Rubout should return you to the Q-Register view.
* The test suite has been extended.
Unfortunately we cannot currently directly check whether we're editing a Q-Register.
But we add a magic number of characters to the Q-Register, that we can check afterwards.
|
|
* Allowing label redefinitions might have been useful when used as comments,
since you will want to be able to define arbitrary comments.
However as flow control constructs, this introduced a certain ambiguity since
gotos might jump to different locations, depending on the progression
of the parser.
* On the other hand, making label redefinition an error might disqualify labels as
comments when writing or porting classic TECO code.
Therefore, it has been made a warning as a compromise.
* Added test case
|
|
characters)
* The teco_qreg_vtable_t::get_string() method should support returning the
length optionally (may be NULL).
This already worked with teco_doc_get_string(), even though it wasn't documented,
and therefore didn't cause problems with regular Q-Registers.
|
|
* As known from DEC TECO, but extended to convert absolute positions to line numbers as well.
:^Q returns the current line.
* Especially useful in macros that accept line arguments,
as it is much shorter than something like
^E@ES/LINEFROMPOSITION//+Q.l@ES/POSITIONFROMLINE//:^E-.
* On the other hand, the fact that ^Q checks the line range means we cannot
easily replace lexer.checkheader with something like
[:J 0,^Q::S...$ ]:
Using SCI_POSITIONFROMLINE still has the advantage that it returns `Z` for out-of-bounds ranges
which would be cumbersome to write with the current ^Q.
* Perhaps there should be a separate command for converting between absolute lines and positions
and :^Q should be repurposed to return a failure boolean for out-of-range values?
* fnkeys.tes could be simplified.
|
|
::FS as well)
* The colon modifier can now occur 2 times.
Specifying `@` more than once or `:` more than twice is an error now.
* Commands do not check for excess colon modifiers - almost every command would have
to check it. Instead, a double colon will simply behave like a single colon on most
commands.
* All search commands inherit the anchored semantics, but it's not very useful in some combinations
like -::S, ::N or ::FK.
That's why the `::` variants are not documented everywhere.
* The lexer.checkheader macro could be simplified and should also be faster now,
speeding up startup.
Eventually this macro can be made superfluous, e.g. by using 1:FB or 0,1^Q::S.
|
|
* Can be freely combined with the colon-modifier as well.
:@Xq cut-appends to register q.
* This simply deletes the given buffer range after the copy or append operation
as if followed by another <K> command.
* This has indeed been a very annoying missing feature, as you often have to retype the
range for a K or D command.
At the same time, this cannot be reasonably solved with a macro since macros
do not accept Q-Register arguments -- so we would have to restrict ourselves to one or a few
selected registers.
I was also considering to solve this with a special stack operation that duplicates the
top values, so that Xq leaves arguments for K, but this couldn't work for cutting lines
and would also be longer to type.
* It's the first non-string command that accepts @.
Others may follow in the future.
We're approaching ITS TECO madness levels.
|
|
lengths (refs #27)
* Allows storing pattern matches into Q-Registers (^YXq).
* You can also refer to subpatterns marked by ^E[...] by passing a number > 0.
This is equivalent to \0-9 references in many programming languages.
* It's especially useful for supporting TECO's equivalent of structural regular expressions.
This will be done with additional macros.
* You can also simply back up to the beginning of an insertion or search.
So I...$^SC leaves dot at the beginning of the insertion.
S...$^SC leaves dot before the found pattern.
This has been previously requested by users.
* Perhaps there should be ^Y string building characters as well to backreference
in search-replacement commands (TODO).
This means that the search commands would have to store the matched text itself
in teco_range_t structures since FR deletes the matched text before
processing the replacement string.
It could also be made into a FR/FS-specific construct,
so we don't fetch the substrings unnecessarily.
* This differs from DEC TECO in always returning the same range even after dot movements,
since we are storing start/end byte positions instead of only the length.
Also DEC TECO does not support fetching subpattern ranges.
|
|
5597bc72671d0128e6f0dba446c4dc8d47bf37d0)
* Using teco_expressions_eval() is wrong since it does not pay attention to precedences.
If you have multiple higher precedence operators in a row, as in 2+3*4*5,
the lower precedence operators would be resolved prematurely.
* Instead we now call teco_expressions_calc() repeatedly but only for lower precedence
operators on the stack top.
This makes sure that as much of the expression as possible is evaluated at any given moment.
|
|
* It was possible to provoke operator right-associativity when placing a high-precedence
operator between two low-precedence operators.
1-6*5-1 evaluated to -28 instead of the expected -30.
* The reason is that SciTECO relies on operators to be resolved from left-to-right as soon as possible.
The higher precedence operator prevents that and pushing the 2nd "-" only evaluated 6*5.
At the end 1-30-1 would be left on the stack.
teco_expressions_eval() however evaluates from right-to-left which is wrong in this case.
* Instead, we now do a full eval on every operator with a lower precedence, making sure that 1-30 is
evaluated first.
|
|
* We cannot call it "." since that introduces a local register
and we don't want to add an unnecessary syntactic exception.
* Allows the idiom [: ... ]: to temporarily move around.
Also, you can now write ^E\: without having to store dot in a register first.
* In the future we might add an ^E register as well for byte offsets.
However, there are much fewer useful applications.
* Of course, you can now also write nU: instead of nJ, Q: instead of "." and
n%: instead of "nC.". However it's all not really useful.
|
|
* This would actually causes crashes when trying to format numbers.
* The ^R local register has a custom set_integer() method now,
so that the check is performed also when using nU.^X.
|
|
^R now (refs #17)
* This way the search mode and radix are local to the current macro frame,
unless the macro was invoked with :Mq.
If colon-modified, you can reproduce the same effect by calling
[.^X 0^X ... ].^X
* The radix register is cached in the Q-Reg table as an optimization.
This could be done with the other "special" registers as well, but at the
cost of larger stack frames.
* In order to allow constructs like [.^X typed with upcarets,
the Q-Register specification syntax has been extended:
^c is the corresponding control code instead of the register "^".
|
|
* Usually you will only want -^X for enabling case sensitive searches
and 0^X for case-insensitive searches (which is also the default).
* An open question is what happens if the user sets -^X and then calls
a macro. The search mode flag should probably be stacked away along
with the search-string. This means we'd need a ^X special Q-Reg as well,
so you can write [^X[_ 0^X S...$ ]_]^X.
Alternatively, the search mode flag should be a property of the
macro frame, along with the radix.
|
|
* The program exit code will usually not signal failures since they are caught earlier.
* Therefore, we always have to capture and check stderr.
|
|
characters in Q-Register)
* It was initialized only once, so it could inherit the wrong local Q-Register table.
A test case has been added for this particular bug.
* Also, if starting from the profile (batch mode), the state machine could be initialized
without undo, which then later cause problems on rubout in interactive mode.
For instance, if S^EG[a] fails and you would repeatedly type `]`, the Q-Reg name could
grow indefinitely. There were probably other issues as well.
Even crashes should have been possible, although I couldn't reproduce them.
* Since the state machine is required only for the pattern to regexp translation
and is performed anew for every character in interactive mode,
we now create a fresh state machine for every call and don't attempt
any undo.
There might be more efficient ways, like reusing the string building's
Q-Reg parser state machine.
|
|
* A test case has been added, although it might have been accidental
that on caused crashes.
|
|
string cells
Found thanks to the "infinite monkey" test.
|
|
* Any memory error will let the test case fail with code 66.
* You can also call
make check TESTSUITEFLAGS="--valgrind"
* There is no program test for Valgrind in configure.ac for the time being.
`valgrind` must be in $PATH.
* All CI testsuite runs under Ubuntu are now with Valgrind.
|
|
* There was some boilerplate code missing in teco_state_search_all_initial(),
that is present in teco_state_search_initial().
* Perhaps there should be a common function to avoid redundancies?
* This will also fix the initialization of the string argument codepage for <N>.
|
|
* Supports all immediate editing commands.
Naturally it cannot emulate arbitrary key presses since there is no
canonic ASCII-encoding of function keys.
Key macros are not consequently also not testable.
The --fake-cmdline parameter is instead treated very similar to
a key macro expansion.
* Most importantly this allows adding test cases for rubout behavior
and bugs that are quite common.
* Added regression test cases for the last two rubout bugs.
* It's not easy to pass control codes in command line arguments in
a portable manner, so the test cases will often use { and }.
Control codes could be used e.g. by defining variables like
RUBOUT=`printf '\b'`
and referencing them with ${RUBOUT}.
|
|
* The old bug of saving gchar in gints, so teco_eol_reader_t::last_char could become negative.
* When converting from an UTF-8 text with CRLF linebreaks, we could have data loss and corruptions.
* On strings ending in UTF-8 characters, teco_eol_reader_t::offset would overflow, resulting
in invalid reads and potentially insertion of data garbage.
I observed this with G~ on Gtk.
* Test cased updated. Couldn't reproduce the bug with the test suite, though.
|
|
* The previous check could result in false positives if you are editing
a local Q-Register, that will be destroyed at the end of the current macro frame,
and call another non-colon modified macro.
* It must instead be invalid to keep the register edited only if it belongs to the
local Q-Registers that are about to be freed.
In other words, the table that the currently edited Q-Register belongs to, must be
the one we're about to destroy.
* This fixes the solarized.toggle (F5) macro when using the Solarized color scheme.
|
|
allow breaking from within braces
For instance, you can now write <23(1;)> without leaving anything on the stack.
|
|
* makes it possible, albeit cumbersome, to escape pattern match characters
* For instance, to search for ^Q, you now have to type
S^Q^Q^Q^Q$.
To search for ^E you have to type
S^Q^Q^Q^E$.
But the last character cannot be typed with carets currently (FIXME?).
For pattern-only characters, two ^Q should be sufficient as in
S^Q^Q^X$.
* Perhaps it would be more elegant to abolish the difference between string building
and pattern matching characters to avoid double quoting.
But then all string building constructs like ^EQq should operate at the pattern level
as well (ie. match the contents of register q verbatim instead of being interpreted as a pattern).
TECOC and TECO-64 don't do that either.
If we leave everything as it is, at least a new string building construct should be added for
auto-quoting patterns (analoguous to ^EN and ^E@).
|
|
* Previously you could open files of arbitrary size and the limit would be checked only afterwards.
* Many, but not all, cases should now be detected earlier.
Since Scintilla allocates lots of memory as part of rendering,
you can still run into memory limits even after successfully loading the file.
* Loading extremely large files can also be potentially slow.
Therefore, it is now possible to interrupt via CTRL+C.
Again, if the UI is blocking because of stuff done as part of rendering,
you still may not be able to interrupt the "blocking" operation.
|
|
|
|
* @EQ$/.../ sets the current directory from the contents of the given file.
@E%$/.../ stores the currend directory in the given file.
* @EQ*/.../ will fail, just like ^U*...$.
@E%*/.../ stores the current buffer's name in the given file.
* It's especially useful with the clipboard registers.
There could still be a minor bug in @E%~/.../ with regard to EOL normalization
as teco_view_save() will use the EOL style of the current document, which
may not be the style of the Q-Reg contents.
Conversions can generally be avoided for these particular commands.
But without teco_view_save() we'd have to care about save point creation.
|
|
* This was unsafe and could easily result in crashes, since teco_qreg_current
would afterwards point to an already freed Q-Register.
* Since automatically editing another register or buffer is not easy to do right,
we throw an error instead.
|
|
This was throwing glib assertions.
|
|
* It wasn't failing on FreeBSD because there are different default
stacksize limits.
We now set it to 8MB everywhere.
|
|
* This fixes F< to the beginning of the macro, which was broken in 73d574b71a10d4661ada20275cafde75aff6c1ba.
teco_machine_main_t::macro_pc actually has to be signed as it is sometimes set to -1.
|
|
* while code is guaranteed to be in valid UTF-8, this cannot be
said about the result of string building.
* The search pattern can end up with invalid Unicode bytes even when
searching on UTF-8 buffers, e.g. if ^EQq inserts garbage.
There are currently no checks.
* When searching on a raw buffer, it must be possible to
search for arbitrary bytes (^EUq).
Since teco_pattern2regexp() was always expecting clean UTF-8 input,
this would sometimes skip over too many bytes and could even crash.
* Instead, teco_pattern2regexp() now takes the <S> target codepage
into account.
|
|
The following rules apply:
* All SciTECO macros __must__ be in valid UTF-8, regardless of the
the register's configured encoding.
This is checked against before execution, so we can use glib's non-validating
UTF-8 API afterwards.
* Things will inevitably get slower as we have to validate all macros first
and convert to gunichar for each and every character passed into the parser.
As an optimization, it may make sense to have our own inlineable version of
g_utf8_get_char() (TODO).
Also, Unicode glyphs in syntactically significant positions may be case-folded -
just like ASCII chars were. This is is of course slower than case folding
ASCII. The impact of this should be measured and perhaps we should restrict
case folding to a-z via teco_ascii_toupper().
* The language itself does not use any non-ANSI characters, so you don't have to
use UTF-8 characters.
* Wherever the parser expects a single character, it will now accept an arbitrary
Unicode/UTF-8 glyph as well.
In other words, you can call macros like M§ instead of having to write M[§].
You can also get the codepoint of any Unicode character with ^^x.
Pressing an Unicode character in the start state or in Ex and Fx will now
give a sane error message.
* When pressing a key which produces a multi-byte UTF-8 sequence, the character
gets translated back and forth multiple times:
1. It's converted to an UTF-8 string, either buffered or by IME methods (Gtk).
On Curses we could directly get a wide char using wget_wch(), but it's
not currently used, so we don't depend on widechar curses.
2. Parsed into gunichar for passing into the edit command callbacks.
This also validates the codepoint - everything later on can assume valid
codepoints and valid UTF-8 strings.
3. Once the edit command handling decides to insert the key into the command line,
it is serialized back into an UTF-8 string as the command line macro has
to be in UTF-8 (like all other macros).
4. The parser reads back gunichars without validation for passing into
the parser callbacks.
* Flickering in the Curses UI and Pango warnings in Gtk, due to incompletely
inserted and displayed UTF-8 sequences, are now fixed.
|
|
* The libtool wrapper binaries do not pass down UTF-8 strings correctly,
so the Unicode tests failed under some circumstances.
* As we aren't actually linking against any locally-built shared libraries,
we are passing --disable-shared to libtool which inhibts wrapper generation
on win32 and fixes the test suite.
* Also use up to date autotools. This didn't fix anything, though.
* test suite: try writing an Unicode filename as well
* There have been problems doing that on Win32 where UTF-8 was not
correctly passed down from the command line and some Windows API
calls were only working with ANSI filenames etc.
|
|
|
|
unsigned integer
* SCI_GETCHARAT is internally casted to `char`, which may be signed.
Characters > 127 therefore become negative and stay so when casted to sptr_t.
We therefore cast it back to guchar (unsigned char).
* The same is true whenever returning a string's character to SciTECO (teco_int_t)
as our string type is `gchar *`.
* <^^x> now also works for those characters.
Eventually, the parser will probably become UTF8-aware and this will
have to be done differently.
|
|
numbers now
* Instead of TECO_OP_NEW, there should perhaps simply be a flag of whether `,` was used.
|
|
* in fact, with a negative exponent the previous naive implementation would even hang indefinitely!
* Now uses the squaring algorithm.
This is slightly longer but significantly more efficient.
* added test cases
|
|
* passing an empty command string down to the shell would always do nothing,
so it doesn't make sense to support that.
* for the time being, we generate a proper error
* in the future, it might make sense to define some special behavior like repeating
the last command - but EC does not currently save the command line anywhere.
* The generated documentation is currently ugly (FIXME).
mandatory parameters are not properly detected by tedoc and we cannot keep apart
Q-Registers from mandatory parameters either.
Also, we should allow <param> markup in command summaries.
|
|
* This was setting only the teco_doc but wasn't calling the necessary
set_string() methods.
* The idiom [$ FG...$ ]$ to change the working directory temporarily
now works.
* Similarily you can now write [~ ^U~...$ ]~ to change the clipboard
temporarily.
* Added test suite cases. The clipboard is not tested since
it's not supported everywhere and would interfer with the host system.
* Resolved lots of redundancies in qreg.c.
The clipboard and workingdir Q-Regs have lots in common.
This is now abstracted in the "external" Q-Reg base "class"
(ie. via initializer TECO_INIT_QREG_EXTERNAL()).
It uses vtable calls which is slightly more inefficient than per
register implementations, but avoiding redundancies is probably more
important.
|
|
* it appears to behave similar to Mac OS with regard to recursions
|