Age | Commit message (Collapse) | Author | Files | Lines |
|
This broke builds e.g. on Ubuntu 20.04.
Regression was introduced in 51bd183f064d0c0ea5e0184d9f6b6b62e5c01e50.
|
|
* In principle --stdin and --stdout could have been done in pure TECO code using the
<^T> command.
Having built-in command-line arguments however has several advantages:
* Significantly faster than reading byte-wise with ^T.
* Performs EOL normalization unless specifying --8bit of course.
* Significantly shortens command-lines.
`sciteco -qio` and `sciteco -qi` can be real replacements for sed and awk.
* You can even place SciTECO into the middle of a pipeline while editing
interactively:
foo | sciteco -qio --no-profile | bar
Unfortunately, this will not currently work when munging the profile
as command-line parameters are also transmitted via the unnamed buffer.
This should be changed to use special Q-registers (FIXME).
* --quiet can help to improve the test suite (TODO).
Should probably be the default in TE_CHECK().
* --stdin and --stdout allow to simplify many SciTECO scripts, avoiding
temporary files, especially for womenpage generation (TODO).
* For processing potentially infinite streams, you will still have to
read using ^T.
|
|
* `static const char *p = "FOO"` is not a true constant since
the variable p can still be changed.
It has to be declared as `static const char *const p = "FOO"`,
so that the pointer itself is constant.
* In case of string constants, it's easier however to use `static const char p[] = "FOO"`.
|
|
* These were leaked e.g. in case of end-of-macro errors,
but also in case of syntax highlighting (teco_lexer_style()).
I considered to solve this by overwriting more of the end_of_macro_cb,
but it didn't turn out to be trivial always.
* Considering that the union in teco_machine_main_t saved only 3 machine words
of memory, I decided to sacrifice those for more robust memory management.
* teco_machine_qregspec_t cannot be directly embedded into teco_machine_main_t
due to recursive dependencies with teco_machine_stringbuilding_t.
It could now and should perhaps be allocated only once in teco_machine_main_init(),
but it would require more refactoring.
|
|
* This command exists in Video TECO.
In Video TECO it also supports reading multiple files with a glob pattern -- we do not support that
as I am not convinced of its usefulness.
* teco_view_load() has been extended, so it can read into dot without
discarding the existing document.
|
|
* The old behavior of throwing an error was inherited from Video TECO.
* The command is now more similar to TECO-11.
* Since -1 is taken, invalid and incomplete UTF-8 byte sequences
are now reported as -2/-3.
I wasn't really able to provoke -3, though.
|
|
* In other words, fixed `-9223372036854775808\` on --with-teco-integer=64
(which is the default).
* The reason is that ABS(G_MININT64) == G_MININT64 since -G_MININT64 == G_MININT64.
It is therefore important not to call ABS() on arbitrary teco_int_t's.
|
|
|
|
* Apparently g_utf8_get_char_validated() sometimes(!) returns -2 for null-characters,
so it was considered an invalid byte sequence.
* What's strange and unexplainable is that other uses of the function, as are behind nA and nQq,
did not cause problems and returned 0 for null-bytes.
* This also fixes syntax higlighting of .teco_session files which use the null-byte as the
string terminator.
(.teco_session files are not highlighted automatically, though.)
|
|
line-character index
* checks for character consistency (of UTF-8 byte sequences) were slowing down things significantly in Scintilla
* It got even worse if the file indeed contained non-ANSI codepoints as reading in chunks of 1024
would sometimes mean that incomplete byte sequences would be read.
Some large 160mb test files wouldn't load even after minutes.
They now load in seconds.
* This does NOT yet solve the slowdowns when operating on very long lines.
|
|
"identifier" to enable lexing in the container
* SCI_SETILEXER(NULL) is not a reliable way to do that since
that's the default for all views.
* This was breaking the git.tes lexer for instance and was unnecessarily
driving teco_lexer_style() on plain-text documents.
* Since we currently do not implement the ILexer5 C++ interface
and teco_view_t is just a pointer alias, we are abusing the view's "identifier" instead.
This is probably sufficient, as long as there is only one lexer "in the container".
Otherwise, there should perhaps be a single C++ class that does nothing but
wrapping a callback into an ILexer5 object with a C ABI.
|
|
* this works by embedding the SciTECO parser and driving it always (exclusively)
in parse-only mode.
* A new teco_state_t::style determines the Scintilla style for any character
accepted in the given state.
* Therefore, the SciTECO lexer is always 100% exact and corresponds to the current
SciTECO grammer - it does not have to be maintained separately.
There are a few exceptions and tweaks, though.
* The contents of curly-brace escapes (`@^Uq{...}`) are rendered as ordinary
code using a separate parser instance.
This can be disabled with the lexer.sciteco.macrodef property.
Unfortunately, SciTECO does not currently allow setting lexer properties (FIXME).
* Labels and comments are currently styled the same.
This could change in the future once we introduce real comments.
* Lexers are usually implemented in C++, but I did not want to draw in C++.
Especially not since we'd have to include parser.h and other SciTECO headers,
that really do not want to keep C++-compatible.
Instead, the lexer is implemented "in the container".
@ES/SCI_SETILEXER/sciteco/ is internally translated to SCI_SETILEXER(NULL)
and we get Scintilla notifications when styling the view becomes necessary.
This is then centrally forwarded to the teco_lexer_style() which
uses the ordinary teco_view_ssm() API for styling.
* Once the command line becomes a Scintilla view even on Curses,
we can enabled syntax highlighting of the command line macro.
|
|
|
|
* Previously you could open files of arbitrary size and the limit would be checked only afterwards.
* Many, but not all, cases should now be detected earlier.
Since Scintilla allocates lots of memory as part of rendering,
you can still run into memory limits even after successfully loading the file.
* Loading extremely large files can also be potentially slow.
Therefore, it is now possible to interrupt via CTRL+C.
Again, if the UI is blocking because of stuff done as part of rendering,
you still may not be able to interrupt the "blocking" operation.
|
|
* When enabled with bit 2 in the ED flags (0,4ED),
all registers and buffers will get the raw ANSI encoding (as if 0EE had been
called on them).
You can still manually change the encoding, eg. by calling 65001EE afterwards.
* Also the ANSI mode sets up character representations for all bytes >= 0x80.
This is currently done only depending on the ED flag, not when setting 0EE.
* Since setting 16,4ED for 8-bit clean editing in a macro can be tricky -
the default unnamed buffer will still be at UTF-8 and at least a bunch
of environment registers as well - we added the command line option
`--8bit` (short `-8`) which configures the ED flags very early on.
As another advantage you can mung the profile in 8-bit mode as well
when using SciTECO as a sort of interactive hex editor.
* Disable UTF-8 checks in 8-bit clean mode (sample.teco_ini).
|
|
teco_state_start_get() (refs #5)
|
|
* this required adding several Q-Register vtable methods
* it should still be investigated whether the repeated calling of
SCI_ALLOCATELINECHARACTERINDEX causes any overhead.
|
|
* This works reasonably well unless lines are exceedingly long
(as on a line we always count characters).
The following test case is still slow (on Unicode buffers):
10000<@I/XX/> <%a-1:J;>
While the following is now also fast:
10000<@I/X^J/> <%a-1:J;>
* Commands with relative character offsets (C, R, A, D) have
a special optimization where they always count characters beginning
at dot, as long as the argument is now exceedingly large.
This means they are fast even on exceedingly long lines.
* The remaining commands (search, EC/EG, Xq) now accept glyph indexes.
|
|
|
|
|
|
* Courier has the quirk that letter sequences like "fi" are turned into ligatures
which breaks the monospaced nature of the display.
* We assume that "Monospace" is also more portable, although it hasn't yet been tested on Windows.
* only relevant for the Gtk UI of course
* It might be a good idea to set SCI_STYLESETCHECKMONOSPACED as well (FIXME?)
|
|
|
|
* NOTE: Selections are currently only used to highlight search results.
* The default selection colors were not always visible well with default settings (--no-profile)
and they were not uniform across platforms.
On Curses, the selection would be reversed, while on Gtk it had a lighter foreground color.
They are now always reversed (black on white background).
The default styles do not assume any color support - they use only black and white.
* Since these defaults cannot possibly work on every color scheme,
color.selfore and color.selback has been added to color.tes.
All existing color schemes have been updated to configure selections as reversed
to the default colors.
This especially fixes selection colors on Gtk.
* On solarized.tes, the caret style was already distinct from inversed default colors.
On terminal.tes, the color of the caret is now bright white, so it stands out
from the selection colors.
* In Curses, the caret color is currently __not__ applied to the command line where
it is continued to be drawn reversed.
The command line drawing code is considered deprecated and will eventually be replaced
with a Scintilla minibuffer.
* In Gtk, we now apply the caret style to the commandline view as well.
* Fixed the comment color in solarized.light.
|
|
* Esp. with the new Scintilla version, the representation
setting as part of every SCI_SETDOCPOINTER has turned out to
be a performance bottleneck.
* The new Scintilla has a custom tweak/patch that disables any
automatic representation setting in Scintilla itself.
It is now sufficient to initialize the SciTECO-style representations
only once in the lifetime of any view.
|
|
* Previous Scintilla version was 3.6.4 and Scinterm was 1.7 (with lots of custom patches).
All of the patches are now either irrelevant or have been merged upstream.
* Since Scintilla 5 requires C++17, this increases the minimum GCC version at least
to 5.0. We may actually require even newer versions.
* I could not upgrade the scintilla-mirror (which was imported from Mercurial),
so the old sciteco-dev branch was renamed to sciteco-dev-pre-v2.0.0,
master was deleted and I reimported the entire Scintilla repo using
git-remote-hg.
This means that scintilla-mirror now contains two entirely separate trees.
But it is still possible to clone old SciTECO repos.
* The strategy/workflow of maintaining hotfix branches on scintilla-mirror has been changed.
Instead of having one sciteco-dev branch that is rebased onto new Scintilla upstream
releases and tagging SciTECO releases in scintilla-mirror (to keep the commits referenced),
we now create a branch for every Scintilla version we are based on (eg. sciteco-rel-5-1-3).
This branch is never rebased or deleted. Therefore, we are guaranteed to be able to
clone arbitrary SciTECO repo commits - not only releases.
Releases no longer have to be tagged in scintilla-mirror.
On the downside, fixup commits may accumulate in these new branches.
They can only be squashed once a new branch for a new Scintilla release is created
(e.g. by cherry-picking followed by rebase).
* Scinterm does no longer have to reside in the Scintilla subdirectory,
so we added it as a regular submodule.
There are no more recursive submodules.
The Scinterm build system has not been improved at all, but we use
a trick based on VPATH to build Scinterm in scintilla/bin/.
* Scinterm is now in Git and we reference the upstream repo for the
time being.
We might mirror it and apply the same branching workflow as with Scintilla
if necessary.
The scinterm-mirror repository still exists but has not been touched.
We will also have to rewrite its master branch as it was a non-reproducible
Mercurial import.
* Scinterm now also comes with patches for Scintilla which we simply applied
on our sciteco-rel-5-1-3 branch.
* Scintilla 5 outsourced its lexers into the Lexilla project.
We added it as yet another submodule.
* All submodules have been moved into contrib/.
* The Scintilla API for setting lexers has consequently changed.
We now have to call SCI_SETILEXER(0, CreateLexer(name)).
As I did not want to introduce a separate command for setting lexers,
<ES> has been extended to allow setting lexers by name with the SCI_SETILEXER
message which effectively replaces SCI_SETLEXERLANGUAGE.
* The lexer macros (SCLEX_...) no longer serve any purpose - they weren't used
in the SciTECO standard library anyway - and have consequently been removed
from symbols-scilexer.c.
The style macros from SciLexer.h (SCE_...) are theoretically still useful - even
though they are not used by our current color schemes - and have therefore been
retained. They can be specified as wParam in <ES>.
* <ES> no longer allows symbolic constants for lParam.
This never made any sense since all supported symbols were always wParam.
* Scinterm supports new native cursor modes.
They are not used for the time being and the previous CARETSTYLE_BLOCK_AFTER
caret style is configured by default.
It makes no sense to enable native cursor modes now since the
command line should have a native cursor but is not yet a Scintilla view.
* The Scintilla upgrade performed much worse than before,
so some optimizations will be necessary.
|
|
* This was an ancient bug apparently broken since d503c3b07c2157658f699294c44ad5be244727a5 (year 2014)
and was therefore broken even in v0.6.4.
|
|
This is a total conversion of SciTECO to plain C (GNU C11).
The chance was taken to improve a lot of internal datastructures,
fix fundamental bugs and lay the foundations of future features.
The GTK user interface is now in an useable state!
All changes have been squashed together.
The language itself has almost not changed at all, except for:
* Detection of string terminators (usually Escape) now takes
the string building characters into account.
A string is only terminated outside of string building characters.
In other words, you can now for instance write
I^EQ[Hello$world]$
This removes one of the last bits of shellisms which is out of
place in SciTECO where no tokenization/lexing is performed.
Consequently, the current termination character can also be
escaped using ^Q/^R.
This is used by auto completions to make sure that strings
are inserted verbatim and without unwanted sideeffects.
* All strings can now safely contain null-characters
(see also: 8-bit cleanliness).
The null-character itself (^@) is not (yet) a valid SciTECO
command, though.
An incomplete list of changes:
* We got rid of the BSD headers for RB trees and lists/queues.
The problem with them was that they used a form of metaprogramming
only to gain a bit of type safety. It also resulted in less
readble code. This was a C++ desease.
The new code avoids metaprogramming only to gain type safety.
The BSD tree.h has been replaced by rb3ptr by Jens Stimpfle
(https://github.com/jstimpfle/rb3ptr).
This implementation is also more memory efficient than BSD's.
The BSD list.h and queue.h has been replaced with a custom
src/list.h.
* Fixed crashes, performance issues and compatibility issues with
the Gtk 3 User Interface.
It is now more or less ready for general use.
The GDK lock is no longer used to avoid using deprecated functions.
On the downside, the new implementation (driving the Gtk event loop
stepwise) is even slower than the old one.
A few glitches remain (see TODO), but it is hoped that they will
be resolved by the Scintilla update which will be performed soon.
* A lot of program units have been split up, so they are shorter
and easier to maintain: core-commands.c, qreg-commands.c,
goto-commands.c, file-utils.h.
* Parser states are simply structs of callbacks now.
They still use a kind of polymorphy using a preprocessor trick.
TECO_DEFINE_STATE() takes an initializer list that will be
merged with the default list of field initializers.
To "subclass" states, you can simply define new macros that add
initializers to existing macros.
* Parsers no longer have a "transitions" table but the input_cb()
may use switch-case statements.
There are also teco_machine_main_transition_t now which can
be used to implement simple transitions. Additionally, you
can specify functions to execute during transitions.
This largely avoids long switch-case-statements.
* Parsers are embeddable/reusable now, at least in parse-only mode.
This does not currently bring any advantages but may later
be used to write a Scintilla lexer for TECO syntax highlighting.
Once parsers are fully embeddable, it will also be possible
to run TECO macros in a kind of coroutine which would allow
them to process string arguments in real time.
* undo.[ch] still uses metaprogramming extensively but via
the C preprocessor of course. On the downside, most undo
token generators must be initiated explicitly (theoretically
we could have used embedded functions / trampolines to
instantiate automatically but this has turned out to be
dangereous).
There is a TECO_DEFINE_UNDO_CALL() to generate closures for
arbitrary functions now (ie. to call an arbitrary function
at undo-time). This simplified a lot of code and is much
shorter than manually pushing undo tokens in many cases.
* Instead of the ridiculous C++ Curiously Recurring Template
Pattern to achieve static polymorphy for user interface
implementations, we now simply declare all functions to
implement in interface.h and link in the implementations.
This is possible since we no longer hace to define
interface subclasses (all state is static variables in
the interface's *.c files).
* Headers are now significantly shorter than in C++ since
we can often hide more of our "class" implementations.
* Memory counting is based on dlmalloc for most platforms now.
Unfortunately, there is no malloc implementation that
provides an efficient constant-time memory counter that
is guaranteed to decrease when freeing memory.
But since we use a defined malloc implementation now,
malloc_usable_size() can be used safely for tracking memory use.
malloc() replacement is very tricky on Windows, so we
use a poll thread on Windows. This can also be enabled
on other supported platforms using --disable-malloc-replacement.
All in all, I'm still not pleased with the state of memory
limiting. It is a mess.
* Error handling uses GError now. This has the advantage that
the GError codes can be reused once we support error catching
in the SciTECO language.
* Added a few more test suite cases.
* Haiku is no longer supported as builds are instable and
I did not manage to debug them - quite possibly Haiku bugs
were responsible.
* Glib v2.44 or later are now required.
The GTK UI requires Gtk+ v3.12 or later now.
The GtkFlowBox fallback and sciteco-wrapper workaround are
no longer required.
* We now extensively use the GCC/Clang-specific g_auto
feature (automatic deallocations when leaving the current
code block).
* Updated copyright to 2021.
SciTECO has been in continuous development, even though there
have been no commits since 2018.
* Since these changes are so significant, the target release has
been set to v2.0.
It is planned that beginning with v3.0, the language will be
kept stable.
|