sciteco/lib/lexers, branch session-extensions

improved the "asm" (x86 assembly) lexer

2025-04-02T23:28:30+00:00

There are still some glitches with non-mainstream assemblers like A86, though.

the ES command (send Scintilla message) now supports passing both wParam and lParam as null-terminated strings

2025-03-23T15:42:07+00:00

* Being able to embed null bytes into the lParam string is
  practically useless - there aren't any messages where this is useful
  and where there are no native SciTECO counterparts - so this case is now catched
  and the null-byte separates wParam from lParam.
* wParam can be the empty string, but it is not supported to pass wParam as a
  string and lParam as the empty string.
  If the second string argument ends in ^@, lParam is popped from the stack instead.
* This is a temporary workaround until we can properly parse the Scintilla.iface and
  generate more elegant per-message wrappers.
* It in particular unlocks the SCI_SETREPRESENTATION and SCI_SETPROPERTY messages.
  The former allows us to write a special hex-editor macro which sets hexadecimal
  character representations, while the latter allows you to set lexer properties.
* The C-based lexers ("cpp" in Lexilla) can now take preprocessor definitions into account.
  This is disabled by default, unless you set lexer.c.defines before opening a file.
  You can also set it interactively and re-set the lexer. For instance:
  ^U[lexer.c.defines]NDEBUG$ M[lexer.set.c]

sciteco lexer: enable 2-char soft tabs by default

2025-03-12T11:06:58+00:00

* You practically never want to indent in SciTECO code with hard tabs, as the hard tab is
  an insertion command.
* 2-char soft tabs are the convention in SciTECO's included macros.
* Fixes the M#it macro among other things.
* If you do want to insert an insertion-with-tab command (ASCII 9), you almost always will
  want to type it ^I instead.
  Real ASCII 9s should consequently be highlighted, ie. there should be a character representation.
  Unfortunately, character representations are currently set in C code and cannot be changed via .

Asciidoc, Markdown and Git lexers: enable word wrapping by default

2025-03-08T19:45:42+00:00

These are all more or less plain text formats.

added "email" lexer for writing mails

2025-03-08T19:03:48+00:00

* Highlights both 1st level and 2nd level quotes and signatures.
* This also sets the edge to 78 columns, as is recommended for email and
  enables word wrapping.
  The edge mode is not set, since it kind of looks ugly in Scinterm.
* Helps when using SciTECO as the email editor for instance in the
  Aerc mail client.
* Unfortunately, we cannot set up Scintilla to automatically break words
  after 78 columns (or perhaps that's a good thing).
  You can use the M#rf reformat-paragraph macro to reflow paragraphs
  before sending the mail.
  This will take the edge column into account even if no edge mode is set.

introduced true block and EOL comments

2024-12-24T10:29:32+00:00

* The previous convention of !* ... *! are now true block comments,
  i.e. they are parsed faster, don't spam the goto table and allow
  embedding of exclamation marks - only "*!" terminates the comment.
* It is therefore now forbidden to have goto labels beginning with "*".
* Also support "!!" to introduce EOL comments (like C++'s //).
  This disallows empty labels, but they weren't useful anyway.
  This is the shortest way to begin a comment.
* All comment labels have been converted to true comments, to ensure
  that syntax highlighting works correctly.
  EOL comments are used for single line commented-out code, since it's
  easiest to uncomment - you don't have to jump to the line end.
  This is a pure convention / coding style.
  Other people might do it differently.
* It's of course still possible to abuse goto labels as comments
  as TECO did for ages.
* In lexing / syntax highlighting, labels and comments are highlighted differently.
* When syntax highlighting, a single "!" will first be highlighted as a label
  since it's not yet unambiguous. Once you type the second character (* or !),
  the first character is retroactively styled as a comment as well.

update sciteco.tes: this again highlights commands, but not Q-Register names

2024-12-13T21:23:47+00:00

It's a bit easier on the eyes.

fixup 244a54a18b7db6af177c9d10f3224772f08d7484: abuse the Scintilla view's "identifier" to enable lexing in the container

2024-12-13T12:28:04+00:00

* SCI_SETILEXER(NULL) is not a reliable way to do that since
  that's the default for all views.
* This was breaking the git.tes lexer for instance and was unnecessarily
  driving teco_lexer_style() on plain-text documents.
* Since we currently do not implement the ILexer5 C++ interface
  and teco_view_t is just a pointer alias, we are abusing the view's "identifier" instead.
  This is probably sufficient, as long as there is only one lexer "in the container".
  Otherwise, there should perhaps be a single C++ class that does nothing but
  wrapping a callback into an ILexer5 object with a C ABI.

implemented Scintilla lexer for SciTECO code, i.e. TECO syntax highlighting

2024-12-12T21:58:14+00:00

* this works by embedding the SciTECO parser and driving it always (exclusively)
  in parse-only mode.
* A new teco_state_t::style determines the Scintilla style for any character
  accepted in the given state.
* Therefore, the SciTECO lexer is always 100% exact and corresponds to the current
  SciTECO grammer - it does not have to be maintained separately.
  There are a few exceptions and tweaks, though.
* The contents of curly-brace escapes (`@^Uq{...}`) are rendered as ordinary
  code using a separate parser instance.
  This can be disabled with the lexer.sciteco.macrodef property.
  Unfortunately, SciTECO does not currently allow setting lexer properties (FIXME).
* Labels and comments are currently styled the same.
  This could change in the future once we introduce real comments.
* Lexers are usually implemented in C++, but I did not want to draw in C++.
  Especially not since we'd have to include parser.h and other SciTECO headers,
  that really do not want to keep C++-compatible.
  Instead, the lexer is implemented "in the container".
  @ES/SCI_SETILEXER/sciteco/ is internally translated to SCI_SETILEXER(NULL)
  and we get Scintilla notifications when styling the view becomes necessary.
  This is then centrally forwarded to the teco_lexer_style() which
  uses the ordinary teco_view_ssm() API for styling.
* Once the command line becomes a Scintilla view even on Curses,
  we can enabled syntax highlighting of the command line macro.

support the ::S anchored search (string comparison) command (and ::FD, ::FR, ::FS as well)

2024-12-06T14:20:52+00:00

* The colon modifier can now occur 2 times.
  Specifying `@` more than once or `:` more than twice is an error now.
* Commands do not check for excess colon modifiers - almost every command would have
  to check it. Instead, a double colon will simply behave like a single colon on most
  commands.
* All search commands inherit the anchored semantics, but it's not very useful in some combinations
  like -::S, ::N or ::FK.
  That's why the `::` variants are not documented everywhere.
* The lexer.checkheader macro could be simplified and should also be faster now,
  speeding up startup.
  Eventually this macro can be made superfluous, e.g. by using 1:FB or 0,1^Q::S.