<feed xmlns='http://www.w3.org/2005/Atom'>
<title>sciteco/lib/lexers, branch v2.3.0</title>
<subtitle>Scintilla-based Text Editor and COrrector</subtitle>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/'/>
<entry>
<title>introduced true block and EOL comments</title>
<updated>2024-12-24T10:29:32+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2024-12-24T10:29:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=ef897b418a4487196e1dbc18a97046f8f0aea2e8'/>
<id>ef897b418a4487196e1dbc18a97046f8f0aea2e8</id>
<content type='text'>
* The previous convention of !* ... *! are now true block comments,
  i.e. they are parsed faster, don't spam the goto table and allow
  embedding of exclamation marks - only "*!" terminates the comment.
* It is therefore now forbidden to have goto labels beginning with "*".
* Also support "!!" to introduce EOL comments (like C++'s //).
  This disallows empty labels, but they weren't useful anyway.
  This is the shortest way to begin a comment.
* All comment labels have been converted to true comments, to ensure
  that syntax highlighting works correctly.
  EOL comments are used for single line commented-out code, since it's
  easiest to uncomment - you don't have to jump to the line end.
  This is a pure convention / coding style.
  Other people might do it differently.
* It's of course still possible to abuse goto labels as comments
  as TECO did for ages.
* In lexing / syntax highlighting, labels and comments are highlighted differently.
* When syntax highlighting, a single "!" will first be highlighted as a label
  since it's not yet unambiguous. Once you type the second character (* or !),
  the first character is retroactively styled as a comment as well.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* The previous convention of !* ... *! are now true block comments,
  i.e. they are parsed faster, don't spam the goto table and allow
  embedding of exclamation marks - only "*!" terminates the comment.
* It is therefore now forbidden to have goto labels beginning with "*".
* Also support "!!" to introduce EOL comments (like C++'s //).
  This disallows empty labels, but they weren't useful anyway.
  This is the shortest way to begin a comment.
* All comment labels have been converted to true comments, to ensure
  that syntax highlighting works correctly.
  EOL comments are used for single line commented-out code, since it's
  easiest to uncomment - you don't have to jump to the line end.
  This is a pure convention / coding style.
  Other people might do it differently.
* It's of course still possible to abuse goto labels as comments
  as TECO did for ages.
* In lexing / syntax highlighting, labels and comments are highlighted differently.
* When syntax highlighting, a single "!" will first be highlighted as a label
  since it's not yet unambiguous. Once you type the second character (* or !),
  the first character is retroactively styled as a comment as well.
</pre>
</div>
</content>
</entry>
<entry>
<title>update sciteco.tes: this again highlights commands, but not Q-Register names</title>
<updated>2024-12-13T21:23:47+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2024-12-13T21:23:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=a673abe372df0a112e075737640dd3786587f870'/>
<id>a673abe372df0a112e075737640dd3786587f870</id>
<content type='text'>
It's a bit easier on the eyes.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It's a bit easier on the eyes.
</pre>
</div>
</content>
</entry>
<entry>
<title>fixup 244a54a18b7db6af177c9d10f3224772f08d7484: abuse the Scintilla view's "identifier" to enable lexing in the container</title>
<updated>2024-12-13T12:28:04+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2024-12-13T12:17:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=7871c54db8fe2b8dcd1cb81aaec3b85099cb20a8'/>
<id>7871c54db8fe2b8dcd1cb81aaec3b85099cb20a8</id>
<content type='text'>
* SCI_SETILEXER(NULL) is not a reliable way to do that since
  that's the default for all views.
* This was breaking the git.tes lexer for instance and was unnecessarily
  driving teco_lexer_style() on plain-text documents.
* Since we currently do not implement the ILexer5 C++ interface
  and teco_view_t is just a pointer alias, we are abusing the view's "identifier" instead.
  This is probably sufficient, as long as there is only one lexer "in the container".
  Otherwise, there should perhaps be a single C++ class that does nothing but
  wrapping a callback into an ILexer5 object with a C ABI.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* SCI_SETILEXER(NULL) is not a reliable way to do that since
  that's the default for all views.
* This was breaking the git.tes lexer for instance and was unnecessarily
  driving teco_lexer_style() on plain-text documents.
* Since we currently do not implement the ILexer5 C++ interface
  and teco_view_t is just a pointer alias, we are abusing the view's "identifier" instead.
  This is probably sufficient, as long as there is only one lexer "in the container".
  Otherwise, there should perhaps be a single C++ class that does nothing but
  wrapping a callback into an ILexer5 object with a C ABI.
</pre>
</div>
</content>
</entry>
<entry>
<title>implemented Scintilla lexer for SciTECO code, i.e. TECO syntax highlighting</title>
<updated>2024-12-12T21:58:14+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2024-12-09T09:58:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=244a54a18b7db6af177c9d10f3224772f08d7484'/>
<id>244a54a18b7db6af177c9d10f3224772f08d7484</id>
<content type='text'>
* this works by embedding the SciTECO parser and driving it always (exclusively)
  in parse-only mode.
* A new teco_state_t::style determines the Scintilla style for any character
  accepted in the given state.
* Therefore, the SciTECO lexer is always 100% exact and corresponds to the current
  SciTECO grammer - it does not have to be maintained separately.
  There are a few exceptions and tweaks, though.
* The contents of curly-brace escapes (`@^Uq{...}`) are rendered as ordinary
  code using a separate parser instance.
  This can be disabled with the lexer.sciteco.macrodef property.
  Unfortunately, SciTECO does not currently allow setting lexer properties (FIXME).
* Labels and comments are currently styled the same.
  This could change in the future once we introduce real comments.
* Lexers are usually implemented in C++, but I did not want to draw in C++.
  Especially not since we'd have to include parser.h and other SciTECO headers,
  that really do not want to keep C++-compatible.
  Instead, the lexer is implemented "in the container".
  @ES/SCI_SETILEXER/sciteco/ is internally translated to SCI_SETILEXER(NULL)
  and we get Scintilla notifications when styling the view becomes necessary.
  This is then centrally forwarded to the teco_lexer_style() which
  uses the ordinary teco_view_ssm() API for styling.
* Once the command line becomes a Scintilla view even on Curses,
  we can enabled syntax highlighting of the command line macro.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* this works by embedding the SciTECO parser and driving it always (exclusively)
  in parse-only mode.
* A new teco_state_t::style determines the Scintilla style for any character
  accepted in the given state.
* Therefore, the SciTECO lexer is always 100% exact and corresponds to the current
  SciTECO grammer - it does not have to be maintained separately.
  There are a few exceptions and tweaks, though.
* The contents of curly-brace escapes (`@^Uq{...}`) are rendered as ordinary
  code using a separate parser instance.
  This can be disabled with the lexer.sciteco.macrodef property.
  Unfortunately, SciTECO does not currently allow setting lexer properties (FIXME).
* Labels and comments are currently styled the same.
  This could change in the future once we introduce real comments.
* Lexers are usually implemented in C++, but I did not want to draw in C++.
  Especially not since we'd have to include parser.h and other SciTECO headers,
  that really do not want to keep C++-compatible.
  Instead, the lexer is implemented "in the container".
  @ES/SCI_SETILEXER/sciteco/ is internally translated to SCI_SETILEXER(NULL)
  and we get Scintilla notifications when styling the view becomes necessary.
  This is then centrally forwarded to the teco_lexer_style() which
  uses the ordinary teco_view_ssm() API for styling.
* Once the command line becomes a Scintilla view even on Curses,
  we can enabled syntax highlighting of the command line macro.
</pre>
</div>
</content>
</entry>
<entry>
<title>support the ::S anchored search (string comparison) command (and ::FD, ::FR, ::FS as well)</title>
<updated>2024-12-06T14:20:52+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2024-12-06T14:20:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=e5884ab2166ab5a03294baa54601b8785e6d9727'/>
<id>e5884ab2166ab5a03294baa54601b8785e6d9727</id>
<content type='text'>
* The colon modifier can now occur 2 times.
  Specifying `@` more than once or `:` more than twice is an error now.
* Commands do not check for excess colon modifiers - almost every command would have
  to check it. Instead, a double colon will simply behave like a single colon on most
  commands.
* All search commands inherit the anchored semantics, but it's not very useful in some combinations
  like -::S, ::N or ::FK.
  That's why the `::` variants are not documented everywhere.
* The lexer.checkheader macro could be simplified and should also be faster now,
  speeding up startup.
  Eventually this macro can be made superfluous, e.g. by using 1:FB or 0,1^Q::S.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* The colon modifier can now occur 2 times.
  Specifying `@` more than once or `:` more than twice is an error now.
* Commands do not check for excess colon modifiers - almost every command would have
  to check it. Instead, a double colon will simply behave like a single colon on most
  commands.
* All search commands inherit the anchored semantics, but it's not very useful in some combinations
  like -::S, ::N or ::FK.
  That's why the `::` variants are not documented everywhere.
* The lexer.checkheader macro could be simplified and should also be faster now,
  speeding up startup.
  Eventually this macro can be made superfluous, e.g. by using 1:FB or 0,1^Q::S.
</pre>
</div>
</content>
</entry>
<entry>
<title>added special Q-Register ":" for accessing dot</title>
<updated>2024-11-24T01:51:34+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2024-11-24T01:38:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=23c90e37ff48707c4aabdb8b1460df382a600d7a'/>
<id>23c90e37ff48707c4aabdb8b1460df382a600d7a</id>
<content type='text'>
* We cannot call it "." since that introduces a local register
  and we don't want to add an unnecessary syntactic exception.
* Allows the idiom [: ... ]: to temporarily move around.
  Also, you can now write ^E\: without having to store dot in a register first.
* In the future we might add an ^E register as well for byte offsets.
  However, there are much fewer useful applications.
* Of course, you can now also write nU: instead of nJ, Q: instead of "." and
  n%: instead of "nC.". However it's all not really useful.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* We cannot call it "." since that introduces a local register
  and we don't want to add an unnecessary syntactic exception.
* Allows the idiom [: ... ]: to temporarily move around.
  Also, you can now write ^E\: without having to store dot in a register first.
* In the future we might add an ^E register as well for byte offsets.
  However, there are much fewer useful applications.
* Of course, you can now also write nU: instead of nJ, Q: instead of "." and
  n%: instead of "nC.". However it's all not really useful.
</pre>
</div>
</content>
</entry>
<entry>
<title>Git lexer: added support for TAG_EDITMSG and MERGE_MSG</title>
<updated>2024-09-26T13:12:49+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2024-09-26T13:12:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=c4a1c3a21818965f669ebc6a081f08323a6919f1'/>
<id>c4a1c3a21818965f669ebc6a081f08323a6919f1</id>
<content type='text'>
* Curses: "icons" have also been added
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Curses: "icons" have also been added
</pre>
</div>
</content>
</entry>
<entry>
<title>improved HTML lexer (html.tes)</title>
<updated>2024-09-17T02:46:12+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2024-09-17T02:46:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=3db9368a2619ae3bdb4afbc6f72625de6f01208a'/>
<id>3db9368a2619ae3bdb4afbc6f72625de6f01208a</id>
<content type='text'>
This previously highlighted little more than embedded Javascripts.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This previously highlighted little more than embedded Javascripts.
</pre>
</div>
</content>
</entry>
<entry>
<title>added an improvised lexer for styling Git commit messages</title>
<updated>2024-09-09T17:24:18+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2024-08-31T02:34:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=c222fa3fab470686020c04d10d5d0f5d3f7b6500'/>
<id>c222fa3fab470686020c04d10d5d0f5d3f7b6500</id>
<content type='text'>
It's not a real Lexilla lexer, but simply styles the document
once in lexer.set.git in order to highlight comment lines.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It's not a real Lexilla lexer, but simply styles the document
once in lexer.set.git in order to highlight comment lines.
</pre>
</div>
</content>
</entry>
<entry>
<title>added troff/nroff lexer</title>
<updated>2024-08-18T16:48:06+00:00</updated>
<author>
<name>Robin Haberkorn</name>
<email>robin.haberkorn@googlemail.com</email>
</author>
<published>2024-08-18T16:34:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.fmsbw.de/sciteco/commit/?id=bbcf801ddfd3eb50203518c130beb400de7ca53f'/>
<id>bbcf801ddfd3eb50203518c130beb400de7ca53f</id>
<content type='text'>
* This is optimized for Groff, but works for Heirloom Troff and Neatroff as well.
  Currently, the Heirloom and Neatroff requests are just added ontop of the Groff
  ones. Theoretically, we could also try to separate the keyword lists into
  a base K&amp;R set with Groff, Heirloom and Neatroff ontop.
* The lexer necessarily has many restrictions, as Troff is fundamentally unparseable
  (like classic TECO) and needs a lot of per-request knowledge.
* The "*.mm" extension has been removed from the lexers/cpp.tes.
  I don't know what language this was for, and I prefer `*.mm` files
  to be considered Troff.
* Temporarily changed the lexilla submodule URL.
  The corresponding Lexila lexer is in the process of being upstreamed.
  Once it is, I will probably revert the submodule to the official repository,
  as the "troff" branch is not stable (can be rebased).
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* This is optimized for Groff, but works for Heirloom Troff and Neatroff as well.
  Currently, the Heirloom and Neatroff requests are just added ontop of the Groff
  ones. Theoretically, we could also try to separate the keyword lists into
  a base K&amp;R set with Groff, Heirloom and Neatroff ontop.
* The lexer necessarily has many restrictions, as Troff is fundamentally unparseable
  (like classic TECO) and needs a lot of per-request knowledge.
* The "*.mm" extension has been removed from the lexers/cpp.tes.
  I don't know what language this was for, and I prefer `*.mm` files
  to be considered Troff.
* Temporarily changed the lexilla submodule URL.
  The corresponding Lexila lexer is in the process of being upstreamed.
  Once it is, I will probably revert the submodule to the official repository,
  as the "troff" branch is not stable (can be rebased).
</pre>
</div>
</content>
</entry>
</feed>
