aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/lexer.c
AgeCommit message (Collapse)AuthorFilesLines
2025-01-13updated copyright to 2025Robin Haberkorn1-1/+1
2024-12-24introduced true block and EOL commentsRobin Haberkorn1-20/+31
* The previous convention of !* ... *! are now true block comments, i.e. they are parsed faster, don't spam the goto table and allow embedding of exclamation marks - only "*!" terminates the comment. * It is therefore now forbidden to have goto labels beginning with "*". * Also support "!!" to introduce EOL comments (like C++'s //). This disallows empty labels, but they weren't useful anyway. This is the shortest way to begin a comment. * All comment labels have been converted to true comments, to ensure that syntax highlighting works correctly. EOL comments are used for single line commented-out code, since it's easiest to uncomment - you don't have to jump to the line end. This is a pure convention / coding style. Other people might do it differently. * It's of course still possible to abuse goto labels as comments as TECO did for ages. * In lexing / syntax highlighting, labels and comments are highlighted differently. * When syntax highlighting, a single "!" will first be highlighted as a label since it's not yet unambiguous. Once you type the second character (* or !), the first character is retroactively styled as a comment as well.
2024-12-22fixed lexing (syntax highlighting) of the null-character (^@) in SciTECO codeRobin Haberkorn1-2/+6
* Apparently g_utf8_get_char_validated() sometimes(!) returns -2 for null-characters, so it was considered an invalid byte sequence. * What's strange and unexplainable is that other uses of the function, as are behind nA and nQq, did not cause problems and returned 0 for null-bytes. * This also fixes syntax higlighting of .teco_session files which use the null-byte as the string terminator. (.teco_session files are not highlighted automatically, though.)
2024-12-13implemented Scintilla lexer for SciTECO code, i.e. TECO syntax highlightingRobin Haberkorn1-0/+235
* this works by embedding the SciTECO parser and driving it always (exclusively) in parse-only mode. * A new teco_state_t::style determines the Scintilla style for any character accepted in the given state. * Therefore, the SciTECO lexer is always 100% exact and corresponds to the current SciTECO grammer - it does not have to be maintained separately. There are a few exceptions and tweaks, though. * The contents of curly-brace escapes (`@^Uq{...}`) are rendered as ordinary code using a separate parser instance. This can be disabled with the lexer.sciteco.macrodef property. Unfortunately, SciTECO does not currently allow setting lexer properties (FIXME). * Labels and comments are currently styled the same. This could change in the future once we introduce real comments. * Lexers are usually implemented in C++, but I did not want to draw in C++. Especially not since we'd have to include parser.h and other SciTECO headers, that really do not want to keep C++-compatible. Instead, the lexer is implemented "in the container". @ES/SCI_SETILEXER/sciteco/ is internally translated to SCI_SETILEXER(NULL) and we get Scintilla notifications when styling the view becomes necessary. This is then centrally forwarded to the teco_lexer_style() which uses the ordinary teco_view_ssm() API for styling. * Once the command line becomes a Scintilla view even on Curses, we can enabled syntax highlighting of the command line macro.