aboutsummaryrefslogtreecommitdiffhomepage
path: root/lib/lexers
AgeCommit message (Collapse)AuthorFilesLines
2025-04-03improved the "asm" (x86 assembly) lexerRobin Haberkorn1-1/+9
There are still some glitches with non-mainstream assemblers like A86, though.
2025-03-23the ES command (send Scintilla message) now supports passing both wParam and ↵Robin Haberkorn4-0/+0
lParam as null-terminated strings * Being able to embed null bytes into the lParam string is practically useless - there aren't any messages where this is useful and where there are no native SciTECO counterparts - so this case is now catched and the null-byte separates wParam from lParam. * wParam can be the empty string, but it is not supported to pass wParam as a string and lParam as the empty string. If the second string argument ends in ^@, lParam is popped from the stack instead. * This is a temporary workaround until we can properly parse the Scintilla.iface and generate more elegant per-message wrappers. * It in particular unlocks the SCI_SETREPRESENTATION and SCI_SETPROPERTY messages. The former allows us to write a special hex-editor macro which sets hexadecimal character representations, while the latter allows you to set lexer properties. * The C-based lexers ("cpp" in Lexilla) can now take preprocessor definitions into account. This is disabled by default, unless you set lexer.c.defines before opening a file. You can also set it interactively and re-set the lexer. For instance: ^U[lexer.c.defines]NDEBUG$ M[lexer.set.c]
2025-03-12sciteco lexer: enable 2-char soft tabs by defaultRobin Haberkorn1-0/+4
* You practically never want to indent in SciTECO code with hard tabs, as the hard tab is an insertion command. * 2-char soft tabs are the convention in SciTECO's included macros. * Fixes the M#it macro among other things. * If you do want to insert an insertion-with-tab command (ASCII 9), you almost always will want to type it ^I instead. Real ASCII 9s should consequently be highlighted, ie. there should be a character representation. Unfortunately, character representations are currently set in C code and cannot be changed via <ES>.
2025-03-08Asciidoc, Markdown and Git lexers: enable word wrapping by defaultRobin Haberkorn3-0/+6
These are all more or less plain text formats.
2025-03-08added "email" lexer for writing mailsRobin Haberkorn1-0/+34
* Highlights both 1st level and 2nd level quotes and signatures. * This also sets the edge to 78 columns, as is recommended for email and enables word wrapping. The edge mode is not set, since it kind of looks ugly in Scinterm. * Helps when using SciTECO as the email editor for instance in the Aerc mail client. * Unfortunately, we cannot set up Scintilla to automatically break words after 78 columns (or perhaps that's a good thing). You can use the M#rf reformat-paragraph macro to reflow paragraphs before sending the mail. This will take the edge column into account even if no edge mode is set.
2024-12-24introduced true block and EOL commentsRobin Haberkorn72-99/+109
* The previous convention of !* ... *! are now true block comments, i.e. they are parsed faster, don't spam the goto table and allow embedding of exclamation marks - only "*!" terminates the comment. * It is therefore now forbidden to have goto labels beginning with "*". * Also support "!!" to introduce EOL comments (like C++'s //). This disallows empty labels, but they weren't useful anyway. This is the shortest way to begin a comment. * All comment labels have been converted to true comments, to ensure that syntax highlighting works correctly. EOL comments are used for single line commented-out code, since it's easiest to uncomment - you don't have to jump to the line end. This is a pure convention / coding style. Other people might do it differently. * It's of course still possible to abuse goto labels as comments as TECO did for ages. * In lexing / syntax highlighting, labels and comments are highlighted differently. * When syntax highlighting, a single "!" will first be highlighted as a label since it's not yet unambiguous. Once you type the second character (* or !), the first character is retroactively styled as a comment as well.
2024-12-14update sciteco.tes: this again highlights commands, but not Q-Register namesRobin Haberkorn1-2/+2
It's a bit easier on the eyes.
2024-12-13fixup 244a54a18b7db6af177c9d10f3224772f08d7484: abuse the Scintilla view's ↵Robin Haberkorn1-1/+1
"identifier" to enable lexing in the container * SCI_SETILEXER(NULL) is not a reliable way to do that since that's the default for all views. * This was breaking the git.tes lexer for instance and was unnecessarily driving teco_lexer_style() on plain-text documents. * Since we currently do not implement the ILexer5 C++ interface and teco_view_t is just a pointer alias, we are abusing the view's "identifier" instead. This is probably sufficient, as long as there is only one lexer "in the container". Otherwise, there should perhaps be a single C++ class that does nothing but wrapping a callback into an ILexer5 object with a C ABI.
2024-12-13implemented Scintilla lexer for SciTECO code, i.e. TECO syntax highlightingRobin Haberkorn1-0/+20
* this works by embedding the SciTECO parser and driving it always (exclusively) in parse-only mode. * A new teco_state_t::style determines the Scintilla style for any character accepted in the given state. * Therefore, the SciTECO lexer is always 100% exact and corresponds to the current SciTECO grammer - it does not have to be maintained separately. There are a few exceptions and tweaks, though. * The contents of curly-brace escapes (`@^Uq{...}`) are rendered as ordinary code using a separate parser instance. This can be disabled with the lexer.sciteco.macrodef property. Unfortunately, SciTECO does not currently allow setting lexer properties (FIXME). * Labels and comments are currently styled the same. This could change in the future once we introduce real comments. * Lexers are usually implemented in C++, but I did not want to draw in C++. Especially not since we'd have to include parser.h and other SciTECO headers, that really do not want to keep C++-compatible. Instead, the lexer is implemented "in the container". @ES/SCI_SETILEXER/sciteco/ is internally translated to SCI_SETILEXER(NULL) and we get Scintilla notifications when styling the view becomes necessary. This is then centrally forwarded to the teco_lexer_style() which uses the ordinary teco_view_ssm() API for styling. * Once the command line becomes a Scintilla view even on Curses, we can enabled syntax highlighting of the command line macro.
2024-12-06support the ::S anchored search (string comparison) command (and ::FD, ::FR, ↵Robin Haberkorn4-4/+4
::FS as well) * The colon modifier can now occur 2 times. Specifying `@` more than once or `:` more than twice is an error now. * Commands do not check for excess colon modifiers - almost every command would have to check it. Instead, a double colon will simply behave like a single colon on most commands. * All search commands inherit the anchored semantics, but it's not very useful in some combinations like -::S, ::N or ::FK. That's why the `::` variants are not documented everywhere. * The lexer.checkheader macro could be simplified and should also be faster now, speeding up startup. Eventually this macro can be made superfluous, e.g. by using 1:FB or 0,1^Q::S.
2024-11-24added special Q-Register ":" for accessing dotRobin Haberkorn1-4/+2
* We cannot call it "." since that introduces a local register and we don't want to add an unnecessary syntactic exception. * Allows the idiom [: ... ]: to temporarily move around. Also, you can now write ^E\: without having to store dot in a register first. * In the future we might add an ^E register as well for byte offsets. However, there are much fewer useful applications. * Of course, you can now also write nU: instead of nJ, Q: instead of "." and n%: instead of "nC.". However it's all not really useful.
2024-09-26Git lexer: added support for TAG_EDITMSG and MERGE_MSGRobin Haberkorn1-0/+2
* Curses: "icons" have also been added
2024-09-17improved HTML lexer (html.tes)Robin Haberkorn1-3/+48
This previously highlighted little more than embedded Javascripts.
2024-09-09added an improvised lexer for styling Git commit messagesRobin Haberkorn1-0/+16
It's not a real Lexilla lexer, but simply styles the document once in lexer.set.git in order to highlight comment lines.
2024-08-18added troff/nroff lexerRobin Haberkorn2-1/+86
* This is optimized for Groff, but works for Heirloom Troff and Neatroff as well. Currently, the Heirloom and Neatroff requests are just added ontop of the Groff ones. Theoretically, we could also try to separate the keyword lists into a base K&R set with Groff, Heirloom and Neatroff ontop. * The lexer necessarily has many restrictions, as Troff is fundamentally unparseable (like classic TECO) and needs a lot of per-request knowledge. * The "*.mm" extension has been removed from the lexers/cpp.tes. I don't know what language this was for, and I prefer `*.mm` files to be considered Troff. * Temporarily changed the lexilla submodule URL. The corresponding Lexila lexer is in the process of being upstreamed. Once it is, I will probably revert the submodule to the official repository, as the "troff" branch is not stable (can be rebased).
2023-04-16added Asciidoc lexer configRobin Haberkorn1-0/+28
2022-11-27added Markdown and YAML lexer configsRobin Haberkorn2-0/+50
* For markdown.tes we should better introduce new predefined colors in the color scheme files since it doesn't map well to existing colors. For italic and bold, I am not using the predefined colors at all but only set the bold and italic style attributes -- this should still be portable across color schemes.
2022-11-21improved the C/C++ and Gob lexersRobin Haberkorn4-4/+13
* single quoted constants are highlighted like single quoted strings in all other auto-generated lexers using "CPP". * recognize /// and //! and comments after preprocessor statements
2022-11-21added lexers for Python and Linux Device TreesRobin Haberkorn2-0/+93
* The device tree lexer reuses CPP and has certain limitations. For once it does not recognize /keywords/ and secondly it confuses properties beginning with # as preprocessor statements.
2021-10-11upgraded to Scintilla 5.1.3 and Scinterm 3.1Robin Haberkorn71-71/+71
* Previous Scintilla version was 3.6.4 and Scinterm was 1.7 (with lots of custom patches). All of the patches are now either irrelevant or have been merged upstream. * Since Scintilla 5 requires C++17, this increases the minimum GCC version at least to 5.0. We may actually require even newer versions. * I could not upgrade the scintilla-mirror (which was imported from Mercurial), so the old sciteco-dev branch was renamed to sciteco-dev-pre-v2.0.0, master was deleted and I reimported the entire Scintilla repo using git-remote-hg. This means that scintilla-mirror now contains two entirely separate trees. But it is still possible to clone old SciTECO repos. * The strategy/workflow of maintaining hotfix branches on scintilla-mirror has been changed. Instead of having one sciteco-dev branch that is rebased onto new Scintilla upstream releases and tagging SciTECO releases in scintilla-mirror (to keep the commits referenced), we now create a branch for every Scintilla version we are based on (eg. sciteco-rel-5-1-3). This branch is never rebased or deleted. Therefore, we are guaranteed to be able to clone arbitrary SciTECO repo commits - not only releases. Releases no longer have to be tagged in scintilla-mirror. On the downside, fixup commits may accumulate in these new branches. They can only be squashed once a new branch for a new Scintilla release is created (e.g. by cherry-picking followed by rebase). * Scinterm does no longer have to reside in the Scintilla subdirectory, so we added it as a regular submodule. There are no more recursive submodules. The Scinterm build system has not been improved at all, but we use a trick based on VPATH to build Scinterm in scintilla/bin/. * Scinterm is now in Git and we reference the upstream repo for the time being. We might mirror it and apply the same branching workflow as with Scintilla if necessary. The scinterm-mirror repository still exists but has not been touched. We will also have to rewrite its master branch as it was a non-reproducible Mercurial import. * Scinterm now also comes with patches for Scintilla which we simply applied on our sciteco-rel-5-1-3 branch. * Scintilla 5 outsourced its lexers into the Lexilla project. We added it as yet another submodule. * All submodules have been moved into contrib/. * The Scintilla API for setting lexers has consequently changed. We now have to call SCI_SETILEXER(0, CreateLexer(name)). As I did not want to introduce a separate command for setting lexers, <ES> has been extended to allow setting lexers by name with the SCI_SETILEXER message which effectively replaces SCI_SETLEXERLANGUAGE. * The lexer macros (SCLEX_...) no longer serve any purpose - they weren't used in the SciTECO standard library anyway - and have consequently been removed from symbols-scilexer.c. The style macros from SciLexer.h (SCE_...) are theoretically still useful - even though they are not used by our current color schemes - and have therefore been retained. They can be specified as wParam in <ES>. * <ES> no longer allows symbolic constants for lParam. This never made any sense since all supported symbols were always wParam. * Scinterm supports new native cursor modes. They are not used for the time being and the previous CARETSTYLE_BLOCK_AFTER caret style is configured by default. It makes no sense to enable native cursor modes now since the command line should have a native cursor but is not yet a Scintilla view. * The Scintilla upgrade performed much worse than before, so some optimizations will be necessary.
2017-11-16CPP lexer: support *.ino files (Arduino IDE sketches)Robin Haberkorn1-1/+2
* a proper Arduino lexer supporting the special Arduino keywords/classes could in principle be written, but for the time being they're treated just like regular C++ sources
2016-11-22womanpage lexer: fixed popup stylingRobin Haberkorn1-1/+1
* the ESSTYLECLEARALL$$ was resetting the STYLE_CALLTIP (and others) resulting in wrongly-styled popups. * We now only change STYLE_DEFAULT for Gtk UIs and use `color.init` to reinitialize the other styles (not very elegant).
2016-11-18implemented self-documenting (online) help systemRobin Haberkorn1-0/+20
* the new "?" (help) command can be used to look up help topics. * help topics are index from $SCITECOPATH/women/*.woman.tec files. * looking up a help topic opens the corresponding "womanpage" and jumps to the position of the topic (it acts like an anchor into the document). * styling is performed by *.woman.tec files. * Setting up the Scintilla view and munging the *.tec file is performed by the new "woman.tes" lexer. On supporting UIs (Gtk), womanpages are shown in a variable-width font. * Woman pages are usually not hand-written, but generated from manpages. A special Groff post-processor grosciteco has been introduced for this purpose. It is much like grotty, but can output SciTECO macros for styling the document (ie. the *.woman.tec files). It is documented in its own man-page. * grosciteco also introduces sciteco.tmac - special Troff macros for controlling the formatting of the document in SciTECO. It also defines .SCITECO_TOPIC which can be used to mark up help topics/terms in Troff markup. * Woman pages are generated/formatted by grosciteco at compile-time, so they will work on platforms without Groff (ie. as on windows). * Groff has been added as a hard compile-time requirement. * The sciteco(1) and sciteco(7) man pages have been augmented with help topic anchors.
2016-02-19fixed Objective C++ lexing: it is now handled by cpp.tesRobin Haberkorn2-11/+14
* the *.mm extension is for Objective C++. Therefore cpp.tes should be responsible. * Objective C keywords have been added to lexer.c.basekeywords. It does not hurt adding them to all C descendants.
2016-02-17added lexing support for Gob2 (GObject Builder)Robin Haberkorn1-0/+36
* this assumes that Gob2 produces plain-C output (no C++ keywords are added) and all Gob keywords are real keywords - even though they might be used in function bodies or %{ %} enclosed blocks.
2016-02-17The "cpp" lexer configuration has been split into "c.tes" and "cpp.tes"Robin Haberkorn2-23/+63
* The keyword list is too different in C when compared to C++. The many additional keywords are annoying when editing plain C files. * Underscored C99 and C11 keywords (like _Bool) have been added to the "c.tes" lexer configuration. The C++ language does not contain these keywords. However, C has stdbool.h to define bool which is part of standard C++. * Therefore a macro "lexer.c.basekeywords" has been defined for all languages __directly__ derived (more or less supersets) of C. It contains most of the C99/C11 standard header shortcuts. * Objective C lexing is set up by c.tes since Objective C is a relatively strict superset of C. All Objective C keywords are handled by c.tes. Since they begin with "@", this should not cause problems when editing plain C files.
2016-02-17simplified "lexer.test..." macros using the $$ return commandRobin Haberkorn42-186/+186
* this is slightly more efficient than using repeated conditionals * the last :EN does not require a conditional, as its return value can simply be forwarded. * even without $$, this could have been done easier using a once-only loop and breaking out of the loop if :EN fails using :;. The last :EN result is still stored in QReg "_". * :EN could also be used to match header lines if lexer.tes would leave the first line (header line) in some Q-Reg, like the local .[header]. However, repeated :ENs would be necessary as globbing currently does not support {...,...} expansions. Since sooner or later, the header line must be evaluated for some lexer.set macro, this is probably more efficient than the current solution using SciTECO patterns and [lexer.checkheader] could be removed as well.
2016-02-15lexer library: avoid unnecessary argument discards after SETLEXERLANGUAGERobin Haberkorn69-69/+69
* causes problems with the $$ command implemented * was already fixed in scite2co.lua but the existing code was manually updated and generated with an earlier version of scite2co.lua
2015-07-21fixed scite2co.lua: use SCI_SETLEXERLANGUAGE instead of SCI_SETLEXERRobin Haberkorn69-69/+69
* the lexer names used in SciTE property files are not SCLEX constants but the internal LexerModule names, so auto-generated SciTECO lexer configurations can only be set by name, i.e. via SETLEXERLANGUAGE, since we cannot easily map those names to SCLEX constants. * should be about as fast as using SCI_SETLEXER (since SciTECO has to look up symbolic names as well at runtime). * this especially fixes opening *.mak files -- often Makefiles but identified as "Mako" files. The macro "lexer.set.mako" used the wrong SCLEX_ symbol. * will also fix the HTML and all other lexers that use the SCLEX_HTML/hypertext lexer. * all lexer files have been updated, to be more compatible with scite2co.lua's output. This eases lexer updates in the future.
2015-05-25extended <EN> command and used it to optimize "lexer.test..." macrosRobin Haberkorn69-218/+308
* EN may now be used for matching file names (similar to fnmatch(3)). This is used to check the current buffers file extension in the lexer configuration macros instead of using expensive Q-Register manipulations. This halves the overall startup time - it is now acceptable even with the current amount of lexer configurations. * EN may now be used for checking file types. session.tes has been simplified. * BREAKS macro portability (EN now has 2 string arguments). * The Globber class has been extended to allow filtering of glob results by file type.
2015-03-24added new lexer configs auto-generated by scite2co.luaRobin Haberkorn62-0/+3123
* these are still not all languages supported by Scintilla. scite2co.lua does not do a good job of generating styles when SciTE's property files use hardcoded colors/fonts. This commit only includes reasonably good conversion results. The remaining languages need some additional manual labor. * Even these lexers are not perfect and should be revised by comparing them with SciTE's properties. * So many lexers make the "lexer.auto" macro too slow. We need some optimization. E.g. the search-command optimization described in TODO, or an extended EN command for globbing manually specified file names.
2015-03-24reformatted existing lexer definitionsRobin Haberkorn7-145/+111
* they are updated with the results of scite2co.lua This makes it easier in the future to update lexer settings based on the property files of new SciTE releases.
2014-11-22added globbing command ENRobin Haberkorn7-28/+0
* implements the same globbing as the EB command already did * uses Globber helper class that behaves more like UNIX glob(). glib only has a glob-style pattern matcher. * The Globber class may be extended later to provide more UNIX-like globbing. * lexer.tes has been updated to make use of globbing. Now, lexers can be automatically loaded and registered at startup. To install a new lexer, it's sufficient to copy a file to the lexers/ directory.
2014-11-20lexer library: added M[lexer.checkheader] and M[lexer.checkname] for ↵Robin Haberkorn7-15/+18
matching a pattern against the first line of a buffer or its filename. This simplifies the "lexer.test..." macros and allows us to select lexers based on the #! line.
2014-11-19added first draft of new modular lexer systemRobin Haberkorn7-0/+270