.\" t .ds ST \\fB@PACKAGE_NAME@\\fP . . .TH "@PACKAGE@" 7 \ "@DATE@" \ "@PACKAGE_NAME@ Version @PACKAGE_VERSION@" .. . .SH NAME @PACKAGE@ \-\- Language reference for the \*(ST text editor and language . . .SH INTRODUCTION . \*(ST is a powerful editor and interactive programming language following an unique paradigm of text editing. It is both different from popular screen editors like \fIex\fP and traditional command-based editors like \fIed\fP. Both of these paradigms can be understood as language-based. Screen editors use simple languages based on commands that closely correspond with key presses (keyboard commands) while command-based editors use more complex, often turing-complete languages. Screen editors interpret their language immediately while command-based editors do so only after complete input of an expression. Some editors like \fIVi\fP or \fIemacs\fP support both screen-editing and command-lines in different modes of operation. While such editors represent a compromise between both paradigms (they support both paradigms to some extent), \*(ST represents a .BR synthesis . In \*(ST the screen-editing and command languages are the same! The \*(ST language can be interpreted interactively and commands can be as simple as single key presses or as complex as nested high-level structured programming constructs. While screen-editors often support an undo-stack to undo simple operations, \*(ST supports undoing on a per-character, per-command or user-configurable level undoing most of the side-effects on the overall editor state. .LP \*(ST is a member of the TECO family of text editor languages, invented by Dan Murphy in the 1960s. TECO was initially a purely command-driven editor that evolved screen editing features later on, culminating in the invention of Emacs as a TECO customization (macro package) in the 1970s. Emacs later became an independent program that eventually dropped TECO as its scripting language in favour of Lisp. \*(ST is not the first attempt to devise a TECO-based interactive screen editor. It is very similar to \fIVideoTECO\fP, a little known editor that pioneered the concept. When \fIVideoTECO\fP wanted to \(lqoutdo classic TECOs in every way\(rq, \*(ST wants to outdo \fIVideoTECO\fP in every way. .LP When using \*(ST in interactive mode, it is important to know exactly how \*(ST translates and processes user input. Generally speaking, the user inputs TECO code that is parsed and executed on the fly by a stream parser/executor with the possibility to rub out code from the end of the input stream, reversing its side-effects. The input stream is called command line macro. The language of the command-line macro is basically the same as the language used in batch processing, with the exception of some commands that depend on the command line macro or undo-capabilities that are disabled in batch mode. To add to or remove from the command line macro, \*(ST supports a number of special keys called immediate editing commands. The key-processing for immediate editing commands is described in the next two sections. While immediate editing commands can be understood as yet another - albeit limited - language for screen editing, \*(ST also supports regular commands for command-line editing. . . .SH KEY TRANSLATION . When the user presses a key or key-combination it is first translated to an ASCII character. All immediate editing commands and regular \*(ST commands operate on a language based solely on .B ASCII characters. The rules for translating keys are as follows: .RS .IP 1. 4 Keys with a printable representation (letters, digits and special characters) are translated to their printable representation. Shift-combinations automatically result in upper-case letters. .IP 2. Control-combinations (e.g. CTRL+A) are translated to control codes, that is a code smaller than 32. The control code can be calculated by stripping the seventh bit from the upper-case letter's ASCII code. So for instance, the upper or lower case A (65) will be translated to code 1, B to code 2, ecetera. \*(ST echos control codes as Caret followed by the corresponding upper case letter, so you seldomly need to know a control codes actual numeric code. For instance, control code 2 - typed CTRL+B - will be echoed \(lq^B\(rq. \*(ST also treats Caret-letter combinations equivalent to control codes under most circumstances. .IP 3. A few keys with non-printable representation are translated to control codes as well. The most prominent is the Escape key - it is translated to code 27. The Backspace key will be translated to code 8, and the Tab key to code 9. Last but not least, the Return key is translated to the current buffer's end of line sequence (linefeed, carriage return followed by linefeed or just carriage return). .IP 4. A selection of other keys without printable representation (called function keys) are translated to user-definable character sequences. This feature is called function key macros and explained in the next subsection. .RE . .SS Function Key Macros . By default function keys except Escape, Backspace and Return are ignored by \*(ST. By setting bit 6 of the \fBED\fP flag variable, function key handling is enabled: .EX 0,64ED .EE This is usually performed in the editor profile. With certain interfaces (curses) after enabling function keys, the Escape key might only be handled after a short delay. This is because it might be used by the terminal to transmit Escape Sequences. .LP Enabling function keys also enables Function Key Macros. These are Q-Register strings inserted into the command stream (before immediate editing command handling) when certain function keys (or combinations) are pressed. The following list of Function Key Macro registers are supported: .TP 9 .B ^FDOWN .TQ .B ^FUP Inserted when the down/up cursor keys are pressed. .TP .B ^FLEFT .TQ .B ^FSLEFT Inserted when the left or shift-left cursor keys are pressed. .TP .B ^FRIGHT .TQ .B ^FSRIGHT Inserted when the right or shift-right cursor keys are pressed. .TP .B ^FHOME .TQ .B ^FSHOME Inserted when the Home or shift-Home keys are pressed. .TP .BI ^FF x Inserted when the Fx-key is pressed .RI ( x is a number between 0 and 63). .TP .B ^FDC .TQ .B ^FSDC Inserted when the Delete or shift-Delete key is pressed. .TP .B ^FIC .TQ .B ^FSIC Inserted when the Insert or shift-Insert key is pressed. .TP .B ^FPPAGE .TQ .B ^FNPAGE Inserted when the Page-Up or Page-Down key is pressed. .TP .B ^FPRINT .TQ .B ^FSPRINT Inserted when the Print or shift-Print key is pressed. .TP .B ^FA1 .TQ .B ^FA3 .TQ .B ^FB2 .TQ .B ^FC1 .TQ .B ^FC3 Inserted when the numeric key pad's upper left key (7), upper right key (9), central key (5), lower left key (1), or lower right key (3) is pressed and num-lock is disabled. The key-pad's cursor keys are handled like the regular cursor keys. .TP .B ^FEND .TQ .B ^FSEND Inserted when the End or shift-End key is pressed. .TP .B ^FHELP .TQ .B ^FSHELP Inserted when the Help or shift-Help key is pressed. . .LP \(lq^F\(rq corresponds to CTRL+F in the above list but might be typed with a caret due to string building characters in long Q-Register names. The names are all derived from key definitions of the curses library - not all of them may be supported on any particular user interface. A set of useful Function Key Macros are provided in the standard library .BR fnkeys.tes . It demonstrates how Function Key Macros may be used to define alternate Escape keys (so the delay issue is not experienced), or do insertion and command-line editing using function keys. . . .SH COMMANDLINE EDITING . After all key presses have been translated to characters \*(ST interpretes some of them in a special way to perform command line editing and a few other actions that cannot depend on regular command execution. These special characters are called Immediate Editing Commands and are outlined in the following subsection. All characters not handled as immediate editing commands are self-inserting, i.e. they insert themself into the command stream and may be processed as regular commands or part of them. . .SS Immediate Editing Commands . In the following table, the Immediate Editing Commands are outlined. Some of them are ubiquitous and are not used used as regular commands (because it would be hard to type them). Some however are context-sensitive and are only available in or depend on the current language context (at the end of the command line) that is always known to \*(ST. Because of this superior parsing and command line editing, editing is much less dangerous and much more interactive than in classic TECO implementations. Most of these commands are control codes, so their control code mnemonics are given as well. . .TS expand,allbox,tab(;); LB LB LB LB LBW70 L L L L L. Name;Code;Mnemonics;Context;Description T{ Rub out T};8;^H, Backspace;T{ Everywhere T};T{ Rubs out the command line's last character. T} T{ Rub out word T};23;^W;T{ String arguments T};T{ Rub out last word according to Scintilla's definition of a word as set by .BR SCI_SETWORDCHARS . T} \^;\^;\^;T{ Miscelleaneous T};T{ Rub out last command, i.e. rub out at least one character until a new command could begin. T} T{ Rub out string T};21;^U;T{ String arguments T};T{ Rub out the entire current string argument. T} T{ Auto complete filename T};20;^T;T{ String arguments T};T{ Auto complete filename beginning after the last space (or beginning of the string argument). Fully completed filenames are terminated with a space. T} \^;9;^I, Tab;T{ File name arguments T};T{ Auto complete filename beginning at the start of the argument. Fully completed filenames terminate the string argument, except if the \(lq{\(rq terminator is used. T} T{ Auto complete symbol T};9;^I, Tab;T{ Scintilla symbol arguments T};T{ In Scintilla Symbol arguments .RB ( ES commands), complete beginning with the symbol, terminating fully completed symbols with a comma. T} T{ Terminate command line T};27;^[, Escape;T{ Immediately after regular command ^[ T};T{ Two escape characters (string terminators do .B not count) terminate the command line, freeing all undo-stack related resources and beginning a new command line. T} T{ Suspend process T};26;^Z;T{ Everywhere T};T{ Suspends process using .BR SIGTSTP . This is only an immediate editing command on platforms supporting this signal (usually Unices). The process can be suspended from another process as well, for instance by pressing CTRL+Z in the console - this will also work if \*(ST is busy. Therefore with GUI user interfaces (GTK+), this command will only work as an immediate editing command in the GUI or as a signal dispatched from an associated console or from another process. T} T{ Interrupt T};3;^C;T{ \*(ST is busy T};T{ .B "Not a real immediate editing command." Will interrupt the current operation (character processing), yielding a \*(ST error. It depends on asynchronous delivery of the .B SIGINT signal and is useful to interrupt infinite loops. Therefore with GUI user interfaces, the signal might has to be delivered via the attached console or from another process. If \*(ST is not busy, .B ^C is self-inserting and might be used as a regular command. T} .TE .LP The immediate editing commands that perform auto-completions, do so in a manner similar to Posix shells. Upon first invocation they try to fully or partially complete the file name (or token). If no completion can be performed, the invocation will display a list of file names (or tokens) beginning with the token to complete in \*(ST's popup area. Note that no additional expansions are performed before attempting a completion, so for instance \(lq~/foo\(rq will not complete a file in the user's home directory (tilde is not part of the file name but tilde-expansions are performed by the shell). \*(ST does however perform completions after string building so that \(lq^EQ{$HOME}/foo\(rq could be completed. . . .SH ARITHMETICS AND EXPRESSIONS . \*(ST abstracts classic TECO's scheme of commands accepting at most two prefix arguments and returning at most one value for the next command into a much more elaborate scheme where commands accept .I n arguments and return .I n arguments. This is done using an integer value stack. The stack is used for passing arguments and for arithmetics. \*(ST is thus a stack-based language. Nevertheless unary prefix and binary infix operators including operator preference are supported. Since the same stack is used for arithmetics and command arguments, arithmetics and arbitrary code may be freely mixed like in expression-centered languages and some classic TECOs. In fact, in \*(ST digits are basically stack-operating commands. For the sake of macro portability all integers are 64-bit signed integers regardless of the host's architecture. The integer storage size may be changed at \*(ST build time however. In this specific build, integers are @TECO_INTEGER@-bit. .LP Some commands expect or return booleans, most often signifying success or failure. Booleans are also integers but unlike in ordinary (sane) languages where 0 represent false and other values represent true, in \*(ST negative values (smaller than zero) represent success and non-negative values (smaller than or equal zero) represent failure. Therefore \*(ST booleans may be negated by performing a binary (bit by bit) negation for instance using the .B ^_ command. . .SS Operators . The following binary operators are supported with descending precedence: .TP 2 .IB base " ^* " power Raise \fIbase\fP to \fIpower\fP. .TP .IB a " * " b Multiply \fIa\fP by \fIb\fP. .TP .IB a " / " b Divide \fIa\fP by \fIb\fP. Naturally it is a division without remainder. .TP .IB a " ^/ " b Remainder of \fIa\fP divided by \fIb\fP. It is actually a modulo operation conforming to the host C compiler's modulo semantics. .TP .IB a " - " b Subtract \fIb\fP from \fIa\fP. .TP .IB a " + " b Add \fIb\fP to \fIa\fP. .TP .IB a " & " b Binary AND. .TP .IB a " # " b Binary OR. .LP Round brackets may be used for grouping expressions to override operator precedence. The opening bracket also represents an argument barrier, ie. the first command after the opening bracket does not see and cannot pop any arguments from the stack. This may be useful for mixing commands with arithmetic expressions. An expression enclosed in brackets returns all of the values that have been pushed onto the stack. Another construct that may be used as an argument barrier to explicitly separate arguments is the comma (\(lq,\(rq). It is obligatory when trying to push a sequence of number constants like in \(lq1,2\(rq but is optional in many contexts where it is mandatory in classic TECO, like in \(lqQaQbD\(rq. I recommend to use it as much as possible in code where clarity matters. .LP The only unary operator currently supported is .RB \(lq - \(rq. The minus-operator is special and has unique semantics. It sets the so called \fBprefix sign\fP which is 1 by default to -1. This prefix sign may be evaluated by commands - most often it is the default value if the argument stack is empty so that expressions like \(lq-C\(rq are rougly equivalent to \(lq-1C\(rq. However in front of opening brackets, like in \(lq-(1+2)\(rq, it is roughly equivalent to \(lq-1*(1+2)\(rq so that values calculated in brackets can be easily negated. In front of digits the prefix sign specifies whether the digit is added or subtracted from the value on the stack so that in front of numbers the result is a negative number. . . .SH COMMAND SYNTAX . \*(ST's command syntax is quite flexible and diverse but may be categorized into some base command types. .TP .B C .TQ .B EF A simple command consists of one or more printable or control characters whose case is insignificant. It is executed as soon as it has been completely specified. .TP .BI Q q A command identifier may be followed by a Q-Register specification .IR q . It specifies a Q-Register to be accessed by the command (e.g. to query, set, increase). .TP .BI I string $ .TQ .BI FS string1 $ string2 $ A command identifier (and Q-Register specification) may be followed by one or more string arguments. String arguments are terminated by Escape characters (27) by default, but this may be changed using modifiers. All string arguments may contain special string building characters for instance to embed other strings. String building may be enabled or disabled by default for a command. In interactive mode the command is often executed as soon as it has been completely specified and updates to the string arguments are handled interactively. . .SS Modifiers . A command's behaviour or syntax may be influenced by so called modifiers written in front of commands. .LP The colon (\fB:\fP) modifier usually prevents a command from failing and instructs it to return a condition (success/failure) boolean instead. .EX 1000:C= .EE For instance if \(lq1000C\(rq would fail, \(lq1000:C\(rq will return 0 instead. .LP The at (\fB@\fP) modifier allows the string termination character to be changed for individual commands. The alternative termination character must be specified just before the first string argument. For instance: .EX @FS/foo/bar/ .EE Any character may be used as an alternative termination character. There is one special case, though. If specified as the opening curly brace (\fB{\fP), a string argument will continue until the closing curly brace (\fB}\fP). Curly braces must be balanced and after the closing curly brace the termination character is reset to Escape and another one may be chosen. This feature is especially useful for embedding TECO code in string arguments, as in: .EX @^Um{ @FS{foo}/bar/ } .EE . .SH Q-REGISTERS . \*(ST may store data in so called Q-Registers. Each Q-Register cell has an integer and string part (can store both at the same time). Strings are actually Scintilla documents. Therefore Q-Register strings may be edited just like buffers (see \fBEQ\fP command). .LP Q-Register cells have case-sensitive names and are stored in Q-Register tables. These tables are Red-Black trees internally. Therefore the Q-Register namespace may be (ab)used as a complex data structure e.g. for records, arrays and maps. \*(ST provides a global Q-Register table (for global registers) and arbitrarily many local Q-Register tables (for local registers). Only one global and local Q-Register table may be accessed at a time. Macro invocations might create new local Q-Register tables for the executed code. \*(ST initializes the Q-Registers \(lqA\(rq to \(lqZ\(rq and \(lq0\(rq to \(lq9\(rq in every Q-Register table. Furthermore \*(ST defines and initializes the following special global registers: .TP 2 .BR _ " (underscore)" Search string and search condition register. .TP .BR - " (minus)" Replacement string register. Its integer part is unused. .TP .BR * " (asterisk)" Name and id of current buffer in ring. .TP .BR $ " (Escape)" Command-line replacement register. Its integer part is unused. .LP Some commands may create and initialize new registers if necessary, while it is an error to access undefined registers for some other commands. The string part of a register is only ever initialized when accessed. This is done opaquely, but allows you to use register tables as purely numeric data structures without the overhead of empty Scintilla documents. . .SS Q-Register Specifications . Q-Registers may be referred to by commands using Q-Register specifications: .TP 10 .I c .TQ .BI . c Refers to a one character Q-Register. The one character name is upper-cased. If lead by a dot, the name refers to a local Q-Register, otherwise to a global one. .TP .BI # cc .TQ .BI .# cc Refers to a two character global or local Q-Register whose name is upper-cased. .TP .BI { name } .TQ .BI .{ name } Refers to a Q-Register with an arbitrary .IR name . The name is \fBnot\fP upper-cased. String building characters may be used so that Q-Register names may be calculated. Curly braces can be used in \fIname\fP as long as they are balanced. The short single or double character specifications refer to registers in the same namespace as long specifications. For instance the specifications \(lqa\(rq and \(lqA\(rq are equivalent to \(lq{A}\(rq. . .SS Push-Down List . Another data structure supported by \*(ST is the Q-Register push-down list. Register contents may be pushed onto and popped from this list, for instance to save and restore global registers modified by a macro. The global Q-Register push-down list is handled using the .BI [ q and .BI ] q commands. For instance to search in a macro without overwriting the contents of the search register you could write: .EX [_ Sfoo$ ]_ .EE . . .SH STRING-BUILDING CHARACTERS . As alluded to earlier \*(ST supports special characters in command string arguments and long Q-Register names. These are called string-building characters. String-building character processing may be enabled or disabled for specific commands by default but is always enabled in long Q-Register specifications. String building and processing is performed in the following stages: .RS .IP 1. 4 Carets followed by characters are translated to control codes, so \(lq^a\(rq and \(lq^A\(rq are equivalent to CTRL+A (code 1). A double caret \(lq^^\(rq is translated to a single caret. This caret-handling is independent of the caret-handling in command names. .IP 2. String building characters are processed, resulting in expansions or translations of subsequent characters. .IP 3 Command-specific character processing. Some commands, most notably the search and replace commands, might interprete special characters and domain specific languages after string building. Care has been taken so that the string building and command-specific languages do not clash (i.e. to minimize necessary escaping). .RE .LP String building characters/expressions are always lead by a control character and their case is insignificant. In the following list of supported expressions, the caret-notation thus refers to the corresponding control code: .TP .BI ^Q c .TQ .BI ^R c Escape character \fIc\fP. The character is not handled as a string building character, so for instance \(lq^Q^Q\(rq translates to \(lq^Q\(rq. .TP .B ^V^V .TQ .BI ^V c Translates all following characters into lower case. When \fB^V\fP is not followed by \fB^V\fP, a single character \fIc\fP is lower-cased. .TP .B ^W^W .TQ .BI ^W c Analogous to \fB^V\fP, but upper-cases characters. .TP .BI ^EQ q Expands to the string contents of the Q-Register specified by \fIq\fP. Currently, long Q-Register names have a separate independant level of string building character processing, allowing you to build Q-Register names whose content is then expanded. .TP .BI ^EU q Expands to the character whose code is stored in the numeric part of Q-Register \fIq\fP. For instance if register \(lqA\(rq contains the code 66, \(lq^EUa\(rq expands to the character \(lqB\(rq. .TP .BI ^E c All remaining \fB^E\fP combinations are passed down unmodified. Therefore \fB^E\fP pattern match characters do not have to be escaped. . . .SH PATTERN MATCH CHARACTERS . \*(ST's search and replace commands allow the use of wildcards for pattern matching. These pattern match characters are all led by control characters and their case is insignificant, so they usually require much less escaping and thus less typing than regular expressions. Nevertheless they describe a similar class of languages. Pattern match character processing is performed after string building by search and replace commands. .LP The following pattern match constructs are supported for matching one character in different character classes (caret-notations refer to the corresponding control characters): .TP .B ^S .TQ .B ^EB Matches all non-alpha-numeric characters. .TP .B ^EA Matches all alphabetic characters. .TP .B ^EC Matches all symbol constituents. These are currently defined as alpha-numeric characters, dot (.) and dollar ($) signs. .TP .B ^ED Matches all digits. .TP .BI ^EG q Matches all characters in the string of the Q-Register specified by \fIq\fP, i.e. one of the characters in the register. .TP .B ^EL Matches all line break characters. These are defined as carriage return, line-feed, vertial tab and form feed. .TP .B ^ER Matches all alpha-numeric characters. .TP .B ^EV Matches all lower-case alphabetic characters. .TP .B ^EW Matches all upper-case alphabetic characters. .TP .I c All other (non-magic) characters represent a class that contains only the character itself. .LP The following additional pattern match constructs are supported (caret-notations refer to the corresponding control characters): .TP .B ^X .TQ .B ^EX Matches any character. .TP .BI ^N class Matches any character \fBnot\fP in \fIclass\fP. All constructs listed above for matching classes may be used. .TP .BI ^EM pattern Matches many occurrences (at least one) of \fIpattern\fP. Any pattern match construct and non-magic character may be used. .TP .B ^ES Matches any sequence of whitespace characters (at least one). Whitespace characters are defined as line-break characters, the space and horizontal tab characters. .TP .BI ^E[ pattern1 , pattern2 , ... ] Matches one of a list of patterns. Any pattern match construct may be used. The pattern alternatives must be separated by commas. .LP All non-pattern-match-characters match themselves. Note however that currently, all pattern matching is performed .BR case-insensitive . . . .SH FLOW CONTROL .SS GOTOS AND LABELS .SS LOOPS .SS CONDITIONALS . . .SH COMMAND REFERENCE . .COMMANDS . . .SH COMPATIBILITY . \*(ST is not compatible to any particular TECO dialect but is quite similar to .BR VideoTECO . Most VideoTECO and many Standard TECO programs should be portable with little to no changes. . . .SH SEE ALSO . .TP Program invocation and options: .BR sciteco (1) .TP Scintilla messages and other documentation: .UR http://scintilla.org/ScintillaDoc.html Scintilla .UE . . .SH AUTHOR . This manpage and the \*(ST program was written by .MT robin.haberkorn@googlemail.com Robin Haberkorn .ME . \# EOF