Age | Commit message (Collapse) | Author | Files | Lines |
|
detect EMCurses
* Emscripten can be used (theoretically) to build a host-only platform-independant version
of SciTECO (running under node.js instead of the browser).
* I ported netbsd-curses with Emscripten for that purpose. Therefore, adaptions for running
in the browser are restricted to EMcurses now.
|
|
* a proper Arduino lexer supporting the special Arduino keywords/classes
could in principle be written, but for the time being they're treated
just like regular C++ sources
|
|
* Array allocations were not properly accounted since the compiler
would call the replacement new() which assumes that it would
always be called along with the replacement sized-deletion.
This is not true for array new[] allocations resulting in
a constant increase of memory_usage and unrecoverable situations.
This problem however could be fixed in principle by avoiding
memory counting for arrays or falling back to malloc_usable_size.
* The bigger problem was that some STLs (new_allocator) are broken, calling the
non-sized delete for regular new() calls which could in principle
be matched by sized-delete.
This is also the reason why I had to provide a non-sized
delete replacement, which in reality intoduced memory leaks.
* Since adding checks for the broken compiler versions or a configure-time
check that tries to detect these broken systems seems tedious,
I simply removed that optimization.
* This means we always have to rely on malloc_usable_size() now
for non-SciTECO-object memory measurement.
* Perhaps in the future, there should be an option for allowing
portable measurement at the cost of memory usage, by prefixing
each memory chunk with the chunk size.
Maintainers could then decide to optimize their build for "speed"
at the cost of memory overhead.
* Another solution to this non-ending odyssey might be to introduce
our own allocator, replacing malloc(), and allowing our own
precise measurements.
|
|
* it turned out to be possible to provoke memory_usage
overflows or underruns, resulting in unrecoverable states
* a possible reason can be that at least with G++ 5.4.0,
the compiler would sometimes call the (default) non-sized
delete followed by our custom sized delete/deallocator.
* This was true even after compiling Scintilla with -fsized-deallocation.
* therefore we provide an empty non-sized delete now.
* memory_usage counting can now be debugged by uncommenting
DEBUG_MAGIC in memory.cpp. This uses a magic value to detect
instrumented allocations being mixed with non-instrumented
allocations.
* simplified the global sized-deallocation functions
(they are identical to the Object-class allocators).
|
|
Automakefiles could be simplified by updating CXXFLAGS
in configure.ac instead.
|
|
* avoid warnings
* make sure Doxygen finds RBEntryOwnString
* it would be nice to strip the top level `SciTECO` namespace
but this is not supported without some macro magic that
ommit the namespace declaration when processing with
Doxygen.
|
|
* automatic conversion with `doxygen -u`
|
|
* when enabled, it will automatically upper-case all
one or two letter commands (which are case insensitive).
* also affects the up-carret control commands, so they when inserted
they look more like real control commands.
* specifically does not affect case-insensitive Q-Register specifications
* the result are command lines that are better readable and conform
to the coding style used in SciTECO's standard library.
This eases reusing command lines as well.
* Consequently, string-building and pattern match characters should
be case-folded as well, but they aren't currently since
State::process_edit_cmd() does not have sufficient insight
into the MicroStateMachines. Also, it could not be delegated
to the MicroStateMachines.
Perhaps they should be abandoned in favour of embeddedable
regular state machines; or regular state machines with a stack
of return states?
|
|
* this resulted in assertions (crashes!) for harmless typos like "+23="
* a test case has been added
|
|
|
|
* test case: HECcat$ on a large buffer (>= 64kb)
truncates the buffer or repeats its beginning
* it turns out that the incremental writing to the process' stdin
was broken. We were always writing data from the beginning of the buffer
which fails if the stdin watcher must be activated more than once.
* Also, EOLWriter::convert() can validly return 0, even if bytes have
been written on the data sink, so this value cannot be used to
check whether the process has closed its stdin.
We now make sure that the entire buffer range is written to stdin.
* Piping large buffers no longer removes the buffer gap.
This makes little difference when filtering via EC since
it will change the buffer gap anyway.
Can make a huge difference when not touching the buffer, though
(e.g. HEGAcat$).
* I did not add a test suite case since that requires
a very large test file and it cannot be easily generated automatically.
|
|
* StateQueryQReg is now derived from StateExpectQReg
whose semantics have been changed slightly.
* The alternative would have been another common base class for both
StateQueryQReg and StateExpectQReg.
|
|
support
* Since netbsd-curses can act as a drop-in replacement to ncurses,
SciTECO builds with --with-interface=ncurses as well.
However, it is unintuitive for users to build with ncurses support
when actually linking against netbsd-curses; so another option has been added.
* The UNIX/TTY specific code (which works with both ncurses and netbsd-curses)
was selected when NCURSES was detected at build-time.
This does not work for netbsd-curses, so we define a new symbol
NETBSD_CURSES. At build-time, a CURSES_TTY macro may now be defined.
* This effectively fixes the stdio in interactive mode, window titles
and the XTerm clipboard support for netbsd-curses.
Some minor features like the reduced ESCDELAY are still broken.
|
|
* this is actually another independant Curses implementation for
Unix platforms I wasn't aware of.
I tested against this portable fork of it:
https://github.com/sabotage-linux/netbsd-curses
* Only a mimimal change to Scinterm was necessary to support it.
* netbsd-curses might be useful for NetBSD support (which is
otherwise untested) and when building small statically
linked self-contained binaries since netbsd-curses is much
smaller than ncurses.
|
|
|
|
been shown to be unacceptably broken, so the fallback implementation has been improved
* mallinfo() is not only broken on 64-bit systems but slows things
down linearilly to the memory size of the process.
E.g. after 500000<%A>, SciTECO will act sluggish! Shutting down
afterwards can take minutes...
mallinfo() was thus finally discarded as a memory measurement
technique.
* Evaluating /proc/self/statm? has also been evaluated and discarded
because doing this frequently is even slower.
* Instead, the fallback implementation has been drastically improved:
* If possible use C++14 global sized deallocators, allowing memory measurements
across the entire C++ code base with minimal runtime overhead.
Since we only depend on C++11, a lengthy Autoconf check had to be introduced.
* Use malloc_usable_size() with global non-sized deallocators to
measure the approx. memory usage of the entire process (at least
the ones done via C++).
The cheaper C++11 sized deallocators implemented via SciTECO::Object still
have precedence, so this affects Scintilla code only.
* With both improvements the test case
sciteco -e '<@EU[X^E\a]"^E\a"%a>'
is handled sufficiently well now on glibc and performance is much better
now.
* The jemalloc-specific technique has been removed since it no longer
brings any benefits compared to the improved fallback technique.
Even the case of using malloc_usable_size() in strict C++ mode is
up to 3 times faster.
* The new fallback implementation might actually be good enough for
Windows as well if some MSVCRT-specific support is added, like
using _msize() instead of malloc_usable_size().
This must be tested and benchmarked, so we keep the Windows-specific
implementation for the time being.
|
|
* in a flat list of undo tokens, we need to store
the program counter (ie. command line position)
that the undo token corresponds to.
Since in general there is more than one undo token per
input character, this stored PCs redundantly.
* For input characters with no undo tokens
(only applies to NOPs like space in the command line
macro), this needs one more pointer than before.
* In case of 1 undo token per input character,
the new implementation uses approx. the same memory.
* In the most common case of more than one undo token
per input character, this saves at least 4 bytes per
undo token.
* In large macros and long loops the effect is especially
pronounced. E.g. 500000<%A> will use 8MB less memory
with the new implementation.
|
|
editing key
* StateEscape should return the same fnmacro mask as StateStart
* When rubbing out a command, we should stop at StateEscape as well.
Therefore we reintroduced States::is_start().
RTTI is still not used.
|
|
as State::process_edit_cmd() virtual methods
* Cmdline::process_edit_cmd() was much too long and deeply nested.
It used RTTI excessively to implement the state-specific behaviour.
It became apparent that the behaviour is largely state-specific and could be
modelled much more elegantly as virtual methods of State.
* Basically, a state can now implement a method to customize its
commandline behaviour.
In the case that the state does not define custom behaviour for
the key pressed, it can "chain" to the parent class' process_edit_cmd().
This can be optimized to tail calls by the compiler.
* The State::process_edit_cmd() implementations are still isolated in
cmdline.cpp. This is not strictly necessary but allows us keep the
already large compilations units like parser.cpp small.
Also, the edit command processing has little to do with the rest of
a state's functionality and is only used in interactive mode.
* As a result, we have many small functions now which are much easier to
maintain.
This makes adding new and more complex context sensitive editing behaviour
easier.
* State-specific function key masking has been refactored by introducing
State::get_fnmacro_mask().
* This allowed us to remove the States::is_*() functions which have
always been a crutch to support context-sensitive key handling.
* RTTI is almost completely eradicated, except for exception handling
and StdError(). Both remaining cases can probably be avoided in the
future, allowing us to compile smaller binaries.
|
|
and added a FreeBSD/jemalloc-specific implementation
* largely reverts 39cfc573, but leaves in minor and documentation
changes.
* further experimentation of memory limiting using malloc() wrapping
has shown additional problems, like dlsym() calling malloc-functions,
further reducing the implementation to glibc-specific means.
This means there had been no implementation for FreeBSD and checks
would have to rely on undocumented internal implementation details
of different libcs, which is not a good thing.
* Other problems have been identified, like having to wrap calloc(),
guarding against underruns and multi-thread safety had been identified
but could be worked around.
* A technique by calculating the memory usage as sbrk(0) - &end
has been shown to be effective enough, at least on glibc.
However even on glibc it has shortcomings since malloc() will
somtimes use mmap() for allocations and the technique
relies on implementation details of the libc.
Furthermore another malloc_trim(0) had to be added to the error
recovery in interactive mode, since glibc does not adjust the program break
automatically (to avoid syscalls I presume).
* On FreeBSD/jemalloc, the sbrk(0) method totally fails because jemalloc
exclusively allocates via mmap() -> that solution was discarded as well.
* Since all evaluated techniques turn out to be highly platform
specific, I reverted to the simple and stable platform-specific
mallinfo() API on Linux.
* On FreeBSD/jemalloc, it's possible to use mallctl("stats.allocated")
for the same purpose - so it works there, too now.
It's slower than the other techniques, though.
* A lengthy discussion has been added to memory.cpp, so that we
do not repeat the previous mistakes.
|
|
* shouldn't make much of a difference, since we're in deep trouble
when they return NULL, but the wrappers should be transparent
instead of crashing in malloc_usable_size().
|
|
|
|
portable and faster hack
* Works by "hooking" into malloc() and friends and counting the
usable heap object sizes with malloc_usable_size().
Thus, it has no memory-overhead.
* Will work at least on Linux and (Free)BSD.
Other UNIXoid systems may work as well - this is tested by ./configure.
* Usually faster than even the fallback implementation since the
memory limit is hit earlier.
* A similar approach could be tried on Windows (TODO).
* A proper memory-limiting counting all malloc()s in the system can make
a huge difference as this test case shows:
sciteco -e '<@EU[X^E\a]"^E\a"%a>'
It will allocate gigabytes before hitting the 500MB memory limit...
* Fixed the UNIX-function checks on BSDs.
|
|
|
|
|
|
* especially to improve building on FreeBSD 11
* We need GNU Make, yet alone because Scintilla/Scinterm
needs it. We now document that dependency and added
an Autoconf check from the autoconf-archive.
We make sure that the build process is invoked with GNU make
by generating only GNUmakefiles.
The Makefile.am files have not been renamed, so this
change can be rolled back easily.
* Some GNU-Make-specific autoreconf warnings have still been
resolved. But not all of them, as this would have been
unelegant and we need GNU Make anyway.
* Declare ACLOCAL_AMFLAGS to appease autoreconf
* Added an explicit check for C++11 from the autoconf-archives.
In general we should support building with every C++11 compiler
that is sufficiently GNU-like.
* Do not use `sed` for inplace editing, as different sed-implementations
have mutually incompatible syntax for this.
Instead of declaring and checking a dependency on GNU sed,
we simply use SciTECO for the editing task.
This improves code portability on BSDs.
* Similarily, BSD/POSIX `cmp` is supported now.
This fixes the test suite on BSD without declaring a
dependency on the GNU coreutils.
* Simplified sciteco-wrapper generation.
|
|
* fixes manpages, Groff warnings and building
womanpages for older Groff versions.
Groff v1.19 is in use eg. on FreeBSD 11.
* tbl v1.19 has different column specifiers than
on later versions. `X` cannot be used for expanded
columns in these Groff versions.
|
|
|
|
|
|
* equivalent to `xF` and currently ignored by grosciteco.
* older versions of Groff use `F` instead of `xF`, even though it
is not documented. Therefore this fixes building on systems
with slightly outdated versions of Groff like Haiku and OS X.
|
|
* we had an undocumented dependency on Groff v1.20, since
this version introduced the .device request.
* this broke the womanpage generation e.g. on OS X 10.6.
Even newer versions of OS X only appear to ship Groff v1.19.
* Since it makes sense to support the Groff shipping with OS X,
we work around this issue by reimplementing .device on platforms
that lack it.
* The fallback implementation still has subtle differences to
the real .device, but they are acceptable for the time being.
|
|
|
|
* fixes formatting of sciteco.7.html
* it is still not ideal since tables with rule="none" can only get
row borders by adding them to the table cells.
Perhaps the entire border handling should be done with CSS.
|
|
* some classic TECOs have this
* just like ^[, dollar works as a command only, not as a string terminator
* it improves the readability of macros using printable characters only
* it closes a gap in the language by allowing $$ (double-dollar) and
^[$ as printable ways to write the return from macro command.
^[^[ was not and is not possible.
* since command line termination is a regular interactive return-command
in SciTECO, double-dollar will also terminate the command line now.
This will be allowed unless it turns out to be a cause of trouble.
* The handling of unterminated commands has been cleaned up by
introducing State::end_of_macro().
Most commands (and thus states) except the start state cannot be
valid at the end of a macro since this indicates an unterminated/incomplete
command.
All lookahead-commands (currently only ^[) will end implicitly
at the end of a macro and so will need a way to perform their action.
The virtual method allows these actions to be defined with the rest
of the state's implementation.
|
|
* The $$ would leave the current state pointing to the "escape" state
which was manually fixed up in macro return handling but not in command line
return (ie. termination) handling.
Therefore the initial state at the start of the command line after $$
was the "escape" state.
The rubout-last-command immediate editing command would consequently
end up in an infinite loop trying to reach the start state.
* This has been fixed by setting the state before throwing Return().
Some additional paranoia assertions have been added to prevent this
bug in the future.
|
|
* the ESSTYLECLEARALL$$ was resetting the STYLE_CALLTIP
(and others) resulting in wrongly-styled popups.
* We now only change STYLE_DEFAULT for Gtk UIs and
use `color.init` to reinitialize the other styles
(not very elegant).
|
|
* this has been broken since cb5e08b40d
|
|
* a table reference was stored in the UndoToken.
* since there are only two tables at a given moment, this can
be avoided by having two different undo tokens, one for globals
and one for locals.
* Practically, undo tokens for locals are only created for the
top-level local Q-Reg table since macro calls with locals
with set must_undo to false since the local table is destroyed
with the macro return.
|
|
* shouldn't really be an issue but since we already have
CTL_KEY_ESC_STR as a character literal, we may as well use it.
|
|
* on MSVCRT/MinGW, space allocated with alloca()/g_newa() was apparently
freed once the first exception was caught.
This prevented the proper destruction of local Q-Reg tables and
broke the Windows port.
* Since all alternatives to alloca() like VLAs are not practical,
the default Q-Register initialization has been moved out of the
QRegisterTable constructor into QRegisterTable::insert_defaults().
* The remaining QRegisterTable initialization and destruction is
very cheap, so we simply reserve an empty QRegisterTable for
local registers on every Execute::macro() call.
The default registers are only initialized when required, though.
* All of this has to change anyway once we replace the
C++ call-stack approach to macro calls with our own macro
call frame memory management.
|
|
* we can use root() instead of min() which is faster
|
|
performance issues with memory measurements
* Fixed build problems on Windows
* g_slice on Windows has been shown to be of little use either
and it does not work well with the GetProcessMemoryInfo()
measurements.
Also, it brings the same problem as on Glibc: Not even command-line
termination returns the memory to the OS.
Therefore, we don't use g_slice at all and commented on it.
* The custom Linux and Windows memory measurement approaches
have been shown to be inefficient.
As a workaround, scripts disable memory limiting.
* A better approach -- but it will only work on Glibc -- might
be to hook into malloc(), realloc() and free() globally
and use the malloc_usable_size() of a heap object for
memory measurements. This will be relatively precise and cheap.
* We still need the "Object" base class in order to measure
memory usage as a fallback approach.
|
|
* a simple cast was missing due to C++ aliasing rules
|
|
|
|
* we were basing the glib allocators on throwing std::bad_alloc just like
the C++ operators. However, this always was unsafe since we were throwing
exceptions across plain-C frames (Glib).
Also, the memory vtable has been deprecated in Glib, resulting in
ugly warnings.
* Instead, we now let the C++ new/delete operators work like Glib
by basing them on g_malloc/g_slice.
This means they will assert and the application will terminate
abnormally in case of OOM. OOMs cannot be handled properly anyway, so it is
more important to have a good memory limiting mechanism.
* Memory limiting has been completely revised.
Instead of approximating undo stack sizes using virtual methods
(which is unprecise and comes with a performance penalty),
we now use a common base class SciTECO::Object to count the memory
required by all objects allocated within SciTECO.
This is less precise than using global replacement new/deletes
which would allow us to control allocations in all C++ code including
Scintilla, but they are only supported as of C++14 (GCC 5) and adding compile-time
checks would be cumbersome.
In any case, we're missing Glib allocations (esp. strings).
* As a platform-specific extension, on Linux/glibc we use mallinfo()
to count the exact memory usage of the process.
On Windows, we use GetProcessMemoryInfo() -- the latter implementation
is currently UNTESTED.
* We use g_malloc() for new/delete operators when there is
malloc_trim() since g_slice does not free heap chunks properly
(probably does its own mmap()ing), rendering malloc_trim() ineffective.
We've also benchmarked g_slice on Linux/glib (malloc_trim() shouldn't
be available elsewhere) and found that it brings no significant
performance benefit.
On all other platforms, we use g_slice since it is assumed
that it at least does not hurt.
The new g_slice based allocators should be tested on MSVCRT
since I assume that they bring a significant performance benefit
on Windows.
* Memory limiting does now work in batch mode as well and is still
enabled by default.
* The old UndoTokenWithSize CRTP hack could be removed.
UndoStack operations should be a bit faster now.
But on the other hand, there will be an overhead due to repeated
memory limit checking on every processed character.
|
|
* test case: rubout 1U[foo]
* this probably also leaked memory if it didn't crash
* a missing cast from RBTree::remove() was missing.
This cast is necessary since QRegister uses multiple inheritance.
The offset of RBEntryString might not be 0 in QRegister.
Also, since the base class is no longer virtual, a cast to the
virtual QRegister class is necessary to ensure that subclass
destructors get called.
This might have not caused problems before since RBEntry was virtual
or the compiler just happened to reorder the instance structures.
|
|
implementation classes
* whenever the implementation class was not exactly RBEntryType,
it had to have a virtual destructor since RBTree cared about
cleanup and had to delete its members.
* Since it does not allocate them, it is consistent to remove RBTree::clear().
The destructor now only checks that subclasses have cleaned up.
Implementing cleanup in the subclasses is trivial.
* Consequently, RBEntryString no longer has to be virtual.
HelpIndex and GotoTables are completely non-virtual now
which saves memory (and a bit of cleanup speed).
For QRegister, not much changes, though.
|
|
|
|
* From what the documentation says, a dot may only be used
once to introduce a local Q-Register specification.
The parser was accepting arbitrarily many dots though.
* Now, ".." will refer to the local register ".".
|
|
* Using a common implementation in RBTreeString::auto_complete().
This is very efficient even for very huge tables since only
an O(log(n)) lookup is required and then all entries with a matching
prefix are iterated. Worst-case complexity is still O(n), since all
entries may be legitimate completions.
If necessary, the number of matching entries could be restricted, though.
* Auto completes short and long Q-Reg names.
Short names are "case-insensitive" (since they are upper-cased).
Long specs are terminated with a closing bracket.
* Long spec completions may have problems with names containing
funny characters since they may be misinterpreted as string building
characters or contain braces. All the auto-completions suffered from
this problem already (see TODO).
* This greatly simplifies investigating the Q-Register name spaces
interactively and e.g. calling macros with long names, inserting
environment registers etc.
* Goto labels are terminated with commas since they may be part
of a computed goto.
* Help topics are matched case insensitive (just like the topic
lookup itself) and are terminated with the escape character.
This greatly simplifies navigating womanpages and looking up
topics with long names.
|