The new 3.0.0 release completes the conversion of the Java (and JavaScript) ANTRL4 runtime to TypeScript. It’s a significant improvement over the existing TS (and JS) runtimes, as it includes now all relevant parts from the Java runtime and has been optimized for performance. It’s now twice as fast for cold runs and 20% faster with a warm parser/lexer. See also the benchmark section below.

This makes it the fastest TypeScript (and JS) runtime currently available. The ANTLR4 JavaScript runtime still is slightly faster in short time tests (e.g. 228ms vs 223ms for the query collection test), where system load and other factors have however much more impact compared to tests that run around 10 seconds.

So, what has changed in this new major release? In detail:

  • Everything that makes sense in the TypeScript environment has been converted (for example ListTokenSource, RuntimeMetaData and parse tree pattern matching, to name a few). That excludes things like the code point char streams, which directly deal with file and (Java) stream input – aspects that don’t matter in TypeScript. Consequently the class CharStreams has been removed. Use CharStream.fromString to feed your lexer.
  • The runtime has been profiled both with VS Code and Node.js directly, to identify bottlenecks. This showed excessive object creation and release, because of the way DFA generation is implemented. That led to big object initialization and garbage collection penalties. This has been improved and now most time is spent in generated code (which could be improved later).
  • A number of public members have been renamed, to match common TypeScript code style (e.g. INSTANCE to instance or _parseListener to parseListener).
  • Methods that can return null/undefined now explicitly say so (Java has no explicit type to mark possible null results and the ANTLR4 Java runtime is a bit lax in this regard). This requires now that users of the new runtime have to do more checks for undefined results.
  • The lexer can now be configured at runtime to control what is cached in the DFA (in the Java runtime it’s always only ASCII chars) and which code points are acceptable (in the Java runtime always the entire Unicode range).
  • The JS doc in code has been reworked and all unsupported markup been removed.
  • A lot of other code cleanup happened.
  • Test improvements:
    • Runtime tests have been ported to TypeScript, so they can be debugged and provide a much quicker turnaround.
    • Benchmarks now take seperated measurements for lexer and parser runs, which much better shows how high the impact of predicates and actions in the lexer is (e.g. 2/3 of the time in the large inserts benchmark is used for lexing input).
  • Hash codes are now cached whereever possible.

Leave a Reply