Detailed notes on the njs JavaScript engine internals gathered during
the optional chaining (?.) implementation. Intended as a reference
for future feature work.
The parser is a stack-based state machine, not a recursive-descent parser. Each state is a function with the signature:
static njs_int_t
njs_parser_<state>(njs_parser_t *parser, njs_lexer_token_t *token,
njs_queue_link_t *current);State transitions use three primitives:
| Primitive | Effect |
|---|---|
njs_parser_next(parser, state) |
Set the immediate next state (replaces current) |
njs_parser_after(parser, current, target, deferred, state) |
Push a continuation onto the stack (runs after the immediate state completes) |
njs_parser_stack_pop(parser) |
Pop to the next continuation on the stack |
njs_parser_after inserts before the current link in the queue.
When multiple njs_parser_after calls are made in sequence, the
last push runs first after the immediate state completes.
Example — pushing A then B:
njs_parser_after(parser, current, ..., stateA); // push A
njs_parser_after(parser, current, ..., stateB); // push B before AExecution order: immediate_state -> B -> A -> (rest of stack)
parser->node— the current AST node being built. States read and write this to pass results between states.parser->target— set by thetargetparameter ofnjs_parser_after. Used to pass a "parent" node into a continuation state (e.g., the function node for argument parsing, or a wrapper node for optional chain).
njs_lexer_consume_token(parser->lexer, N); // consume N tokens
njs_lexer_token(parser->lexer, 0); // get current token (after consume)
njs_lexer_peek_token(parser->lexer, token, 0); // peek next token without consumingThe token parameter passed to each state function is the current
token. After consuming tokens, call njs_lexer_token() to get the new
current token.
njs_parser_node_t *node = njs_parser_node_new(parser, NJS_TOKEN_TYPE);
node->token_line = token->line;
node->u.operation = NJS_VMCODE_XXX; // for operation nodes
node->left = ...;
node->right = ...;String/identifier nodes:
njs_parser_node_t *str = njs_parser_node_string(parser->vm, token, parser);njs_parser_create_call(parser, node, ctor) inspects node->token_type:
| Node Type | Result |
|---|---|
NJS_TOKEN_NAME |
Reuses the node, changes type to FUNCTION_CALL |
NJS_TOKEN_PROPERTY |
Creates METHOD_CALL wrapping the property |
NJS_TOKEN_OPTIONAL_CHAIN (with PROPERTY right) |
Unwraps to METHOD_CALL with plain PROPERTY |
| Everything else | Creates FUNCTION_CALL wrapping the node |
The distinction between METHOD_CALL and FUNCTION_CALL is critical for
this binding: METHOD_CALL uses METHOD_FRAME (which sets this to
the object), while FUNCTION_CALL uses FUNCTION_FRAME (which sets
this to undefined).
In njs_parser_unary_expression_after, when type == NJS_TOKEN_DELETE:
NJS_TOKEN_PROPERTY-> converted toNJS_TOKEN_PROPERTY_DELETEin-place, returned without a DELETE wrapper.NJS_TOKEN_OPTIONAL_CHAIN-> inner PROPERTY converted to PROPERTY_DELETE, then falls through to wrap in a DELETE unary node (so the short-circuitundefinedis coerced totruebyNJS_VMCODE_DELETE).NJS_TOKEN_NAME-> syntax error.- Default -> wrapped in DELETE unary node (evaluates expression,
returns
true).
The generator is also a stack-based state machine, mirroring the parser's design. Each state is:
static njs_int_t
njs_generate_<state>(njs_vm_t *vm, njs_generator_t *generator,
njs_parser_node_t *node);State transitions:
| Primitive | Effect |
|---|---|
njs_generator_next(generator, state, node) |
Set immediate next state and node |
njs_generator_after(vm, generator, link, node, state, ctx, ctx_size) |
Push continuation |
njs_generator_stack_pop(vm, generator, ctx) |
Pop to next continuation |
Same as the parser: njs_generator_after inserts before the given
link. The last push runs first.
njs_generator_after accepts a ctx pointer and ctx_size. If
ctx_size > 0, the context is copied to a new allocation. The
continuation state accesses it via generator->context. This is used
to pass data like jump offsets between phases.
njs_generate() in njs_generator.c has a large switch on
node->token_type that dispatches to specialized generators:
case NJS_TOKEN_PROPERTY:
case NJS_TOKEN_PROPERTY_DELETE:
return njs_generate_3addr_operation(vm, generator, node, 0);
case NJS_TOKEN_METHOD_CALL:
return njs_generate_method_call(vm, generator, node);
case NJS_TOKEN_FUNCTION_CALL:
return njs_generate_function_call(vm, generator, node);3-address operation (njs_generate_3addr_operation):
- Generate left child
- Generate right child
- Emit 3-addr instruction:
dst = op(src1, src2)
Used for PROPERTY_GET, PROPERTY_ATOM_GET, PROPERTY_DELETE, arithmetic,
comparisons, etc. The swap parameter controls operand order.
Test-jump pattern (used by ??, ||, &&, ?.):
- Phase 1: Generate the left/base expression
- Phase 2: Emit test-jump opcode (conditional jump), save jump offset
- Generate the right/body expression
- Phase 3: MOVE result if needed, patch the jump offset
// Phase 2: emit test-jump
njs_generate_code(generator, njs_vmcode_test_jump_t, test_jump,
NJS_VMCODE_XXX, node);
jump_offset = njs_code_offset(generator, test_jump);
test_jump->value = node->left->index;
test_jump->retval = node->index;
// Phase 3: patch jump
njs_code_set_jump_offset(generator, njs_vmcode_test_jump_t,
*((njs_jump_off_t *) generator->context));Method call (njs_generate_method_call):
- Generate
prop->left(object expression) - Generate
prop->right(method name expression) - Emit
METHOD_FRAMEwithobject = prop->left->index,method = prop->right->index - Generate arguments
- Emit
FUNCTION_CALL
Function call (njs_generate_function_call):
- Generate function expression
- Emit
FUNCTION_FRAME(this = undefined) - Generate arguments
- Emit
FUNCTION_CALL
njs_generate_code(generator, type_t, var, NJS_VMCODE_XXX, node);This macro allocates space in the code buffer, sets the opcode, and
assigns the pointer to var. The node is used for debug info (line
numbers).
// Allocate a temporary index for a node
node->index = njs_generate_node_temp_index_get(vm, generator, node);
// Get destination index (reuses dest hint or allocates temp)
node->index = njs_generate_dest_index(vm, generator, node);
// Release children's temp indexes
njs_generate_children_indexes_release(vm, generator, node);// Save current code position
njs_jump_off_t offset = njs_code_offset(generator, instruction);
// Later, patch the jump to point to current position
njs_code_set_jump_offset(generator, instruction_type, offset);njs_vmcode.c uses a computed-goto dispatch table
(NJS_GOTO_ROW(opcode) entries). Each opcode has a CASE handler.
| Opcode | Structure | Description |
|---|---|---|
FUNCTION_FRAME |
njs_vmcode_function_frame_t |
Create call frame, this = undefined |
METHOD_FRAME |
njs_vmcode_method_frame_t |
Property lookup + call frame, this = object |
FUNCTION_CALL |
njs_vmcode_function_call_t |
Invoke the prepared frame |
PROPERTY_GET |
njs_vmcode_3addr_t |
dst = obj[key] (computed key) |
PROPERTY_ATOM_GET |
njs_vmcode_3addr_t |
dst = obj.name (string/number key, optimized) |
PROPERTY_DELETE |
njs_vmcode_3addr_t |
delete obj[key], returns boolean |
MOVE |
njs_vmcode_move_t |
dst = src |
COALESCE |
njs_vmcode_test_jump_t |
?? operator: if value is null/undefined, jump |
OPTIONAL_CHAIN |
njs_vmcode_test_jump_t |
?. operator: if null/undefined, set undefined + jump |
TEST_IF_TRUE |
njs_vmcode_test_jump_t |
|| operator |
TEST_IF_FALSE |
njs_vmcode_test_jump_t |
&& operator |
DELETE |
njs_vmcode_2addr_t |
Always returns true (non-reference delete) |
CASE (NJS_VMCODE_METHOD_FRAME):
// operand2 = object, operand3 = method name/key
// 1. Convert key to string if needed
// 2. njs_value_property_val(vm, object, key, &method) — property lookup
// 3. Check method is a function
// 4. njs_function_frame_create(vm, &method, object, nargs, ctor)
// — creates frame with this = objectNote: METHOD_FRAME does its own property lookup. This means it
re-evaluates the property access. For optional chaining method calls
(o.m?.()), this means the property is accessed twice: once for the
null check and once for the METHOD_FRAME. This is acceptable for njs
(no Proxy support).
CASE (NJS_VMCODE_FUNCTION_FRAME):
// operand2 = function value
// njs_function_frame_create(vm, function, &njs_value_undefined, nargs, ctor)
// — creates frame with this = undefinedDefined in src/njs_lexer.h. Token types serve dual purpose:
- Lexer tokens (produced by the lexer during tokenization)
- AST node types (used in
njs_parser_node_t.token_type)
Some token types are AST-only (never produced by the lexer):
NJS_TOKEN_OPTIONAL_CHAIN, // wrapper for ?. chain
NJS_TOKEN_OPTIONAL_CHAIN_REF, // placeholder for base index
NJS_TOKEN_PROPERTY_DELETE, // delete obj.prop
NJS_TOKEN_METHOD_CALL, // obj.method()
NJS_TOKEN_FUNCTION_CALL, // func()Each token type needs a serialization entry in
njs_parser_serialize_node() (at the end of njs_parser.c) for error
messages and debugging:
njs_token_serialize(NJS_TOKEN_OPTIONAL_CHAIN);
njs_token_serialize(NJS_TOKEN_OPTIONAL_CHAIN_REF);src/njs_vmcode.h: Add to thenjs_vmcode_tenum.src/njs_vmcode.c: AddNJS_GOTO_ROW()entry in the dispatch table. AddCASEhandler.src/njs_disassembler.c: Add disassembly output block.src/njs_generator.c: Add dispatch innjs_generate()and implement the generation function(s).src/njs_lexer.h: If new AST-only token types are needed, add beforeNJS_TOKEN_RESERVED.src/njs_parser.c: Add token serialization at the end.
struct njs_parser_node_s {
njs_token_type_t token_type:16;
uint8_t ctor:1; // constructor call flag
uint8_t hoist:1; // statement hoisting (imports)
uint8_t temporary; // temp index flag
uint32_t token_line;
union {
uint32_t length;
njs_variable_reference_t reference;
njs_value_t value; // literal values
njs_vmcode_t operation; // opcode for operations
njs_parser_node_t *object;
njs_mod_t *module;
} u;
njs_str_t name;
njs_index_t index; // result index after generation
njs_parser_node_t *left; // first child / object
njs_parser_node_t *right; // second child / property name
njs_parser_node_t *dest; // destination hint
// ... scope, etc.
};node->index is set during code generation. It encodes:
index | level_type (4-bit) | var_type (4-bit)
Level types: LOCAL=0, CLOSURE=1, GLOBAL=2, STATIC=3.
After generating a node, its index field contains the register/slot
where the result is stored at runtime.
For obj?.prop:
OPTIONAL_CHAIN (op: NJS_VMCODE_OPTIONAL_CHAIN)
+-- left: NAME(obj) <- base, null-checked
+-- right: PROPERTY(ref, "prop") <- chain body
OPTIONAL_CHAIN_REF is a placeholder node whose index is set by the
generator to match the base's index, so the chain body can reference
the base's computed value without re-evaluating it.
For obj.method?.() — the ?.() pattern on a property base:
OPTIONAL_CHAIN
+-- left: PROPERTY(NAME(obj), STRING("method"))
+-- right: METHOD_CALL
+-- left: PROPERTY(ref_obj, ref_method) <- two refs
+-- right: args
The generator walks the right subtree's left spine to find a METHOD_CALL whose PROPERTY has an OPTIONAL_CHAIN_REF as its right child (distinguishing from normal METHOD_CALL where right is a STRING). It then extracts object/method-name indices from the base PROPERTY:
ref_obj->index = base_prop->left->index(the object)ref_method->index = base_prop->right->index(the method name)
The METHOD_FRAME re-evaluates the property access but correctly sets
this to the object.
The njs_generate_optional_chain_ref() helper walks node->left
repeatedly to find the first NJS_TOKEN_OPTIONAL_CHAIN_REF in a
subtree. This works because AST nodes chain through left pointers:
PROPERTY -> left -> METHOD_CALL -> left -> PROPERTY -> left -> ref
For the two-ref method call case, the generator walks the left spine looking for a METHOD_CALL with the special PROPERTY(ref, ref) pattern instead of using the simple ref-finder.
./build/njs -d -c "expression"./configure --debug-opcode=YES && make njs
./build/njs -o script.js./configure --debug-generator=YES && make njs./configure --address-sanitizer=YES && make njs// In JS code:
debugger;gdb ./build/njs
(gdb) break njs_vmcode_debugger
(gdb) run script.js
# Then inspect: p *njs_scope_value_get(vm, 0x0123)| File | Purpose |
|---|---|
src/njs_lexer.h |
Token type enum, lexer structures |
src/njs_lexer.c |
Tokenizer implementation |
src/njs_parser.h |
Parser node structure, parser state |
src/njs_parser.c |
Parser state machine (~9600 lines) |
src/njs_generator.c |
Code generator state machine (~5200 lines) |
src/njs_vmcode.h |
VM opcode enum, instruction structures |
src/njs_vmcode.c |
VM interpreter loop (~1800 lines) |
src/njs_disassembler.c |
Bytecode disassembly output |
src/njs_scope.h |
Scope/index value access (njs_scope_value) |
src/njs_variable.c |
Variable resolution |
src/njs_function.c |
Function/frame management |
src/test/njs_unit_test.c |
Unit test cases (~23000 lines) |
- 4 spaces indentation, no tabs
- 80 character line limit
- C declarations at the top of function bodies
njs_slow_path()/njs_fast_path()for branch prediction hints- Pool-based memory allocation (
njs_mp_alloc), no individual frees for parser nodes - Error returns:
NJS_ERRORfor fatal,NJS_DONEfor parse errors with message,NJS_DECLINEDfor "not handled"