You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An unfinished design document I worked on w/ Claude.
Didn't complete it because I don't really have a use case for it and C and Clojure have more incompatibilities than I'd assumed.
Note that this doesn't include the full "converstaion" as I got Claude to remove comments once they were resolved.
Things outside of quotes are the design elements.
Quotes like this
Note
Or notes like this are from me
[!NOTE] π€
Notes like this are from Claude
This allowed me to have a "converstaion" with the design doc over serveral sessions.
(defn ^:static helper ^void [] ;; static void helper(void)...)
(defn ^:inline fast-add ^int [^int a ^int b] ;; static inline int fast_add(...)
(+ a b))
(defn ^:export api-func ^void [] ;; with visibility attribute for shared libs...)
Variadic Functions
Note
We should stick to the clojure norms here
What would the typing be?
[!NOTE] π€
Tricky because C's ... is untyped. Options:
Untyped varargs (match C): & args compiles to ..., no type annotation possible
va_list passthrough: ^va_list args for functions that just forward
Typed array alternative: For type safety, encourage ^int* args ^size_t n instead
The & args syntax is Clojure-idiomatic. Just document that it maps to C's ... and is inherently untyped. The type annotation would go on the non-variadic params only.
Yeah, varargs are a bit weird in C it seems
The typing seems fine (as untyped)
I've updated the code, what would this compile to?
[!NOTE] π€
This is tricky. The issue is that C's ... doesn't give you a handle to the args β you need va_list to access them. The vprintf approach only works if you have a va_list. Two options:
Option A: Function can't access args directly
(defnmy-printf ^int [^char* fmt & ...] ;; & ... means "passthrough only"
(printf fmt ...)) ;; special syntax to forward varargs
But C can't forward ... to another variadic function directly.
Expands at compile time, so varargs work naturally.
I'd suggest: & args compiles to ... but is only usable for calling other variadic functions via a compiler-generated wrapper that uses va_start/va_end. Or just document that variadics are best handled via macros.
Hmm, I think ... is out because it doesn't match any existing clojure syntax
How does squint handle this? Is there other prior art for clojure-like languages?
What's the minimum we can support here that's strictly clojure?
Maybe just not supporting varargs at all to start with?
[!NOTE] π€
How Squint handles varargs: Squint's & args compiles to JS rest parameters (...args), which are proper arrays. JS handles this natively, so no special treatment needed.
Prior art:
Carp: Has (defn foo [x & rest] ...) but rest becomes an Array β no direct C varargs mapping
ClojureScript: Same as Squint, relies on JS
C itself: Varargs are fundamentally different from collected args β you can't iterate them without macros
Minimum viable approach: Not supporting & args for C varargs initially is reasonable. Options:
Defer: Document that C variadics are out of scope v1. Users use macros or explicit arrays.
Macro-only: & args only works in defmacro, expands at compile time
va_list explicit: Support ^va_list as a type for wrapping existing variadic C functions
;; Option 3: Explicit va_list for wrapping
(defnmy-vprintf ^int [^char* fmt ^va_list args]
(vprintf fmt args))
;; Calling variadic C functions works naturally
(printf"hello %s %d" name age) ;; just emits printf("hello %s %d", name, age)
I'd suggest: defer defining variadic functions but allow calling C variadic functions freely. Most use cases are calling printf/sprintf, not defining new variadics.
;; Calling variadic C functions works
(printf"Value: %d\n" x)
;; Defining variadic functions: use macros instead
(defmacrolog [fmt & args]
`(printf ~fmt ~@args))
Hmm, this is interesting. It's wrapping up a common C pattern in a nice syntax
I wonder if we could instead have the struct syntax support unions and have a special macro for tagged unions?
[!NOTE] π€
Good instinct. You could decompose it:
;; Low-level: raw union (no tag)
(defunionIntCollData
[^int* array-data
^IntNode* list-head])
;; Low-level: manual tagged struct
(defstructIntColl
[^IntCollTag tag
^IntCollData data])
;; High-level: macro that generates both
(deftaggedIntColl
(Array ^int* data ^size_t len)
(List ^IntNode* head))
This way defunion and defstruct are primitive, and deftagged (or a user macro) builds on them. More composable, less magic in the core.
I was more thinking there is struct and union functions like:
(defs
(struct
[^int* tag
(union [^(struct [^int* data ^size_t len]) array])
(union [^(struct [^IntNode* head]) list])]))
Or something like that anyway, I'm not sure the above would work.
What would?
[!NOTE] π€
The idea of composable struct and union forms is good. Here's a syntax that would work:
;; Named union (like defstruct but for unions)
(defunionValue
[^int i
^float f
^char* s])
;; Anonymous union inline in a struct
(defstructEvent
[^EventType type
^(union [^KeyData key]
[^MouseData mouse]) data])
;; Anonymous struct inline (less common but possible)
(defunionPacket
[^(struct [^uint16 len ^uint8 flags]) header
^uint32 raw])
The key insight: ^(union ...) and ^(struct ...) as type annotations for anonymous inline types. This parallels ^(array int 10) and ^(fn int -> int).
For a tagged union specifically:
(defstructIntColl
[^IntCollTag tag
^(union [^(struct [^int* data ^size_t len]) array]
[^(struct [^IntNode* head]) list]) data])
The deftagged macro could then be sugar that generates both the enum and this struct pattern.
I was thinking more than it being a type-function (which on looking at further doesn't seem very clojure-y)
that instead union was a function which wrapped the union {...} syntax
(defstructIntColl
[^IntCollTag tag
(union [^(struct [^int* data ^size_t len]) array]
[^(struct [^IntNode* head]) list]) data])
Possibly the same would be true for struct? Although I'm not sure how that would work syntactically
Could you consider the above plus also provide any prior are there is for function call-like type hints in clojure itself (or it's libraries)?
[!NOTE] π€
I like the (union ...) form approach β it's more Clojure-y than type annotations. Here's how it could work:
(defstructIntColl
[^IntCollTag tag
(union
[^int* data ^size_t len] ;; variant 1: array
[^IntNode* head]) ;; variant 2: list
])
The challenge is naming the variants and the union field itself. Options:
;; Option A: Named variants inline
(defstructIntColl
[^IntCollTag tag
(union:data;; field name
[:array ^int* data ^size_t len] ;; variant name + fields
[:list ^IntNode* head])])
;; Option B: Separate field name
(defstructIntColl
[^IntCollTag tag
^(union [:array [^int* data ^size_t len]]
[:list [^IntNode* head]]) data])
Prior art for function-like type hints in Clojure:
Malli: Uses vectors like [:map [:x :int]], [:vector :int]
Spec: Uses (s/keys :req [:x]), (s/coll-of int?)
These all use function-call-like forms for complex types. So (union ...) and (struct ...) as forms (not type annotations) fits the pattern. The key is: are they forms that emit code or type annotations?
I'd suggest: (union ...) is a form within defstruct that describes an anonymous union field:
(defstructIntColl
[^IntCollTag tag
(union data ;; field name first
(variant:array [^int* ptr ^size_t len])
(variant:list [^IntNode* head]))])
For runtime type choice:
(defenumIntColl
(Array ^int* data ^size_t len)
(List ^IntNode* head))
int*find(int*arr, size_tlen, inttarget) {
for (size_ti=0; i<len; i++) {
if (arr[i] ==target) returnarr+i;
}
returnNULL;
}
Break and Continue
No explicit break or continue. Use loop/recur with conditional logic:
;; Print even numbers less than 50
(loop [i 0]
(when (< i 50)
(when (even? i)
(printf"%d\n" i))
(recur (inc i))))
Outputs:
for (inti=0; i<50; i++) {
if (i % 2==0) {
printf("%d\n", i);
}
}
Arrays and Pointers
Array Access
(aget arr i) ;; arr[i]
(aset arr i val) ;; arr[i] = val
(aget matrix r c) ;; matrix[r][c]
Pointer Operations
Note
Would deref work better for getting the address of a pointer? Possibly not. I don't want to introduce much if any new syntax to c
Is pointer arithmetic compiled to a different type in C? Or can it use the same arithmetic as in clojure? (+ - etc)
Casting is interesting, there's likely some better syntax though? Possibly:
(my-func ^void* ptr) - cast when tagged at usage site (not definition)
[!NOTE] π€
Pointer arithmetic: In C, ptr + n is type-aware (adds n * sizeof(*ptr)). Using (+ ptr 5) would work if the compiler knows ptr is a pointer type β it can emit ptr + 5 directly. No need for ptr+ if we have type info.
Casting at call site: I like (my-func ^void* ptr) β the metadata says "cast this before passing". Clean and consistent with type annotations. Could work for any expression:
deref vs addr: deref for *ptr is Clojure-ish (atoms use deref). For address-of, & reader macro would be nice but might conflict. (addr x) is fine, or even just prefix &x if the reader supports it.
Great, for arithmetic just use the existing operators (+ - etc)
For casting add a note and examples of that
I definitely don't want to add any reader syntax as clojure has barely any of that
The problem with implementing deref for a value to get it's pointer is that we can no longer use deref for other types of values
What other options do we have that are nice and ergonomic? but still fit in all my requirements?
[!NOTE] π€
You're right that deref is overloaded in Clojure (atoms, refs, agents, futures). Options for pointer dereference:
deref with type dispatch: If compiler knows it's a pointer type, deref means *ptr. Risk: confusion.
* as a function: (* ptr) β but conflicts with multiplication. Bad.
ptr-get/ptr-set!: Explicit, clear, ugly.
@ reader macro: @ptr β but you don't want new reader syntax.
.- field access style: (.- ptr) β too cryptic.
Just use deref: It's the closest Clojure concept. The compiler knows types, so (deref atom-like-thing) vs (deref ^int* ptr) are distinguishable.
Recommendation: Keep deref for pointer dereference. In a C-targeting language, you won't have Clojure atoms/refs anyway. The mental model is "deref gets the value at a reference" β which is exactly what *ptr does.
For address-of, addr is fine and readable.
;; Pointer dereference
(deref ptr) ;; *ptr;; Address-of
(addr x) ;; &x;; Pointer arithmetic (just use regular operators)
(+ arr 5) ;; arr + 5 (type-aware)
(- p q) ;; p - q;; Casting at call site
(func ^void* ptr) ;; func((void*)ptr)
(func ^int (+ x y)) ;; func((int)(x + y))
Array Literals
Note
Feels like the syntax for a stack array can be worked on a bit more
[!NOTE] π€
Options:
;; Current
(stack-array int 100)
;; More Clojure-ish: make-array with type
(make-array int 100) ;; Clojure has this!;; Using type annotation on literal
^(array int 100) [] ;; empty fixed array;; Just use let binding with type
(let [^(array int 100) arr] ;; declares int arr[100]
...)
make-array exists in Clojure and is idiomatic. The distinction between stack/heap could be context-dependent (inside let = stack, returned = must be heap).
I like make-array, would metadata be the wrong way to make this stack or not?
Hmm, for heap stuff how is that done in C? Isn't it done explicitly with malloc/calloc?
I don't want things to magically be on the heap unless the user explicitly does that, or a function which is documented does that
[!NOTE] π€
You're exactly right. In C:
Stack: int arr[100]; β automatic storage, no explicit allocation
Functions like map, filter, reduce are macros that expand to loops:
Note
I think it's worth noting the reason for this
Afaik, it's for typing issues
Otherwise when you would only be able to define reduce for one type signature, which is limiting
One thing that might need to be expanded on is how the types do get inferred by macros and here specifically
What does the macro look like to do that?
[!NOTE] π€
Yes, the core reason is: C has no generics (pre-C11 _Generic), so you can't write one reduce function that works on all types.
How macros get type info: The macro would need access to compiler metadata. Options:
Explicit type args: (reduce ^int + 0 xs) β user specifies accumulator type
Infer from initial value: 0 is int, 0.0 is double β macro deduces type
Infer from collection type: If xs is annotated ^int*, macro knows element type
Basically types aren't checked during macro expansion
[!NOTE] π€
You're right β the macro doesn't need type info. It just expands to code, and the C compiler handles types. The expanded code uses the same variables, so types flow through naturally:
(reduce + 0 xs) ;; xs is ^int*, 0 is int
Expands to code where acc inherits type from 0 (int), and (aget xs i) inherits from xs (int). The C compiler sees:
intacc=0;
for (...) { acc=acc+xs[i]; }
No type tracking needed in the macro β just textual expansion. The C type system does the rest.
(reduce + 0 xs)
Expands to:
intacc=0;
for (size_ti=0; i<xs_len; i++) {
acc=acc+xs[i];
}
Closures
Note
This is an area I'm still unsure about
It feels like the language should pick up clojure's semantics here
But I don't like the idea of this language being too "thick" of a layer over C
This feels like something to consider more broadly across the whole language
Should most of the language be 1:1 transpilation and some features improvments?
Should we make everything fully representable?
Perhaps let's come back to this after considering more of the rest of the design.
We should also consider prior art here for other transpilation languages, what behaviour would be more or less surprising to a user?
[!NOTE] π€
Prior art from other transpilers:
Carp: Closures that capture must be explicit with copy or reference. No implicit heap allocation.
Nim: Has closures but they're heap-allocated with GC. Explicit {.closure.} pragma.
Zig: No closures at all. You pass context explicitly.
Rust: Closures work, but ownership/borrowing makes captures explicit.
Spectrum of options:
Zig-style (thin): No closures. Always pass context explicitly. Most C-like.
Carp-style (medium): Inline closures work, escaping closures require explicit handling.
I'd suggest Carp-style: inline closures "just work" (no allocation), escaping closures are a compiler error that tells you to restructure. This is honest about C's limitations while still being ergonomic for common cases.
Could you show me various examples of the medium approach?
What would it look like in that case to handle it "manually" in this case?
[!NOTE] π€
Medium approach examples:
1. Inline closure β works automatically:
(let [multiplier 10]
(map (fn [x] (* x multiplier)) xs))
Compiler inlines multiplier since it's in scope. No allocation.
In this area I'm in a couple minds
I definitely want it to be possible to write code that compiles transparently to C without much ceremony
But it also seems very useful to provide some level of abstraction for areas like this
Maybe there's a "core" set of functions that have zero abstraction but certain features have more abstraction build in?
Maybe functions work on structs and arrays normally, but we provide a Vector (dynamic array) and maps with more features that work with
more functions.
I.e. you can only use reduce, map, filter, assoc etc when using types with a specific interface?
It would also be nice to support as much of the clojure standard library as possible with zero abstraction
[!NOTE] π€
I think a layered approach works:
Layer 0 (Core): Pure transpilation, zero magic
defn, defstruct, let, if, loop, arithmetic, aget/aset
Works on raw C types
Layer 1 (Macros): Zero-cost abstractions via macros
reduce, map, filter, doseq β expand to loops
Work on anything with known length (arrays, structs with len field)
Monomorphize at compile time
Layer 2 (Runtime library, opt-in): Provided data structures
Vec, HashMap, String β dynamic, heap-allocated
Full assoc, get, conj support
Could use Boehm GC or manual memory
This way: simple C code uses Layer 0, functional style uses Layer 1 (still zero-cost), and people who want richer collections opt into Layer 2.
Hmm, is there a concept similar to interfaces we can use?
In clojure there is protocols which we could consider as something similar
[!NOTE] π€
Protocol-like patterns in C:
1. Compile-time protocols (via macros):
A "protocol" could be a contract that macros check at expansion time:
3. Monomorphizing macro approach:
Most practical: reduce, map etc. are macros that generate type-specific code. They don't need protocols if they can see the type at macro-expansion time.
I'd suggest: start without protocols, see if macros + type annotations are sufficient. Add compile-time protocols later if there's demand.
Or simpler: <stdio.h> vs "myheader.h" as literal strings.
I think using the C syntax for requires makes sense:
(nsmylib
(:require ["<stdio.h>"]
["myheader.h"]))
What about :as, is that a feature that makes sense? Or should we require :refer :all or something
or maybe just support use which I guess does the same thing?
[!NOTE] π€
:as for C headers: Doesn't map well to C. C #include is textual inclusion β there's no namespacing. Everything becomes globally visible. So :as would be misleading.
What makes sense:
(:require ["<stdio.h>"]) β just includes, all symbols visible
No :as, no :refer β not applicable to C includes
For this project's own modules (other .cljc files), you could potentially have:
(nsmylib
(:require ["<stdio.h>"] ;; C header, just include
[myutils :as u])) ;; our .cljc module, could prefix
But that requires the compiler to track symbols and generate prefixed names. Low priority β start with just includes, no namespacing. C devs expect global symbols anyway.
Include
Use :require in the ns form with strings. Angle brackets vs quotes detected from string content:
"<stdio.h>" β #include <stdio.h> (starts with <)
"myheader.h" β #include "myheader.h" (no angle brackets)
I'm hoping we can avoid this with more thought for the design
I can see there's a use for these for interacting with other C files, but maybe we can have better first class support for that?
What might it look like?
Maybe something in the ns form? That might be too specific though
[!NOTE] π€
Common c-raw use cases and how to replace them:
Use case
Instead of c-raw
Include guards
(ns foo (:guard :once)) or just document "use pragma once"
Pragma once
(pragma :once) β first-class form
Define constants
(def ^:macro NAME value)
Define macros
Use Clojure macros instead
Platform ifdefs
(when-defined _WIN32 ...)
Extern C
(ns foo (:extern-c true)) for C++ compat
You could probably eliminate 90% of c-raw needs. Keep it as an escape hatch but discourage use. If people need it a lot, that's a signal to add first-class support.
I think a pragma form makes sense which we can include in ns similar to require.
There are surely other pragmas that we might want to support in the future too
I think it makes sense for #define to be supported in def, I'm not sure about :macro. What other syntax could we use?
Are there libraries that require us to define C macros to use them?
I don't like when-defined maybe we can have a defined? form and pattern match (when/if (defined? ...)) Is there cases of that in Squint?
I think we need to support extern C too, I'm not sure it make sense in the ns form though. What would it look like as a separate form or as metadata?
[!NOTE] π€
Pragma in ns: Yes, (:pragma :once) or (:pragma ["pack" "push" 1]) for more complex ones.
#define via def: ^:define or ^:c-macro could work:
(def ^:define BUFFER_SIZE4096) ;; #define BUFFER_SIZE 4096
(def ^:define MAX (fn [a b] ...)) ;; #define MAX(a, b) ...
Libraries requiring #defines: Yes, common patterns:
#define STB_IMAGE_IMPLEMENTATION before including stb headers
#define GLAD_GL_IMPLEMENTATION
Feature toggles like #define USE_OPENGL
defined? form: I like (if (defined? SYMBOL) ...) better than when-defined. It composes:
(if (defined?_WIN32)
(windows-impl)
(posix-impl))
Squint doesn't have this β JS doesn't have preprocessor conditionals.
(nsvec; Just a name, doesn't mean anything
(:require ["<stdio.h>"] ; => #include <stdio.h>
["other.h"] ; => #include "other.h"
[something :as s])) ; Maybe has some other features or something
What does this translate to in Squint for comparison?
An interesting inspiration might be :gen-class
We could take inspiration from that for prefixing functions like some libraries do, maybe cram some other features into that
:prefix for function name prefixing (like :gen-class)
:export to list public symbols
(nsmy.vec
(:require ["<stddef.h>"])
(:pragma:once)
(:prefix"vec_") ;; vec_new, vec_push, etc.
(:export [new push pop])) ;; only these are non-static
I'm not sure about :prefix but it seems fine for now, I think :export makes more sense as metadata on a def/defn/declare though
Remove export and remove this note
User Controls Output Type
The compiler only transpiles. User decides the output filename:
mycompiler vec.cljc -o vec.h # output as header
mycompiler vec.cljc -o vec.c # output as implementation
mycompiler main.cljc -o main.c
No Automatic Include Guards
Use (:pragma :once) in the ns form:
(nsvec
(:pragma:once))
Outputs:
#pragma once
Header vs Implementation Pattern
Note
The declarations don't feel clojure-y to me, but maybe they don't need to?
Maybe these could be:
(defstruct ^:declare Vec)
(declare ^:struct Vec)
[!NOTE] π€
I prefer (declare ^:struct Vec) β it extends existing declare semantics with metadata. Consistent pattern:
(declare foo) β forward declare function (existing Clojure)
Clojure declare doesn't support specifying argument and return types
How else could this be specified? More metadata?
[!NOTE] π€
You're right that standard Clojure declare is just (declare foo). Options:
Option A: Extended declare (what we have)
(declarevec-new ^Vec* [])
Not standard Clojure, but clear and mirrors defn structure.
Option B: defn with :declare metadata
(defn ^:declare vec-new ^Vec* [] ...)
Problem: ... or empty body is weird.
Option C: Metadata map on declare
(declare ^{:returns Vec* :args []} vec-new)
More verbose but uses standard metadata.
Option D: defn without body = declaration
(defnvec-new ^Vec* []) ;; no body = declaration only
This overloads defn meaning based on body presence.
I'd suggest keeping the extended declare syntax β it's intuitive and parallels defn. It's a small deviation from Clojure, but this is a C-targeting language anyway.
vec.c.cljc (definitions):
(nsvec
(:require ["vec.h"]))
(defstructVec [^int* data ^size_t len ^size_t cap])
(defnvec-new ^Vec* []
...)
(defnvec-push ^void [^Vec* v ^int val]
...)
Header-Only Library Pattern
Use ifdef form for conditional compilation blocks:
(nsvec2
(:pragma:once))
;; Declarations (always visible)
(defstructVec2 [^float x ^float y])
(declarevec2-add ^Vec2 [^Vec2 a ^Vec2 b])
;; Implementation (guarded)
(ifdef VEC2_IMPL
(defn ^:static vec2-add ^Vec2 [^Vec2 a ^Vec2 b]
(Vec2. (+ (:x a) (:x b))
(+ (:y a) (:y b)))))
I'm not sure about defconst, seems fine
Maybe (def ^:const BUFFER_SIZE 4096) would be better?
It feels like it would be difficult to tell from clojure whether something is a C macro or a runtime constant?
[!NOTE] π€
Good point about the distinction. Options:
;; C macro (#define)
(def ^:macro BUFFER_SIZE4096) ;; or (defmacro-c ...);; C const variable (static const int)
(def ^:const ^int BUFFER_SIZE4096)
The difference matters:
#define has no type, evaluated at preprocessing
static const has a type, exists at runtime (but often optimized away)
Using metadata makes the distinction explicit. ^:macro for #define, ^:const for typed constant.
Ok, I'm coming around to :macro now.
I think I would like some more examples of how :macro could be used throughout the language as well as in practice though
Can you give me some more of those in a note below?
[!NOTE] π€
Examples of ^:define (C preprocessor macros):
;; Simple constants
(def ^:define BUFFER_SIZE4096)
(def ^:define PI3.14159)
(def ^:define NULL0)
;; Feature flags (common pattern with stb-style headers)
(def ^:define STB_IMAGE_IMPLEMENTATION) ;; no value = just defined;; Macro functions
(def ^:define MAX (fn [a b] (if (> a b) a b)))
(def ^:define MIN (fn [a b] (if (< a b) a b)))
(def ^:define ARRAY_LEN (fn [arr] (/ (sizeof arr) (sizeof (aget arr 0)))))
;; Debug helpers using __LINE__, __FILE__
(def ^:define DEBUG_LOG (fn [msg] (printf"[%s:%d] %s\n" __FILE__ __LINE__ msg)))
;; Stringification (token pasting) - might need special syntax
(def ^:define STRINGIFY (fn [x] #x)) ;; needs thought
I wonder if we even need this?
Maybe lisp macros are sufficient?
I suppose, users may want to define C macros for use in other libraries?
[!NOTE] π€
Lisp macros are sufficient for your code. But C macros are needed when:
You're writing a header that C users will consume
You need preprocessor-specific features (__LINE__, __FILE__, token pasting)
You're wrapping existing C macro patterns
For pure Clojure-to-C projects, Lisp macros are better. But for interop with C ecosystem, having c-define as an escape hatch makes sense. Low priority though β can be added later if needed.
Could this instead be done using :define in def?
[!NOTE] π€
Yes! Using ^:define on def is cleaner:
;; Macro function via def
(def ^:define MIN (fn [a b] (if (< a b) a b)))
This is consistent with how we handle constants vs macros. No need for a separate c-define form.
As a side note, it would be cool if the language added an easy (but compatible with existing tools) way to configure builds
Maybe it's just a library that can be run or something?
Not important to start with though
[!NOTE] π€
A build.clj that runs with Babashka would be nice:
Future consideration: A build.clj script running via Babashka could provide a Clojure-native build system. Low priority β users can use Make/CMake initially.
Clojure Source (.cljc)
β
[Reader] β Clojure/JVM reads the source
β
[Macro Expansion] β Macros run on JVM
β
[Code Generation] β Walk AST, emit C
β
C Source (.c/.h)
β
[User's Build System] β gcc, clang, make, cmake, etc.
β
Binary
Compiler Does
Note
Does it need to do the type checking? It could leave that to the C compiler completely maybe?
What benefits would there be to doing it ourselves?
Also possibly the build could work on a given folder like squint and be configured to compile files in place or to a different folder?
Broadly this is right though
[!NOTE] π€
Benefits of doing some type checking:
Better error messages β "expected ^int, got ^float at line 42" vs cryptic C compiler error
Catch errors earlier β Before generating invalid C
Enable type-based code gen β Macros like reduce need to know types to expand correctly
IDE support β Can provide autocomplete, error highlighting
What to check:
Minimal: Just track types for macro expansion, let C compiler do the rest
Medium: Check function call arities, basic type mismatches
Full: Replicate C's type system (probably overkill)
Type checking approach: Minimal β catch obvious errors (arity, missing annotations, struct field access on non-structs) but delegate deep type checking to the C compiler. Generate #line directives to map C compiler errors back to source.
Parse Clojure syntax
Expand macros (on JVM at compile time)
Type check (based on annotations)
Emit C code
Output to specified file
Compiler Does Not
Invoke C compiler
Generate build files
Add include guards automatically
Manage header/implementation split
Provide a runtime
Optional Runtime Library
Note
As mentioned above, maybe some functions require these types or specific interfaces?
Possibly they get included and assume the C compiler will tree-shake unused code if they don't get used?
[!NOTE] π€
C compilers don't tree-shake β they compile what you give them. Linkers can do dead code elimination with -flto (link-time optimization) and -ffunction-sections + -gc-sections.
Better approach: header-only library with guards:
// Only compile what's used#ifdefCLJC_USE_VEC// Vec implementation#endif#ifdefCLJC_USE_HASHMAP// HashMap implementation#endif
Or generate only the runtime functions that are actually used in the user's code. The compiler knows what's needed.
An optional library providing common utilities. User explicitly opts in.
Core Data Structures (if desired)
(require '[cljc.runtime :as rt])
;; Dynamic vector
(let [v (rt/vec-new)]
(rt/vec-push v 1)
(rt/vec-push v 2)
(rt/vec-free v))
Value Type (for dynamic typing)
(require '[cljc.dynamic :as d])
;; Boxed values with runtime type tags
(let [x (d/int42)
y (d/string"hello")]
(d/print x)
(d/print y))
This would use Boehm GC or similar. Completely optional.
Open Questions and Gaps
Unresolved Design Decisions
Namespace handling β Do namespaces map to anything in C? Probably just for macro organization, no C output.
Possibly the name can be like a file path relative the root? foo.bar.baz or maybe it means nothing. Unsure
I think the ns function is definitely useful as seen above for includes and pragmas
[!NOTE] π€
The namespace name could be used for:
Function name prefixing (if :prefix not set): my.vec/new β my_vec_new
Organizing macros at compile time
Documentation / tooling
But emitting nothing to C is also valid β just organizational. I'd say: optional prefixing, otherwise just for tooling.
Decision: Namespace names are organizational only β for macro scoping and tooling. No C output from namespace names unless explicit :prefix option is used.
Map literals {:x 10 :y 20} require type context to work:
(let [^Point p {:x10:y20}] ...) ;; type annotation tells compiler it's Point
Multi-dimensional Arrays
Use multi-arity aget/aset like standard Clojure:
(aget matrix i j) ;; matrix[i][j]
(aset matrix i j val) ;; matrix[i][j] = val
(aget cube x y z) ;; cube[x][y][z]
Collection Length for Seq Operations
Unsure here, needs more thought?
Maybe these functions require a function set that provides length?
[!NOTE] π€
Options for knowing collection length:
Convention: Arrays always come with length: (reduce + 0 xs xs-len)
Struct wrapper: (defstruct Slice [^int* data ^size_t len]) β seq ops work on Slice
Macro inspects type: If xs is ^(array int 10), length is known. If ^int*, require explicit length.
Protocol-like pattern: Types that support seq ops must have a len field or function
Option 3 is cleanest: fixed arrays have compile-time length, pointers require explicit length parameter.
I really don't like the idea of requiring the length, it means changing the contract of these functions
What would a protocol like patter look like for this?
It seems like the only option that means that we don't change the syntax
[!NOTE] π€
Protocol-like pattern for seq operations:
The compiler could recognize types that are "seqable":
;; Fixed-size arrays: length known at compile time
(let [^(array int 10) arr (make-array int 10)]
(reduce + 0 arr)) ;; compiler knows len=10;; Slice struct: has :data and :len fields
(defstructSlice [^int* data ^size_t len])
(let [^Slice s ...]
(reduce + 0 s)) ;; compiler extracts (:data s) and (:len s);; Vec struct: same pattern
(defstructVec [^int* data ^size_t len ^size_t cap])
(reduce + 0 my-vec) ;; works because Vec has :data and :len
The "protocol" is implicit: any struct with ^T* data and ^size_t len fields is seqable. The macro expands differently based on the type:
;; For fixed array
(reduce + 0 arr) β loop over 0..10
;; For Slice/Vec
(reduce + 0 s) β loop over 0..(:len s) using (:data s)
;; For bare pointer β compiler error
(reduce + 0 ^int* ptr) β "Cannot iterate ^int*: unknown length. Use Slice or pass length."
This keeps the syntax clean while being explicit about what's supported. Bare pointers must be wrapped in a Slice or use explicit length.
REPL Implementation
Future consideration: Development REPL via dlopen trick:
Compile expression to .so
dlopen with RTLD_GLOBAL
Call eval function
Keep library loaded
Needs more design work. Low priority for v1.
C++ Interop
Use extern "C" blocks for C++ compatibility:
(extern-c
(defnmy-func ^void [] ...))
;; Or at ns level
(nsmylib
(:extern-ctrue))
Features Explicitly Not Included
Garbage collection (use runtime library if needed)
Runtime reflection
Runtime eval
Persistent data structures (use runtime library if needed)
Automatic memory management
Exceptions (use return codes or setjmp/longjmp via c-raw)
Comparison to Similar Projects
Project
Language
Target
GC
Types
This project
Clojure syntax
C
No
Explicit
Squint
ClojureScript
JavaScript
JS GC
Dynamic
Carp
Lisp
C
Borrow checker
Inferred
Zig
Custom
Machine code
No
Explicit
Nim
Python-like
C
Optional
Inferred
Example: Complete Small Program
point.h.cljc:
(nspoint
(:pragma:once))
(defstructPoint [^int x ^int y])
(declarepoint-new ^Point [^int x ^int y])
(declarepoint-add ^Point [^Point a ^Point b])
(declarepoint-print ^void [^Point* p])
point.c.cljc:
(nspoint
(:require ["<stdio.h>"]
["point.h"]))
(defnpoint-new ^Point [^int x ^int y]
(Point. x y))
(defnpoint-add ^Point [^Point a ^Point b]
(Point. (+ (:x a) (:x b))
(+ (:y a) (:y b))))
(defnpoint-print ^void [^Point* p]
(printf"Point(%d, %d)\n" (:x p) (:y p)))
main.c.cljc:
(nsmain
(:require ["point.h"]))
(defnmain ^int []
(let [a (point-new1020)
b (point-new55)
c (point-add a b)]
(point-print (addr c))
0))