Analysis of kysely-clickhouse Library

Comprehensive analysis of the kysely-clickhouse library addressing open questions about limitations, implementation completeness, code quality, and ClickHouse-specific SQL support.

1. Inherent Limitations of Using kysely-clickhouse vs Direct ClickHouse Client

Answer: There are NO inherent architectural limitations, but there ARE practical limitations due to incomplete implementation.

What Works Well:

Query building: Standard CRUD operations (SELECT, INSERT, UPDATE, DELETE) work fine
Type safety: Full TypeScript support with database schema typing
Schema introspection: Works correctly via information_schema
Migrations: Kysely's migration system is supported

Current Practical Limitations:

No query-level SETTINGS support: ClickHouse's SETTINGS clause is critical for performance tuning but not accessible through Kysely's API
No table ENGINE specification: Cannot specify ClickHouse table engines (MergeTree, ReplicatedMergeTree, etc.) through Kysely's schema builder
No ORDER BY in CREATE TABLE: MergeTree tables require ORDER BY clause, not supported in Kysely
No PARTITION BY support: ClickHouse's partitioning is a first-class feature but unavailable
No MATERIALIZED/ALIAS column support: These ClickHouse-specific column types aren't in Kysely's schema builder
No TTL support: Data lifecycle management via TTL is unavailable

Workaround:

All ClickHouse-specific features can be accessed via sql template tags (raw SQL):

await db.schema
  .createTable('events')
  .$call(qb => sql`${qb} ENGINE = MergeTree() ORDER BY timestamp`.execute())

Verdict: Not an inherent limitation of Kysely's design, but ClickHouse-specific features require raw SQL workarounds.

2. Completeness of Implementation & Extension Points

Kysely Extension Points (5 total):

✅ Dialect: Fully implemented
✅ Driver: Fully implemented
✅ QueryCompiler: Implemented (uses MysqlQueryCompiler)
✅ DatabaseIntrospector: Fully implemented
✅ DialectAdapter: Fully implemented

What's Implemented:

src/ClickhouseDialect.ts:34-35

createQueryCompiler(): QueryCompiler {
  return new MysqlQueryCompiler();
}

Uses MySQL's query compiler, which works because ClickHouse's basic SQL syntax (SELECT, INSERT, etc.) is MySQL-compatible.

src/ClickhouseAdapter.ts:3-24

export class ClickhouseAdapter extends DialectAdapterBase {
  get supportsTransactionalDdl(): boolean {
    return false  // ✅ Correct - ClickHouse has no transactions
  }
  
  get supportsReturning(): boolean {
    return true   // ⚠️ Questionable - see below
  }
  
  async acquireMigrationLock(): Promise<void> {
    // Empty implementation - no locking
  }
}

src/ClickhouseConnection.ts:43-46

prepareQuery<O>(compiledQuery: CompiledQuery): string {
  // Converts UPDATE to ALTER TABLE UPDATE
  return compiledSql.replace(
    /^update ((`\w+`\.)*`\w+`) set/i,
    "alter table $1 update"
  )
}

Smart workaround to translate standard UPDATE to ClickHouse's ALTER TABLE UPDATE syntax.

src/ClickhouseConnection.ts:49-103 Handles INSERT with special cases:

ValuesNode → Uses ClickHouse native insert() API with JSONCompactEachRowWithNames format
SelectQueryNode → Uses parameterized queries for INSERT...SELECT

Missing/Incomplete Extension Points:

Should Implement a ClickHouseQueryCompiler:

The current MysqlQueryCompiler works for basic queries but lacks:

ENGINE clause support in CREATE TABLE
ORDER BY clause support in CREATE TABLE (required for MergeTree)
PARTITION BY clause support
SETTINGS clause in queries
TTL clause support
MATERIALIZED, ALIAS, EPHEMERAL column types

3. Code Quality Issues & Recommended Improvements

🔴 Critical Issues:

3.1. Incorrect `supportsReturning` Setting

src/ClickhouseAdapter.ts:8-10

get supportsReturning(): boolean {
  return true
}

Problem: ClickHouse does NOT support RETURNING clause. Setting this to true is misleading.

Evidence from ClickHouse docs: No mention of RETURNING in INSERT/UPDATE/DELETE operations. ClickHouse uses different patterns:

INSERT returns summary statistics
UPDATE (ALTER TABLE UPDATE) is asynchronous with no return values
DELETE (ALTER TABLE DELETE) is asynchronous with no return values

Severity: HIGH - Could cause runtime errors if users try to use .returning() API

Fix: Change to return false

3.2. Empty Migration Lock Implementation

src/ClickhouseAdapter.ts:12-24

async acquireMigrationLock(): Promise<void> {
  // Empty - no implementation
}

async releaseMigrationLock(): Promise<void> {
  // Empty - no implementation  
}

Problem: Multiple migration processes could run concurrently without any locking mechanism. Since ClickHouse doesn't support transactions (supportsTransactionalDdl = false), this is dangerous.

Severity: HIGH - Could corrupt migration state

Solutions:

Use system.zookeeper for distributed locking (if ZooKeeper is available)
Use a lock table pattern with ReplacingMergeTree engine
Document that migrations should only run from single process

3.3. Unsafe String Escaping in prepareQuery

src/ClickhouseConnection.ts:28-40

prepareQuery<O>(compiledQuery: CompiledQuery): string {
  let i = 0
  const compiledSql = compiledQuery.sql.replace(/\?/g, () => {
    const param = compiledQuery.parameters[i++]
    if (typeof param === 'number') {
      return `${param}`
    }
    return `'${param.replace(/'/gm, `\\'`).replace(/\\"/g, '\\\\"')}'`
  })
}

Problems:

Using regex replace for SQL escaping is fragile
Escape sequence \\' is incorrect - should be '' (double single quote) per SQL standard
The .replace(/\\"/g, '\\\\"') line is confusing and potentially wrong
No handling for NULL values, dates, arrays, etc.

Severity: HIGH - SQL injection risk for complex data types

Recommendation: Let ClickHouse client library handle parameterization rather than manual string interpolation

3.4. Incomplete Type Inference in INSERT...SELECT Parameterization

src/ClickhouseConnection.ts:78-92

const query = compiledQuery.sql.replace(/\?/g, () => {
  const val = compiledQuery.parameters[counter];
  if (typeof val === 'string') {
    return `{p${counter++}: String}`;
  }
  if (typeof val === 'number') {
    return `{p${counter++}: UInt32}`;  // ⚠️ What about floats, negative numbers?
  }
  if (typeof val === 'object' && val instanceof Date) {
    return `{p${counter++}: DateTime}`;
  }
  return `{p${counter++}: String}`;  // Fallback
});

Problems:

UInt32 assumption is wrong for floats, negative numbers, or large integers
No support for arrays, tuples, or other ClickHouse types
Fallback to String is too broad

Severity: MEDIUM - Type coercion errors at runtime

3.5. Missing Error Handling

src/ClickhouseConnection.ts:148-158

async beginTransaction() {
  throw new Error('Transactions are not supported.');
}

Problem: The error message is correct, but Kysely might still try to begin transactions during migrations if not properly configured.

Severity: LOW - Error is explicit, but could be prevented

🟡 Design Issues:

3.6. Session ID Per Connection

src/ClickhouseConnection.ts:22

session_id: randomUUID(),

Issue: Creating a new connection for every query with random session IDs prevents:

Using temporary tables across queries
Session-level settings persistence
Proper connection pooling benefits

Severity: MEDIUM - Impacts temporary table functionality (as seen in tests)

Note: The test uses kysely.connection().execute() wrapper, which presumably maintains the same connection, so temporary tables work within that scope.

3.7. Inconsistent Query Execution Paths

The executeQuery method has 5 different code paths:

INSERT with ValuesNode → ClickHouse client insert() API
INSERT with SelectQueryNode → parameterized command()
SELECT → query() with manual string interpolation
UPDATE → query() with manual string interpolation
Everything else → command()

Severity: MEDIUM - Hard to maintain, test, and debug

✅ Good Design Decisions:

UPDATE translation: Converting UPDATE to ALTER TABLE UPDATE is correct
Using ClickHouse native insert(): For VALUES inserts, using the native API with JSONCompactEachRowWithNames format is efficient
Session-aware connections: Using session_id enables temporary tables within a connection scope
date_time_input_format: 'best_effort': Good default for flexible date parsing

4. ClickHouse-Specific SQL Support

Current Support:

Table Engines: ❌ NOT SUPPORTED

Workaround: Use raw SQL with sql template tag

await db.schema.createTable('events')
  .$call(qb => sql`${qb} ENGINE = MergeTree() ORDER BY timestamp`.execute())

Query-Level SETTINGS: ❌ NOT SUPPORTED

Workaround: Use raw SQL

await sql`SELECT * FROM events SETTINGS final = 1`.execute(db)

ALTER TABLE UPDATE: ✅ SUPPORTED (automatic translation in prepareQuery)

await db.updateTable('users')
  .set({ status: 'active' })
  .where('id', '>', 100)
  .execute()
// Translates to: ALTER TABLE users UPDATE status = 'active' WHERE id > 100

INSERT Variations: ✅ PARTIALLY SUPPORTED

VALUES format: ✅ Supported via native API
INSERT...SELECT: ✅ Supported via parameterized queries
SETTINGS in INSERT: ❌ Not supported
FORMAT specification: ❌ Not supported

Temporary Tables: ✅ SUPPORTED

await db.schema
  .createTable('temp_table')
  .temporary()
  .addColumn('id', 'bigint')
  .execute()

Works via kysely.connection().execute(async (db) => { ... }) pattern

Key ClickHouse SQL Differences from MySQL

CREATE TABLE Syntax

ClickHouse requires ENGINE and ORDER BY for MergeTree tables:

CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
    name1 [type1] [NULL|NOT NULL] [DEFAULT|MATERIALIZED|EPHEMERAL|ALIAS expr1],
    name2 [type2] [NULL|NOT NULL] [DEFAULT|MATERIALIZED|EPHEMERAL|ALIAS expr2],
    ...
    PRIMARY KEY(expr1[, expr2,...])
) ENGINE = MergeTree()
ORDER BY expr
[PARTITION BY expr]
[PRIMARY KEY expr]
[SAMPLE BY expr]
[TTL expr [...] ]
[SETTINGS name = value, ...]

Key differences from MySQL:

ENGINE clause is REQUIRED - specifies table engine (MergeTree, Memory, Log, etc.)
ORDER BY clause is REQUIRED for MergeTree - defines sorting key
PARTITION BY clause - ClickHouse-specific partitioning
PRIMARY KEY - different semantics than MySQL (sparse index, not unique)
TTL (Time To Live) - automatic data expiration
SETTINGS - table-level engine settings
MATERIALIZED columns - auto-calculated columns, excluded from SELECT *
EPHEMERAL columns - temporary columns for DEFAULT expressions, not stored
ALIAS columns - virtual computed columns

UPDATE Syntax

ClickHouse uses ALTER TABLE UPDATE - fundamentally different:

ALTER TABLE [db.]table [ON CLUSTER cluster] UPDATE column1 = expr1 [, ...] 
[IN PARTITION partition_id] WHERE filter_expr

Key characteristics:

Not a standard UPDATE - implemented as a "mutation"
Asynchronous - returns immediately, executes in background
Rewrites entire data parts - not row-by-row updates
Heavy operation - designed for infrequent use (opposite of OLTP)
No atomicity - SELECTs during mutation see mixed old/new data
Cannot update primary or partition key columns

Query-Level SETTINGS

ClickHouse supports SETTINGS in SELECT, INSERT, and other queries:

SELECT * FROM table SETTINGS setting1 = value1, setting2 = value2;

INSERT INTO table VALUES (...) SETTINGS async_insert = 1;

Common query-level settings:

select_sequential_consistency = 1 - for replicated setups
final = 1 - forces materialization of all data parts
async_insert = 1 - asynchronous inserts
optimize_read_in_order = 1 - query optimization

Summary & Recommendations

Should You Use This Library?

For Query Building: ✅ YES

Standard SELECT/INSERT/DELETE/UPDATE operations work well
Type safety is excellent
Schema introspection works

For Migrations: ⚠️ WITH CAUTION

Works, but requires raw SQL for ClickHouse-specific table features
No migration locking (run migrations from single process only)
Document this limitation clearly

For Production Use: ⚠️ YES, BUT... You'll need to:

Fix the supportsReturning bug
Implement proper migration locking OR document single-process restriction
Improve string escaping in prepareQuery or switch to client-side parameterization
Use raw SQL for ClickHouse-specific features (ENGINE, SETTINGS, PARTITION BY, etc.)

Priority Improvements:

[Critical] Fix supportsReturning = false in src/ClickhouseAdapter.ts:8-10
[Critical] Implement migration locking or document restrictions in src/ClickhouseAdapter.ts:12-24
[High] Fix SQL escaping vulnerabilities in src/ClickhouseConnection.ts:28-40
[High] Improve type inference in INSERT...SELECT parameterization in src/ClickhouseConnection.ts:78-92
[Medium] Consider implementing ClickHouseQueryCompiler for ENGINE/SETTINGS support
[Low] Add connection pooling for better performance

Should You Implement a ClickHouse Query Compiler?

Short answer: EVENTUALLY, YES

The current MySQL query compiler works for ~80% of use cases. However:

Pros of custom compiler:

Native support for ENGINE clause
Native support for ORDER BY in CREATE TABLE
Native support for PARTITION BY
Native support for SETTINGS clause
Support for MATERIALIZED, ALIAS, EPHEMERAL columns
Support for TTL clauses
Better error messages

Cons:

Significant development effort (~500-1000 lines of code)
Maintenance burden as ClickHouse evolves
Current workarounds with sql tags work fine

Recommendation: Start by fixing critical bugs, then consider implementing a custom compiler if community adoption grows and users frequently need ClickHouse-specific features.

Quick Reference: File Locations

Dialect implementation: src/ClickhouseDialect.ts
Adapter (capabilities): src/ClickhouseAdapter.ts
Connection logic: src/ClickhouseConnection.ts
Driver: src/ClickhouseDriver.ts
Introspector: src/ClickhouseIntrospector.ts
Tests: tests/clickhouse.spec.ts

All critical issues are in:

src/ClickhouseAdapter.ts:8-10 (supportsReturning)
src/ClickhouseAdapter.ts:12-24 (migration locking)
src/ClickhouseConnection.ts:28-40 (string escaping)
src/ClickhouseConnection.ts:78-92 (type inference)

Conclusion

The kysely-clickhouse library is a solid foundation with good design decisions around UPDATE translation and INSERT handling. However, it has several critical bugs that should be fixed before production use. The lack of ClickHouse-specific SQL feature support is not a blocker since raw SQL workarounds exist, but implementing a custom ClickHouseQueryCompiler would significantly improve the developer experience for users working with ClickHouse's unique features like table engines, partitioning, and query-level settings.

For schema migrations specifically, the library works but requires awareness of its limitations around migration locking and the need for raw SQL when creating ClickHouse-optimized table structures.

loganlinn/kysely-clickhouse-analysis.md

Select an option

No results found

Select an option

No results found

Analysis of kysely-clickhouse Library

1. Inherent Limitations of Using kysely-clickhouse vs Direct ClickHouse Client

What Works Well:

Current Practical Limitations:

Workaround:

2. Completeness of Implementation & Extension Points

Kysely Extension Points (5 total):

What's Implemented:

Missing/Incomplete Extension Points:

3. Code Quality Issues & Recommended Improvements

🔴 Critical Issues:

3.1. Incorrect `supportsReturning` Setting

3.2. Empty Migration Lock Implementation

3.3. Unsafe String Escaping in prepareQuery

3.4. Incomplete Type Inference in INSERT...SELECT Parameterization

3.5. Missing Error Handling

🟡 Design Issues:

3.6. Session ID Per Connection

3.7. Inconsistent Query Execution Paths

✅ Good Design Decisions:

4. ClickHouse-Specific SQL Support

Current Support:

Key ClickHouse SQL Differences from MySQL

CREATE TABLE Syntax

UPDATE Syntax

Query-Level SETTINGS

Summary & Recommendations

Should You Use This Library?

Priority Improvements:

Should You Implement a ClickHouse Query Compiler?

Quick Reference: File Locations

Conclusion

loganlinn/kysely-clickhouse-analysis.md

Analysis of kysely-clickhouse Library

1. Inherent Limitations of Using kysely-clickhouse vs Direct ClickHouse Client

What Works Well:

Current Practical Limitations:

Workaround:

2. Completeness of Implementation & Extension Points

Kysely Extension Points (5 total):

What's Implemented:

Missing/Incomplete Extension Points:

3. Code Quality Issues & Recommended Improvements

🔴 Critical Issues:

3.1. Incorrect supportsReturning Setting

3.2. Empty Migration Lock Implementation

3.3. Unsafe String Escaping in prepareQuery

3.4. Incomplete Type Inference in INSERT...SELECT Parameterization

3.5. Missing Error Handling

🟡 Design Issues:

3.6. Session ID Per Connection

3.7. Inconsistent Query Execution Paths

✅ Good Design Decisions:

4. ClickHouse-Specific SQL Support

Current Support:

Key ClickHouse SQL Differences from MySQL

CREATE TABLE Syntax

UPDATE Syntax

Query-Level SETTINGS

Summary & Recommendations

Should You Use This Library?

Priority Improvements:

Should You Implement a ClickHouse Query Compiler?

Quick Reference: File Locations

Conclusion

3.1. Incorrect `supportsReturning` Setting