Skip to content

Instantly share code, notes, and snippets.

@loganlinn
Created December 8, 2025 04:07
Show Gist options
  • Select an option

  • Save loganlinn/8d706f8693399cda09d89bb5cc5f2eec to your computer and use it in GitHub Desktop.

Select an option

Save loganlinn/8d706f8693399cda09d89bb5cc5f2eec to your computer and use it in GitHub Desktop.
Analysis of kysely-clickhouse: Limitations, Implementation Completeness, Code Quality & ClickHouse SQL Support

Analysis of kysely-clickhouse Library

Comprehensive analysis of the kysely-clickhouse library addressing open questions about limitations, implementation completeness, code quality, and ClickHouse-specific SQL support.


1. Inherent Limitations of Using kysely-clickhouse vs Direct ClickHouse Client

Answer: There are NO inherent architectural limitations, but there ARE practical limitations due to incomplete implementation.

What Works Well:

  • Query building: Standard CRUD operations (SELECT, INSERT, UPDATE, DELETE) work fine
  • Type safety: Full TypeScript support with database schema typing
  • Schema introspection: Works correctly via information_schema
  • Migrations: Kysely's migration system is supported

Current Practical Limitations:

  1. No query-level SETTINGS support: ClickHouse's SETTINGS clause is critical for performance tuning but not accessible through Kysely's API
  2. No table ENGINE specification: Cannot specify ClickHouse table engines (MergeTree, ReplicatedMergeTree, etc.) through Kysely's schema builder
  3. No ORDER BY in CREATE TABLE: MergeTree tables require ORDER BY clause, not supported in Kysely
  4. No PARTITION BY support: ClickHouse's partitioning is a first-class feature but unavailable
  5. No MATERIALIZED/ALIAS column support: These ClickHouse-specific column types aren't in Kysely's schema builder
  6. No TTL support: Data lifecycle management via TTL is unavailable

Workaround:

All ClickHouse-specific features can be accessed via sql template tags (raw SQL):

await db.schema
  .createTable('events')
  .$call(qb => sql`${qb} ENGINE = MergeTree() ORDER BY timestamp`.execute())

Verdict: Not an inherent limitation of Kysely's design, but ClickHouse-specific features require raw SQL workarounds.


2. Completeness of Implementation & Extension Points

Kysely Extension Points (5 total):

  1. Dialect: Fully implemented
  2. Driver: Fully implemented
  3. QueryCompiler: Implemented (uses MysqlQueryCompiler)
  4. DatabaseIntrospector: Fully implemented
  5. DialectAdapter: Fully implemented

What's Implemented:

src/ClickhouseDialect.ts:34-35

createQueryCompiler(): QueryCompiler {
  return new MysqlQueryCompiler();
}

Uses MySQL's query compiler, which works because ClickHouse's basic SQL syntax (SELECT, INSERT, etc.) is MySQL-compatible.

src/ClickhouseAdapter.ts:3-24

export class ClickhouseAdapter extends DialectAdapterBase {
  get supportsTransactionalDdl(): boolean {
    return false  // ✅ Correct - ClickHouse has no transactions
  }
  
  get supportsReturning(): boolean {
    return true   // ⚠️ Questionable - see below
  }
  
  async acquireMigrationLock(): Promise<void> {
    // Empty implementation - no locking
  }
}

src/ClickhouseConnection.ts:43-46

prepareQuery<O>(compiledQuery: CompiledQuery): string {
  // Converts UPDATE to ALTER TABLE UPDATE
  return compiledSql.replace(
    /^update ((`\w+`\.)*`\w+`) set/i,
    "alter table $1 update"
  )
}

Smart workaround to translate standard UPDATE to ClickHouse's ALTER TABLE UPDATE syntax.

src/ClickhouseConnection.ts:49-103 Handles INSERT with special cases:

  • ValuesNode → Uses ClickHouse native insert() API with JSONCompactEachRowWithNames format
  • SelectQueryNode → Uses parameterized queries for INSERT...SELECT

Missing/Incomplete Extension Points:

Should Implement a ClickHouseQueryCompiler:

The current MysqlQueryCompiler works for basic queries but lacks:

  • ENGINE clause support in CREATE TABLE
  • ORDER BY clause support in CREATE TABLE (required for MergeTree)
  • PARTITION BY clause support
  • SETTINGS clause in queries
  • TTL clause support
  • MATERIALIZED, ALIAS, EPHEMERAL column types

3. Code Quality Issues & Recommended Improvements

🔴 Critical Issues:

3.1. Incorrect supportsReturning Setting

src/ClickhouseAdapter.ts:8-10

get supportsReturning(): boolean {
  return true
}

Problem: ClickHouse does NOT support RETURNING clause. Setting this to true is misleading.

Evidence from ClickHouse docs: No mention of RETURNING in INSERT/UPDATE/DELETE operations. ClickHouse uses different patterns:

  • INSERT returns summary statistics
  • UPDATE (ALTER TABLE UPDATE) is asynchronous with no return values
  • DELETE (ALTER TABLE DELETE) is asynchronous with no return values

Severity: HIGH - Could cause runtime errors if users try to use .returning() API

Fix: Change to return false


3.2. Empty Migration Lock Implementation

src/ClickhouseAdapter.ts:12-24

async acquireMigrationLock(): Promise<void> {
  // Empty - no implementation
}

async releaseMigrationLock(): Promise<void> {
  // Empty - no implementation  
}

Problem: Multiple migration processes could run concurrently without any locking mechanism. Since ClickHouse doesn't support transactions (supportsTransactionalDdl = false), this is dangerous.

Severity: HIGH - Could corrupt migration state

Solutions:

  1. Use system.zookeeper for distributed locking (if ZooKeeper is available)
  2. Use a lock table pattern with ReplacingMergeTree engine
  3. Document that migrations should only run from single process

3.3. Unsafe String Escaping in prepareQuery

src/ClickhouseConnection.ts:28-40

prepareQuery<O>(compiledQuery: CompiledQuery): string {
  let i = 0
  const compiledSql = compiledQuery.sql.replace(/\?/g, () => {
    const param = compiledQuery.parameters[i++]
    if (typeof param === 'number') {
      return `${param}`
    }
    return `'${param.replace(/'/gm, `\\'`).replace(/\\"/g, '\\\\"')}'`
  })
}

Problems:

  • Using regex replace for SQL escaping is fragile
  • Escape sequence \\' is incorrect - should be '' (double single quote) per SQL standard
  • The .replace(/\\"/g, '\\\\"') line is confusing and potentially wrong
  • No handling for NULL values, dates, arrays, etc.

Severity: HIGH - SQL injection risk for complex data types

Recommendation: Let ClickHouse client library handle parameterization rather than manual string interpolation


3.4. Incomplete Type Inference in INSERT...SELECT Parameterization

src/ClickhouseConnection.ts:78-92

const query = compiledQuery.sql.replace(/\?/g, () => {
  const val = compiledQuery.parameters[counter];
  if (typeof val === 'string') {
    return `{p${counter++}: String}`;
  }
  if (typeof val === 'number') {
    return `{p${counter++}: UInt32}`;  // ⚠️ What about floats, negative numbers?
  }
  if (typeof val === 'object' && val instanceof Date) {
    return `{p${counter++}: DateTime}`;
  }
  return `{p${counter++}: String}`;  // Fallback
});

Problems:

  • UInt32 assumption is wrong for floats, negative numbers, or large integers
  • No support for arrays, tuples, or other ClickHouse types
  • Fallback to String is too broad

Severity: MEDIUM - Type coercion errors at runtime


3.5. Missing Error Handling

src/ClickhouseConnection.ts:148-158

async beginTransaction() {
  throw new Error('Transactions are not supported.');
}

Problem: The error message is correct, but Kysely might still try to begin transactions during migrations if not properly configured.

Severity: LOW - Error is explicit, but could be prevented


🟡 Design Issues:

3.6. Session ID Per Connection

src/ClickhouseConnection.ts:22

session_id: randomUUID(),

Issue: Creating a new connection for every query with random session IDs prevents:

  • Using temporary tables across queries
  • Session-level settings persistence
  • Proper connection pooling benefits

Severity: MEDIUM - Impacts temporary table functionality (as seen in tests)

Note: The test uses kysely.connection().execute() wrapper, which presumably maintains the same connection, so temporary tables work within that scope.


3.7. Inconsistent Query Execution Paths

The executeQuery method has 5 different code paths:

  1. INSERT with ValuesNode → ClickHouse client insert() API
  2. INSERT with SelectQueryNode → parameterized command()
  3. SELECT → query() with manual string interpolation
  4. UPDATE → query() with manual string interpolation
  5. Everything else → command()

Severity: MEDIUM - Hard to maintain, test, and debug


✅ Good Design Decisions:

  1. UPDATE translation: Converting UPDATE to ALTER TABLE UPDATE is correct
  2. Using ClickHouse native insert(): For VALUES inserts, using the native API with JSONCompactEachRowWithNames format is efficient
  3. Session-aware connections: Using session_id enables temporary tables within a connection scope
  4. date_time_input_format: 'best_effort': Good default for flexible date parsing

4. ClickHouse-Specific SQL Support

Current Support:

Table Engines: ❌ NOT SUPPORTED

  • Workaround: Use raw SQL with sql template tag
await db.schema.createTable('events')
  .$call(qb => sql`${qb} ENGINE = MergeTree() ORDER BY timestamp`.execute())

Query-Level SETTINGS: ❌ NOT SUPPORTED

  • Workaround: Use raw SQL
await sql`SELECT * FROM events SETTINGS final = 1`.execute(db)

ALTER TABLE UPDATE: ✅ SUPPORTED (automatic translation in prepareQuery)

await db.updateTable('users')
  .set({ status: 'active' })
  .where('id', '>', 100)
  .execute()
// Translates to: ALTER TABLE users UPDATE status = 'active' WHERE id > 100

INSERT Variations: ✅ PARTIALLY SUPPORTED

  • VALUES format: ✅ Supported via native API
  • INSERT...SELECT: ✅ Supported via parameterized queries
  • SETTINGS in INSERT: ❌ Not supported
  • FORMAT specification: ❌ Not supported

Temporary Tables: ✅ SUPPORTED

await db.schema
  .createTable('temp_table')
  .temporary()
  .addColumn('id', 'bigint')
  .execute()

Works via kysely.connection().execute(async (db) => { ... }) pattern


Key ClickHouse SQL Differences from MySQL

CREATE TABLE Syntax

ClickHouse requires ENGINE and ORDER BY for MergeTree tables:

CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
    name1 [type1] [NULL|NOT NULL] [DEFAULT|MATERIALIZED|EPHEMERAL|ALIAS expr1],
    name2 [type2] [NULL|NOT NULL] [DEFAULT|MATERIALIZED|EPHEMERAL|ALIAS expr2],
    ...
    PRIMARY KEY(expr1[, expr2,...])
) ENGINE = MergeTree()
ORDER BY expr
[PARTITION BY expr]
[PRIMARY KEY expr]
[SAMPLE BY expr]
[TTL expr [...] ]
[SETTINGS name = value, ...]

Key differences from MySQL:

  • ENGINE clause is REQUIRED - specifies table engine (MergeTree, Memory, Log, etc.)
  • ORDER BY clause is REQUIRED for MergeTree - defines sorting key
  • PARTITION BY clause - ClickHouse-specific partitioning
  • PRIMARY KEY - different semantics than MySQL (sparse index, not unique)
  • TTL (Time To Live) - automatic data expiration
  • SETTINGS - table-level engine settings
  • MATERIALIZED columns - auto-calculated columns, excluded from SELECT *
  • EPHEMERAL columns - temporary columns for DEFAULT expressions, not stored
  • ALIAS columns - virtual computed columns

UPDATE Syntax

ClickHouse uses ALTER TABLE UPDATE - fundamentally different:

ALTER TABLE [db.]table [ON CLUSTER cluster] UPDATE column1 = expr1 [, ...] 
[IN PARTITION partition_id] WHERE filter_expr

Key characteristics:

  • Not a standard UPDATE - implemented as a "mutation"
  • Asynchronous - returns immediately, executes in background
  • Rewrites entire data parts - not row-by-row updates
  • Heavy operation - designed for infrequent use (opposite of OLTP)
  • No atomicity - SELECTs during mutation see mixed old/new data
  • Cannot update primary or partition key columns

Query-Level SETTINGS

ClickHouse supports SETTINGS in SELECT, INSERT, and other queries:

SELECT * FROM table SETTINGS setting1 = value1, setting2 = value2;

INSERT INTO table VALUES (...) SETTINGS async_insert = 1;

Common query-level settings:

  • select_sequential_consistency = 1 - for replicated setups
  • final = 1 - forces materialization of all data parts
  • async_insert = 1 - asynchronous inserts
  • optimize_read_in_order = 1 - query optimization

Summary & Recommendations

Should You Use This Library?

For Query Building: ✅ YES

  • Standard SELECT/INSERT/DELETE/UPDATE operations work well
  • Type safety is excellent
  • Schema introspection works

For Migrations: ⚠️ WITH CAUTION

  • Works, but requires raw SQL for ClickHouse-specific table features
  • No migration locking (run migrations from single process only)
  • Document this limitation clearly

For Production Use: ⚠️ YES, BUT... You'll need to:

  1. Fix the supportsReturning bug
  2. Implement proper migration locking OR document single-process restriction
  3. Improve string escaping in prepareQuery or switch to client-side parameterization
  4. Use raw SQL for ClickHouse-specific features (ENGINE, SETTINGS, PARTITION BY, etc.)

Priority Improvements:

  1. [Critical] Fix supportsReturning = false in src/ClickhouseAdapter.ts:8-10
  2. [Critical] Implement migration locking or document restrictions in src/ClickhouseAdapter.ts:12-24
  3. [High] Fix SQL escaping vulnerabilities in src/ClickhouseConnection.ts:28-40
  4. [High] Improve type inference in INSERT...SELECT parameterization in src/ClickhouseConnection.ts:78-92
  5. [Medium] Consider implementing ClickHouseQueryCompiler for ENGINE/SETTINGS support
  6. [Low] Add connection pooling for better performance

Should You Implement a ClickHouse Query Compiler?

Short answer: EVENTUALLY, YES

The current MySQL query compiler works for ~80% of use cases. However:

Pros of custom compiler:

  • Native support for ENGINE clause
  • Native support for ORDER BY in CREATE TABLE
  • Native support for PARTITION BY
  • Native support for SETTINGS clause
  • Support for MATERIALIZED, ALIAS, EPHEMERAL columns
  • Support for TTL clauses
  • Better error messages

Cons:

  • Significant development effort (~500-1000 lines of code)
  • Maintenance burden as ClickHouse evolves
  • Current workarounds with sql tags work fine

Recommendation: Start by fixing critical bugs, then consider implementing a custom compiler if community adoption grows and users frequently need ClickHouse-specific features.


Quick Reference: File Locations

  • Dialect implementation: src/ClickhouseDialect.ts
  • Adapter (capabilities): src/ClickhouseAdapter.ts
  • Connection logic: src/ClickhouseConnection.ts
  • Driver: src/ClickhouseDriver.ts
  • Introspector: src/ClickhouseIntrospector.ts
  • Tests: tests/clickhouse.spec.ts

All critical issues are in:

  • src/ClickhouseAdapter.ts:8-10 (supportsReturning)
  • src/ClickhouseAdapter.ts:12-24 (migration locking)
  • src/ClickhouseConnection.ts:28-40 (string escaping)
  • src/ClickhouseConnection.ts:78-92 (type inference)

Conclusion

The kysely-clickhouse library is a solid foundation with good design decisions around UPDATE translation and INSERT handling. However, it has several critical bugs that should be fixed before production use. The lack of ClickHouse-specific SQL feature support is not a blocker since raw SQL workarounds exist, but implementing a custom ClickHouseQueryCompiler would significantly improve the developer experience for users working with ClickHouse's unique features like table engines, partitioning, and query-level settings.

For schema migrations specifically, the library works but requires awareness of its limitations around migration locking and the need for raw SQL when creating ClickHouse-optimized table structures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment