Reliability & Observability Issues Report
Generated: 2025-02-06
Repository: 0xMiden/miden-node
Total Open Issues: 106
Issues Addressed by PRs: 13
Unaddressed Issues: 93
This report analyzes all open issues in the miden-node repository, filtering out those already addressed by open pull requests, and categorizing the remaining issues by priority (HIGH, MEDIUM, LOW) with a focus on reliability and observability concerns (as opposed to new features).
Critical Performance & Reliability
#
Title
Labels
Description
1539
perf: move to a single, locked writer database connection
store
Move to a single, locked writer database connection to avoid contention and improve reliability
1538
perf: avoid loading and storing full Accounts from and to DB
store
Critical performance issue - loading full accounts for partial deltas is wasteful (~250ms per block)
1546
fix: get rid of the free-floating Mutex
store
Remove a free-floating mutex that may be causing lock contention
1537
tracking: performace of apply_block
store
Tracking issue for apply_block performance - critical path for block processing
1565
Sanity check or limit vault key count coming via VaultAssetWitnessesRequest
store
Security/reliability - prevent unbounded requests that could cause DoS
1052
Separate worker/connection pools for the Store
store
Prevent resource starvation between different store services
1302
Better handling of IO errors
Improve error handling for I/O failures - critical for reliability
1505
Add resource consumption tracking and metrics for node components
telemetry
Essential for observability - track disk, memory, CPU usage
Database & Data Management
#
Title
Labels
Description
1467
db/limits: Limit SQL query results
store
Prevent unbounded query results that could cause memory issues
927
Consider storing raw blocks in DB
store
Alternative storage strategy for blocks - affects data persistence
633
[devops] database backups
CI
Critical for production - database backup strategy
Shutdown & Initialization
#
Title
Labels
Description
1534
feat: enforce shutdown sequencing of components
Ensure clean shutdown to prevent data loss
91
graceful shutdown
node store rpc block-producer
Implement proper graceful shutdown for all components
#
Title
Labels
Description
1150
Investigate store component initialization order
store rpc block-producer node
Ensure proper component startup order
Performance Optimizations
#
Title
Labels
Description
1548
perf: avoid iterating over nullifiers twice
store
Optimization - reduce duplicate iterations
1544
perf/investigate: move the forest update out of apply_block
store
Performance investigation - may be deferred
1543
perf: ensure we do paged loading on forest load in State::load
store
Ensure efficient loading of forest data
1542
perf: investigate using barriers over oneshot channels
store
Performance investigation for concurrency primitives
1541
perf: consider moving compute_block_note_tree to separate task
store
Parallel opportunity for note tree computation
1243
Benchmark and improve performance of SyncTransaction query
store optimization
Query performance optimization
777
Optimize block insertion and block input retrieval
store block-producer
Critical path optimization
1200
Look into reducing vector allocations, especially for fixed-size data
store
Memory optimization
1201
Look into using arrays instead of vectors for fixed-size data
Memory optimization
1199
Consider using HashMap and HashSet instead of BTreeMap and BTreeSet
Performance optimization for lookups
Observability & Telemetry
#
Title
Labels
Description
1618
Use #[track_caller] on log/trace helpers
telemetry
Improved error tracking in logs
1331
Reduce INFO level noise
Reduce log noise for better signal
935
Investigate about grouping metrics by status code
delegated proving
Better metrics organization
1550
Use a dedicated retry crate
Improve retry behavior with proper library
1498
Revisit representations of InnerForest when migrating to LargeSmtForest
store
Data structure for memory efficiency
1416
Allow user to query historical nullifier tree
store rpc
Historical query capability
Error Handling & Validation
#
Title
Labels
Description
1091
Add error variant for connection errors for Remote Prover
delegated proving
Better error handling for remote proving
255
Improve error message for invalid data format
Better user experience with clearer errors
537
[block-producer]: improve InflightState error handling
block-producer
Better error handling
#
Title
Labels
Description
1615
Load all standard note scripts into DataStore at startup
Ensure required scripts are available
1188
Make note script insertion procedure more robust
store
Handle partial/hidden note scripts better
1253
Consider erasing data in proposed mempool nodes
block-producer mempool
Privacy/cleanup consideration
#
Title
Labels
Description
1254
Add tests for paginated queries in the store
store tests
Test coverage for pagination
1233
Consistency tests, test utils and expose test helpers
Better testing infrastructure
222
Add End-to-End tests for the Miden Node
tests CI
E2E testing for production confidence
291
Create Test Runner that does not rely on Miden client
CI
Independent test runner
Code Health & Refactoring
#
Title
Labels
Description
1494
refactor: accounts.rs is too large
store
Technical debt - file too large
1477
Revisit and add missing conversions to conv.rs
store
Complete conversion utilities
1465
db/diesel: Explicitness of Transaction vs Connection for database queries
store
Type safety for database operations
1134
store: refactor queries.rs and types.rs
store
Code organization
1161
store/queries: Address technical debt and minor issues
store
Technical debt cleanup
#
Title
Labels
Description
670
[devops] Deploy without nuking db
CI
Important for production - preserve DB on deploy
973
Create Dockerfile for Proving Service and add to build-docker and publish workflows
CI
Containerization for deployment
1222
Investigate github cache reduction
CI
CI optimization
1142
Switch from make to cargo-make
CI
Build system improvement
1024
Minimize duplicate dependencies
Dependency management
#
Title
Labels
Description
1625
Update Node documentation and architecture diagrams
documentation
Keep docs current
1619
Update documentation for v0.13
documentation
Release documentation
628
Missing panic and errors docs
documentation CI
Document error cases
1080
Revisit limits for list parameters in proto messages
documentation good first issue node
API documentation
Minor Improvements & Nice-to-Haves
#
Title
Labels
Description
1639
Add debuginfo to our systemd binaries
node
Better debugging support
1528
Refactor ConversionError enum
Code quality
1518
store/GetAccount: Dynamic response based on map leaf number
store
API improvement
1478
Stream GetNetworkAccountIds response
store network transactions
API improvement
1458
Clean up crates/rpc/src/server/api.rs
good first issue
Code cleanup
1456
Simplify env var names for node URLs
Configuration simplification
1444
Refactor RPC messages to query storage maps by slot ID instead of name
store rpc
API improvement
1439
Improve mempool graph replacement API
mempool
API improvement
1431
Bind to ports instead of urls
node
Configuration improvement
1341
Consider removing proxy from remote prover binary
delegated proving on hold
Simplification
1072
gRPC extension traits
Code organization
1059
Simulate genesis bootstrap with real transactions
Testing tooling
1040
Ensure miden-base versions match in RPC header check
Version compatibility
602
Consider changing account ID's SQL backing type
optimization
Database optimization
600
refactor: change the native type for account IDs to reduce allocations
Memory optimization
597
[CI] Add client integration test as an optional PR job
CI
CI enhancement
Feature Work (Lower Priority for Reliability)
#
Title
Labels
Description
1628
Persist ntx state
network transactions
Feature for network transactions
1620
Rename NetworkTransactionBuilder
network transactions
Naming/refactoring
1592
Deferred Block Proving
store delegated proving
Feature work
1605
Rethink transaction sync endpoint
rpc
API redesign
1270
Proof of Authority: block producer signatures
Security feature
1314
Improving actor design
network transactions
Architecture improvement
1306
Split out network transaction builder to its own service
network transactions
Architecture
1242
Express mempool constraints ito fees
block-producer mempool
Feature
1238
Improve note inclusion path serialization
rpc
Optimization
1209
Simpler decorators removal from ProvenTransaction and ProvenBatch
rpc
Simplification
1171
Replace Pingora proxy with custom tonic-based service
delegated proving on hold
Feature
1112
Support submitting ProvenBatch
rpc block-producer mempool
Feature
995
Merge miden-proving-service and miden-proving-service-client
delegated proving on hold
Refactoring
922
Mempool testing R&D
mempool
Research
762
Decouple stress-test from block-producer
Testing
594
Improve tx reverting strategy
mempool
Feature
573
Refactor tests
tests
Code quality
341
Block streaming endpoint
store rpc
Feature
CLI & Interface (Nice to Have)
#
Title
Labels
Description
137
Add a hosted/web interface to interact with the servers
User interface
95
Add CLI interface to interact with the servers
CLI tool
83
add control plane
Management interface
52
Add rate limiting to the servers
rpc
Security feature
50
Endpoint to download initial sync state
store rpc
Feature
23
Add persistence to the mempool
block-producer mempool
Feature
Category
Count
HIGH Priority
16
MEDIUM Priority
52
LOW Priority
57
Total Unaddressed Issues
93
Issues Already Addressed by Open PRs
Issue
PR
Title
#1618
#1651
Use #[track_caller] on log/trace helpers
#1641
#1646
Add typed error codes for GetAccount endpoint
#1591
#1636
Remove the SynState endpoint
#1429
#1624
Genesis config: custom account types
#1316
#1614
Guardrails: Tx and Block Validation + deferred Block Proving
#1538
#1567
perf: avoid loading and storing full Accounts from and to DB
#1470
#1502
db/diesel: Better docs / accessibility for schema changes
#1406
#1500
Create a miden-node-tracing crate
#1218
#1499
Extract RocksDbStorage smt storage backend into a separate crate
#1304
#1296
Historical account database data cleanup
#617
#1158
Return one batch proof per storage map on GetAccount
Immediate Actions (HIGH Priority)
Database Connection Management (#1539, #1052) - Implement single writer connection and separate worker pools
Performance Optimization (#1538) - The PR (#1567) exists but needs review/merge
Resource Monitoring (#1505) - Implement metrics for disk, memory, CPU usage
Database Backups (#633) - Critical for production reliability
Graceful Shutdown (#91, #1534) - Ensure clean component shutdown
Query Limits (#1467) - Prevent unbounded queries
Short Term (MEDIUM Priority)
Reduce log noise (#1331) while improving error tracking (#1618)
Improve retry logic (#1550)
Add E2E tests (#222)
Complete pagination tests (#1254)
CI cache optimization (#1222)
Dependency cleanup (#1024)
Documentation updates
CLI and web interface development
Feature work (deferred proving, network transactions, etc.)