Note: This gist outlines architectural patterns and infrastructure used for research. Source code and proprietary datasets are excluded per lab data policy.
Reinforcement Learning Environment
- Gymnasium Wrappers: Custom observation-based environments with flexible state transitions for cognitive task modeling.
- Configurable Rewards: Decoupled reward structure from environment logic to allow rapid iteration on task definitions.
Parameter Optimization
- Gradient-Free Search: Integrated PyBADS (Bayesian Adaptive Direct Search) for optimizing agents in non-differentiable landscapes.
- Posterior Estimation: Uses PyVBMC for model comparison and uncertainty quantification.
- Flexible Fitting Pipeline: Supports hot-swapping model architectures and initialization strategies without rewriting the optimization loop.
HPC Infrastructure
- Slurm Integration: Automated job dispatching via
submitit. Handles job arrays, timeouts, and requeuing logic. - Massive Parallelism: Scales fitting procedures across hundreds of CPUs concurrently.
Data Pipeline
- Heterogeneous Ingestion: Pandas-based normalization for complex time-series data (behavioral and neuroimaging).
- Serialization: Optimized I/O for rapid iterative testing of model hypotheses.
- Core: Python, NumPy, Pandas
- RL: Gymnasium
- Optimization: PyBADS, PyVBMC
- Infra: Slurm, Submitit