Skip to content

Instantly share code, notes, and snippets.

@txoof
Created December 12, 2025 07:41
Show Gist options
  • Select an option

  • Save txoof/cb61153760e51fa03f1aeb11400cf8a6 to your computer and use it in GitHub Desktop.

Select an option

Save txoof/cb61153760e51fa03f1aeb11400cf8a6 to your computer and use it in GitHub Desktop.
Setting up ClearML Projects for RL Training

ClearML Remote Training with GitHub

Problem

ClearML remote workers clone your repository in an isolated environment.
They do not have access to your local SSH keys, SSH config, or GitHub login state.

This commonly causes jobs to fail with errors like:

fatal: Could not read from remote repository
Please make sure you have the correct access rights

This document describes a stable, low-friction setup that works in shared school infrastructure and avoids constant Git authentication problems.


Core Idea

Your local Git setup and the repository URL ClearML uses do not need to be the same.

  • You use SSH locally so you can push without pain
  • ClearML clones using HTTPS so the remote worker does not need credentials

These concerns are intentionally separated.

Requirements

One team member must:

  • Set up a public Git repo
  • Add team members as contributors
  • Share git repo link with team members

Local Git Setup (Do This Once)

Clone your repository using SSH:

git remote set-url origin git@github.com:OWNER_USERNAME/YOUR_REPO.git

From this point on, all local work (commit, push, pull) happens normally over SSH.


ClearML Setup (Force HTTPS Explicitly)

In your training script, explicitly tell ClearML which repository to clone.

Example:

from clearml import Task

task = Task.init(
    project_name='Pendulum-v1/yourname',
    task_name='Pendulum_test',
)

task.set_base_docker('deanis/2023y2b-rl:latest')

task.set_script(
    repository='https://github.com/YOUR_USERNAME/YOUR_REPO.git',
    branch='your_branch_name',
    working_dir='.',
    entry_point='RL_pendulum_training.py',
)

task.execute_remotely(queue_name='default')

This ensures:

  • ClearML does not attempt to infer your local SSH setup
  • The remote worker clones over HTTPS
  • Public repos clone without authentication

Daily Workflow

  1. Make changes locally
  2. Commit everything you want to run remotely
  3. Push using SSH
  4. Run the training script
  5. ClearML queues the job and clones over HTTPS

You do not switch remotes back and forth.


Key Understandings for ClearML Projects

  • ClearML queues jobs on a remote server
  • A job should represent a specific experiment, usually defined by a set of hyperparameters
  • ClearML works best when each job explores a clear parameter configuration

Key Parameters to Explore

  • Learning rate
  • Batch size
  • Number of steps
  • Reward function

Reward functions are the hardest to tune because they interact strongly with all other parameters.


Git Hygiene Requirements

ClearML expects a relatively clean Git state.

  • Uncommitted changes often cause jobs to fail
  • Large diffs or experimental local code confuse reproducibility
  • Always commit before enqueueing a job

Running ClearML from a messy working tree is a common source of failure.


Recommended Team Repository Structure

A shared team repository should have the following properties:

  • Shared access for all team members
  • A clean, stable core training wrapper
  • One branch per team member for experimentation

Branch Discipline

  • Each team member explores parameters in their own branch
  • Experimental branches should remain reasonably clean
  • Avoid long-lived, chaotic branches with unrelated changes

Team Workflow for Hyperparameter Exploration

  1. As a group, agree on which hyperparameters to explore
  2. Each team member focuses on one parameter space
  3. Avoid exhaustive grid searches; they are usually inefficient
  4. Prefer parameter sweeps over sensible ranges
  5. When promising values emerge, other team members adjust their sweeps
  6. Gradually converge toward better configurations

This allows parallel exploration without stepping on each other’s work.


Troubleshooting Appendix (Advanced)

Multiple GitHub Accounts and SSH Aliases

Some users have multiple GitHub accounts configured via SSH aliases:

Host buas_git
HostName github.com
User git
IdentityFile ~/.ssh/id_rsa_school

Do not use SSH aliases in the repository URL that ClearML sees.

Instead:

  1. Keep the remote URL standard:
    git@github.com:YOUR_USERNAME/YOUR_REPO.git

  2. Pin the SSH key for this repo only:
    git config core.sshCommand "ssh -i ~/.ssh/id_rsa_school -o IdentitiesOnly=yes"

This affects only the current repository and does not interfere with other projects.


Summary

  • Use SSH locally
  • Force HTTPS for ClearML
  • Commit before enqueueing
  • Keep branches clean
  • Coordinate parameter exploration as a team

This setup avoids authentication issues and scales well for shared ClearML infrastructure in coursework.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment