You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Install smartmontools if needed
sudo apt install smartmontools
# Check each drive
sudo smartctl -H /dev/sda
sudo smartctl -H /dev/sdg
# Full SMART report
sudo smartctl -a /dev/sda
sudo smartctl -a /dev/sdg
Check for Drive Errors
# Kernel messages for disk errors
sudo dmesg | grep -iE 'error|fail|sd[a-z]'# Check system logs
sudo journalctl -u mdmonitor --since "24 hours ago"
Monthly Maintenance Plan
1. SMART Self-Test (run monthly)
# Start short self-test (~2 min)
sudo smartctl -t short /dev/sda
sudo smartctl -t short /dev/sdg
# Check results after test completes
sudo smartctl -l selftest /dev/sda
sudo smartctl -l selftest /dev/sdg
2. Array Consistency Check (run monthly)
# Trigger a check (non-destructive, runs in background)
sudo echo check > /sys/block/md0/md/sync_action
# Monitor progress
cat /proc/mdstat
# Check result when done
sudo cat /sys/block/md0/md/mismatch_cnt
# Should be 0 (or very low)
3. Review SMART Attributes to Watch
sudo smartctl -A /dev/sda | grep -E 'Reallocated|Pending|Uncorrectable|Power_On|Temperature'
Reallocated_Sector_Ct - bad sectors remapped (watch for increases)
Current_Pending_Sector - sectors waiting to be remapped
Offline_Uncorrectable - sectors that couldn't be read
Temperature_Celsius - keep below 50°C
Hot-Swap Procedure (when a drive fails)
1. Identify the failed drive
cat /proc/mdstat
# Look for [U_] or [_U] - underscore shows failed position
sudo mdadm --detail /dev/md0
# Shows "faulty" or "removed" next to bad drive
2. Remove the failed drive from array
sudo mdadm --remove /dev/md0 /dev/sdX1
# Replace sdX1 with actual failed device
3. Physically swap the drive
Power down if needed (or hot-swap if supported)
Remove failed drive
Insert new drive (must be same size or larger: 8TB)