Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save inchoate/007512f12e33af8889ef0eb3712284d8 to your computer and use it in GitHub Desktop.

Select an option

Save inchoate/007512f12e33af8889ef0eb3712284d8 to your computer and use it in GitHub Desktop.
Python B/E Refactoring Playbook
# Python Project Refactor Playbook
A comprehensive guide for refactoring monolithic Python applications (Flask, Django, FastAPI) into maintainable, production-ready codebases.
## πŸ“‹ Pre-Refactoring Assessment
### 1. Current State Analysis
```bash
# Analyze codebase structure
find . -name "*.py" -exec wc -l {} + | sort -nr
find . -name "*.py" -exec grep -l "^class\|^def\|^@app.route" {} \;
# Check dependencies
pip list --outdated
pip check
```
**Red Flags to Look For:**
- Single files > 500 lines
- Mixed concerns (database, business logic, routing in same file)
- Hardcoded credentials or configuration
- No error handling or logging
- Inline SQL queries
- No input validation
- Missing type hints
- No testing structure
### 2. Architecture Assessment Template
```python
"""
CURRENT ARCHITECTURE ANALYSIS:
File Structure:
- app.py: [LINE_COUNT] lines - [CONCERNS: routing, auth, data, templates]
- Database: [TYPE] - [SCHEMA_COMPLEXITY]
- Authentication: [IMPLEMENTATION_TYPE]
- API Design: [REST/GraphQL/Mixed]
- Error Handling: [PRESENT/MISSING]
- Logging: [CONFIGURED/MISSING]
- Testing: [COVERAGE_%]
- Configuration: [HARDCODED/ENV_VARS/CONFIG_FILES]
Technical Debt:
- [LIST_SPECIFIC_ISSUES]
- [SECURITY_CONCERNS]
- [PERFORMANCE_BOTTLENECKS]
"""
```
## 🎯 Refactoring Strategy
### Phase 1: Extract Configuration & Constants
**Goal**: Get all hardcoded values into one place
```python
# config.py - All configuration in one place
class Config:
# Database
DATABASE_PATH = 'Data/financial_data.db'
CACHE_TIMEOUT = 300 # 5 minutes
# Business Intelligence Constants
FY25_REVENUE_GROWTH = 49.2 # %
FY25_PROFIT_GROWTH = 57.7 # %
MERITON_SEASONALITY = 58.1 # %
# Seasonal Factors (validated 30-month analysis)
SEASONAL_FACTORS = {
1: 0.95, 2: 0.82, 3: 1.02, 4: 0.92,
5: 0.95, 6: 1.18, 7: 1.11, 8: 1.17,
9: 0.94, 10: 1.18, 11: 0.73, 12: 1.23
}
# Strategic Cost Buckets
STRATEGIC_BUCKETS = {
'personnel_costs': {
'name': 'Personnel Costs',
'metrics': ['SALARIES & WAGES', 'PAYROLL TAXES', ...],
'show_in_operating': True
},
# ... other buckets
}
```
### Phase 2: Extract Data Access Layer
**Goal**: Separate database operations from business logic
```python
# data/database.py - Clean database abstraction
class DatabaseConnection:
@contextmanager
def get_connection(self):
conn = self._create_optimized_connection()
try:
yield conn
finally:
conn.close()
def _create_optimized_connection(self):
conn = sqlite3.connect(Config.DATABASE_PATH)
# Apply all performance optimizations
conn.execute('PRAGMA journal_mode = WAL')
conn.execute('PRAGMA cache_size = 10000')
return conn
# data/queries.py - SQL queries in one place
class FinancialQueries:
@staticmethod
def get_performance_data(company_filter, report_date, period_type):
# Clean, parameterized queries
pass
@staticmethod
def get_trends_data(company, period_type, dates):
# Batch queries for efficiency
pass
```
### Phase 3: Extract Business Logic
**Goal**: Pure business logic functions that are easy to test and modify
```python
# business/insights.py - All insight generation logic
class InsightGenerator:
def __init__(self, config: Config):
self.config = config
def generate_landing_commentary(self, kpis, report_date):
"""Generate executive-level insights for landing page"""
# Pure function - easy to test and modify
pass
def generate_performance_commentary(self, revenue_var, margin_var, profit_var):
"""Generate performance insights with FY25 context"""
pass
# business/seasonal.py - HVAC seasonal intelligence
class SeasonalAnalyzer:
def __init__(self, seasonal_factors: dict):
self.seasonal_factors = seasonal_factors
def analyze_hvac_patterns(self, trends_data):
"""Analyze HVAC seasonal patterns"""
pass
def get_meriton_seasonal_context(self, month_num, actual_revenue):
"""Get season-specific business context"""
pass
# business/predictions.py - AI prediction engine
class PredictionEngine:
def calculate_prediction_confidence(self, values, bucket_name):
"""Calculate confidence based on data quality"""
pass
def generate_smart_predictions(self, historical_data, months_ahead):
"""Generate predictions with seasonal adjustments"""
pass
```
### Phase 4: Clean API Layer
**Goal**: Thin API layer that orchestrates business logic
```python
# api/routes.py - Clean route definitions
from flask import Blueprint
from business.insights import InsightGenerator
from data.queries import FinancialQueries
dashboard_bp = Blueprint('dashboard', __name__)
@dashboard_bp.route('/api/performance-data')
@login_required
def get_performance_data():
# Input validation
params = validate_performance_params(request.args)
# Data retrieval
queries = FinancialQueries()
data = queries.get_performance_data(**params)
# Business logic
insights = InsightGenerator(Config())
commentary = insights.generate_performance_commentary(data)
# Response formatting
return jsonify({
'data': data,
'commentary': commentary,
'status': 'success'
})
```
## πŸ—οΈ Proposed File Structure
```
app/
β”œβ”€β”€ app.py # Main Flask app (< 100 lines)
β”œβ”€β”€ config.py # All configuration & constants
β”œβ”€β”€ auth.py # Authentication logic
β”œβ”€β”€ cache.py # Caching utilities
β”œβ”€β”€ data/
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ database.py # Database connection & optimization
β”‚ β”œβ”€β”€ queries.py # All SQL queries
β”‚ └── models.py # Data transformation utilities
β”œβ”€β”€ business/
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ insights.py # Insight generation engine
β”‚ β”œβ”€β”€ seasonal.py # HVAC seasonal intelligence
β”‚ β”œβ”€β”€ predictions.py # AI prediction algorithms
β”‚ └── formatters.py # Business data formatting
β”œβ”€β”€ api/
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ routes.py # All API routes
β”‚ └── validators.py # Input validation
└── utils/
β”œβ”€β”€ __init__.py
└── helpers.py # Generic utility functions
```
## πŸ“ Refactoring Process
### Step 1: Create Structure
```bash
mkdir -p app/{data,business,api,utils}
touch app/{data,business,api,utils}/__init__.py
```
### Step 2: Extract Constants First
```python
# Start with config.py - move all hardcoded values
# This immediately improves maintainability
```
### Step 3: Move Database Logic
```python
# Extract database.py and queries.py
# Keep existing optimization logic intact
```
### Step 4: Extract Business Logic Functions
```python
# Move insight generation functions to business/
# Keep sophisticated algorithms intact - just organize them
```
### Step 5: Thin Out Routes
```python
# Routes become orchestrators, not implementers
# Easy to see what each endpoint does
```
## βœ… POC-Friendly Patterns
### 1. Keep Complex Logic Intact
```python
# βœ… Don't break complex algorithms - just move them
def generate_hvac_seasonal_analysis(self, monthly_avg, seasonal_factors, metric_name):
"""
KEEP THIS COMPLEX LOGIC EXACTLY AS IS
Just move it to business/seasonal.py
"""
# All the existing sophisticated logic here
pass
```
### 2. Simple Dependency Injection
```python
# βœ… Simple constructor injection - no fancy DI containers
class InsightGenerator:
def __init__(self, config, seasonal_analyzer):
self.config = config
self.seasonal_analyzer = seasonal_analyzer
```
### 3. Preserve Existing Optimizations
```python
# βœ… Keep all existing caching and performance optimizations
@lru_cache(maxsize=10)
def get_available_dates_cached(self, table_name):
# Keep this exactly as is
pass
```
### 4. Minimal Abstractions
```python
# βœ… Don't create abstract base classes or complex inheritance
# Just use simple classes with clear responsibilities
```
## 🚨 What NOT to Do (POC Anti-Patterns)
### ❌ Over-Engineering
```python
# DON'T create complex inheritance hierarchies
class AbstractInsightGenerator(ABC):
@abstractmethod
def generate_insight(self): pass
class SeasonalInsightGenerator(AbstractInsightGenerator):
pass # This is overkill for a POC
```
### ❌ Too Many Layers
```python
# DON'T create repositories, services, and DTOs
# Keep it simple: queries β†’ business logic β†’ API
```
### ❌ Complex Configuration
```python
# DON'T use complex config systems
# Simple Python class with constants is fine
```
## πŸ§ͺ Testing Strategy (Minimal but Effective)
```python
# tests/test_insights.py - Test the complex business logic
def test_seasonal_analysis():
analyzer = SeasonalAnalyzer(Config.SEASONAL_FACTORS)
result = analyzer.analyze_hvac_patterns(sample_data)
assert 'summer_cooling' in result
assert result['seasonal_variance'] > 0
```
## πŸ“Š Success Metrics
- βœ… No single file > 300 lines
- βœ… Business logic separated from API routes
- βœ… Configuration centralized in config.py
- βœ… All existing functionality preserved
- βœ… Easy to find and modify insight generation logic
- βœ… Database queries isolated and reusable
- βœ… Still easy to add new endpoints or modify existing ones
## πŸ”§ Migration Commands
```bash
# 1. Create backup
cp "Updated_Data/President Dashboard - Updated Data/app.py" app_backup.py
# 2. Run existing app to ensure it works
python app.py
# 3. Create new structure (we'll do this step by step)
mkdir -p app/{data,business,api,utils}
# 4. After each refactoring step, test
python -m pytest tests/ -v
python app.py # Manual smoke test
```
## πŸ’‘ POC Refactoring Principles
1. **Preserve Sophistication**: Don't dumb down the complex insight algorithms
2. **Organize, Don't Rewrite**: Move code, don't rewrite it
3. **Test After Each Step**: Ensure nothing breaks
4. **Keep It Runnable**: Always have a working version
5. **Focus on Readability**: Make it easy to find and edit business logic
6. **Maintain Performance**: Keep all existing optimizations
## 🎯 Expected Outcome
After refactoring, you should have:
- **Clear separation** of concerns without over-engineering
- **Easy to modify** business logic and insights
- **Preserved functionality** with all existing features working
- **Better organization** making it easier to add new features
- **Maintainable POC** that can evolve as needed
**Time Investment**: ~4-6 hours for complete refactor
**Risk Level**: Low (moving code, not rewriting logic)
**Benefit**: Much easier to maintain and modify going forward
This approach respects that it's a POC while making it much more maintainable and editable! πŸš€
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment