Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save carefree-ladka/199c32b470168b2fdf67663c3f054a67 to your computer and use it in GitHub Desktop.

Select an option

Save carefree-ladka/199c32b470168b2fdf67663c3f054a67 to your computer and use it in GitHub Desktop.
SQL Complete Interview Preparation Guide 🎯

SQL Complete Interview Preparation Guide 🎯

Master SQL with 150+ Problems, Solutions & Concepts


Table of Contents

  1. SQL Basics & Fundamentals
  2. SELECT Queries & Filtering
  3. Aggregate Functions & GROUP BY
  4. JOINs (INNER, LEFT, RIGHT, FULL, CROSS, SELF)
  5. Subqueries & Nested Queries
  6. Window Functions (MUST KNOW)
  7. Common Table Expressions (CTEs)
  8. String Functions
  9. Date & Time Functions
  10. CASE Statements & Conditional Logic
  11. Set Operations (UNION, INTERSECT, EXCEPT)
  12. Indexes & Performance Optimization
  13. Constraints & Data Integrity
  14. Transactions & ACID Properties
  15. Advanced SQL Patterns
  16. 30 MUST-DO Interview Questions
  17. SQL Optimization Techniques
  18. Real-World Scenarios

Sample Database Schema

We'll use these tables throughout the guide:

-- Employees Table
CREATE TABLE employees (
    employee_id INT PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    email VARCHAR(100),
    phone VARCHAR(20),
    hire_date DATE,
    job_id VARCHAR(10),
    salary DECIMAL(10, 2),
    commission_pct DECIMAL(4, 2),
    manager_id INT,
    department_id INT
);

-- Departments Table
CREATE TABLE departments (
    department_id INT PRIMARY KEY,
    department_name VARCHAR(50),
    manager_id INT,
    location_id INT
);

-- Jobs Table
CREATE TABLE jobs (
    job_id VARCHAR(10) PRIMARY KEY,
    job_title VARCHAR(50),
    min_salary DECIMAL(10, 2),
    max_salary DECIMAL(10, 2)
);

-- Locations Table
CREATE TABLE locations (
    location_id INT PRIMARY KEY,
    street_address VARCHAR(100),
    postal_code VARCHAR(20),
    city VARCHAR(50),
    state_province VARCHAR(50),
    country_id CHAR(2)
);

-- Orders Table
CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    customer_id INT,
    order_date DATE,
    ship_date DATE,
    status VARCHAR(20),
    total_amount DECIMAL(10, 2)
);

-- Order Items Table
CREATE TABLE order_items (
    order_item_id INT PRIMARY KEY,
    order_id INT,
    product_id INT,
    quantity INT,
    unit_price DECIMAL(10, 2)
);

-- Products Table
CREATE TABLE products (
    product_id INT PRIMARY KEY,
    product_name VARCHAR(100),
    category VARCHAR(50),
    price DECIMAL(10, 2),
    stock_quantity INT
);

-- Customers Table
CREATE TABLE customers (
    customer_id INT PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    email VARCHAR(100),
    city VARCHAR(50),
    country VARCHAR(50),
    registration_date DATE
);

1. SQL Basics & Fundamentals

1.1 Data Definition Language (DDL)

-- CREATE TABLE
CREATE TABLE students (
    student_id INT PRIMARY KEY AUTO_INCREMENT,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(100) UNIQUE,
    enrollment_date DATE DEFAULT CURRENT_DATE,
    gpa DECIMAL(3, 2) CHECK (gpa >= 0 AND gpa <= 4.0)
);

-- ALTER TABLE - Add Column
ALTER TABLE students
ADD COLUMN phone VARCHAR(20);

-- ALTER TABLE - Modify Column
ALTER TABLE students
MODIFY COLUMN email VARCHAR(150);

-- ALTER TABLE - Drop Column
ALTER TABLE students
DROP COLUMN phone;

-- DROP TABLE
DROP TABLE IF EXISTS students;

-- TRUNCATE TABLE (removes all data, keeps structure)
TRUNCATE TABLE students;

1.2 Data Manipulation Language (DML)

-- INSERT - Single Row
INSERT INTO employees (employee_id, first_name, last_name, salary, hire_date)
VALUES (1, 'John', 'Doe', 75000, '2023-01-15');

-- INSERT - Multiple Rows
INSERT INTO employees (employee_id, first_name, last_name, salary, hire_date)
VALUES 
    (2, 'Jane', 'Smith', 80000, '2023-02-01'),
    (3, 'Mike', 'Johnson', 70000, '2023-03-10'),
    (4, 'Sarah', 'Williams', 85000, '2023-04-05');

-- UPDATE - Single Condition
UPDATE employees
SET salary = 78000
WHERE employee_id = 1;

-- UPDATE - Multiple Conditions
UPDATE employees
SET salary = salary * 1.10,
    commission_pct = 0.05
WHERE department_id = 10 AND salary < 80000;

-- DELETE - With Condition
DELETE FROM employees
WHERE employee_id = 1;

-- DELETE - All Rows (use carefully!)
-- DELETE FROM employees;

1.3 Data Query Language (DQL)

-- SELECT - All Columns
SELECT * FROM employees;

-- SELECT - Specific Columns
SELECT first_name, last_name, salary FROM employees;

-- SELECT - With Alias
SELECT 
    first_name AS "First Name",
    last_name AS "Last Name",
    salary AS "Annual Salary"
FROM employees;

-- SELECT DISTINCT - Unique Values
SELECT DISTINCT department_id FROM employees;

-- SELECT - With Calculations
SELECT 
    first_name,
    last_name,
    salary,
    salary * 12 AS annual_salary,
    salary * 0.10 AS tax
FROM employees;

2. SELECT Queries & Filtering

2.1 WHERE Clause

-- Basic Filtering
SELECT * FROM employees
WHERE salary > 80000;

-- Multiple Conditions - AND
SELECT * FROM employees
WHERE salary > 70000 AND department_id = 10;

-- Multiple Conditions - OR
SELECT * FROM employees
WHERE department_id = 10 OR department_id = 20;

-- NOT Operator
SELECT * FROM employees
WHERE NOT department_id = 10;

-- BETWEEN Operator
SELECT * FROM employees
WHERE salary BETWEEN 70000 AND 90000;

-- IN Operator
SELECT * FROM employees
WHERE department_id IN (10, 20, 30);

-- LIKE Operator - Pattern Matching
SELECT * FROM employees
WHERE first_name LIKE 'J%';  -- Starts with J

SELECT * FROM employees
WHERE email LIKE '%@gmail.com';  -- Ends with @gmail.com

SELECT * FROM employees
WHERE first_name LIKE '_ohn';  -- Second letter onwards is 'ohn'

-- IS NULL / IS NOT NULL
SELECT * FROM employees
WHERE commission_pct IS NULL;

SELECT * FROM employees
WHERE manager_id IS NOT NULL;

2.2 ORDER BY Clause

-- ORDER BY - Ascending (Default)
SELECT first_name, last_name, salary
FROM employees
ORDER BY salary;

-- ORDER BY - Descending
SELECT first_name, last_name, salary
FROM employees
ORDER BY salary DESC;

-- ORDER BY - Multiple Columns
SELECT first_name, last_name, department_id, salary
FROM employees
ORDER BY department_id ASC, salary DESC;

-- ORDER BY - Column Position
SELECT first_name, last_name, salary
FROM employees
ORDER BY 3 DESC;  -- Orders by 3rd column (salary)

2.3 LIMIT & OFFSET

-- LIMIT - Top N Records
SELECT first_name, last_name, salary
FROM employees
ORDER BY salary DESC
LIMIT 5;

-- LIMIT with OFFSET - Pagination
SELECT first_name, last_name, salary
FROM employees
ORDER BY employee_id
LIMIT 10 OFFSET 20;  -- Skip 20 rows, then get 10 rows

-- Alternative Syntax (MySQL)
SELECT first_name, last_name, salary
FROM employees
ORDER BY employee_id
LIMIT 20, 10;  -- Skip 20, get 10

3. Aggregate Functions & GROUP BY

3.1 Aggregate Functions

-- COUNT - Total Rows
SELECT COUNT(*) AS total_employees
FROM employees;

-- COUNT - Non-NULL Values
SELECT COUNT(commission_pct) AS employees_with_commission
FROM employees;

-- COUNT DISTINCT - Unique Values
SELECT COUNT(DISTINCT department_id) AS unique_departments
FROM employees;

-- SUM - Total
SELECT SUM(salary) AS total_salary_expense
FROM employees;

-- AVG - Average
SELECT AVG(salary) AS average_salary
FROM employees;

-- MIN - Minimum Value
SELECT MIN(salary) AS lowest_salary
FROM employees;

-- MAX - Maximum Value
SELECT MAX(salary) AS highest_salary
FROM employees;

-- Multiple Aggregates
SELECT 
    COUNT(*) AS total_employees,
    SUM(salary) AS total_salary,
    AVG(salary) AS avg_salary,
    MIN(salary) AS min_salary,
    MAX(salary) AS max_salary
FROM employees;

3.2 GROUP BY Clause

-- GROUP BY - Single Column
SELECT 
    department_id,
    COUNT(*) AS employee_count,
    AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id;

-- GROUP BY - Multiple Columns
SELECT 
    department_id,
    job_id,
    COUNT(*) AS employee_count,
    AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id, job_id
ORDER BY department_id, job_id;

-- GROUP BY with WHERE
SELECT 
    department_id,
    COUNT(*) AS employee_count,
    AVG(salary) AS avg_salary
FROM employees
WHERE salary > 50000
GROUP BY department_id;

3.3 HAVING Clause

-- HAVING - Filter Grouped Results
SELECT 
    department_id,
    COUNT(*) AS employee_count,
    AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
HAVING COUNT(*) > 5;

-- HAVING with Multiple Conditions
SELECT 
    department_id,
    COUNT(*) AS employee_count,
    AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
HAVING COUNT(*) > 5 AND AVG(salary) > 70000;

-- WHERE vs HAVING
-- WHERE filters before grouping, HAVING filters after grouping
SELECT 
    department_id,
    COUNT(*) AS employee_count,
    AVG(salary) AS avg_salary
FROM employees
WHERE hire_date > '2022-01-01'  -- Filter before grouping
GROUP BY department_id
HAVING AVG(salary) > 60000;     -- Filter after grouping

4. JOINs

4.1 INNER JOIN

-- INNER JOIN - Basic
SELECT 
    e.employee_id,
    e.first_name,
    e.last_name,
    d.department_name
FROM employees e
INNER JOIN departments d ON e.department_id = d.department_id;

-- INNER JOIN - Multiple Tables
SELECT 
    e.first_name,
    e.last_name,
    d.department_name,
    l.city,
    l.country_id
FROM employees e
INNER JOIN departments d ON e.department_id = d.department_id
INNER JOIN locations l ON d.location_id = l.location_id;

-- INNER JOIN with WHERE
SELECT 
    e.first_name,
    e.last_name,
    e.salary,
    d.department_name
FROM employees e
INNER JOIN departments d ON e.department_id = d.department_id
WHERE e.salary > 70000;

4.2 LEFT JOIN (LEFT OUTER JOIN)

-- LEFT JOIN - Returns all from left table
SELECT 
    e.employee_id,
    e.first_name,
    e.last_name,
    d.department_name
FROM employees e
LEFT JOIN departments d ON e.department_id = d.department_id;

-- Find Employees Without Department
SELECT 
    e.employee_id,
    e.first_name,
    e.last_name
FROM employees e
LEFT JOIN departments d ON e.department_id = d.department_id
WHERE d.department_id IS NULL;

4.3 RIGHT JOIN (RIGHT OUTER JOIN)

-- RIGHT JOIN - Returns all from right table
SELECT 
    e.employee_id,
    e.first_name,
    e.last_name,
    d.department_name
FROM employees e
RIGHT JOIN departments d ON e.department_id = d.department_id;

-- Find Departments Without Employees
SELECT 
    d.department_id,
    d.department_name
FROM employees e
RIGHT JOIN departments d ON e.department_id = d.department_id
WHERE e.employee_id IS NULL;

4.4 FULL OUTER JOIN

-- FULL OUTER JOIN (MySQL doesn't support directly, use UNION)
SELECT 
    e.employee_id,
    e.first_name,
    d.department_name
FROM employees e
LEFT JOIN departments d ON e.department_id = d.department_id

UNION

SELECT 
    e.employee_id,
    e.first_name,
    d.department_name
FROM employees e
RIGHT JOIN departments d ON e.department_id = d.department_id;

4.5 CROSS JOIN

-- CROSS JOIN - Cartesian Product
SELECT 
    e.first_name,
    d.department_name
FROM employees e
CROSS JOIN departments d;

-- Useful for generating combinations
SELECT 
    p.product_name,
    s.size
FROM products p
CROSS JOIN (SELECT 'Small' AS size UNION SELECT 'Medium' UNION SELECT 'Large') s;

4.6 SELF JOIN

-- SELF JOIN - Find Employee and Their Manager
SELECT 
    e.employee_id,
    e.first_name AS employee_name,
    e.last_name AS employee_lastname,
    m.first_name AS manager_name,
    m.last_name AS manager_lastname
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.employee_id;

-- Find Employees in Same Department
SELECT 
    e1.first_name || ' ' || e1.last_name AS employee1,
    e2.first_name || ' ' || e2.last_name AS employee2,
    e1.department_id
FROM employees e1
INNER JOIN employees e2 
    ON e1.department_id = e2.department_id
    AND e1.employee_id < e2.employee_id;  -- Avoid duplicates

5. Subqueries & Nested Queries

5.1 Single-Row Subqueries

-- Subquery in WHERE Clause
SELECT first_name, last_name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);

-- Subquery with IN
SELECT first_name, last_name, department_id
FROM employees
WHERE department_id IN (
    SELECT department_id 
    FROM departments 
    WHERE location_id = 1700
);

-- Subquery in SELECT
SELECT 
    first_name,
    last_name,
    salary,
    (SELECT AVG(salary) FROM employees) AS avg_salary,
    salary - (SELECT AVG(salary) FROM employees) AS diff_from_avg
FROM employees;

5.2 Multiple-Row Subqueries

-- ALL Operator
SELECT first_name, last_name, salary
FROM employees
WHERE salary > ALL (
    SELECT salary 
    FROM employees 
    WHERE department_id = 10
);

-- ANY Operator
SELECT first_name, last_name, salary
FROM employees
WHERE salary > ANY (
    SELECT salary 
    FROM employees 
    WHERE department_id = 10
);

-- EXISTS Operator
SELECT department_name
FROM departments d
WHERE EXISTS (
    SELECT 1 
    FROM employees e 
    WHERE e.department_id = d.department_id
);

-- NOT EXISTS Operator
SELECT department_name
FROM departments d
WHERE NOT EXISTS (
    SELECT 1 
    FROM employees e 
    WHERE e.department_id = d.department_id
);

5.3 Correlated Subqueries

-- Find Employees Earning More Than Avg in Their Department
SELECT 
    e.first_name,
    e.last_name,
    e.salary,
    e.department_id
FROM employees e
WHERE salary > (
    SELECT AVG(salary)
    FROM employees e2
    WHERE e2.department_id = e.department_id
);

-- Find Departments with More Than 5 Employees
SELECT d.department_name
FROM departments d
WHERE (
    SELECT COUNT(*)
    FROM employees e
    WHERE e.department_id = d.department_id
) > 5;

5.4 Subqueries in FROM Clause (Derived Tables)

-- Using Subquery as a Table
SELECT 
    dept_summary.department_id,
    dept_summary.avg_salary,
    d.department_name
FROM (
    SELECT 
        department_id,
        AVG(salary) AS avg_salary
    FROM employees
    GROUP BY department_id
) AS dept_summary
JOIN departments d ON dept_summary.department_id = d.department_id
WHERE dept_summary.avg_salary > 70000;

6. Window Functions

6.1 ROW_NUMBER()

-- Assign Row Numbers
SELECT 
    employee_id,
    first_name,
    last_name,
    department_id,
    salary,
    ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num
FROM employees;

-- Row Number Within Each Department
SELECT 
    employee_id,
    first_name,
    last_name,
    department_id,
    salary,
    ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY salary DESC) AS dept_rank
FROM employees;

6.2 RANK() and DENSE_RANK()

-- RANK() - Leaves gaps after ties
SELECT 
    employee_id,
    first_name,
    last_name,
    salary,
    RANK() OVER (ORDER BY salary DESC) AS salary_rank
FROM employees;

-- DENSE_RANK() - No gaps after ties
SELECT 
    employee_id,
    first_name,
    last_name,
    salary,
    DENSE_RANK() OVER (ORDER BY salary DESC) AS salary_dense_rank
FROM employees;

-- Compare RANK vs DENSE_RANK
SELECT 
    employee_id,
    first_name,
    salary,
    RANK() OVER (ORDER BY salary DESC) AS rank_with_gaps,
    DENSE_RANK() OVER (ORDER BY salary DESC) AS rank_no_gaps
FROM employees;

6.3 NTILE()

-- Divide into N Equal Groups (Quartiles)
SELECT 
    employee_id,
    first_name,
    salary,
    NTILE(4) OVER (ORDER BY salary DESC) AS salary_quartile
FROM employees;

-- Divide by Department into 3 Groups
SELECT 
    employee_id,
    first_name,
    department_id,
    salary,
    NTILE(3) OVER (PARTITION BY department_id ORDER BY salary DESC) AS dept_salary_group
FROM employees;

6.4 LAG() and LEAD()

-- LAG() - Access Previous Row
SELECT 
    employee_id,
    first_name,
    salary,
    LAG(salary, 1) OVER (ORDER BY salary) AS previous_salary,
    salary - LAG(salary, 1) OVER (ORDER BY salary) AS salary_diff
FROM employees;

-- LEAD() - Access Next Row
SELECT 
    employee_id,
    first_name,
    hire_date,
    LEAD(hire_date, 1) OVER (ORDER BY hire_date) AS next_hire_date
FROM employees;

-- Compare with Previous Employee in Same Department
SELECT 
    employee_id,
    first_name,
    department_id,
    salary,
    LAG(salary, 1) OVER (PARTITION BY department_id ORDER BY hire_date) AS prev_emp_salary
FROM employees;

6.5 FIRST_VALUE() and LAST_VALUE()

-- FIRST_VALUE() - First Value in Window
SELECT 
    employee_id,
    first_name,
    department_id,
    salary,
    FIRST_VALUE(salary) OVER (PARTITION BY department_id ORDER BY salary DESC) AS highest_salary_in_dept
FROM employees;

-- LAST_VALUE() - Last Value in Window
SELECT 
    employee_id,
    first_name,
    department_id,
    salary,
    LAST_VALUE(salary) OVER (
        PARTITION BY department_id 
        ORDER BY salary DESC
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS lowest_salary_in_dept
FROM employees;

6.6 Running Totals and Moving Averages

-- Running Total (Cumulative Sum)
SELECT 
    order_id,
    order_date,
    total_amount,
    SUM(total_amount) OVER (ORDER BY order_date) AS running_total
FROM orders;

-- Running Total by Customer
SELECT 
    customer_id,
    order_id,
    order_date,
    total_amount,
    SUM(total_amount) OVER (PARTITION BY customer_id ORDER BY order_date) AS customer_running_total
FROM orders;

-- Moving Average (Last 3 rows)
SELECT 
    order_date,
    total_amount,
    AVG(total_amount) OVER (
        ORDER BY order_date
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ) AS moving_avg_3days
FROM orders;

7. Common Table Expressions (CTEs)

7.1 Basic CTE

-- Simple CTE
WITH dept_avg AS (
    SELECT 
        department_id,
        AVG(salary) AS avg_salary
    FROM employees
    GROUP BY department_id
)
SELECT 
    e.first_name,
    e.last_name,
    e.salary,
    d.avg_salary
FROM employees e
JOIN dept_avg d ON e.department_id = d.department_id
WHERE e.salary > d.avg_salary;

7.2 Multiple CTEs

-- Multiple CTEs
WITH 
high_earners AS (
    SELECT employee_id, first_name, last_name, salary, department_id
    FROM employees
    WHERE salary > 80000
),
dept_counts AS (
    SELECT 
        department_id,
        COUNT(*) AS emp_count
    FROM employees
    GROUP BY department_id
)
SELECT 
    h.first_name,
    h.last_name,
    h.salary,
    d.emp_count
FROM high_earners h
JOIN dept_counts d ON h.department_id = d.department_id;

7.3 Recursive CTE

-- Recursive CTE - Employee Hierarchy
WITH RECURSIVE emp_hierarchy AS (
    -- Anchor: Start with top-level employees (no manager)
    SELECT 
        employee_id,
        first_name,
        last_name,
        manager_id,
        1 AS level
    FROM employees
    WHERE manager_id IS NULL
    
    UNION ALL
    
    -- Recursive: Get employees managed by previous level
    SELECT 
        e.employee_id,
        e.first_name,
        e.last_name,
        e.manager_id,
        eh.level + 1
    FROM employees e
    INNER JOIN emp_hierarchy eh ON e.manager_id = eh.employee_id
)
SELECT 
    employee_id,
    REPEAT('  ', level - 1) || first_name || ' ' || last_name AS employee_name,
    level
FROM emp_hierarchy
ORDER BY level, employee_id;

-- Generate Numbers 1 to 100
WITH RECURSIVE numbers AS (
    SELECT 1 AS n
    UNION ALL
    SELECT n + 1
    FROM numbers
    WHERE n < 100
)
SELECT n FROM numbers;

8. String Functions

8.1 String Manipulation

-- CONCAT - Concatenate Strings
SELECT 
    CONCAT(first_name, ' ', last_name) AS full_name
FROM employees;

-- CONCAT_WS - Concat With Separator
SELECT 
    CONCAT_WS(', ', last_name, first_name) AS name
FROM employees;

-- UPPER / LOWER - Case Conversion
SELECT 
    UPPER(first_name) AS upper_name,
    LOWER(last_name) AS lower_name
FROM employees;

-- LENGTH / CHAR_LENGTH - String Length
SELECT 
    first_name,
    LENGTH(first_name) AS byte_length,
    CHAR_LENGTH(first_name) AS char_length
FROM employees;

-- SUBSTRING - Extract Substring
SELECT 
    email,
    SUBSTRING(email, 1, POSITION('@' IN email) - 1) AS username,
    SUBSTRING(email, POSITION('@' IN email) + 1) AS domain
FROM employees;

-- LEFT / RIGHT - Extract from Left/Right
SELECT 
    phone,
    LEFT(phone, 3) AS area_code,
    RIGHT(phone, 4) AS last_four
FROM employees;

8.2 String Searching and Replacement

-- POSITION / LOCATE - Find Position
SELECT 
    email,
    POSITION('@' IN email) AS at_position
FROM employees;

-- REPLACE - Replace Substring
SELECT 
    email,
    REPLACE(email, '@company.com', '@newcompany.com') AS new_email
FROM employees;

-- TRIM / LTRIM / RTRIM - Remove Spaces
SELECT 
    TRIM('  Hello World  ') AS trimmed,
    LTRIM('  Hello World  ') AS left_trimmed,
    RTRIM('  Hello World  ') AS right_trimmed;

8.3 Pattern Matching

-- LIKE - Pattern Matching
SELECT * FROM employees
WHERE first_name LIKE 'J%';  -- Starts with J

-- REGEXP / RLIKE - Regular Expression (MySQL)
SELECT * FROM employees
WHERE email REGEXP '^[a-z]+@gmail\\.com$';

-- Find emails with numbers
SELECT * FROM employees
WHERE email REGEXP '[0-9]';

9. Date & Time Functions

9.1 Current Date/Time

-- Current Date and Time
SELECT 
    CURRENT_DATE AS today,
    CURRENT_TIME AS now_time,
    CURRENT_TIMESTAMP AS now_datetime,
    NOW() AS now_function;

9.2 Date Extraction

-- Extract Parts of Date
SELECT 
    hire_date,
    YEAR(hire_date) AS hire_year,
    MONTH(hire_date) AS hire_month,
    DAY(hire_date) AS hire_day,
    DAYNAME(hire_date) AS day_of_week,
    QUARTER(hire_date) AS quarter
FROM employees;

-- EXTRACT Function
SELECT 
    hire_date,
    EXTRACT(YEAR FROM hire_date) AS year,
    EXTRACT(MONTH FROM hire_date) AS month,
    EXTRACT(DAY FROM hire_date) AS day
FROM employees;

9.3 Date Arithmetic

-- Add/Subtract Dates
SELECT 
    order_date,
    DATE_ADD(order_date, INTERVAL 30 DAY) AS due_date,
    DATE_SUB(order_date, INTERVAL 1 WEEK) AS week_before
FROM orders;

-- DATEDIFF - Difference in Days
SELECT 
    order_date,
    ship_date,
    DATEDIFF(ship_date, order_date) AS days_to_ship
FROM orders;

-- Age Calculation
SELECT 
    first_name,
    hire_date,
    DATEDIFF(CURRENT_DATE, hire_date) / 365 AS years_employed
FROM employees;

9.4 Date Formatting

-- DATE_FORMAT (MySQL)
SELECT 
    hire_date,
    DATE_FORMAT(hire_date, '%Y-%m-%d') AS iso_format,
    DATE_FORMAT(hire_date, '%M %d, %Y') AS readable_format,
    DATE_FORMAT(hire_date, '%W, %M %d, %Y') AS full_format
FROM employees;

-- TO_CHAR (PostgreSQL)
SELECT 
    hire_date,
    TO_CHAR(hire_date, 'YYYY-MM-DD') AS iso_format,
    TO_CHAR(hire_date, 'Month DD, YYYY') AS readable_format
FROM employees;

10. CASE Statements & Conditional Logic

10.1 Simple CASE

-- Simple CASE Expression
SELECT 
    first_name,
    last_name,
    department_id,
    CASE department_id
        WHEN 10 THEN 'Administration'
        WHEN 20 THEN 'Marketing'
        WHEN 30 THEN 'Purchasing'
        WHEN 40 THEN 'Human Resources'
        ELSE 'Other'
    END AS department_name
FROM employees;

10.2 Searched CASE

-- Searched CASE Expression
SELECT 
    first_name,
    last_name,
    salary,
    CASE 
        WHEN salary < 50000 THEN 'Low'
        WHEN salary BETWEEN 50000 AND 80000 THEN 'Medium'
        WHEN salary > 80000 THEN 'High'
        ELSE 'Unknown'
    END AS salary_category
FROM employees;

-- Multiple Conditions
SELECT 
    first_name,
    last_name,
    salary,
    commission_pct,
    CASE 
        WHEN salary > 80000 AND commission_pct IS NOT NULL THEN 'Senior Sales'
        WHEN salary > 80000 THEN 'Senior'
        WHEN commission_pct IS NOT NULL THEN 'Sales'
        ELSE 'Regular'
    END AS employee_type
FROM employees;

10.3 CASE in Aggregations

-- Conditional Aggregation
SELECT 
    department_id,
    COUNT(*) AS total_employees,
    COUNT(CASE WHEN salary > 70000 THEN 1 END) AS high_earners,
    COUNT(CASE WHEN salary <= 70000 THEN 1 END) AS regular_earners,
    SUM(CASE WHEN commission_pct IS NOT NULL THEN salary ELSE 0 END) AS sales_total_salary
FROM employees
GROUP BY department_id;

-- Pivot-like Query
SELECT 
    YEAR(order_date) AS order_year,
    SUM(CASE WHEN MONTH(order_date) = 1 THEN total_amount ELSE 0 END) AS jan_sales,
    SUM(CASE WHEN MONTH(order_date) = 2 THEN total_amount ELSE 0 END) AS feb_sales,
    SUM(CASE WHEN MONTH(order_date) = 3 THEN total_amount ELSE 0 END) AS mar_sales,
    SUM(CASE WHEN MONTH(order_date) = 4 THEN total_amount ELSE 0 END) AS apr_sales
FROM orders
GROUP BY YEAR(order_date);

10.4 COALESCE and NULLIF

-- COALESCE - Return First Non-NULL Value
SELECT 
    first_name,
    last_name,
    COALESCE(commission_pct, 0) AS commission_pct,
    salary * (1 + COALESCE(commission_pct, 0)) AS total_compensation
FROM employees;

-- NULLIF - Return NULL if values are equal
SELECT 
    product_name,
    price,
    NULLIF(price, 0) AS safe_price  -- Returns NULL if price is 0
FROM products;

-- Avoid Division by Zero
SELECT 
    department_id,
    SUM(salary) / NULLIF(COUNT(*), 0) AS avg_salary
FROM employees
GROUP BY department_id;

11. Set Operations

11.1 UNION

-- UNION - Combines and removes duplicates
SELECT first_name, last_name FROM employees WHERE department_id = 10
UNION
SELECT first_name, last_name FROM employees WHERE department_id = 20;

-- UNION ALL - Combines and keeps duplicates (faster)
SELECT first_name, last_name FROM employees WHERE department_id = 10
UNION ALL
SELECT first_name, last_name FROM employees WHERE department_id = 20;

11.2 INTERSECT

-- INTERSECT - Returns common rows (MySQL 8.0.31+)
SELECT employee_id FROM employees WHERE salary > 70000
INTERSECT
SELECT employee_id FROM employees WHERE department_id = 10;

-- Alternative using INNER JOIN
SELECT DISTINCT e1.employee_id
FROM employees e1
INNER JOIN employees e2 ON e1.employee_id = e2.employee_id
WHERE e1.salary > 70000 AND e2.department_id = 10;

11.3 EXCEPT (MINUS)

-- EXCEPT - Returns rows from first query not in second (MySQL 8.0.31+)
SELECT employee_id FROM employees WHERE department_id = 10
EXCEPT
SELECT employee_id FROM employees WHERE salary > 70000;

-- Alternative using NOT IN
SELECT employee_id 
FROM employees 
WHERE department_id = 10
AND employee_id NOT IN (
    SELECT employee_id FROM employees WHERE salary > 70000
);

12. Indexes & Performance Optimization

12.1 Creating Indexes

-- Single Column Index
CREATE INDEX idx_lastname ON employees(last_name);

-- Composite Index (Multiple Columns)
CREATE INDEX idx_dept_salary ON employees(department_id, salary);

-- Unique Index
CREATE UNIQUE INDEX idx_email ON employees(email);

-- Full-Text Index (MySQL)
CREATE FULLTEXT INDEX idx_product_name ON products(product_name);

-- Partial Index (PostgreSQL)
CREATE INDEX idx_active_employees ON employees(employee_id)
WHERE status = 'active';

12.2 Viewing and Dropping Indexes

-- Show Indexes
SHOW INDEX FROM employees;

-- Drop Index
DROP INDEX idx_lastname ON employees;

-- Analyze Table
ANALYZE TABLE employees;

12.3 Query Optimization Tips

-- Use EXPLAIN to analyze query
EXPLAIN SELECT * FROM employees WHERE salary > 70000;

-- βœ… GOOD: Use indexed columns in WHERE
SELECT * FROM employees WHERE employee_id = 100;

-- ❌ BAD: Functions on indexed columns prevent index usage
SELECT * FROM employees WHERE YEAR(hire_date) = 2023;

-- βœ… GOOD: Rewrite without function
SELECT * FROM employees 
WHERE hire_date >= '2023-01-01' AND hire_date < '2024-01-01';

-- βœ… GOOD: Use EXISTS instead of IN for large subqueries
SELECT * FROM employees e
WHERE EXISTS (
    SELECT 1 FROM departments d 
    WHERE d.department_id = e.department_id AND d.location_id = 1700
);

-- βœ… GOOD: Use JOINs instead of subqueries when possible
SELECT e.*
FROM employees e
INNER JOIN departments d ON e.department_id = d.department_id
WHERE d.location_id = 1700;

13. Constraints & Data Integrity

13.1 Primary Key

-- Primary Key at Creation
CREATE TABLE students (
    student_id INT PRIMARY KEY,
    name VARCHAR(100)
);

-- Composite Primary Key
CREATE TABLE enrollments (
    student_id INT,
    course_id INT,
    enrollment_date DATE,
    PRIMARY KEY (student_id, course_id)
);

-- Add Primary Key to Existing Table
ALTER TABLE students
ADD PRIMARY KEY (student_id);

13.2 Foreign Key

-- Foreign Key at Creation
CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    customer_id INT,
    order_date DATE,
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

-- Foreign Key with Actions
CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    customer_id INT,
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
        ON DELETE CASCADE
        ON UPDATE CASCADE
);

-- Add Foreign Key to Existing Table
ALTER TABLE orders
ADD FOREIGN KEY (customer_id) REFERENCES customers(customer_id);

13.3 Other Constraints

-- NOT NULL Constraint
CREATE TABLE employees (
    employee_id INT NOT NULL,
    first_name VARCHAR(50) NOT NULL,
    email VARCHAR(100)
);

-- UNIQUE Constraint
CREATE TABLE employees (
    employee_id INT UNIQUE,
    email VARCHAR(100) UNIQUE
);

-- CHECK Constraint
CREATE TABLE employees (
    employee_id INT,
    salary DECIMAL(10, 2) CHECK (salary > 0),
    age INT CHECK (age >= 18 AND age <= 65)
);

-- DEFAULT Constraint
CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    order_date DATE DEFAULT CURRENT_DATE,
    status VARCHAR(20) DEFAULT 'pending'
);

14. Transactions & ACID Properties

14.1 Transaction Basics

-- Start Transaction
START TRANSACTION;
-- OR
BEGIN;

-- Execute Queries
UPDATE accounts SET balance = balance - 100 WHERE account_id = 1;
UPDATE accounts SET balance = balance + 100 WHERE account_id = 2;

-- Commit Transaction
COMMIT;

-- Or Rollback if Error
ROLLBACK;

14.2 Savepoints

START TRANSACTION;

INSERT INTO employees (employee_id, first_name) VALUES (100, 'John');

SAVEPOINT sp1;

INSERT INTO employees (employee_id, first_name) VALUES (101, 'Jane');

SAVEPOINT sp2;

INSERT INTO employees (employee_id, first_name) VALUES (102, 'Bob');

-- Rollback to sp2
ROLLBACK TO SAVEPOINT sp2;

-- Commit remaining changes
COMMIT;

14.3 Isolation Levels

-- Set Isolation Level
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

-- Example
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
START TRANSACTION;
SELECT * FROM accounts WHERE account_id = 1;
-- ... more queries
COMMIT;

15. Advanced SQL Patterns

15.1 Pivot Tables

-- Pivot: Transform Rows to Columns
SELECT 
    customer_id,
    SUM(CASE WHEN MONTH(order_date) = 1 THEN total_amount ELSE 0 END) AS jan,
    SUM(CASE WHEN MONTH(order_date) = 2 THEN total_amount ELSE 0 END) AS feb,
    SUM(CASE WHEN MONTH(order_date) = 3 THEN total_amount ELSE 0 END) AS mar
FROM orders
GROUP BY customer_id;

15.2 Running Calculations

-- Calculate Running Difference
WITH numbered_orders AS (
    SELECT 
        order_id,
        order_date,
        total_amount,
        ROW_NUMBER() OVER (ORDER BY order_date) AS rn
    FROM orders
)
SELECT 
    o1.order_id,
    o1.total_amount,
    o2.total_amount AS prev_amount,
    o1.total_amount - COALESCE(o2.total_amount, 0) AS diff_from_prev
FROM numbered_orders o1
LEFT JOIN numbered_orders o2 ON o1.rn = o2.rn + 1;

15.3 Finding Gaps in Sequences

-- Find Missing IDs
SELECT 
    t1.employee_id + 1 AS missing_id_start,
    t2.employee_id - 1 AS missing_id_end
FROM employees t1
CROSS JOIN employees t2
WHERE t1.employee_id < t2.employee_id
AND NOT EXISTS (
    SELECT 1 FROM employees t3 
    WHERE t3.employee_id BETWEEN t1.employee_id + 1 AND t2.employee_id - 1
)
AND t2.employee_id - t1.employee_id > 1;

15.4 Hierarchical Queries

-- Find All Subordinates of a Manager
WITH RECURSIVE subordinates AS (
    -- Start with specific manager
    SELECT employee_id, first_name, last_name, manager_id, 1 AS level
    FROM employees
    WHERE employee_id = 100
    
    UNION ALL
    
    SELECT e.employee_id, e.first_name, e.last_name, e.manager_id, s.level + 1
    FROM employees e
    INNER JOIN subordinates s ON e.manager_id = s.employee_id
)
SELECT * FROM subordinates;

16. 30 MUST-DO Interview Questions

Q1: Find Nth Highest Salary

-- Method 1: Using LIMIT and OFFSET
SELECT DISTINCT salary
FROM employees
ORDER BY salary DESC
LIMIT 1 OFFSET 2;  -- 3rd highest (N-1)

-- Method 2: Using Subquery
SELECT MAX(salary)
FROM employees
WHERE salary < (
    SELECT MAX(salary) FROM employees
    WHERE salary < (SELECT MAX(salary) FROM employees)
);

-- Method 3: Using DENSE_RANK
WITH ranked_salaries AS (
    SELECT 
        salary,
        DENSE_RANK() OVER (ORDER BY salary DESC) AS rank_num
    FROM employees
)
SELECT DISTINCT salary
FROM ranked_salaries
WHERE rank_num = 3;

Q2: Find Duplicate Records

-- Find Duplicate Emails
SELECT email, COUNT(*) AS count
FROM employees
GROUP BY email
HAVING COUNT(*) > 1;

-- Get All Duplicate Rows
SELECT e.*
FROM employees e
INNER JOIN (
    SELECT email
    FROM employees
    GROUP BY email
    HAVING COUNT(*) > 1
) duplicates ON e.email = duplicates.email;

Q3: Delete Duplicate Rows (Keep One)

-- Using ROW_NUMBER (MySQL 8.0+)
DELETE FROM employees
WHERE employee_id IN (
    SELECT employee_id FROM (
        SELECT 
            employee_id,
            ROW_NUMBER() OVER (PARTITION BY email ORDER BY employee_id) AS rn
        FROM employees
    ) t
    WHERE rn > 1
);

-- Alternative: Create New Table
CREATE TABLE employees_unique AS
SELECT DISTINCT * FROM employees;

DROP TABLE employees;
RENAME TABLE employees_unique TO employees;

Q4: Employees Earning More Than Their Managers

SELECT 
    e.employee_id,
    e.first_name,
    e.last_name,
    e.salary AS emp_salary,
    m.first_name AS manager_name,
    m.salary AS manager_salary
FROM employees e
INNER JOIN employees m ON e.manager_id = m.employee_id
WHERE e.salary > m.salary;

Q5: Department with Highest Average Salary

SELECT 
    d.department_name,
    AVG(e.salary) AS avg_salary
FROM employees e
INNER JOIN departments d ON e.department_id = d.department_id
GROUP BY d.department_name
ORDER BY avg_salary DESC
LIMIT 1;

-- Alternative: Using Subquery
SELECT department_name, avg_salary
FROM (
    SELECT 
        d.department_name,
        AVG(e.salary) AS avg_salary
    FROM employees e
    INNER JOIN departments d ON e.department_id = d.department_id
    GROUP BY d.department_name
) dept_avg
WHERE avg_salary = (
    SELECT MAX(avg_salary) FROM (
        SELECT AVG(salary) AS avg_salary
        FROM employees
        GROUP BY department_id
    ) max_avg
);

Q6: Second Highest Salary

-- Method 1: Using LIMIT
SELECT DISTINCT salary
FROM employees
ORDER BY salary DESC
LIMIT 1 OFFSET 1;

-- Method 2: Using MAX with Subquery
SELECT MAX(salary) AS second_highest
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);

-- Method 3: Handle NULL case
SELECT 
    (SELECT DISTINCT salary 
     FROM employees 
     ORDER BY salary DESC 
     LIMIT 1 OFFSET 1) AS SecondHighestSalary;

Q7: Consecutive Numbers

-- Find numbers that appear at least 3 times consecutively
WITH numbered AS (
    SELECT 
        num,
        ROW_NUMBER() OVER (ORDER BY id) AS rn,
        ROW_NUMBER() OVER (PARTITION BY num ORDER BY id) AS grp
    FROM logs
)
SELECT DISTINCT num AS ConsecutiveNums
FROM numbered
GROUP BY num, rn - grp
HAVING COUNT(*) >= 3;

Q8: Customers Who Never Ordered

-- Using LEFT JOIN
SELECT c.customer_id, c.first_name, c.last_name
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_id IS NULL;

-- Using NOT IN
SELECT customer_id, first_name, last_name
FROM customers
WHERE customer_id NOT IN (SELECT DISTINCT customer_id FROM orders);

-- Using NOT EXISTS (Most Efficient)
SELECT c.customer_id, c.first_name, c.last_name
FROM customers c
WHERE NOT EXISTS (
    SELECT 1 FROM orders o WHERE o.customer_id = c.customer_id
);

Q9: Monthly Sales Growth

WITH monthly_sales AS (
    SELECT 
        DATE_FORMAT(order_date, '%Y-%m') AS month,
        SUM(total_amount) AS total_sales
    FROM orders
    GROUP BY DATE_FORMAT(order_date, '%Y-%m')
)
SELECT 
    month,
    total_sales,
    LAG(total_sales) OVER (ORDER BY month) AS prev_month_sales,
    total_sales - LAG(total_sales) OVER (ORDER BY month) AS growth,
    ROUND(
        (total_sales - LAG(total_sales) OVER (ORDER BY month)) / 
        LAG(total_sales) OVER (ORDER BY month) * 100, 2
    ) AS growth_percentage
FROM monthly_sales;

Q10: Top 3 Products by Sales in Each Category

WITH product_sales AS (
    SELECT 
        p.category,
        p.product_name,
        SUM(oi.quantity * oi.unit_price) AS total_sales,
        RANK() OVER (PARTITION BY p.category ORDER BY SUM(oi.quantity * oi.unit_price) DESC) AS sales_rank
    FROM products p
    INNER JOIN order_items oi ON p.product_id = oi.product_id
    GROUP BY p.category, p.product_name
)
SELECT category, product_name, total_sales
FROM product_sales
WHERE sales_rank <= 3;

Q11: Cumulative Sum

SELECT 
    order_id,
    order_date,
    total_amount,
    SUM(total_amount) OVER (ORDER BY order_date) AS cumulative_total
FROM orders;

Q12: Find Median

-- Using Window Functions
WITH numbered AS (
    SELECT 
        salary,
        ROW_NUMBER() OVER (ORDER BY salary) AS rn,
        COUNT(*) OVER () AS total_count
    FROM employees
)
SELECT AVG(salary) AS median_salary
FROM numbered
WHERE rn IN (FLOOR((total_count + 1) / 2), CEILING((total_count + 1) / 2));

Q13: Self Join - Find Pairs

-- Find all pairs of employees in same department
SELECT 
    e1.first_name AS employee1,
    e2.first_name AS employee2,
    e1.department_id
FROM employees e1
INNER JOIN employees e2 
    ON e1.department_id = e2.department_id
    AND e1.employee_id < e2.employee_id;  -- Avoid duplicates and self-pairing

Q14: Year Over Year Growth

WITH yearly_sales AS (
    SELECT 
        YEAR(order_date) AS year,
        SUM(total_amount) AS total_sales
    FROM orders
    GROUP BY YEAR(order_date)
)
SELECT 
    year,
    total_sales,
    LAG(total_sales) OVER (ORDER BY year) AS prev_year_sales,
    ROUND(
        ((total_sales - LAG(total_sales) OVER (ORDER BY year)) / 
        LAG(total_sales) OVER (ORDER BY year)) * 100, 2
    ) AS yoy_growth_pct
FROM yearly_sales;

Q15: Active Users (Consecutive Days)

-- Find users active for at least 3 consecutive days
WITH user_dates AS (
    SELECT DISTINCT 
        user_id,
        DATE(login_time) AS login_date
    FROM user_logins
),
date_groups AS (
    SELECT 
        user_id,
        login_date,
        DATE_SUB(login_date, INTERVAL ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY login_date) DAY) AS grp
    FROM user_dates
)
SELECT user_id
FROM date_groups
GROUP BY user_id, grp
HAVING COUNT(*) >= 3;

Q16: Swap Salary Values

-- Swap salaries of two specific employees
UPDATE employees
SET salary = CASE employee_id
    WHEN 1 THEN (SELECT salary FROM employees WHERE employee_id = 2)
    WHEN 2 THEN (SELECT salary FROM employees WHERE employee_id = 1)
    ELSE salary
END
WHERE employee_id IN (1, 2);

Q17: Tree Node Type

-- Classify each node as Leaf, Root, or Inner
SELECT 
    employee_id,
    CASE
        WHEN manager_id IS NULL THEN 'Root'
        WHEN employee_id IN (SELECT DISTINCT manager_id FROM employees WHERE manager_id IS NOT NULL) THEN 'Inner'
        ELSE 'Leaf'
    END AS node_type
FROM employees;

Q18: Department Highest Salary

-- Find employees with highest salary in each department
WITH dept_max_salary AS (
    SELECT 
        department_id,
        MAX(salary) AS max_salary
    FROM employees
    GROUP BY department_id
)
SELECT 
    e.employee_id,
    e.first_name,
    e.last_name,
    e.department_id,
    e.salary
FROM employees e
INNER JOIN dept_max_salary d 
    ON e.department_id = d.department_id 
    AND e.salary = d.max_salary;

Q19: Customers Who Bought Products A and B

-- Find customers who bought both Product A and Product B
SELECT o.customer_id
FROM orders o
INNER JOIN order_items oi ON o.order_id = oi.order_id
WHERE oi.product_id IN (1, 2)  -- Assuming 1=A, 2=B
GROUP BY o.customer_id
HAVING COUNT(DISTINCT oi.product_id) = 2;

Q20: First and Last Values

-- Get first and last order for each customer
SELECT 
    customer_id,
    MIN(order_date) AS first_order,
    MAX(order_date) AS last_order,
    DATEDIFF(MAX(order_date), MIN(order_date)) AS days_between
FROM orders
GROUP BY customer_id;

Q21: Rank Dense Rank Row Number Differences

SELECT 
    employee_id,
    salary,
    ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num,
    RANK() OVER (ORDER BY salary DESC) AS rank_num,
    DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank_num
FROM employees;

Q22: Percentage of Total

SELECT 
    department_id,
    SUM(salary) AS dept_total,
    ROUND(SUM(salary) * 100.0 / (SELECT SUM(salary) FROM employees), 2) AS pct_of_total
FROM employees
GROUP BY department_id;

Q23: Find All Dates in Range

-- Generate all dates in a range
WITH RECURSIVE date_range AS (
    SELECT DATE('2023-01-01') AS date
    UNION ALL
    SELECT DATE_ADD(date, INTERVAL 1 DAY)
    FROM date_range
    WHERE date < '2023-12-31'
)
SELECT date FROM date_range;

Q24: Employees Hired in Last N Months

-- Employees hired in last 6 months
SELECT *
FROM employees
WHERE hire_date >= DATE_SUB(CURRENT_DATE, INTERVAL 6 MONTH);

Q25: Salary Above Department Average

SELECT 
    e.employee_id,
    e.first_name,
    e.last_name,
    e.salary,
    e.department_id,
    d.avg_salary
FROM employees e
INNER JOIN (
    SELECT department_id, AVG(salary) AS avg_salary
    FROM employees
    GROUP BY department_id
) d ON e.department_id = d.department_id
WHERE e.salary > d.avg_salary;

Q26: Trips and Users (Cancellation Rate)

-- Calculate cancellation rate by day
SELECT 
    request_date,
    ROUND(
        SUM(CASE WHEN status LIKE 'cancelled%' THEN 1 ELSE 0 END) * 1.0 / COUNT(*),
        2
    ) AS cancellation_rate
FROM trips
WHERE request_date BETWEEN '2023-10-01' AND '2023-10-03'
GROUP BY request_date;

Q27: Find Managers with at Least 5 Direct Reports

SELECT 
    m.employee_id,
    m.first_name,
    m.last_name,
    COUNT(e.employee_id) AS direct_reports
FROM employees m
INNER JOIN employees e ON m.employee_id = e.manager_id
GROUP BY m.employee_id, m.first_name, m.last_name
HAVING COUNT(e.employee_id) >= 5;

Q28: Average Salary Excluding Min and Max

SELECT 
    department_id,
    ROUND(
        (SUM(salary) - MAX(salary) - MIN(salary)) / 
        (COUNT(*) - 2),
        2
    ) AS avg_salary_excl_extremes
FROM employees
GROUP BY department_id
HAVING COUNT(*) > 2;

Q29: Friend Requests Acceptance Rate

-- Calculate friend request acceptance rate
SELECT 
    ROUND(
        IFNULL(
            (SELECT COUNT(DISTINCT requester_id, accepter_id) FROM accepted) /
            (SELECT COUNT(DISTINCT sender_id, send_to_id) FROM requests),
            0
        ),
        2
    ) AS accept_rate;

Q30: Students and Examinations

-- Show all combinations of students and subjects with exam count
SELECT 
    s.student_id,
    s.student_name,
    sub.subject_name,
    COUNT(e.subject_name) AS attended_exams
FROM students s
CROSS JOIN subjects sub
LEFT JOIN examinations e 
    ON s.student_id = e.student_id 
    AND sub.subject_name = e.subject_name
GROUP BY s.student_id, s.student_name, sub.subject_name
ORDER BY s.student_id, sub.subject_name;

17. SQL Optimization Techniques

17.1 Query Optimization Checklist

-- βœ… Use SELECT with specific columns instead of SELECT *
SELECT employee_id, first_name, last_name FROM employees;

-- βœ… Use WHERE clause to filter early
SELECT * FROM employees WHERE department_id = 10;

-- βœ… Use indexes on frequently queried columns
CREATE INDEX idx_dept_id ON employees(department_id);

-- βœ… Avoid functions on indexed columns in WHERE
-- ❌ BAD
SELECT * FROM employees WHERE YEAR(hire_date) = 2023;
-- βœ… GOOD
SELECT * FROM employees WHERE hire_date >= '2023-01-01' AND hire_date < '2024-01-01';

-- βœ… Use EXISTS instead of IN for subqueries
-- Better performance with large datasets
SELECT * FROM employees e
WHERE EXISTS (SELECT 1 FROM departments d WHERE d.department_id = e.department_id);

-- βœ… Use LIMIT to restrict result set
SELECT * FROM employees LIMIT 100;

-- βœ… Avoid SELECT DISTINCT when not needed
-- Use GROUP BY if aggregation is needed

-- βœ… Use UNION ALL instead of UNION when duplicates don't matter
-- UNION ALL is faster as it doesn't remove duplicates

-- βœ… Use appropriate JOIN types
-- INNER JOIN is faster than OUTER JOIN

17.2 Index Strategy

-- Composite index for multiple columns often used together
CREATE INDEX idx_dept_sal ON employees(department_id, salary);

-- This query will use the index
SELECT * FROM employees WHERE department_id = 10 AND salary > 70000;

-- Covering index (includes all columns in SELECT)
CREATE INDEX idx_covering ON employees(department_id, salary, first_name);

-- Query fully satisfied by index (no table access needed)
SELECT department_id, salary, first_name 
FROM employees 
WHERE department_id = 10;

18. Real-World Scenarios

Scenario 1: E-commerce Analytics

-- Top 5 customers by revenue
SELECT 
    c.customer_id,
    c.first_name,
    c.last_name,
    SUM(o.total_amount) AS total_revenue,
    COUNT(o.order_id) AS order_count,
    AVG(o.total_amount) AS avg_order_value
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.first_name, c.last_name
ORDER BY total_revenue DESC
LIMIT 5;

-- Customer Lifetime Value
WITH customer_metrics AS (
    SELECT 
        customer_id,
        MIN(order_date) AS first_order,
        MAX(order_date) AS last_order,
        COUNT(order_id) AS order_count,
        SUM(total_amount) AS total_spent
    FROM orders
    GROUP BY customer_id
)
SELECT 
    customer_id,
    order_count,
    total_spent,
    DATEDIFF(last_order, first_order) AS customer_lifespan_days,
    total_spent / NULLIF(order_count, 0) AS avg_order_value
FROM customer_metrics;

Scenario 2: User Retention Analysis

-- Monthly Active Users (MAU)
SELECT 
    DATE_FORMAT(activity_date, '%Y-%m') AS month,
    COUNT(DISTINCT user_id) AS monthly_active_users
FROM user_activity
GROUP BY DATE_FORMAT(activity_date, '%Y-%m');

-- User Retention Rate (% of users active in consecutive months)
WITH monthly_users AS (
    SELECT 
        DATE_FORMAT(activity_date, '%Y-%m') AS month,
        user_id
    FROM user_activity
    GROUP BY DATE_FORMAT(activity_date, '%Y-%m'), user_id
)
SELECT 
    m1.month AS current_month,
    COUNT(DISTINCT m1.user_id) AS current_users,
    COUNT(DISTINCT m2.user_id) AS retained_users,
    ROUND(COUNT(DISTINCT m2.user_id) * 100.0 / COUNT(DISTINCT m1.user_id), 2) AS retention_rate
FROM monthly_users m1
LEFT JOIN monthly_users m2 
    ON m1.user_id = m2.user_id
    AND m2.month = DATE_FORMAT(DATE_ADD(STR_TO_DATE(CONCAT(m1.month, '-01'), '%Y-%m-%d'), INTERVAL 1 MONTH), '%Y-%m')
GROUP BY m1.month;

Scenario 3: Inventory Management

-- Products Running Low on Stock
SELECT 
    product_id,
    product_name,
    stock_quantity,
    CASE 
        WHEN stock_quantity = 0 THEN 'Out of Stock'
        WHEN stock_quantity < 10 THEN 'Critical'
        WHEN stock_quantity < 50 THEN 'Low'
        ELSE 'Adequate'
    END AS stock_status
FROM products
WHERE stock_quantity < 50
ORDER BY stock_quantity;

-- Reorder Report with Sales Velocity
WITH sales_velocity AS (
    SELECT 
        oi.product_id,
        COUNT(DISTINCT o.order_id) AS orders_last_30_days,
        SUM(oi.quantity) AS units_sold_30_days,
        SUM(oi.quantity) / 30.0 AS avg_daily_sales
    FROM orders o
    INNER JOIN order_items oi ON o.order_id = oi.order_id
    WHERE o.order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 30 DAY)
    GROUP BY oi.product_id
)
SELECT 
    p.product_id,
    p.product_name,
    p.stock_quantity,
    sv.avg_daily_sales,
    ROUND(p.stock_quantity / NULLIF(sv.avg_daily_sales, 0), 1) AS days_of_inventory,
    CASE 
        WHEN p.stock_quantity / NULLIF(sv.avg_daily_sales, 0) < 7 THEN 'Urgent Reorder'
        WHEN p.stock_quantity / NULLIF(sv.avg_daily_sales, 0) < 14 THEN 'Reorder Soon'
        ELSE 'OK'
    END AS reorder_status
FROM products p
LEFT JOIN sales_velocity sv ON p.product_id = sv.product_id
ORDER BY days_of_inventory;

Quick Reference: SQL Command Categories

DDL (Data Definition Language)

  • CREATE - Create database objects
  • ALTER - Modify structure
  • DROP - Delete objects
  • TRUNCATE - Remove all rows

DML (Data Manipulation Language)

  • INSERT - Add rows
  • UPDATE - Modify rows
  • DELETE - Remove rows
  • MERGE - Insert/Update/Delete

DQL (Data Query Language)

  • SELECT - Query data

DCL (Data Control Language)

  • GRANT - Give permissions
  • REVOKE - Remove permissions

TCL (Transaction Control Language)

  • COMMIT - Save changes
  • ROLLBACK - Undo changes
  • SAVEPOINT - Set transaction point

Common SQL Interview Questions (Conceptual)

Q: What is the difference between WHERE and HAVING?

  • WHERE filters rows before grouping
  • HAVING filters groups after aggregation

Q: What is the difference between INNER JOIN and OUTER JOIN?

  • INNER JOIN returns only matching rows
  • OUTER JOIN returns matching rows plus unmatched rows from one or both tables

Q: What is a Primary Key vs Foreign Key?

  • Primary Key uniquely identifies each row
  • Foreign Key references Primary Key in another table

Q: What are ACID properties?

  • Atomicity: All or nothing
  • Consistency: Valid state always
  • Isolation: Transactions don't interfere
  • Durability: Committed data persists

Q: What is normalization? Process of organizing data to reduce redundancy:

  • 1NF: Atomic values, no repeating groups
  • 2NF: 1NF + no partial dependencies
  • 3NF: 2NF + no transitive dependencies

Q: Clustered vs Non-Clustered Index?

  • Clustered: Physically orders data, one per table
  • Non-Clustered: Separate structure, multiple per table

Best Practices

  1. βœ… Use meaningful table and column names
  2. βœ… Always use WHERE clause to limit results
  3. βœ… Use appropriate data types
  4. βœ… Create indexes on frequently queried columns
  5. βœ… Use transactions for related operations
  6. βœ… Avoid SELECT * in production code
  7. βœ… Use prepared statements to prevent SQL injection
  8. βœ… Comment complex queries
  9. βœ… Test queries on small datasets first
  10. βœ… Monitor query performance with EXPLAIN

Conclusion

This guide covers essential SQL concepts and interview questions. Practice these queries regularly to build strong SQL skills. Remember:

  • Master the basics before advanced topics
  • Understand JOINs thoroughly
  • Practice window functions - they're heavily tested
  • Learn to optimize queries
  • Work through real-world scenarios

Good luck with your SQL interviews! πŸš€


Practice makes perfect. Keep querying!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment