C
h
i
L
L
u
.
.
.

MySQL Mastery Guide

Complete guide from beginner to advanced

Understanding MySQL: The World's Most Popular Open-Source Database

MySQL is an open-source relational database management system (RDBMS) that uses Structured Query Language (SQL). Originally created by MySQL AB in 1995 and now owned by Oracle Corporation, MySQL has become the world's most popular open-source database, powering millions of web applications worldwide.

MySQL is known for its reliability, performance, and ease of use. It's a core component of the LAMP stack (Linux, Apache, MySQL, PHP/Python/Perl) and is used by major companies like Facebook, Twitter, YouTube, and WordPress for handling massive amounts of data.

Did you know? The name "MySQ" is a combination of "My", the name of co-founder Michael Widenius's daughter, and "SQL", the structured query language. MySQL's dolphin logo is named "Sakila", which was chosen from a list of names suggested by users in a "Name the Dolphin" contest.

1. MySQL Basics & Fundamentals

Getting Started with MySQL

MySQL can be installed on various operating systems and accessed through command-line tools, graphical interfaces, or programming languages. Understanding the basic commands is essential for database management.
-- Connect to MySQL server
mysql -u root -p

-- Show available databases
SHOW DATABASES;

-- Create a new database
CREATE DATABASE company;

-- Use a specific database
USE company;

-- Show tables in current database
SHOW TABLES;

-- Show current user and database
SELECT USER(), DATABASE();

-- Get MySQL version
SELECT VERSION();

-- Basic SELECT statement
SELECT 'Hello, MySQL!' AS greeting;

-- Working with variables
SET @name = 'John Doe';
SELECT @name;

-- Mathematical operations
SELECT 10 + 5 AS addition,
       10 - 5 AS subtraction,
       10 * 5 AS multiplication,
       10 / 3 AS division,
       10 % 3 AS modulus;

-- String operations
SELECT CONCAT('Hello', ' ', 'World') AS greeting,
       UPPER('hello') AS upper_case,
       LOWER('HELLO') AS lower_case,
       LENGTH('MySQL') AS string_length,
       SUBSTRING('MySQL', 1, 3) AS substring;

-- Date and time functions
SELECT NOW() AS current_datetime,
       CURDATE() AS current_date,
       CURTIME() AS current_time,
       DAYNAME(NOW()) AS day_name,
       MONTHNAME(NOW()) AS month_name;

Basic CRUD Operations

CRUD (Create, Read, Update, Delete) operations form the foundation of database interactions. These operations allow you to manage data in your database tables.
-- Create a sample table
CREATE TABLE employees (
    id INT AUTO_INCREMENT PRIMARY KEY,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(100) UNIQUE,
    salary DECIMAL(10,2),
    hire_date DATE,
    department VARCHAR(50)
);

-- INSERT - Create new records
INSERT INTO employees (first_name, last_name, email, salary, hire_date, department)
VALUES 
('John', 'Doe', 'john.doe@company.com', 50000.00, '2020-01-15', 'Engineering'),
('Jane', 'Smith', 'jane.smith@company.com', 60000.00, '2019-03-20', 'Marketing'),
('Bob', 'Johnson', 'bob.johnson@company.com', 55000.00, '2021-06-10', 'Engineering'),
('Alice', 'Brown', 'alice.brown@company.com', 52000.00, '2022-02-28', 'HR');

-- SELECT - Read data (various ways)
SELECT * FROM employees;  -- All columns, all rows

SELECT first_name, last_name, salary 
FROM employees;  -- Specific columns

SELECT first_name, last_name, salary
FROM employees
WHERE department = 'Engineering';  -- With condition

SELECT first_name, last_name, salary
FROM employees
WHERE salary > 55000
ORDER BY salary DESC;  -- With ordering

SELECT department, COUNT(*) as employee_count
FROM employees
GROUP BY department;  -- Aggregation

-- UPDATE - Modify existing records
UPDATE employees 
SET salary = salary * 1.10  -- 10% raise
WHERE department = 'Engineering';

UPDATE employees
SET email = 'john.newemail@company.com'
WHERE id = 1;

-- DELETE - Remove records
DELETE FROM employees 
WHERE id = 4;  -- Delete specific employee

-- Safe DELETE with SELECT first
SELECT * FROM employees WHERE last_name = 'Johnson';  -- Check what will be deleted
DELETE FROM employees WHERE last_name = 'Johnson';

-- TRUNCATE - Remove all records (faster than DELETE)
TRUNCATE TABLE employees;  -- Resets auto_increment counter

-- DROP - Remove entire table
DROP TABLE employees;

Working with Multiple Tables

Real-world databases typically consist of multiple related tables. Understanding how to work with and relate these tables is crucial for effective database design.
-- Create related tables for a simple e-commerce system
CREATE TABLE customers (
    customer_id INT AUTO_INCREMENT PRIMARY KEY,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(100) UNIQUE NOT NULL,
    phone VARCHAR(20),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE products (
    product_id INT AUTO_INCREMENT PRIMARY KEY,
    product_name VARCHAR(100) NOT NULL,
    description TEXT,
    price DECIMAL(10,2) NOT NULL,
    stock_quantity INT DEFAULT 0,
    category VARCHAR(50),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE orders (
    order_id INT AUTO_INCREMENT PRIMARY KEY,
    customer_id INT,
    order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    total_amount DECIMAL(10,2),
    status ENUM('pending', 'processing', 'shipped', 'delivered', 'cancelled') DEFAULT 'pending',
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

CREATE TABLE order_items (
    order_item_id INT AUTO_INCREMENT PRIMARY KEY,
    order_id INT,
    product_id INT,
    quantity INT NOT NULL,
    unit_price DECIMAL(10,2) NOT NULL,
    FOREIGN KEY (order_id) REFERENCES orders(order_id),
    FOREIGN KEY (product_id) REFERENCES products(product_id)
);

-- Insert sample data
INSERT INTO customers (first_name, last_name, email, phone) VALUES
('Alice', 'Johnson', 'alice.johnson@email.com', '555-0101'),
('Bob', 'Williams', 'bob.williams@email.com', '555-0102'),
('Carol', 'Davis', 'carol.davis@email.com', '555-0103');

INSERT INTO products (product_name, description, price, stock_quantity, category) VALUES
('Laptop', 'High-performance laptop', 999.99, 50, 'Electronics'),
('Mouse', 'Wireless computer mouse', 29.99, 100, 'Electronics'),
('Keyboard', 'Mechanical keyboard', 79.99, 75, 'Electronics'),
('Book', 'Programming guide', 39.99, 200, 'Books');

INSERT INTO orders (customer_id, total_amount, status) VALUES
(1, 1029.98, 'delivered'),
(2, 79.99, 'processing'),
(1, 29.99, 'shipped');

INSERT INTO order_items (order_id, product_id, quantity, unit_price) VALUES
(1, 1, 1, 999.99),  -- Laptop
(1, 2, 1, 29.99),   -- Mouse
(2, 3, 1, 79.99),   -- Keyboard
(3, 2, 1, 29.99);   -- Mouse

-- Query across multiple tables
SELECT 
    c.first_name,
    c.last_name,
    o.order_id,
    o.order_date,
    o.total_amount
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id;

-- Complex query with multiple joins
SELECT 
    c.first_name,
    c.last_name,
    o.order_id,
    p.product_name,
    oi.quantity,
    oi.unit_price,
    (oi.quantity * oi.unit_price) as item_total
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
WHERE o.status = 'delivered'
ORDER BY o.order_date DESC;

2. Data Types & Operators

Numeric Data Types

MySQL provides various numeric data types to store different kinds of numbers with appropriate precision and storage requirements.
-- Integer types
CREATE TABLE integer_types (
    -- Signed integers (can be negative)
    tiny_int TINYINT,           -- 1 byte: -128 to 127
    small_int SMALLINT,         -- 2 bytes: -32768 to 32767
    medium_int MEDIUMINT,       -- 3 bytes: -8388608 to 8388607
    regular_int INT,            -- 4 bytes: -2147483648 to 2147483647
    big_int BIGINT,             -- 8 bytes: -2^63 to 2^63-1
    
    -- Unsigned integers (only positive)
    tiny_int_unsigned TINYINT UNSIGNED,    -- 0 to 255
    small_int_unsigned SMALLINT UNSIGNED,  -- 0 to 65535
    int_unsigned INT UNSIGNED,             -- 0 to 4294967295
    big_int_unsigned BIGINT UNSIGNED,      -- 0 to 2^64-1
    
    -- Boolean (alias for TINYINT(1))
    is_active BOOLEAN,          -- 0 or 1
    is_verified TINYINT(1)      -- Same as BOOLEAN
);

-- Decimal types for exact precision
CREATE TABLE decimal_types (
    -- DECIMAL(precision, scale)
    -- precision: total digits, scale: digits after decimal
    price DECIMAL(10, 2),       -- 10 digits total, 2 after decimal: 12345678.99
    tax_rate DECIMAL(5, 4),     -- 5 digits total, 4 after decimal: 0.0825
    percentage DECIMAL(3, 0),   -- 3 digits, no decimal: 100
    
    -- Floating point (approximate)
    float_num FLOAT,            -- 4 bytes, single precision
    double_num DOUBLE,          -- 8 bytes, double precision
    float_precise FLOAT(10, 2)  -- With specified precision
);

-- Insert examples
INSERT INTO integer_types VALUES 
(127, 32767, 8388607, 2147483647, 9223372036854775807,
255, 65535, 4294967295, 18446744073709551615, 1, 1);

INSERT INTO decimal_types VALUES
(12345.67, 0.0825, 100, 123.456, 12345.6789, 1234.56);

-- Numeric functions and operations
SELECT 
    10 + 5 AS addition,
    10 - 5 AS subtraction,
    10 * 5 AS multiplication,
    10 / 3 AS division,
    10 % 3 AS modulus,
    POW(2, 3) AS power,
    SQRT(25) AS square_root,
    ROUND(15.756, 2) AS rounded,
    FLOOR(15.756) AS floor_value,
    CEILING(15.756) AS ceiling_value,
    ABS(-15) AS absolute_value,
    RAND() AS random_number;

String and Text Data Types

MySQL offers various string data types for storing text data, each with different storage characteristics and performance implications.
-- String and text types
CREATE TABLE string_types (
    -- Fixed-length strings (padded with spaces)
    char_field CHAR(10),        -- Always 10 characters, padded
    
    -- Variable-length strings
    varchar_field VARCHAR(255), -- Up to 255 characters
    
    -- Text types for larger content
    tiny_text TINYTEXT,         -- 255 bytes
    regular_text TEXT,          -- 65,535 bytes
    medium_text MEDIUMTEXT,     -- 16,777,215 bytes
    long_text LONGTEXT,         -- 4,294,967,295 bytes
    
    -- Binary data types
    binary_field BINARY(10),    -- Fixed-length binary
    varbinary_field VARBINARY(255), -- Variable-length binary
    blob_field BLOB,            -- Binary large object
    
    -- Enum and Set types
    status ENUM('active', 'inactive', 'pending'), -- One value from list
    tags SET('featured', 'new', 'sale', 'popular') -- Multiple values from list
);

-- Insert examples
INSERT INTO string_types VALUES 
('Hello', 'World', 'Short text', 'Regular text content', 
'Medium text content here', 'Very long text content...',
UNHEX('48656C6C6F'), UNHEX('576F726C64'), UNHEX('42696E617279'),
'active', 'featured,new');

-- String functions and operations
SELECT 
    -- Basic string functions
    CONCAT('Hello', ' ', 'World') AS concatenation,
    CONCAT_WS(', ', 'John', 'Doe', 'CEO') AS concat_with_separator,
    LENGTH('MySQL') AS length,
    CHAR_LENGTH('MySQL') AS char_length,
    
    -- Case functions
    UPPER('hello world') AS uppercase,
    LOWER('HELLO WORLD') AS lowercase,
    
    -- Substring and position
    SUBSTRING('Hello World', 1, 5) AS substring,
    LEFT('Hello World', 5) AS left_part,
    RIGHT('Hello World', 5) AS right_part,
    LOCATE('World', 'Hello World') AS position,
    
    -- Trim and pad
    TRIM('   Hello   ') AS trimmed,
    LTRIM('   Hello') AS left_trimmed,
    RTRIM('Hello   ') AS right_trimmed,
    LPAD('Hello', 10, '*') AS left_padded,
    RPAD('Hello', 10, '*') AS right_padded,
    
    -- Replace and insert
    REPLACE('Hello World', 'World', 'MySQL') AS replaced,
    INSERT('Hello World', 7, 5, 'MySQL') AS inserted,
    
    -- Formatting
    REVERSE('Hello') AS reversed,
    REPEAT('Ha', 3) AS repeated;

-- Pattern matching with LIKE
SELECT 
    first_name,
    last_name
FROM employees
WHERE first_name LIKE 'J%';  -- Starts with J

SELECT 
    product_name
FROM products
WHERE product_name LIKE '%computer%';  -- Contains 'computer'

SELECT 
    email
FROM customers
WHERE email LIKE '%@gmail.com';  -- Ends with @gmail.com

Date, Time, and JSON Data Types

MySQL provides comprehensive support for date/time operations and modern JSON data type for semi-structured data.
-- Date and time types
CREATE TABLE datetime_types (
    -- Date only
    birth_date DATE,
    
    -- Time only
    meeting_time TIME,
    
    -- Date and time combinations
    created_at DATETIME,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    
    -- Year
    birth_year YEAR,
    
    -- JSON data type (MySQL 5.7+)
    user_preferences JSON,
    metadata JSON
);

-- Insert examples
INSERT INTO datetime_types VALUES 
('1990-05-15', '14:30:00', '2023-01-15 10:30:00', 
CURRENT_TIMESTAMP, 1990, 
'{"theme": "dark", "notifications": true, "language": "en"}',
'{"created_by": "system", "version": 1}');

-- Date and time functions
SELECT 
    -- Current date/time
    NOW() AS current_datetime,
    CURDATE() AS current_date,
    CURTIME() AS current_time,
    CURRENT_TIMESTAMP AS current_ts,
    
    -- Date parts
    YEAR(NOW()) AS current_year,
    MONTH(NOW()) AS current_month,
    DAY(NOW()) AS current_day,
    HOUR(NOW()) AS current_hour,
    MINUTE(NOW()) AS current_minute,
    SECOND(NOW()) AS current_second,
    
    -- Date names
    DAYNAME(NOW()) AS day_name,
    MONTHNAME(NOW()) AS month_name,
    
    -- Date arithmetic
    DATE_ADD(NOW(), INTERVAL 7 DAY) AS next_week,
    DATE_SUB(NOW(), INTERVAL 1 MONTH) AS last_month,
    DATEDIFF('2023-12-31', '2023-01-01') AS days_diff,
    
    -- Formatting
    DATE_FORMAT(NOW(), '%W, %M %e, %Y') AS formatted_date,
    TIME_FORMAT(NOW(), '%h:%i %p') AS formatted_time;

-- JSON functions (MySQL 5.7+)
SELECT 
    -- Create JSON
    JSON_OBJECT('name', 'John', 'age', 30) AS json_object,
    JSON_ARRAY('apple', 'banana', 'orange') AS json_array,
    
    -- Extract from JSON
    JSON_EXTRACT('{"name": "John", "age": 30}', '$.name') AS extracted_name,
    user_preferences->>'$.theme' AS theme,  -- Shortcut syntax
    
    -- Modify JSON
    JSON_SET('{"name": "John"}', '$.age', 30) AS json_set,
    JSON_INSERT('{"name": "John"}', '$.city', 'NYC') AS json_insert,
    JSON_REMOVE('{"name": "John", "age": 30}', '$.age') AS json_remove,
    
    -- Search in JSON
    JSON_SEARCH(user_preferences, 'one', 'dark') AS json_search,
    JSON_CONTAINS(user_preferences, '"dark"', '$.theme') AS json_contains,
    
    -- JSON keys and values
    JSON_KEYS(user_preferences) AS json_keys,
    JSON_LENGTH(user_preferences) AS json_length
FROM datetime_types;

-- Working with JSON data
UPDATE datetime_types 
SET user_preferences = JSON_SET(
    user_preferences,
    '$.notifications', false,
    '$.font_size', 'medium'
)
WHERE JSON_EXTRACT(user_preferences, '$.theme') = 'dark';

-- Query JSON data
SELECT *
FROM datetime_types
WHERE JSON_EXTRACT(user_preferences, '$.notifications') = true;

3. SQL Queries & Joins

Advanced SELECT Queries

Mastering SELECT statements with WHERE clauses, aggregation, grouping, and sorting is essential for effective data retrieval.
-- Sample data setup
CREATE TABLE employees (
    emp_id INT PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    department VARCHAR(50),
    salary DECIMAL(10,2),
    hire_date DATE,
    manager_id INT
);

INSERT INTO employees VALUES
(1, 'John', 'Doe', 'Engineering', 75000, '2020-01-15', NULL),
(2, 'Jane', 'Smith', 'Engineering', 80000, '2019-03-20', 1),
(3, 'Bob', 'Johnson', 'Marketing', 65000, '2021-06-10', 1),
(4, 'Alice', 'Brown', 'HR', 60000, '2022-02-28', 1),
(5, 'Charlie', 'Wilson', 'Engineering', 90000, '2018-11-05', NULL),
(6, 'Diana', 'Lee', 'Marketing', 70000, '2020-07-12', 5),
(7, 'Edward', 'Davis', 'Sales', 55000, '2023-01-08', 5),
(8, 'Fiona', 'Garcia', 'Engineering', 85000, '2019-09-15', 1);

-- WHERE clause with various operators
SELECT * FROM employees WHERE salary > 70000;
SELECT * FROM employees WHERE department IN ('Engineering', 'Marketing');
SELECT * FROM employees WHERE hire_date BETWEEN '2020-01-01' AND '2021-12-31';
SELECT * FROM employees WHERE last_name LIKE 'D%';  -- Starts with D
SELECT * FROM employees WHERE manager_id IS NULL;

-- Combining conditions
SELECT * FROM employees 
WHERE department = 'Engineering' 
  AND salary > 80000 
  AND hire_date > '2019-01-01';

SELECT * FROM employees 
WHERE department = 'Engineering' 
   OR department = 'Marketing'
   AND salary BETWEEN 60000 AND 80000;

-- ORDER BY with multiple columns
SELECT first_name, last_name, salary, department
FROM employees
ORDER BY department ASC, salary DESC;

-- LIMIT and OFFSET for pagination
SELECT first_name, last_name, salary
FROM employees
ORDER BY salary DESC
LIMIT 5;  -- Top 5 highest paid

SELECT first_name, last_name, salary
FROM employees
ORDER BY salary DESC
LIMIT 5 OFFSET 5;  -- Next 5 (page 2)

-- DISTINCT for unique values
SELECT DISTINCT department FROM employees;

-- Aggregation functions
SELECT 
    COUNT(*) AS total_employees,
    COUNT(DISTINCT department) AS total_departments,
    AVG(salary) AS average_salary,
    MAX(salary) AS max_salary,
    MIN(salary) AS min_salary,
    SUM(salary) AS total_salary_budget
FROM employees;

-- GROUP BY with HAVING
SELECT 
    department,
    COUNT(*) AS employee_count,
    AVG(salary) AS avg_salary,
    MAX(salary) AS max_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 65000  -- Filter groups
ORDER BY avg_salary DESC;

-- CASE statements for conditional logic
SELECT 
    first_name,
    last_name,
    salary,
    CASE 
        WHEN salary > 80000 THEN 'High'
        WHEN salary BETWEEN 60000 AND 80000 THEN 'Medium'
        ELSE 'Low'
    END AS salary_grade,
    CASE department
        WHEN 'Engineering' THEN 'Tech'
        WHEN 'Marketing' THEN 'Business'
        ELSE 'Other'
    END AS department_category
FROM employees
ORDER BY salary DESC;

JOIN Operations

JOINs are fundamental for combining data from multiple tables. Understanding different JOIN types is crucial for complex queries.
-- Create related tables
CREATE TABLE departments (
    dept_id INT PRIMARY KEY,
    dept_name VARCHAR(50) NOT NULL,
    location VARCHAR(100)
);

CREATE TABLE employees (
    emp_id INT PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    dept_id INT,
    salary DECIMAL(10,2),
    hire_date DATE,
    FOREIGN KEY (dept_id) REFERENCES departments(dept_id)
);

CREATE TABLE projects (
    project_id INT PRIMARY KEY,
    project_name VARCHAR(100),
    dept_id INT,
    budget DECIMAL(12,2),
    start_date DATE,
    FOREIGN KEY (dept_id) REFERENCES departments(dept_id)
);

-- Insert sample data
INSERT INTO departments VALUES
(1, 'Engineering', 'New York'),
(2, 'Marketing', 'Chicago'),
(3, 'Sales', 'Los Angeles'),
(4, 'HR', 'Boston');

INSERT INTO employees VALUES
(1, 'John', 'Doe', 1, 75000, '2020-01-15'),
(2, 'Jane', 'Smith', 1, 80000, '2019-03-20'),
(3, 'Bob', 'Johnson', 2, 65000, '2021-06-10'),
(4, 'Alice', 'Brown', 2, 60000, '2022-02-28'),
(5, 'Charlie', 'Wilson', NULL, 90000, '2018-11-05'),
(6, 'Diana', 'Lee', 3, 70000, '2020-07-12');

INSERT INTO projects VALUES
(1, 'Website Redesign', 1, 50000, '2023-01-01'),
(2, 'Social Media Campaign', 2, 30000, '2023-02-15'),
(3, 'Product Launch', 1, 75000, '2023-03-01'),
(4, 'Market Research', 2, 25000, '2023-01-20'),
(5, 'Sales Training', 3, 15000, '2023-02-01');

-- INNER JOIN (most common)
SELECT 
    e.first_name,
    e.last_name,
    d.dept_name,
    d.location
FROM employees e
INNER JOIN departments d ON e.dept_id = d.dept_id;

-- LEFT JOIN (all employees, even without department)
SELECT 
    e.first_name,
    e.last_name,
    d.dept_name
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.dept_id;

-- RIGHT JOIN (all departments, even without employees)
SELECT 
    e.first_name,
    e.last_name,
    d.dept_name
FROM employees e
RIGHT JOIN departments d ON e.dept_id = d.dept_id;

-- FULL OUTER JOIN (simulated in MySQL)
SELECT 
    e.first_name,
    e.last_name,
    d.dept_name
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.dept_id
UNION
SELECT 
    e.first_name,
    e.last_name,
    d.dept_name
FROM employees e
RIGHT JOIN departments d ON e.dept_id = d.dept_id;

-- CROSS JOIN (Cartesian product)
SELECT 
    e.first_name,
    d.dept_name
FROM employees e
CROSS JOIN departments d;

-- SELF JOIN (employees and their managers example)
ALTER TABLE employees ADD COLUMN manager_id INT;
UPDATE employees SET manager_id = 1 WHERE emp_id IN (2,3,4);
UPDATE employees SET manager_id = 5 WHERE emp_id = 6;

SELECT 
    e.first_name AS employee_name,
    e.last_name AS employee_last_name,
    m.first_name AS manager_name,
    m.last_name AS manager_last_name
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.emp_id;

-- Multiple table JOIN
SELECT 
    e.first_name,
    e.last_name,
    d.dept_name,
    p.project_name,
    p.budget
FROM employees e
INNER JOIN departments d ON e.dept_id = d.dept_id
INNER JOIN projects p ON d.dept_id = p.dept_id
ORDER BY e.last_name, p.project_name;

-- Complex JOIN with aggregation
SELECT 
    d.dept_name,
    COUNT(e.emp_id) AS employee_count,
    COUNT(p.project_id) AS project_count,
    AVG(e.salary) AS avg_salary,
    SUM(p.budget) AS total_budget
FROM departments d
LEFT JOIN employees e ON d.dept_id = e.dept_id
LEFT JOIN projects p ON d.dept_id = p.dept_id
GROUP BY d.dept_id, d.dept_name
ORDER BY total_budget DESC;

-- JOIN with subquery
SELECT 
    e.first_name,
    e.last_name,
    e.salary,
    dept_stats.avg_dept_salary
FROM employees e
INNER JOIN (
    SELECT 
        dept_id,
        AVG(salary) AS avg_dept_salary
    FROM employees
    GROUP BY dept_id
) dept_stats ON e.dept_id = dept_stats.dept_id
WHERE e.salary > dept_stats.avg_dept_salary;

Subqueries and Common Table Expressions

Subqueries and CTEs provide powerful ways to write complex queries by breaking them into manageable parts.
-- Sample data setup
CREATE TABLE sales (
    sale_id INT PRIMARY KEY,
    product_id INT,
    salesperson_id INT,
    sale_date DATE,
    quantity INT,
    unit_price DECIMAL(10,2),
    region VARCHAR(50)
);

INSERT INTO sales VALUES
(1, 101, 1, '2023-01-15', 2, 25.00, 'North'),
(2, 102, 2, '2023-01-16', 1, 50.00, 'South'),
(3, 101, 1, '2023-02-10', 3, 25.00, 'North'),
(4, 103, 3, '2023-02-15', 2, 75.00, 'East'),
(5, 102, 2, '2023-03-01', 1, 50.00, 'South'),
(6, 101, 3, '2023-03-05', 4, 25.00, 'West'),
(7, 104, 1, '2023-03-10', 1, 100.00, 'North');

-- Scalar subquery (returns single value)
SELECT 
    product_id,
    quantity,
    unit_price,
    (SELECT AVG(unit_price) FROM sales) AS avg_unit_price
FROM sales
WHERE unit_price > (SELECT AVG(unit_price) FROM sales);

-- Correlated subquery (references outer query)
SELECT 
    s1.product_id,
    s1.region,
    s1.quantity
FROM sales s1
WHERE s1.quantity > (
    SELECT AVG(s2.quantity)
    FROM sales s2
    WHERE s2.region = s1.region
);

-- Subquery in WHERE clause
SELECT *
FROM sales
WHERE salesperson_id IN (
    SELECT salesperson_id
    FROM sales
    GROUP BY salesperson_id
    HAVING SUM(quantity * unit_price) > 150
);

-- Subquery in FROM clause (derived table)
SELECT 
    region,
    total_sales,
    average_sale
FROM (
    SELECT 
        region,
        SUM(quantity * unit_price) AS total_sales,
        AVG(quantity * unit_price) AS average_sale
    FROM sales
    GROUP BY region
) regional_stats
WHERE total_sales > 100;

-- EXISTS and NOT EXISTS
SELECT *
FROM sales s1
WHERE EXISTS (
    SELECT 1
    FROM sales s2
    WHERE s2.product_id = s1.product_id
    AND s2.quantity > 3
);

-- Common Table Expressions (CTEs) - MySQL 8.0+
WITH regional_sales AS (
    SELECT 
        region,
        SUM(quantity * unit_price) AS total_sales,
        COUNT(*) AS transaction_count
    FROM sales
    GROUP BY region
),
sales_performance AS (
    SELECT 
        region,
        total_sales,
        transaction_count,
        total_sales / transaction_count AS avg_transaction_value
    FROM regional_sales
)
SELECT 
    region,
    total_sales,
    transaction_count,
    avg_transaction_value,
    CASE 
        WHEN avg_transaction_value > 50 THEN 'High'
        WHEN avg_transaction_value > 25 THEN 'Medium'
        ELSE 'Low'
    END AS performance_rating
FROM sales_performance
ORDER BY total_sales DESC;

-- Recursive CTE (MySQL 8.0+)
CREATE TABLE categories (
    category_id INT PRIMARY KEY,
    category_name VARCHAR(50),
    parent_id INT
);

INSERT INTO categories VALUES
(1, 'Electronics', NULL),
(2, 'Computers', 1),
(3, 'Laptops', 2),
(4, 'Tablets', 2),
(5, 'Smartphones', 1),
(6, 'Gaming', NULL),
(7, 'Consoles', 6),
(8, 'Games', 6);

WITH RECURSIVE category_tree AS (
    -- Anchor member
    SELECT 
        category_id,
        category_name,
        parent_id,
        0 AS level,
        CAST(category_name AS CHAR(1000)) AS path
    FROM categories
    WHERE parent_id IS NULL
    
    UNION ALL
    
    -- Recursive member
    SELECT 
        c.category_id,
        c.category_name,
        c.parent_id,
        ct.level + 1,
        CONCAT(ct.path, ' > ', c.category_name)
    FROM categories c
    INNER JOIN category_tree ct ON c.parent_id = ct.category_id
)
SELECT 
    category_id,
    category_name,
    level,
    path
FROM category_tree
ORDER BY path;

-- Multiple CTEs in single query
WITH monthly_sales AS (
    SELECT 
        YEAR(sale_date) AS sale_year,
        MONTH(sale_date) AS sale_month,
        SUM(quantity * unit_price) AS monthly_total
    FROM sales
    GROUP BY YEAR(sale_date), MONTH(sale_date)
),
running_totals AS (
    SELECT 
        sale_year,
        sale_month,
        monthly_total,
        SUM(monthly_total) OVER (
            ORDER BY sale_year, sale_month
        ) AS running_total
    FROM monthly_sales
)
SELECT 
    sale_year,
    sale_month,
    monthly_total,
    running_total,
    LAG(monthly_total) OVER (ORDER BY sale_year, sale_month) AS prev_month_sales,
    (monthly_total - LAG(monthly_total) OVER (ORDER BY sale_year, sale_month)) 
    / LAG(monthly_total) OVER (ORDER BY sale_year, sale_month) * 100 AS growth_percentage
FROM running_totals
ORDER BY sale_year, sale_month;

4. DDL & Database Design

Database Design and Normalization

Proper database design involves normalization to reduce data redundancy and improve data integrity. Understanding normal forms is crucial for creating efficient databases.
-- Example: E-commerce database design
-- First Normal Form (1NF) - Atomic values, no repeating groups

CREATE TABLE customers_1nf (
    customer_id INT PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    email VARCHAR(100),
    phone VARCHAR(20)
);

CREATE TABLE orders_1nf (
    order_id INT PRIMARY KEY,
    customer_id INT,
    order_date DATE,
    -- Instead of having multiple product columns, we create separate table
    FOREIGN KEY (customer_id) REFERENCES customers_1nf(customer_id)
);

CREATE TABLE order_items_1nf (
    order_id INT,
    product_id INT,
    quantity INT,
    unit_price DECIMAL(10,2),
    PRIMARY KEY (order_id, product_id),
    FOREIGN KEY (order_id) REFERENCES orders_1nf(order_id)
);

-- Second Normal Form (2NF) - No partial dependencies
-- Move product details to separate table

CREATE TABLE products_2nf (
    product_id INT PRIMARY KEY,
    product_name VARCHAR(100),
    category_id INT,
    price DECIMAL(10,2)
);

CREATE TABLE categories_2nf (
    category_id INT PRIMARY KEY,
    category_name VARCHAR(50),
    description TEXT
);

-- Update order_items to reference products
ALTER TABLE order_items_1nf
ADD FOREIGN KEY (product_id) REFERENCES products_2nf(product_id);

ALTER TABLE products_2nf
ADD FOREIGN KEY (category_id) REFERENCES categories_2nf(category_id);

-- Third Normal Form (3NF) - No transitive dependencies
-- Remove derived and transitive dependencies

CREATE TABLE customers_3nf (
    customer_id INT PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    email VARCHAR(100),
    phone VARCHAR(20),
    address_id INT  -- Reference to address table
);

CREATE TABLE addresses (
    address_id INT PRIMARY KEY,
    street VARCHAR(100),
    city VARCHAR(50),
    state VARCHAR(50),
    zip_code VARCHAR(20),
    country VARCHAR(50)
);

-- Complete normalized e-commerce schema
CREATE TABLE customers (
    customer_id INT AUTO_INCREMENT PRIMARY KEY,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(100) UNIQUE NOT NULL,
    phone VARCHAR(20),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE addresses (
    address_id INT AUTO_INCREMENT PRIMARY KEY,
    customer_id INT NOT NULL,
    address_type ENUM('billing', 'shipping') NOT NULL,
    street VARCHAR(100) NOT NULL,
    city VARCHAR(50) NOT NULL,
    state VARCHAR(50) NOT NULL,
    zip_code VARCHAR(20) NOT NULL,
    country VARCHAR(50) DEFAULT 'USA',
    is_default BOOLEAN DEFAULT FALSE,
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

CREATE TABLE categories (
    category_id INT AUTO_INCREMENT PRIMARY KEY,
    category_name VARCHAR(50) UNIQUE NOT NULL,
    description TEXT,
    parent_category_id INT,
    FOREIGN KEY (parent_category_id) REFERENCES categories(category_id)
);

CREATE TABLE products (
    product_id INT AUTO_INCREMENT PRIMARY KEY,
    product_name VARCHAR(100) NOT NULL,
    description TEXT,
    sku VARCHAR(50) UNIQUE NOT NULL,
    price DECIMAL(10,2) NOT NULL,
    cost DECIMAL(10,2),
    category_id INT NOT NULL,
    stock_quantity INT DEFAULT 0,
    is_active BOOLEAN DEFAULT TRUE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    FOREIGN KEY (category_id) REFERENCES categories(category_id)
);

CREATE TABLE orders (
    order_id INT AUTO_INCREMENT PRIMARY KEY,
    customer_id INT NOT NULL,
    order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    total_amount DECIMAL(10,2) NOT NULL,
    status ENUM('pending', 'confirmed', 'shipped', 'delivered', 'cancelled') DEFAULT 'pending',
    shipping_address_id INT NOT NULL,
    billing_address_id INT NOT NULL,
    notes TEXT,
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id),
    FOREIGN KEY (shipping_address_id) REFERENCES addresses(address_id),
    FOREIGN KEY (billing_address_id) REFERENCES addresses(address_id)
);

CREATE TABLE order_items (
    order_item_id INT AUTO_INCREMENT PRIMARY KEY,
    order_id INT NOT NULL,
    product_id INT NOT NULL,
    quantity INT NOT NULL CHECK (quantity > 0),
    unit_price DECIMAL(10,2) NOT NULL,
    line_total DECIMAL(10,2) GENERATED ALWAYS AS (quantity * unit_price) STORED,
    FOREIGN KEY (order_id) REFERENCES orders(order_id) ON DELETE CASCADE,
    FOREIGN KEY (product_id) REFERENCES products(product_id),
    UNIQUE KEY unique_order_product (order_id, product_id)
);

CREATE TABLE payments (
    payment_id INT AUTO_INCREMENT PRIMARY KEY,
    order_id INT NOT NULL,
    payment_method ENUM('credit_card', 'paypal', 'bank_transfer') NOT NULL,
    amount DECIMAL(10,2) NOT NULL,
    payment_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    status ENUM('pending', 'completed', 'failed', 'refunded') DEFAULT 'pending',
    transaction_id VARCHAR(100),
    FOREIGN KEY (order_id) REFERENCES orders(order_id)
);

-- Indexes for performance
CREATE INDEX idx_customers_email ON customers(email);
CREATE INDEX idx_products_category ON products(category_id);
CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date);
CREATE INDEX idx_orders_status ON orders(status);
CREATE INDEX idx_order_items_order ON order_items(order_id);
CREATE INDEX idx_order_items_product ON order_items(product_id);

Constraints and Data Integrity

Constraints enforce data integrity rules at the database level, ensuring data quality and consistency.
-- Comprehensive constraints example
CREATE TABLE employees_constraints (
    -- Primary key constraint
    employee_id INT AUTO_INCREMENT PRIMARY KEY,
    
    -- NOT NULL constraints
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(100) NOT NULL,
    
    -- UNIQUE constraint
    email VARCHAR(100) UNIQUE,
    ssn VARCHAR(11) UNIQUE,
    
    -- CHECK constraints (MySQL 8.0.16+)
    age INT CHECK (age >= 18 AND age <= 65),
    salary DECIMAL(10,2) CHECK (salary > 0),
    
    -- ENUM constraint
    gender ENUM('M', 'F', 'Other') NOT NULL,
    status ENUM('active', 'inactive', 'suspended') DEFAULT 'active',
    
    -- DEFAULT constraints
    hire_date DATE DEFAULT (CURRENT_DATE),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    
    -- Foreign key constraint
    department_id INT,
    manager_id INT,
    FOREIGN KEY (department_id) REFERENCES departments(department_id),
    FOREIGN KEY (manager_id) REFERENCES employees_constraints(employee_id)
);

-- Adding constraints to existing tables
ALTER TABLE employees_constraints
ADD CONSTRAINT chk_salary_range CHECK (salary BETWEEN 30000 AND 200000);

ALTER TABLE employees_constraints
ADD CONSTRAINT unique_employee_code UNIQUE (employee_code);

ALTER TABLE employees_constraints
MODIFY COLUMN phone VARCHAR(20) NOT NULL;

-- Complex check constraints
CREATE TABLE products_constraints (
    product_id INT PRIMARY KEY,
    product_name VARCHAR(100) NOT NULL,
    price DECIMAL(10,2) NOT NULL CHECK (price >= 0),
    cost DECIMAL(10,2) CHECK (cost >= 0),
    markup DECIMAL(10,2) GENERATED ALWAYS AS (price - cost) STORED,
    CONSTRAINT chk_price_greater_than_cost CHECK (price >= cost),
    CONSTRAINT chk_valid_markup CHECK (markup >= 0)
);

-- Table-level constraints
CREATE TABLE project_assignments (
    project_id INT,
    employee_id INT,
    role VARCHAR(50) NOT NULL,
    start_date DATE NOT NULL,
    end_date DATE,
    allocation_percentage INT CHECK (allocation_percentage BETWEEN 0 AND 100),
    
    -- Composite primary key
    PRIMARY KEY (project_id, employee_id),
    
    -- Foreign keys
    FOREIGN KEY (project_id) REFERENCES projects(project_id) ON DELETE CASCADE,
    FOREIGN KEY (employee_id) REFERENCES employees(employee_id) ON DELETE CASCADE,
    
    -- Table-level check constraint
    CONSTRAINT chk_valid_dates CHECK (end_date IS NULL OR end_date > start_date)
);

-- Demonstrating constraint violations
-- This will fail due to NOT NULL constraint
INSERT INTO employees_constraints (first_name, last_name, email) 
VALUES (NULL, 'Doe', 'john.doe@email.com');

-- This will fail due to UNIQUE constraint
INSERT INTO employees_constraints (first_name, last_name, email) 
VALUES ('John', 'Doe', 'existing@email.com');
INSERT INTO employees_constraints (first_name, last_name, email) 
VALUES ('Jane', 'Smith', 'existing@email.com');  -- Duplicate email

-- This will fail due to CHECK constraint
INSERT INTO products_constraints (product_id, product_name, price, cost)
VALUES (1, 'Test Product', 50.00, 60.00);  -- Price less than cost

-- This will fail due to FOREIGN KEY constraint
INSERT INTO project_assignments (project_id, employee_id, role, start_date)
VALUES (999, 1, 'Developer', '2023-01-01');  -- Invalid project_id

-- Handling constraints in applications
-- Using INSERT IGNORE to skip rows that violate constraints
INSERT IGNORE INTO employees_constraints (first_name, last_name, email)
VALUES 
('John', 'Doe', 'john.doe@email.com'),
('Jane', 'Smith', 'john.doe@email.com');  -- This row will be skipped

-- Using ON DUPLICATE KEY UPDATE for handling duplicates
INSERT INTO employees_constraints (first_name, last_name, email)
VALUES ('John', 'Doe', 'john.doe@email.com')
ON DUPLICATE KEY UPDATE 
    first_name = VALUES(first_name),
    last_name = VALUES(last_name);

-- Viewing constraint information
SELECT 
    TABLE_NAME,
    CONSTRAINT_NAME,
    CONSTRAINT_TYPE
FROM information_schema.TABLE_CONSTRAINTS
WHERE TABLE_SCHEMA = DATABASE();

-- Disabling and enabling constraints (be careful!)
SET FOREIGN_KEY_CHECKS = 0;  -- Disable foreign key checks
-- Perform operations that might violate constraints
SET FOREIGN_KEY_CHECKS = 1;  -- Re-enable foreign key checks

Views, Indexes, and Stored Procedures

Advanced database objects like views, indexes, and stored procedures enhance functionality, performance, and maintainability.
-- Creating views for simplified data access
CREATE VIEW employee_directory AS
SELECT 
    e.employee_id,
    CONCAT(e.first_name, ' ', e.last_name) AS full_name,
    e.email,
    e.phone,
    d.department_name,
    m.first_name AS manager_first_name,
    m.last_name AS manager_last_name,
    e.hire_date
FROM employees e
LEFT JOIN departments d ON e.department_id = d.department_id
LEFT JOIN employees m ON e.manager_id = m.employee_id
WHERE e.status = 'active';

-- Updatable view
CREATE VIEW active_employees AS
SELECT 
    employee_id,
    first_name,
    last_name,
    email,
    department_id
FROM employees
WHERE status = 'active'
WITH CHECK OPTION;  -- Ensures inserts/updates satisfy view condition

-- Materialized view (simulated with table)
CREATE TABLE department_stats (
    department_id INT PRIMARY KEY,
    department_name VARCHAR(50),
    employee_count INT,
    avg_salary DECIMAL(10,2),
    total_budget DECIMAL(12,2),
    last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Indexes for performance optimization
-- Single-column indexes
CREATE INDEX idx_employees_last_name ON employees(last_name);
CREATE INDEX idx_employees_department ON employees(department_id);
CREATE INDEX idx_employees_hire_date ON employees(hire_date);

-- Composite indexes
CREATE INDEX idx_employees_name_department ON employees(last_name, first_name, department_id);
CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date DESC);

-- Unique indexes
CREATE UNIQUE INDEX idx_employees_email ON employees(email);
CREATE UNIQUE INDEX idx_products_sku ON products(sku);

-- Partial indexes (MySQL doesn't support directly, but we can use filtered approach)
CREATE INDEX idx_active_products ON products(product_id) WHERE is_active = TRUE;

-- Full-text search indexes
CREATE TABLE articles (
    article_id INT PRIMARY KEY,
    title VARCHAR(200),
    content TEXT,
    author VARCHAR(100),
    published_date DATE,
    FULLTEXT(title, content)
);

-- Using full-text search
SELECT 
    article_id,
    title,
    MATCH(title, content) AGAINST('database performance' IN NATURAL LANGUAGE MODE) AS relevance
FROM articles
WHERE MATCH(title, content) AGAINST('database performance' IN NATURAL LANGUAGE MODE)
ORDER BY relevance DESC;

-- Stored procedures
DELIMITER //

CREATE PROCEDURE GetEmployeeDetails(IN emp_id INT)
BEGIN
    SELECT 
        e.employee_id,
        CONCAT(e.first_name, ' ', e.last_name) AS full_name,
        e.email,
        e.phone,
        d.department_name,
        e.salary,
        e.hire_date
    FROM employees e
    LEFT JOIN departments d ON e.department_id = d.department_id
    WHERE e.employee_id = emp_id;
END //

CREATE PROCEDURE UpdateEmployeeSalary(
    IN emp_id INT,
    IN new_salary DECIMAL(10,2),
    OUT old_salary DECIMAL(10,2),
    OUT result_message VARCHAR(100)
)
BEGIN
    DECLARE current_salary DECIMAL(10,2);
    
    -- Get current salary
    SELECT salary INTO current_salary
    FROM employees
    WHERE employee_id = emp_id;
    
    IF current_salary IS NULL THEN
        SET result_message = 'Employee not found';
    ELSEIF new_salary <= 0 THEN
        SET result_message = 'Invalid salary amount';
    ELSE
        -- Update salary
        UPDATE employees 
        SET salary = new_salary
        WHERE employee_id = emp_id;
        
        SET old_salary = current_salary;
        SET result_message = CONCAT('Salary updated from ', current_salary, ' to ', new_salary);
    END IF;
END //

CREATE PROCEDURE GetDepartmentStatistics(IN dept_id INT)
BEGIN
    SELECT 
        d.department_name,
        COUNT(e.employee_id) AS employee_count,
        AVG(e.salary) AS average_salary,
        MIN(e.salary) AS min_salary,
        MAX(e.salary) AS max_salary,
        SUM(e.salary) AS total_budget
    FROM departments d
    LEFT JOIN employees e ON d.department_id = e.department_id
    WHERE d.department_id = dept_id
    GROUP BY d.department_id, d.department_name;
END //

DELIMITER ;

-- Using stored procedures
CALL GetEmployeeDetails(1);
CALL GetDepartmentStatistics(1);

SET @old_salary = 0;
SET @message = '';
CALL UpdateEmployeeSalary(1, 80000, @old_salary, @message);
SELECT @old_salary, @message;

-- Functions
DELIMITER //

CREATE FUNCTION CalculateAnnualBonus(
    emp_id INT,
    bonus_percentage DECIMAL(5,2)
) 
RETURNS DECIMAL(10,2)
READS SQL DATA
DETERMINISTIC
BEGIN
    DECLARE annual_salary DECIMAL(10,2);
    DECLARE bonus_amount DECIMAL(10,2);
    
    SELECT salary * 12 INTO annual_salary
    FROM employees
    WHERE employee_id = emp_id;
    
    SET bonus_amount = annual_salary * (bonbonus_percentage / 100);
    
    RETURN bonus_amount;
END //

CREATE FUNCTION GetEmployeeTenure(emp_id INT)
RETURNS INT
READS SQL DATA
DETERMINISTIC
BEGIN
    DECLARE tenure_years INT;
    
    SELECT TIMESTAMPDIFF(YEAR, hire_date, CURDATE()) INTO tenure_years
    FROM employees
    WHERE employee_id = emp_id;
    
    RETURN tenure_years;
END //

DELIMITER ;

-- Using functions
SELECT 
    employee_id,
    first_name,
    last_name,
    salary,
    CalculateAnnualBonus(employee_id, 10) AS annual_bonus,
    GetEmployeeTenure(employee_id) AS tenure_years
FROM employees;

5. Advanced SQL Features

Window Functions

Window functions perform calculations across a set of table rows that are somehow related to the current row, without collapsing rows like GROUP BY does.
-- Sample data for window functions
CREATE TABLE sales_data (
    sale_id INT PRIMARY KEY,
    salesperson VARCHAR(50),
    region VARCHAR(50),
    sale_date DATE,
    amount DECIMAL(10,2),
    product_category VARCHAR(50)
);

INSERT INTO sales_data VALUES
(1, 'John Doe', 'North', '2023-01-15', 1000.00, 'Electronics'),
(2, 'Jane Smith', 'South', '2023-01-16', 1500.00, 'Furniture'),
(3, 'John Doe', 'North', '2023-02-10', 800.00, 'Electronics'),
(4, 'Bob Johnson', 'East', '2023-02-15', 2000.00, 'Appliances'),
(5, 'Jane Smith', 'South', '2023-03-01', 1200.00, 'Furniture'),
(6, 'John Doe', 'North', '2023-03-05', 900.00, 'Electronics'),
(7, 'Alice Brown', 'West', '2023-03-10', 1800.00, 'Appliances'),
(8, 'Bob Johnson', 'East', '2023-04-01', 2200.00, 'Electronics'),
(9, 'Jane Smith', 'South', '2023-04-05', 1300.00, 'Furniture'),
(10, 'John Doe', 'North', '2023-04-10', 1100.00, 'Electronics');

-- Ranking functions
SELECT 
    salesperson,
    region,
    sale_date,
    amount,
    -- Row number (unique sequential integers)
    ROW_NUMBER() OVER (ORDER BY amount DESC) AS row_num,
    
    -- Rank (with gaps for ties)
    RANK() OVER (ORDER BY amount DESC) AS rank_amount,
    
    -- Dense rank (no gaps for ties)
    DENSE_RANK() OVER (ORDER BY amount DESC) AS dense_rank_amount,
    
    -- Percent rank (relative rank 0-1)
    PERCENT_RANK() OVER (ORDER BY amount DESC) AS percent_rank_amount
FROM sales_data;

-- Partitioned window functions
SELECT 
    salesperson,
    region,
    sale_date,
    amount,
    -- Row number within each region
    ROW_NUMBER() OVER (PARTITION BY region ORDER BY amount DESC) AS region_row_num,
    
    -- Rank within each salesperson
    RANK() OVER (PARTITION BY salesperson ORDER BY amount DESC) AS salesperson_rank,
    
    -- Running total by region
    SUM(amount) OVER (
        PARTITION BY region 
        ORDER BY sale_date 
        ROWS UNBOUNDED PRECEDING
    ) AS running_total_region,
    
    -- Moving average (3-month)
    AVG(amount) OVER (
        PARTITION BY salesperson
        ORDER BY sale_date
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ) AS moving_avg_3_month
FROM sales_data;

-- Aggregate window functions
SELECT 
    salesperson,
    region,
    sale_date,
    amount,
    -- Total and average by salesperson
    SUM(amount) OVER (PARTITION BY salesperson) AS total_by_salesperson,
    AVG(amount) OVER (PARTITION BY salesperson) AS avg_by_salesperson,
    
    -- Percentage of salesperson total
    amount / SUM(amount) OVER (PARTITION BY salesperson) * 100 AS pct_of_salesperson_total,
    
    -- Comparison to region average
    amount - AVG(amount) OVER (PARTITION BY region) AS diff_from_region_avg,
    
    -- Cumulative percentage
    SUM(amount) OVER (
        PARTITION BY region 
        ORDER BY amount DESC
    ) / SUM(amount) OVER (PARTITION BY region) * 100 AS cumulative_pct
FROM sales_data;

-- LEAD and LAG for accessing previous/next rows
SELECT 
    salesperson,
    sale_date,
    amount,
    -- Previous sale amount
    LAG(amount) OVER (
        PARTITION BY salesperson 
        ORDER BY sale_date
    ) AS prev_sale_amount,
    
    -- Next sale amount
    LEAD(amount) OVER (
        PARTITION BY salesperson 
        ORDER BY sale_date
    ) AS next_sale_amount,
    
    -- Difference from previous sale
    amount - LAG(amount) OVER (
        PARTITION BY salesperson 
        ORDER BY sale_date
    ) AS diff_from_prev,
    
    -- Percentage growth from previous
    (amount - LAG(amount) OVER (
        PARTITION BY salesperson 
        ORDER BY sale_date
    )) / LAG(amount) OVER (
        PARTITION BY salesperson 
        ORDER BY sale_date
    ) * 100 AS growth_pct
FROM sales_data;

-- FIRST_VALUE and LAST_VALUE
SELECT 
    salesperson,
    region,
    sale_date,
    amount,
    -- First sale in region
    FIRST_VALUE(amount) OVER (
        PARTITION BY region 
        ORDER BY sale_date
    ) AS first_sale_in_region,
    
    -- Best sale in region
    FIRST_VALUE(amount) OVER (
        PARTITION BY region 
        ORDER BY amount DESC
    ) AS best_sale_in_region,
    
    -- Last sale in region
    LAST_VALUE(amount) OVER (
        PARTITION BY region 
        ORDER BY sale_date
        RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS last_sale_in_region
FROM sales_data;

-- NTILE for dividing data into buckets
SELECT 
    salesperson,
    amount,
    -- Divide into 4 quartiles
    NTILE(4) OVER (ORDER BY amount DESC) AS amount_quartile,
    
    -- Divide salespeople into performance tiers
    CASE 
        WHEN NTILE(4) OVER (ORDER BY SUM(amount) OVER (PARTITION BY salesperson) DESC) = 1 THEN 'Top Performer'
        WHEN NTILE(4) OVER (ORDER BY SUM(amount) OVER (PARTITION BY salesperson) DESC) = 2 THEN 'High Performer'
        WHEN NTILE(4) OVER (ORDER BY SUM(amount) OVER (PARTITION BY salesperson) DESC) = 3 THEN 'Medium Performer'
        ELSE 'Low Performer'
    END AS performance_tier
FROM sales_data;

-- Complex window function example: Sales trends
WITH monthly_sales AS (
    SELECT 
        salesperson,
        DATE_FORMAT(sale_date, '%Y-%m') AS sale_month,
        SUM(amount) AS monthly_total
    FROM sales_data
    GROUP BY salesperson, DATE_FORMAT(sale_date, '%Y-%m')
)
SELECT 
    salesperson,
    sale_month,
    monthly_total,
    -- Moving average of last 3 months
    AVG(monthly_total) OVER (
        PARTITION BY salesperson
        ORDER BY sale_month
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ) AS moving_avg_3month,
    
    -- Month-over-month growth
    monthly_total - LAG(monthly_total) OVER (
        PARTITION BY salesperson
        ORDER BY sale_month
    ) AS mom_growth,
    
    -- Percentage of yearly total
    monthly_total / SUM(monthly_total) OVER (
        PARTITION BY salesperson, YEAR(sale_date)
    ) * 100 AS pct_of_year,
    
    -- Rank within month
    RANK() OVER (
        PARTITION BY sale_month
        ORDER BY monthly_total DESC
    ) AS monthly_rank
FROM monthly_sales
ORDER BY salesperson, sale_month;

Advanced Data Manipulation

Advanced DML operations including UPSERT, bulk operations, and complex data transformations provide powerful tools for data management.
-- UPSERT operations (INSERT ... ON DUPLICATE KEY UPDATE)
CREATE TABLE products_upsert (
    product_id INT AUTO_INCREMENT PRIMARY KEY,
    sku VARCHAR(50) UNIQUE NOT NULL,
    product_name VARCHAR(100) NOT NULL,
    price DECIMAL(10,2) NOT NULL,
    stock_quantity INT DEFAULT 0,
    last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
);

-- Basic UPSERT
INSERT INTO products_upsert (sku, product_name, price, stock_quantity)
VALUES ('LAPTOP-001', 'Gaming Laptop', 999.99, 50)
ON DUPLICATE KEY UPDATE
    product_name = VALUES(product_name),
    price = VALUES(price),
    stock_quantity = VALUES(stock_quantity),
    last_updated = CURRENT_TIMESTAMP;

-- UPSERT with multiple rows
INSERT INTO products_upsert (sku, product_name, price, stock_quantity)
VALUES 
    ('MOUSE-001', 'Wireless Mouse', 29.99, 100),
    ('KEYBOARD-001', 'Mechanical Keyboard', 79.99, 75),
    ('MONITOR-001', '27-inch Monitor', 199.99, 30)
ON DUPLICATE KEY UPDATE
    product_name = VALUES(product_name),
    price = VALUES(price),
    stock_quantity = stock_quantity + VALUES(stock_quantity);  -- Add to existing stock

-- REPLACE (DELETE + INSERT)
REPLACE INTO products_upsert (sku, product_name, price, stock_quantity)
VALUES ('LAPTOP-001', 'Updated Gaming Laptop', 899.99, 60);

-- INSERT IGNORE (skip duplicates)
INSERT IGNORE INTO products_upsert (sku, product_name, price, stock_quantity)
VALUES 
    ('LAPTOP-001', 'Another Laptop', 799.99, 40),  -- This will be skipped
    ('TABLET-001', 'New Tablet', 299.99, 25);      -- This will be inserted

-- Bulk operations with CTEs
WITH new_products AS (
    SELECT 'LAPTOP-002' AS sku, 'Business Laptop' AS product_name, 799.99 AS price, 40 AS stock_quantity
    UNION ALL
    SELECT 'TABLET-002' AS sku, 'Drawing Tablet' AS product_name, 399.99 AS price, 20 AS stock_quantity
    UNION ALL
    SELECT 'PHONE-001' AS sku, 'Smartphone' AS product_name, 599.99 AS price, 100 AS stock_quantity
)
INSERT INTO products_upsert (sku, product_name, price, stock_quantity)
SELECT sku, product_name, price, stock_quantity
FROM new_products
ON DUPLICATE KEY UPDATE
    product_name = VALUES(product_name),
    price = VALUES(price),
    stock_quantity = VALUES(stock_quantity);

-- Conditional updates with CASE
UPDATE products_upsert
SET price = CASE 
    WHEN stock_quantity > 50 THEN price * 0.9  -- 10% discount for high stock
    WHEN stock_quantity < 10 THEN price * 1.1  -- 10% increase for low stock
    ELSE price
END,
stock_quantity = CASE 
    WHEN product_name LIKE '%Laptop%' THEN stock_quantity + 10  -- Restock laptops
    ELSE stock_quantity
END
WHERE price > 500;

-- Multi-table updates
CREATE TABLE product_categories (
    product_id INT PRIMARY KEY,
    category_name VARCHAR(50),
    is_featured BOOLEAN DEFAULT FALSE
);

-- Update based on join
UPDATE products_upsert p
JOIN product_categories pc ON p.product_id = pc.product_id
SET p.price = p.price * 0.85  -- 15% discount for featured products
WHERE pc.is_featured = TRUE;

-- Update with subquery
UPDATE products_upsert
SET stock_quantity = stock_quantity + (
    SELECT COALESCE(SUM(quantity), 0)
    FROM order_items oi
    WHERE oi.product_id = products_upsert.product_id
    AND oi.created_date >= DATE_SUB(NOW(), INTERVAL 30 DAY)
)
WHERE product_id IN (SELECT product_id FROM product_categories WHERE is_featured = TRUE);

-- Advanced DELETE operations
-- Delete with join
DELETE p
FROM products_upsert p
LEFT JOIN order_items oi ON p.product_id = oi.product_id
WHERE p.stock_quantity = 0
AND oi.order_item_id IS NULL;  -- Never been ordered

-- Delete using subquery
DELETE FROM products_upsert
WHERE product_id IN (
    SELECT product_id 
    FROM (
        SELECT product_id
        FROM products_upsert
        WHERE stock_quantity = 0
        AND last_updated < DATE_SUB(NOW(), INTERVAL 6 MONTH)
    ) AS old_products
);

-- Temporary tables for complex operations
CREATE TEMPORARY TABLE temp_sales_summary AS
SELECT 
    salesperson,
    COUNT(*) AS sale_count,
    SUM(amount) AS total_sales,
    AVG(amount) AS avg_sale_amount
FROM sales_data
GROUP BY salesperson;

-- Use temporary table in operations
UPDATE sales_commissions sc
JOIN temp_sales_summary tss ON sc.salesperson = tss.salesperson
SET sc.commission_amount = tss.total_sales * sc.commission_rate
WHERE sc.commission_calculated = FALSE;

-- Clean up temporary table
DROP TEMPORARY TABLE temp_sales_summary;

-- Data archiving with INSERT ... SELECT
CREATE TABLE sales_archive LIKE sales_data;

-- Archive old sales data
INSERT INTO sales_archive
SELECT *
FROM sales_data
WHERE sale_date < DATE_SUB(NOW(), INTERVAL 2 YEAR);

-- Delete archived data
DELETE FROM sales_data
WHERE sale_date < DATE_SUB(NOW(), INTERVAL 2 YEAR);

-- MERGE-like operations (using multiple statements)
START TRANSACTION;

-- Archive records that will be updated
INSERT INTO products_archive
SELECT * FROM products_upsert
WHERE sku IN ('LAPTOP-001', 'MOUSE-001');

-- Update the records
UPDATE products_upsert
SET price = price * 0.9,
    last_updated = NOW()
WHERE sku IN ('LAPTOP-001', 'MOUSE-001');

COMMIT;

-- Complex data transformation example
-- Update product prices based on multiple factors
UPDATE products_upsert p
LEFT JOIN (
    SELECT 
        product_id,
        COUNT(*) AS order_count,
        AVG(quantity) AS avg_order_quantity
    FROM order_items
    GROUP BY product_id
) oi_stats ON p.product_id = oi_stats.product_id
LEFT JOIN product_categories pc ON p.product_id = pc.product_id
SET p.price = CASE 
    WHEN pc.is_featured = TRUE AND oi_stats.order_count > 10 THEN p.price * 1.15  -- Popular featured: +15%
    WHEN oi_stats.order_count > 20 THEN p.price * 1.10  -- Very popular: +10%
    WHEN oi_stats.order_count < 5 AND p.stock_quantity > 50 THEN p.price * 0.85   -- Slow moving: -15%
    WHEN p.stock_quantity < 10 THEN p.price * 1.05  -- Low stock: +5%
    ELSE p.price
END,
p.last_updated = NOW()
WHERE p.price != CASE 
    WHEN pc.is_featured = TRUE AND oi_stats.order_count > 10 THEN p.price * 1.15
    WHEN oi_stats.order_count > 20 THEN p.price * 1.10
    WHEN oi_stats.order_count < 5 AND p.stock_quantity > 50 THEN p.price * 0.85
    WHEN p.stock_quantity < 10 THEN p.price * 1.05
    ELSE p.price
END;  -- Only update if price actually changes

6. Performance & Optimization

Query Optimization and Execution Plans

Understanding how MySQL executes queries and using execution plans to identify performance bottlenecks is crucial for database performance.
-- Sample large table for optimization examples
CREATE TABLE large_orders (
    order_id INT AUTO_INCREMENT PRIMARY KEY,
    customer_id INT NOT NULL,
    order_date DATETIME NOT NULL,
    total_amount DECIMAL(12,2) NOT NULL,
    status ENUM('pending', 'processing', 'shipped', 'delivered', 'cancelled') NOT NULL,
    shipping_method VARCHAR(50),
    payment_method VARCHAR(50),
    region VARCHAR(50),
    salesperson_id INT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_customer_id (customer_id),
    INDEX idx_order_date (order_date),
    INDEX idx_status (status),
    INDEX idx_region_status (region, status)
);

-- Insert sample data (1 million rows)
-- Note: In practice, you'd use a script to generate this data
INSERT INTO large_orders (customer_id, order_date, total_amount, status, shipping_method, payment_method, region, salesperson_id)
SELECT 
    FLOOR(RAND() * 10000) + 1 AS customer_id,
    DATE_ADD('2020-01-01', INTERVAL FLOOR(RAND() * 1460) DAY) AS order_date,
    ROUND(RAND() * 1000 + 10, 2) AS total_amount,
    ELT(FLOOR(RAND() * 5) + 1, 'pending', 'processing', 'shipped', 'delivered', 'cancelled') AS status,
    ELT(FLOOR(RAND() * 4) + 1, 'standard', 'express', 'overnight', 'international') AS shipping_method,
    ELT(FLOOR(RAND() * 3) + 1, 'credit_card', 'paypal', 'bank_transfer') AS payment_method,
    ELT(FLOOR(RAND() * 4) + 1, 'North', 'South', 'East', 'West') AS region,
    FLOOR(RAND() * 100) + 1 AS salesperson_id
FROM 
    (SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) a,
    (SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) b,
    (SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) c,
    (SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) d,
    (SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) e,
    (SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) f,
    (SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) g,
    (SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) h;

-- Analyze table statistics
ANALYZE TABLE large_orders;

-- Using EXPLAIN to understand query execution
EXPLAIN FORMAT=JSON
SELECT 
    customer_id,
    COUNT(*) AS order_count,
    SUM(total_amount) AS total_spent
FROM large_orders
WHERE order_date BETWEEN '2023-01-01' AND '2023-12-31'
    AND status = 'delivered'
    AND region = 'North'
GROUP BY customer_id
HAVING total_spent > 1000
ORDER BY total_spent DESC
LIMIT 10;

-- Simple EXPLAIN output
EXPLAIN 
SELECT * FROM large_orders 
WHERE customer_id = 1234 
AND order_date > '2023-01-01';

-- Common performance issues and solutions

-- 1. Missing indexes
-- Slow query (full table scan)
EXPLAIN SELECT * FROM large_orders WHERE region = 'North' AND status = 'pending';

-- Add composite index
CREATE INDEX idx_region_status ON large_orders(region, status);

-- 2. Index not being used
-- Function on indexed column prevents index usage
EXPLAIN SELECT * FROM large_orders WHERE YEAR(order_date) = 2023;

-- Rewrite to use index
EXPLAIN SELECT * FROM large_orders WHERE order_date BETWEEN '2023-01-01' AND '2023-12-31';

-- 3. Using OR conditions
EXPLAIN SELECT * FROM large_orders WHERE region = 'North' OR region = 'South';

-- Rewrite using IN
EXPLAIN SELECT * FROM large_orders WHERE region IN ('North', 'South');

-- Or use UNION
EXPLAIN 
SELECT * FROM large_orders WHERE region = 'North'
UNION ALL
SELECT * FROM large_orders WHERE region = 'South';

-- 4. Subquery optimization
-- Correlated subquery (slow)
EXPLAIN
SELECT customer_id, total_amount
FROM large_orders lo1
WHERE total_amount > (
    SELECT AVG(total_amount)
    FROM large_orders lo2
    WHERE lo2.customer_id = lo1.customer_id
);

-- Rewrite with window function (MySQL 8.0+)
EXPLAIN
SELECT customer_id, total_amount
FROM (
    SELECT 
        customer_id,
        total_amount,
        AVG(total_amount) OVER (PARTITION BY customer_id) AS avg_customer_amount
    FROM large_orders
) AS customer_stats
WHERE total_amount > avg_customer_amount;

-- Query performance analysis
-- Enable performance schema (if not already enabled)
-- SET GLOBAL performance_schema = ON;

-- Check slow queries
SELECT 
    query,
    db,
    exec_count,
    avg_latency,
    max_latency,
    rows_sent,
    rows_examined
FROM sys.statement_analysis
ORDER BY avg_latency DESC
LIMIT 10;

-- Identify tables with full scans
SELECT 
    object_schema,
    object_name,
    index_name,
    count_star,
    avg_timer_wait
FROM performance_schema.table_io_waits_summary_by_table
WHERE index_name IS NULL  -- Full table scans
ORDER BY count_star DESC
LIMIT 10;

-- Monitor index usage
SELECT 
    OBJECT_SCHEMA,
    OBJECT_NAME,
    INDEX_NAME,
    COUNT_READ,
    COUNT_FETCH,
    COUNT_INSERT,
    COUNT_UPDATE,
    COUNT_DELETE
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE OBJECT_SCHEMA = DATABASE()
ORDER BY COUNT_READ DESC;

-- Query tuning techniques

-- 1. Force index usage (when optimizer chooses wrong index)
SELECT * FROM large_orders FORCE INDEX (idx_order_date)
WHERE order_date BETWEEN '2023-01-01' AND '2023-12-31'
AND customer_id = 1234;

-- 2. Use covering indexes
EXPLAIN
SELECT order_id, customer_id, order_date
FROM large_orders
WHERE order_date BETWEEN '2023-01-01' AND '2023-12-31';

-- Create covering index
CREATE INDEX idx_covering_order_date ON large_orders(order_date, customer_id, order_id);

-- 3. Batch processing for large updates
-- Instead of single large update, break into batches
SET @batch_size = 1000;
SET @offset = 0;

WHILE @offset < (SELECT COUNT(*) FROM large_orders) DO
    UPDATE large_orders 
    SET status = 'processed'
    WHERE status = 'pending'
    LIMIT @batch_size
    OFFSET @offset;
    
    SET @offset = @offset + @batch_size;
    -- Add delay if needed: DO SLEEP(1);
END WHILE;

-- 4. Use temporary tables for complex queries
CREATE TEMPORARY TABLE temp_customer_stats AS
SELECT 
    customer_id,
    COUNT(*) AS order_count,
    SUM(total_amount) AS total_spent,
    MAX(order_date) AS last_order_date
FROM large_orders
WHERE order_date >= DATE_SUB(NOW(), INTERVAL 1 YEAR)
GROUP BY customer_id;

-- Then use the temporary table
SELECT 
    customer_id,
    order_count,
    total_spent
FROM temp_customer_stats
WHERE total_spent > 1000
ORDER BY total_spent DESC;

-- Clean up
DROP TEMPORARY TABLE temp_customer_stats;

Indexing Strategies and Best Practices

Proper indexing is the most important factor in database performance. Understanding different index types and when to use them is essential.
-- Different index types and their use cases

-- 1. B-tree indexes (default for most cases)
CREATE TABLE products_indexed (
    product_id INT AUTO_INCREMENT PRIMARY KEY,
    sku VARCHAR(50) UNIQUE,
    product_name VARCHAR(200) NOT NULL,
    category_id INT NOT NULL,
    price DECIMAL(10,2) NOT NULL,
    stock_quantity INT DEFAULT 0,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    is_active BOOLEAN DEFAULT TRUE,
    
    -- Single column indexes
    INDEX idx_category_id (category_id),
    INDEX idx_price (price),
    INDEX idx_created_at (created_at),
    
    -- Composite indexes
    INDEX idx_category_price (category_id, price),
    INDEX idx_active_created (is_active, created_at),
    
    -- Covering indexes
    INDEX idx_covering_product (category_id, price, product_name),
    
    -- Partial indexes (simulated with condition)
    INDEX idx_active_products (product_id) WHERE is_active = TRUE
);

-- 2. Unique indexes
CREATE UNIQUE INDEX idx_unique_sku ON products_indexed(sku);
CREATE UNIQUE INDEX idx_unique_product_name ON products_indexed(product_name);

-- 3. Full-text indexes (for text search)
CREATE TABLE articles_ft (
    article_id INT AUTO_INCREMENT PRIMARY KEY,
    title VARCHAR(500) NOT NULL,
    content TEXT NOT NULL,
    author VARCHAR(100),
    published_date DATE,
    FULLTEXT(title, content)
);

-- Using full-text search
SELECT 
    article_id,
    title,
    MATCH(title, content) AGAINST('database performance' IN NATURAL LANGUAGE MODE) AS relevance
FROM articles_ft
WHERE MATCH(title, content) AGAINST('database performance' IN NATURAL LANGUAGE MODE)
ORDER BY relevance DESC;

-- Boolean mode search
SELECT 
    article_id,
    title
FROM articles_ft
WHERE MATCH(title, content) AGAINST('+MySQL -Oracle' IN BOOLEAN MODE);

-- 4. Spatial indexes (for geographic data)
CREATE TABLE locations (
    location_id INT AUTO_INCREMENT PRIMARY KEY,
    location_name VARCHAR(100) NOT NULL,
    coordinates POINT NOT NULL,
    SPATIAL INDEX(coordinates)
);

-- Insert spatial data
INSERT INTO locations (location_name, coordinates)
VALUES 
    ('Office', ST_GeomFromText('POINT(40.7128 -74.0060)')),
    ('Warehouse', ST_GeomFromText('POINT(40.7589 -73.9851)'));

-- Spatial query
SELECT 
    location_name,
    ST_AsText(coordinates) AS coordinates
FROM locations
WHERE ST_Distance_Sphere(coordinates, ST_GeomFromText('POINT(40.7128 -74.0060)')) < 1000;

-- Index analysis and maintenance

-- Show index statistics
SELECT 
    TABLE_NAME,
    INDEX_NAME,
    COLUMN_NAME,
    SEQ_IN_INDEX,
    CARDINALITY,
    INDEX_TYPE
FROM information_schema.STATISTICS
WHERE TABLE_SCHEMA = DATABASE()
AND TABLE_NAME = 'products_indexed'
ORDER BY TABLE_NAME, INDEX_NAME, SEQ_IN_INDEX;

-- Analyze index usage
SELECT 
    OBJECT_TYPE,
    OBJECT_SCHEMA,
    OBJECT_NAME,
    INDEX_NAME,
    COUNT_READ,
    COUNT_FETCH,
    COUNT_INSERT,
    COUNT_UPDATE,
    COUNT_DELETE
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE OBJECT_SCHEMA = DATABASE()
AND OBJECT_NAME = 'products_indexed'
ORDER BY COUNT_READ DESC;

-- Identify unused indexes
SELECT 
    OBJECT_SCHEMA,
    OBJECT_NAME,
    INDEX_NAME
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE INDEX_NAME IS NOT NULL
AND COUNT_STAR = 0
AND OBJECT_SCHEMA NOT IN ('mysql', 'performance_schema', 'information_schema')
ORDER BY OBJECT_SCHEMA, OBJECT_NAME;

-- Index fragmentation analysis
SELECT 
    TABLE_NAME,
    INDEX_NAME,
    ROUND(stat_value * @@innodb_page_size / 1024 / 1024, 2) AS index_size_mb
FROM mysql.innodb_index_stats
WHERE database_name = DATABASE()
AND stat_name = 'size'
ORDER BY stat_value * @@innodb_page_size DESC;

-- Rebuild fragmented indexes
OPTIMIZE TABLE products_indexed;

-- Or for specific indexes
ALTER TABLE products_indexed ENGINE=InnoDB;

-- Indexing best practices

-- 1. Choose the right columns to index
-- Index columns used in WHERE, JOIN, ORDER BY, GROUP BY
CREATE INDEX idx_orders_customer_date ON large_orders(customer_id, order_date);
CREATE INDEX idx_products_category_active ON products_indexed(category_id, is_active);

-- 2. Composite index column order
-- Put most selective columns first
CREATE INDEX idx_selective_first ON large_orders(region, status, order_date);

-- 3. Avoid over-indexing
-- Each index adds overhead for INSERT/UPDATE/DELETE

-- 4. Use covering indexes
-- Include all columns needed in the query
CREATE INDEX idx_covering_order_details ON large_orders(
    customer_id, 
    order_date, 
    total_amount, 
    status
);

-- 5. Consider index on foreign keys
ALTER TABLE order_items ADD INDEX idx_product_id (product_id);

-- 6. Partial indexes (where supported)
-- MySQL doesn't support true partial indexes, but we can simulate
CREATE INDEX idx_active_high_price 
ON products_indexed(price) 
WHERE is_active = TRUE AND price > 100;

-- Indexing anti-patterns

-- 1. Indexing low-cardinality columns
CREATE INDEX idx_gender ON users(gender);  -- Usually bad (only 2-3 values)

-- 2. Too many indexes on frequently updated tables
-- Each index slows down INSERT/UPDATE/DELETE operations

-- 3. Indexing very long columns
CREATE INDEX idx_long_content ON articles(content(100));  -- Prefix index

-- 4. Redundant indexes
CREATE INDEX idx_a_b ON table1(a, b);
CREATE INDEX idx_a ON table1(a);  -- Redundant, first index can be used

-- Advanced indexing scenarios

-- 1. Descending indexes (MySQL 8.0+)
CREATE INDEX idx_orders_date_desc ON large_orders(order_date DESC);

-- 2. Functional indexes (MySQL 8.0+)
CREATE INDEX idx_month_year ON large_orders((YEAR(order_date)), (MONTH(order_date)));

-- 3. Multi-valued indexes (MySQL 8.0.17+ for JSON arrays)
CREATE TABLE users_mv (
    user_id INT PRIMARY KEY,
    name VARCHAR(100),
    tags JSON,
    INDEX idx_tags ((CAST(tags AS CHAR(255) ARRAY)))
);

-- Index monitoring and tuning

-- Slow query log analysis
-- Enable slow query log
SET GLOBAL slow_query_log = 1;
SET GLOBAL long_query_time = 2;  -- Queries taking more than 2 seconds
SET GLOBAL log_queries_not_using_indexes = 1;

-- Analyze slow queries
SELECT 
    query,
    db,
    exec_count,
    avg_latency,
    rows_sent,
    rows_examined,
    rows_sent / rows_examined AS selectivity
FROM sys.statement_analysis
WHERE rows_examined > 1000
ORDER BY avg_latency DESC
LIMIT 10;

-- Index suggestions based on query patterns
SELECT 
    TABLE_SCHEMA,
    TABLE_NAME,
    COLUMN_NAME,
    DATA_TYPE,
    COUNT(*) AS query_count
FROM information_schema.COLUMNS c
JOIN mysql.slow_log sl ON sl.db = c.TABLE_SCHEMA
WHERE sl.query_text LIKE CONCAT('%', c.COLUMN_NAME, '%')
AND sl.query_text LIKE '%WHERE%'
GROUP BY TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME
ORDER BY query_count DESC;

7. Database Administration

User Management and Security

Proper user management and security practices are essential for protecting sensitive data and ensuring database integrity.
-- User management and privileges

-- View current users
SELECT user, host, authentication_string FROM mysql.user;

-- Create new user
CREATE USER 'app_user'@'localhost' IDENTIFIED BY 'secure_password123';
CREATE USER 'readonly_user'@'%' IDENTIFIED BY 'readonly_pass';
CREATE USER 'admin_user'@'192.168.1.%' IDENTIFIED BY 'admin_pass';

-- Grant privileges
-- Application user (limited privileges)
GRANT SELECT, INSERT, UPDATE, DELETE ON company.* TO 'app_user'@'localhost';

-- Read-only user
GRANT SELECT ON company.* TO 'readonly_user'@'%';

-- Admin user (all privileges on specific database)
GRANT ALL PRIVILEGES ON company.* TO 'admin_user'@'192.168.1.%';

-- Grant specific table privileges
GRANT SELECT, INSERT ON company.products TO 'app_user'@'localhost';
GRANT SELECT ON company.salaries TO 'hr_user'@'localhost';

-- Grant stored procedure privileges
GRANT EXECUTE ON PROCEDURE company.CalculateBonus TO 'app_user'@'localhost';

-- View user privileges
SHOW GRANTS FOR 'app_user'@'localhost';
SELECT * FROM mysql.user WHERE user = 'app_user';

-- Revoke privileges
REVOKE DELETE ON company.* FROM 'app_user'@'localhost';
REVOKE ALL PRIVILEGES ON company.salaries FROM 'hr_user'@'localhost';

-- Modify users
ALTER USER 'app_user'@'localhost' IDENTIFIED BY 'new_secure_password123';
RENAME USER 'old_user'@'localhost' TO 'new_user'@'localhost';

-- Lock and unlock users
ALTER USER 'app_user'@'localhost' ACCOUNT LOCK;
ALTER USER 'app_user'@'localhost' ACCOUNT UNLOCK;

-- Drop users
DROP USER 'old_user'@'localhost';

-- Roles (MySQL 8.0+)
CREATE ROLE 'read_only';
CREATE ROLE 'data_entry';
CREATE ROLE 'admin';

-- Grant privileges to roles
GRANT SELECT ON company.* TO 'read_only';
GRANT SELECT, INSERT, UPDATE ON company.* TO 'data_entry';
GRANT ALL PRIVILEGES ON company.* TO 'admin';

-- Assign roles to users
GRANT 'read_only' TO 'readonly_user'@'%';
GRANT 'data_entry' TO 'app_user'@'localhost';
GRANT 'admin' TO 'admin_user'@'192.168.1.%';

-- Set default roles
SET DEFAULT ROLE 'data_entry' TO 'app_user'@'localhost';

-- Security best practices

-- Password validation plugin (MySQL 8.0+)
INSTALL COMPONENT 'file://component_validate_password';
SET GLOBAL validate_password.policy = MEDIUM;

-- Secure connections
CREATE USER 'secure_user'@'%' IDENTIFIED BY 'password' REQUIRE SSL;

-- Password expiration
ALTER USER 'app_user'@'localhost' PASSWORD EXPIRE INTERVAL 90 DAY;

-- Login attempts limitation
CREATE USER 'limited_user'@'localhost' 
IDENTIFIED BY 'password'
WITH MAX_QUERIES_PER_HOUR 1000
MAX_UPDATES_PER_HOUR 100
MAX_CONNECTIONS_PER_HOUR 50
MAX_USER_CONNECTIONS 10;

-- Audit logging (Enterprise feature or using triggers)
CREATE TABLE security_audit (
    audit_id INT AUTO_INCREMENT PRIMARY KEY,
    user_name VARCHAR(50),
    action_type VARCHAR(50),
    table_name VARCHAR(50),
    record_id INT,
    old_values JSON,
    new_values JSON,
    changed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    ip_address VARCHAR(45)
);

-- Example audit trigger
DELIMITER //
CREATE TRIGGER audit_employee_changes
    AFTER UPDATE ON employees
    FOR EACH ROW
BEGIN
    IF NEW.salary != OLD.salary OR NEW.department_id != OLD.department_id THEN
        INSERT INTO security_audit (
            user_name, 
            action_type, 
            table_name, 
            record_id, 
            old_values, 
            new_values,
            ip_address
        )
        VALUES (
            USER(),
            'UPDATE',
            'employees',
            NEW.employee_id,
            JSON_OBJECT(
                'salary', OLD.salary,
                'department_id', OLD.department_id
            ),
            JSON_OBJECT(
                'salary', NEW.salary,
                'department_id', NEW.department_id
            ),
            -- Get client IP (requires additional setup)
            'unknown'
        );
    END IF;
END //
DELIMITER ;

-- Database backup and recovery strategies

-- Logical backup with mysqldump
-- Command line: mysqldump -u root -p company > company_backup.sql

-- Partial backups
-- Backup specific tables
-- mysqldump -u root -p company employees departments > hr_tables.sql

-- Backup with conditions
-- mysqldump -u root -p company --where="salary>50000" employees > high_salary_employees.sql

-- Backup stored procedures and functions
-- mysqldump -u root -p --routines --no-data company > schema_and_routines.sql

-- Physical backup (file system level)
-- Requires MySQL to be stopped or using special tools

-- Point-in-time recovery with binary logs
-- Enable binary logging in my.cnf:
-- log_bin = /var/log/mysql/mysql-bin.log
-- expire_logs_days = 7

-- View binary logs
SHOW BINARY LOGS;
SHOW BINLOG EVENTS IN 'mysql-bin.000001';

-- Backup automation with events
CREATE EVENT daily_backup
ON SCHEDULE EVERY 1 DAY
STARTS '2023-01-01 02:00:00'
DO
BEGIN
    -- This would typically call an external script
    -- For demonstration, we'll just log
    INSERT INTO backup_log (backup_type, status, completed_at)
    VALUES ('scheduled', 'completed', NOW());
END;

-- Recovery procedures

-- Restore from logical backup
-- mysql -u root -p company < company_backup.sql

-- Point-in-time recovery
-- mysqlbinlog mysql-bin.000001 --start-datetime="2023-01-15 10:00:00" | mysql -u root -p

-- Disaster recovery planning

-- Replication setup
-- On master:
-- server_id = 1
-- log_bin = mysql-bin
-- binlog_do_db = company

-- On slave:
-- server_id = 2
-- relay_log = mysql-relay-bin
-- read_only = 1

-- Monitor replication
SHOW SLAVE STATUS\G

-- High availability with group replication (MySQL 8.0+)
-- Requires special configuration for multi-master replication

-- Performance monitoring

-- Show process list
SHOW PROCESSLIST;
SELECT * FROM information_schema.PROCESSLIST;

-- Show current locks
SHOW ENGINE INNODB STATUS;
SELECT * FROM information_schema.INNODB_LOCKS;
SELECT * FROM information_schema.INNODB_LOCK_WAITS;

-- Monitor table sizes
SELECT 
    TABLE_SCHEMA,
    TABLE_NAME,
    ROUND((DATA_LENGTH + INDEX_LENGTH) / 1024 / 1024, 2) AS size_mb,
    TABLE_ROWS
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = DATABASE()
ORDER BY (DATA_LENGTH + INDEX_LENGTH) DESC;

-- Monitor connection usage
SHOW STATUS LIKE 'Threads_connected';
SHOW VARIABLES LIKE 'max_connections';

-- Buffer pool usage
SHOW STATUS LIKE 'Innodb_buffer_pool%';

-- Maintenance operations

-- Optimize tables
OPTIMIZE TABLE large_orders, products, customers;

-- Check and repair tables
CHECK TABLE large_orders;
REPAIR TABLE large_orders;  -- Only for MyISAM

-- Update statistics
ANALYZE TABLE large_orders, products, customers;

-- Clean up old data
PURGE BINARY LOGS BEFORE '2023-01-01 00:00:00';

-- Reset performance schema if needed
-- TRUNCATE TABLE performance_schema.events_waits_history_long;

Backup, Recovery, and High Availability

Robust backup strategies, disaster recovery plans, and high availability setups are critical for production database systems.
-- Comprehensive backup and recovery strategies

-- 1. Logical backups with mysqldump

-- Full database backup
-- mysqldump -u root -p --single-transaction --routines --triggers --events company > full_backup.sql

-- Backup with compression
-- mysqldump -u root -p company | gzip > company_backup_$(date +%Y%m%d).sql.gz

-- Backup specific schemas
-- mysqldump -u root -p --databases company hr finance > multiple_dbs.sql

-- Backup with master data for replication
-- mysqldump -u root -p --single-transaction --master-data=2 company > backup_with_binlog_info.sql

-- 2. MySQL Enterprise Backup (for InnoDB)
-- Requires commercial license
-- mysqlbackup --backup-dir=/backups --user=root --password backup

-- 3. File system snapshots (LVM, ZFS, etc.)
-- Requires coordination with MySQL

-- Backup automation script example
-- Save as /scripts/mysql_backup.sh

DELIMITER //
CREATE PROCEDURE CreateBackupLog()
BEGIN
    INSERT INTO backup_audit (
        backup_type,
        backup_file,
        file_size,
        status,
        created_at
    ) VALUES (
        'scheduled',
        CONCAT('company_backup_', DATE_FORMAT(NOW(), '%Y%m%d_%H%i%s'), '.sql'),
        0,
        'started',
        NOW()
    );
END //

CREATE PROCEDURE UpdateBackupLog(IN backup_file VARCHAR(255), IN file_size BIGINT)
BEGIN
    UPDATE backup_audit 
    SET status = 'completed',
        file_size = file_size,
        completed_at = NOW()
    WHERE backup_file = backup_file;
END //
DELIMITER ;

-- Create backup audit table
CREATE TABLE backup_audit (
    audit_id INT AUTO_INCREMENT PRIMARY KEY,
    backup_type ENUM('scheduled', 'manual', 'emergency'),
    backup_file VARCHAR(255),
    file_size BIGINT,
    status ENUM('started', 'completed', 'failed'),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    completed_at TIMESTAMP NULL,
    error_message TEXT
);

-- Recovery procedures

-- 1. Full database recovery
-- mysql -u root -p < full_backup.sql

-- 2. Partial recovery (specific tables)
-- mysql -u root -p company < employees_backup.sql

-- 3. Point-in-time recovery
-- Step 1: Restore last full backup
-- mysql -u root -p < full_backup.sql

-- Step 2: Apply binary logs since backup
-- mysqlbinlog --start-datetime="2023-01-15 10:00:00" mysql-bin.000001 mysql-bin.000002 | mysql -u root -p

-- 4. Table-level recovery from backup
-- Create temporary database for extraction
CREATE DATABASE temp_recovery;
-- mysql -u root -p temp_recovery < full_backup.sql
-- mysqldump -u root -p temp_recovery specific_table > recovered_table.sql
-- mysql -u root -p company < recovered_table.sql
-- DROP DATABASE temp_recovery;

-- Disaster recovery testing

-- Test recovery procedure
CREATE DATABASE recovery_test;
-- mysql -u root -p recovery_test < full_backup.sql

-- Verify data integrity
SELECT 
    TABLE_NAME,
    TABLE_ROWS
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = 'recovery_test';

-- Compare with production
SELECT 
    'production' as source,
    COUNT(*) as row_count
FROM company.employees
UNION ALL
SELECT 
    'recovery_test' as source,
    COUNT(*) as row_count
FROM recovery_test.employees;

-- High availability setups

-- 1. Master-Slave replication
-- On master configuration:
-- server_id = 1
-- log_bin = /var/log/mysql/mysql-bin.log
-- binlog_do_db = company

-- On slave configuration:
-- server_id = 2
-- relay_log = /var/log/mysql/mysql-relay-bin.log
-- read_only = 1

-- Setup replication
-- On master:
CREATE USER 'repl'@'%' IDENTIFIED BY 'repl_password';
GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%';
SHOW MASTER STATUS;

-- On slave:
CHANGE MASTER TO
MASTER_HOST='master_host',
MASTER_USER='repl',
MASTER_PASSWORD='repl_password',
MASTER_LOG_FILE='mysql-bin.000001',
MASTER_LOG_POS=107;

START SLAVE;

-- Monitor replication
SHOW SLAVE STATUS\G

-- 2. Master-Master replication
-- Both servers act as masters
-- Requires careful conflict resolution

-- 3. Group Replication (MySQL 8.0+)
-- Provides built-in high availability
-- Configuration example:
-- plugin_load_add='group_replication.so'
-- group_replication_group_name="aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
-- group_replication_start_on_boot=off
-- group_replication_local_address= "node1:33061"
-- group_replication_group_seeds= "node1:33061,node2:33061,node3:33061"

-- Start group replication
SET GLOBAL group_replication_bootstrap_group=ON;
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group=OFF;

-- 4. MySQL InnoDB Cluster (MySQL 8.0+)
-- Provides complete high availability solution
-- Uses Group Replication + MySQL Router

-- Monitoring and maintenance

-- Replication monitoring
SELECT 
    channel_name,
    service_state,
    last_error_number,
    last_error_message
FROM performance_schema.replication_connection_status;

-- Lag monitoring
SHOW SLAVE STATUS;
SELECT 
    NOW() - MAX(ts) AS replication_lag_seconds
FROM company.replication_heartbeat;

-- Automated failover procedures

-- Failover script concept
DELIMITER //
CREATE PROCEDURE PromoteSlaveToMaster()
BEGIN
    -- Stop slave
    STOP SLAVE;
    
    -- Reset slave
    RESET SLAVE ALL;
    
    -- Make writable
    SET GLOBAL read_only = 0;
    
    -- Update application configuration
    -- This would typically call an external script
    
    -- Log the promotion
    INSERT INTO failover_events (event_type, old_master, new_master, occurred_at)
    VALUES ('promotion', @old_master_host, @@hostname, NOW());
END //

CREATE PROCEDURE SetupReplicationToNewMaster(IN new_master_host VARCHAR(255))
BEGIN
    -- Stop current replication
    STOP SLAVE;
    
    -- Configure new master
    CHANGE MASTER TO
    MASTER_HOST = new_master_host,
    MASTER_USER = 'repl',
    MASTER_PASSWORD = 'repl_password',
    MASTER_AUTO_POSITION = 1;
    
    -- Start replication
    START SLAVE;
    
    -- Log the setup
    INSERT INTO failover_events (event_type, new_master, occurred_at)
    VALUES ('replication_setup', new_master_host, NOW());
END //
DELIMITER ;

-- Backup verification

-- Create backup verification procedure
DELIMITER //
CREATE PROCEDURE VerifyBackup(IN backup_file VARCHAR(255))
BEGIN
    DECLARE backup_row_count INT;
    DECLARE production_row_count INT;
    
    -- Count rows in backup (would need to restore to temp database)
    SELECT COUNT(*) INTO backup_row_count 
    FROM temp_restore.employees;
    
    -- Count rows in production
    SELECT COUNT(*) INTO production_row_count 
    FROM company.employees;
    
    -- Compare
    IF backup_row_count = production_row_count THEN
        UPDATE backup_audit 
        SET status = 'verified',
            completed_at = NOW()
        WHERE backup_file = backup_file;
    ELSE
        UPDATE backup_audit 
        SET status = 'verification_failed',
            error_message = CONCAT('Row count mismatch: backup=', backup_row_count, ' production=', production_row_count)
        WHERE backup_file = backup_file;
    END IF;
END //
DELIMITER ;

-- Emergency procedures

-- Database corruption recovery
-- Check for corruption
CHECK TABLE company.employees EXTENDED;

-- Repair if needed (for MyISAM)
REPAIR TABLE company.employees;

-- For InnoDB, use innodb_force_recovery
-- Set in my.cnf: innodb_force_recovery = 1 (then 2, 3, etc. if needed)

-- Data recovery from corrupted tables
-- Try to dump data
-- mysqldump -u root -p company employees > recovered_data.sql

-- Performance under recovery
-- Monitor recovery progress
SHOW ENGINE INNODB STATUS;

-- Set recovery settings
SET GLOBAL innodb_force_recovery = 1;
-- Restart MySQL and attempt data recovery
-- Then remove innodb_force_recovery and restart normally

-- Documentation and runbooks

-- Create maintenance documentation table
CREATE TABLE maintenance_runbooks (
    runbook_id INT AUTO_INCREMENT PRIMARY KEY,
    procedure_name VARCHAR(100),
    description TEXT,
    steps JSON,
    estimated_duration INT,
    risk_level ENUM('low', 'medium', 'high'),
    last_performed TIMESTAMP NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Example runbook entry
INSERT INTO maintenance_runbooks (
    procedure_name,
    description,
    steps,
    estimated_duration,
    risk_level
) VALUES (
    'Monthly Database Maintenance',
    'Routine maintenance including optimization and statistics update',
    JSON_ARRAY(
        'Backup database',
        'Check table integrity',
        'Optimize fragmented tables',
        'Update statistics',
        'Verify backup',
        'Document results'
    ),
    120,  -- 2 hours
    'low'
);

8. Programming & Integration

Stored Procedures, Functions, and Triggers

MySQL supports stored routines and triggers for implementing business logic directly in the database, improving performance and data consistency.
-- Stored procedures for complex business logic

DELIMITER //

-- Procedure with parameters and error handling
CREATE PROCEDURE ProcessEmployeeBonus(
    IN emp_id INT,
    IN bonus_percentage DECIMAL(5,2),
    OUT bonus_amount DECIMAL(10,2),
    OUT result_message VARCHAR(200)
)
BEGIN
    DECLARE current_salary DECIMAL(10,2);
    DECLARE employee_exists INT DEFAULT 0;
    DECLARE EXIT HANDLER FOR SQLEXCEPTION
    BEGIN
        ROLLBACK;
        SET result_message = 'Error processing bonus';
        SET bonus_amount = 0;
    END;
    
    -- Check if employee exists
    SELECT COUNT(*) INTO employee_exists
    FROM employees
    WHERE employee_id = emp_id;
    
    IF employee_exists = 0 THEN
        SET result_message = 'Employee not found';
        SET bonus_amount = 0;
    ELSEIF bonus_percentage <= 0 OR bonus_percentage > 50 THEN
        SET result_message = 'Invalid bonus percentage';
        SET bonus_amount = 0;
    ELSE
        START TRANSACTION;
        
        -- Get current salary
        SELECT salary INTO current_salary
        FROM employees
        WHERE employee_id = emp_id
        FOR UPDATE;  -- Lock the row
        
        -- Calculate bonus
        SET bonus_amount = (current_salary * 12) * (bonus_percentage / 100);
        
        -- Insert bonus record
        INSERT INTO employee_bonuses (
            employee_id,
            bonus_amount,
            bonus_percentage,
            calculated_date
        ) VALUES (
            emp_id,
            bonus_amount,
            bonus_percentage,
            CURDATE()
        );
        
        -- Update employee record
        UPDATE employees
        SET last_bonus_date = CURDATE(),
            last_bonus_amount = bonus_amount
        WHERE employee_id = emp_id;
        
        COMMIT;
        
        SET result_message = CONCAT(
            'Bonus processed successfully: $', 
            ROUND(bonus_amount, 2)
        );
    END IF;
END //

-- Function for calculated fields
CREATE FUNCTION CalculateEmployeeTenure(emp_id INT)
RETURNS INT
READS SQL DATA
DETERMINISTIC
BEGIN
    DECLARE tenure_years INT;
    
    SELECT TIMESTAMPDIFF(YEAR, hire_date, CURDATE())
    INTO tenure_years
    FROM employees
    WHERE employee_id = emp_id;
    
    RETURN tenure_years;
END //

-- Function with complex logic
CREATE FUNCTION GetEmployeePerformanceRating(emp_id INT)
RETURNS VARCHAR(20)
READS SQL DATA
DETERMINISTIC
BEGIN
    DECLARE avg_salary DECIMAL(10,2);
    DECLARE emp_salary DECIMAL(10,2);
    DECLARE performance_rating VARCHAR(20);
    
    -- Get department average salary
    SELECT AVG(salary) INTO avg_salary
    FROM employees e
    WHERE e.department_id = (
        SELECT department_id 
        FROM employees 
        WHERE employee_id = emp_id
    );
    
    -- Get employee salary
    SELECT salary INTO emp_salary
    FROM employees
    WHERE employee_id = emp_id;
    
    -- Determine performance rating
    IF emp_salary > avg_salary * 1.2 THEN
        SET performance_rating = 'Excellent';
    ELSEIF emp_salary > avg_salary THEN
        SET performance_rating = 'Good';
    ELSEIF emp_salary > avg_salary * 0.8 THEN
        SET performance_rating = 'Average';
    ELSE
        SET performance_rating = 'Needs Improvement';
    END IF;
    
    RETURN performance_rating;
END //

-- Triggers for data integrity

-- Before insert trigger
CREATE TRIGGER before_employee_insert
    BEFORE INSERT ON employees
    FOR EACH ROW
BEGIN
    -- Set default hire date if not provided
    IF NEW.hire_date IS NULL THEN
        SET NEW.hire_date = CURDATE();
    END IF;
    
    -- Validate email format
    IF NEW.email NOT LIKE '%_@__%.__%' THEN
        SIGNAL SQLSTATE '45000'
        SET MESSAGE_TEXT = 'Invalid email format';
    END IF;
    
    -- Set created timestamp
    SET NEW.created_at = NOW();
END //

-- After update trigger for audit trail
CREATE TRIGGER after_employee_update
    AFTER UPDATE ON employees
    FOR EACH ROW
BEGIN
    -- Only log if important fields changed
    IF NEW.salary != OLD.salary OR NEW.department_id != OLD.department_id THEN
        INSERT INTO employee_audit (
            employee_id,
            changed_field,
            old_value,
            new_value,
            changed_by,
            changed_at
        )
        VALUES (
            NEW.employee_id,
            'salary',
            OLD.salary,
            NEW.salary,
            USER(),
            NOW()
        );
        
        IF NEW.department_id != OLD.department_id THEN
            INSERT INTO employee_audit (
                employee_id,
                changed_field,
                old_value,
                new_value,
                changed_by,
                changed_at
            )
            VALUES (
                NEW.employee_id,
                'department_id',
                OLD.department_id,
                NEW.department_id,
                USER(),
                NOW()
            );
        END IF;
    END IF;
END //

-- Instead of delete trigger (using soft delete)
CREATE TRIGGER before_employee_delete
    BEFORE DELETE ON employees
    FOR EACH ROW
BEGIN
    -- Instead of deleting, mark as inactive
    UPDATE employees 
    SET status = 'terminated',
        termination_date = CURDATE()
    WHERE employee_id = OLD.employee_id;
    
    -- Prevent actual deletion
    SIGNAL SQLSTATE '45000'
    SET MESSAGE_TEXT = 'Employees are soft-deleted. Use termination procedure.';
END //

-- Complex stored procedure with cursors
CREATE PROCEDURE GenerateDepartmentReport(IN dept_id INT)
BEGIN
    DECLARE done INT DEFAULT FALSE;
    DECLARE emp_id INT;
    DECLARE emp_name VARCHAR(100);
    DECLARE emp_salary DECIMAL(10,2);
    DECLARE total_salary DECIMAL(12,2) DEFAULT 0;
    DECLARE employee_count INT DEFAULT 0;
    
    DECLARE emp_cursor CURSOR FOR
        SELECT employee_id, CONCAT(first_name, ' ', last_name), salary
        FROM employees
        WHERE department_id = dept_id
        AND status = 'active';
    
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
    
    -- Create temporary table for report
    CREATE TEMPORARY TABLE dept_report (
        employee_id INT,
        employee_name VARCHAR(100),
        salary DECIMAL(10,2),
        salary_rank INT,
        department_avg DECIMAL(10,2)
    );
    
    OPEN emp_cursor;
    
    read_loop: LOOP
        FETCH emp_cursor INTO emp_id, emp_name, emp_salary;
        IF done THEN
            LEAVE read_loop;
        END IF;
        
        SET total_salary = total_salary + emp_salary;
        SET employee_count = employee_count + 1;
        
        INSERT INTO dept_report (employee_id, employee_name, salary)
        VALUES (emp_id, emp_name, emp_salary);
    END LOOP;
    
    CLOSE emp_cursor;
    
    -- Calculate rankings and averages
    UPDATE dept_report
    SET salary_rank = (
        SELECT COUNT(*) + 1
        FROM dept_report dr2
        WHERE dr2.salary > dept_report.salary
    ),
    department_avg = total_salary / NULLIF(employee_count, 0);
    
    -- Return the report
    SELECT * FROM dept_report ORDER BY salary_rank;
    
    -- Clean up
    DROP TEMPORARY TABLE dept_report;
    
END //

DELIMITER ;

-- Using stored procedures and functions

-- Call procedure with output parameters
SET @bonus_amt = 0;
SET @message = '';
CALL ProcessEmployeeBonus(1, 10, @bonus_amt, @message);
SELECT @bonus_amt, @message;

-- Use function in queries
SELECT 
    employee_id,
    first_name,
    last_name,
    salary,
    CalculateEmployeeTenure(employee_id) AS tenure_years,
    GetEmployeePerformanceRating(employee_id) AS performance
FROM employees
WHERE status = 'active';

-- Event scheduler for automated tasks

-- Enable event scheduler
SET GLOBAL event_scheduler = ON;

-- Create event for daily maintenance
DELIMITER //
CREATE EVENT daily_maintenance
ON SCHEDULE EVERY 1 DAY
STARTS '2023-01-01 02:00:00'
DO
BEGIN
    -- Update employee tenure
    UPDATE employees
    SET tenure_years = CalculateEmployeeTenure(employee_id)
    WHERE status = 'active';
    
    -- Archive old audit records
    INSERT INTO employee_audit_archive
    SELECT * FROM employee_audit
    WHERE changed_at < DATE_SUB(NOW(), INTERVAL 1 YEAR);
    
    DELETE FROM employee_audit
    WHERE changed_at < DATE_SUB(NOW(), INTERVAL 1 YEAR);
    
    -- Log maintenance completion
    INSERT INTO maintenance_log (task_name, completed_at)
    VALUES ('daily_maintenance', NOW());
END //
DELIMITER ;

-- View events
SHOW EVENTS;
SELECT * FROM information_schema.EVENTS;

Application Integration and Best Practices

Integrating MySQL with applications requires proper connection management, query optimization, and error handling for robust performance.
-- Application integration patterns

-- 1. Connection pooling and management
-- Use connection pools in your application
-- Set appropriate timeout values
SET GLOBAL wait_timeout = 600;
SET GLOBAL interactive_timeout = 600;

-- 2. Prepared statements for security and performance

-- Using prepared statements in MySQL
PREPARE employee_stmt FROM 
'SELECT * FROM employees WHERE department_id = ? AND salary > ?';

SET @dept_id = 1;
SET @min_salary = 50000;
EXECUTE employee_stmt USING @dept_id, @min_salary;

DEALLOCATE PREPARE employee_stmt;

-- 3. Transaction management

-- Basic transaction
START TRANSACTION;

UPDATE accounts SET balance = balance - 100 WHERE account_id = 1;
UPDATE accounts SET balance = balance + 100 WHERE account_id = 2;

-- Check for errors in application code
-- If no errors:
COMMIT;

-- If errors:
ROLLBACK;

-- 4. Error handling in stored procedures

DELIMITER //
CREATE PROCEDURE TransferFunds(
    IN from_account INT,
    IN to_account INT,
    IN amount DECIMAL(10,2),
    OUT success BOOLEAN,
    OUT error_msg VARCHAR(200)
)
BEGIN
    DECLARE from_balance DECIMAL(10,2);
    DECLARE EXIT HANDLER FOR SQLEXCEPTION
    BEGIN
        ROLLBACK;
        SET success = FALSE;
        SET error_msg = 'Database error during transfer';
    END;
    
    START TRANSACTION;
    
    -- Check if from account exists and has sufficient funds
    SELECT balance INTO from_balance
    FROM accounts
    WHERE account_id = from_account
    FOR UPDATE;
    
    IF from_balance IS NULL THEN
        ROLLBACK;
        SET success = FALSE;
        SET error_msg = 'Source account not found';
    ELSEIF from_balance < amount THEN
        ROLLBACK;
        SET success = FALSE;
        SET error_msg = 'Insufficient funds';
    ELSE
        -- Perform transfer
        UPDATE accounts SET balance = balance - amount 
        WHERE account_id = from_account;
        
        UPDATE accounts SET balance = balance + amount 
        WHERE account_id = to_account;
        
        -- Log the transaction
        INSERT INTO transactions (from_account, to_account, amount, transaction_date)
        VALUES (from_account, to_account, amount, NOW());
        
        COMMIT;
        SET success = TRUE;
        SET error_msg = NULL;
    END IF;
END //
DELIMITER ;

-- 5. Bulk operations for better performance

-- Bulk insert
INSERT INTO order_items (order_id, product_id, quantity, unit_price)
VALUES 
(1, 101, 2, 25.00),
(1, 102, 1, 50.00),
(1, 103, 3, 15.00);

-- Bulk update using CASE
UPDATE products
SET price = CASE
    WHEN category_id = 1 THEN price * 0.9  -- 10% discount
    WHEN category_id = 2 THEN price * 1.1  -- 10% increase
    ELSE price
END,
stock_quantity = CASE
    WHEN stock_quantity < 10 THEN stock_quantity + 20  -- Restock
    ELSE stock_quantity
END;

-- 6. Pagination for large result sets

-- Efficient pagination using keyset (cursor) pagination
SELECT *
FROM orders
WHERE order_id > ?  -- Last seen order_id
AND customer_id = ?
ORDER BY order_id
LIMIT 20;

-- Traditional LIMIT/OFFSET (less efficient for large offsets)
SELECT *
FROM orders
WHERE customer_id = ?
ORDER BY order_date DESC
LIMIT 20 OFFSET 40;  -- Page 3

-- 7. Full-text search integration

-- Create full-text index
CREATE TABLE product_search (
    product_id INT PRIMARY KEY,
    product_name VARCHAR(200),
    description TEXT,
    category VARCHAR(100),
    FULLTEXT(product_name, description, category)
);

-- Search with relevance scoring
SELECT 
    product_id,
    product_name,
    MATCH(product_name, description, category) AGAINST('wireless bluetooth speaker' IN NATURAL LANGUAGE MODE) AS relevance
FROM product_search
WHERE MATCH(product_name, description, category) AGAINST('wireless bluetooth speaker' IN NATURAL LANGUAGE MODE)
ORDER BY relevance DESC;

-- 8. JSON data handling

-- Store and query JSON data
CREATE TABLE user_preferences (
    user_id INT PRIMARY KEY,
    preferences JSON,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

INSERT INTO user_preferences (user_id, preferences)
VALUES (1, '{
    "theme": "dark",
    "notifications": {
        "email": true,
        "push": false,
        "sms": true
    },
    "language": "en",
    "font_size": "medium"
}');

-- Query JSON data
SELECT 
    user_id,
    preferences->>'$.theme' AS theme,
    preferences->>'$.notifications.email' AS email_notifications,
    JSON_EXTRACT(preferences, '$.language') AS language
FROM user_preferences
WHERE preferences->>'$.theme' = 'dark';

-- Update JSON data
UPDATE user_preferences
SET preferences = JSON_SET(
    preferences,
    '$.theme', 'light',
    '$.font_size', 'large'
)
WHERE user_id = 1;

-- 9. Application-level caching strategies

-- Query result caching (using application cache)
-- Cache frequently accessed, rarely changed data
SELECT 
    product_id,
    product_name,
    price,
    category
FROM products
WHERE is_active = TRUE
AND category_id = ?;

-- Cache this result for 1 hour in application cache

-- 10. Monitoring and logging

-- Create application log table
CREATE TABLE application_logs (
    log_id INT AUTO_INCREMENT PRIMARY KEY,
    log_level ENUM('DEBUG', 'INFO', 'WARN', 'ERROR'),
    message TEXT,
    context JSON,
    user_id INT,
    ip_address VARCHAR(45),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_log_level (log_level),
    INDEX idx_created_at (created_at)
);

-- Log slow queries
INSERT INTO application_logs (log_level, message, context, created_at)
SELECT 
    'WARN',
    'Slow query detected',
    JSON_OBJECT(
        'query', query,
        'execution_time', query_time,
        'rows_examined', rows_examined
    ),
    NOW()
FROM mysql.slow_log
WHERE query_time > 10;  -- Queries taking more than 10 seconds

-- 11. Database versioning and migrations

-- Create migrations table
CREATE TABLE schema_migrations (
    version VARCHAR(50) PRIMARY KEY,
    description TEXT,
    applied_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Example migration
INSERT INTO schema_migrations (version, description)
VALUES ('20230115001', 'Add email verification to users table');

ALTER TABLE users 
ADD COLUMN email_verified BOOLEAN DEFAULT FALSE,
ADD COLUMN verification_token VARCHAR(100);

-- 12. Data validation and sanitization

-- Use constraints and triggers for data validation
ALTER TABLE users
ADD CONSTRAINT chk_valid_email 
CHECK (email REGEXP '^[A-Za-z0-9._%-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}$');

-- Trigger for additional validation
DELIMITER //
CREATE TRIGGER validate_user_data
    BEFORE INSERT ON users
    FOR EACH ROW
BEGIN
    -- Ensure username is alphanumeric
    IF NEW.username REGEXP '[^a-zA-Z0-9]' THEN
        SIGNAL SQLSTATE '45000'
        SET MESSAGE_TEXT = 'Username must be alphanumeric';
    END IF;
    
    -- Set default values
    IF NEW.created_at IS NULL THEN
        SET NEW.created_at = NOW();
    END IF;
END //
DELIMITER ;

-- 13. Performance monitoring integration

-- Create performance metrics table
CREATE TABLE performance_metrics (
    metric_id INT AUTO_INCREMENT PRIMARY KEY,
    metric_name VARCHAR(100),
    metric_value DECIMAL(15,4),
    recorded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_metric_name (metric_name),
    INDEX idx_recorded_at (recorded_at)
);

-- Log query performance
INSERT INTO performance_metrics (metric_name, metric_value)
SELECT 
    'query_execution_time',
    AVG(query_time)
FROM mysql.slow_log
WHERE start_time > DATE_SUB(NOW(), INTERVAL 1 HOUR);

-- 14. Backup integration

-- Create backup status table
CREATE TABLE backup_status (
    backup_id INT AUTO_INCREMENT PRIMARY KEY,
    backup_type ENUM('FULL', 'INCREMENTAL', 'LOG'),
    status ENUM('STARTED', 'COMPLETED', 'FAILED'),
    file_path VARCHAR(500),
    file_size BIGINT,
    started_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    completed_at TIMESTAMP NULL,
    error_message TEXT
);

-- Update backup status
INSERT INTO backup_status (backup_type, status, started_at)
VALUES ('FULL', 'STARTED', NOW());

-- After backup completes
UPDATE backup_status 
SET status = 'COMPLETED',
    completed_at = NOW(),
    file_size = 1024000  -- Example size
WHERE backup_id = LAST_INSERT_ID();

-- 15. Security best practices

-- Use application-specific users with limited privileges
CREATE USER 'web_app'@'%' IDENTIFIED BY 'secure_password';
GRANT SELECT, INSERT, UPDATE ON company.* TO 'web_app'@'%';
GRANT EXECUTE ON PROCEDURE company.TransferFunds TO 'web_app'@'%';

-- Enable SSL for secure connections
ALTER USER 'web_app'@'%' REQUIRE SSL;

-- Regular security audits
CREATE TABLE security_audit (
    audit_id INT AUTO_INCREMENT PRIMARY KEY,
    action_type VARCHAR(50),
    user_name VARCHAR(50),
    table_name VARCHAR(50),
    query_text TEXT,
    ip_address VARCHAR(45),
    audit_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

💻 Hands-On Practice Exercises

Beginner Level

  • 1Create a database for a library system with books, authors, and members
  • 2Write queries to find the most borrowed books and active members
  • 3Design a student grade management system with courses and grades
  • 4Practice basic CRUD operations with proper constraints
  • 5Create views for commonly accessed data combinations

Intermediate Level

  • 1Design and normalize an e-commerce database with products, orders, and customers
  • 2Write complex queries with multiple joins and aggregations
  • 3Create stored procedures for business logic like order processing
  • 4Implement triggers for audit trails and data validation
  • 5Optimize queries using proper indexing strategies

Advanced Level

  • 1Design a multi-tenant SaaS application database
  • 2Implement database partitioning for large tables
  • 3Set up master-slave replication for high availability
  • 4Create comprehensive backup and recovery procedures
  • 5Design and implement a data warehouse with star schema

🛠️ Essential MySQL Tools

MySQL Workbench

Visual tool for database design, development, and administration

  • Visual SQL Development
  • Database Design & Modeling
  • Server Administration
  • Data Migration

GUI for writing queries, designing ER diagrams, and managing database servers

phpMyAdmin

Web-based MySQL administration tool

  • Web Interface
  • User Management
  • Import/Export
  • Query Browser

Browser-based database management, ideal for web hosting environments

MySQL Shell

Advanced client and code editor for MySQL

  • JavaScript & Python Support
  • X DevAPI
  • AdminAPI
  • SQL & NoSQL

Command-line interface with modern features for developers and DBAs

Percona Toolkit

Advanced command-line tools for MySQL

  • Query Analysis
  • Schema Management
  • Data Synchronization
  • Performance Tools

Expert tools for database optimization, monitoring, and maintenance

🎯 MySQL Career Paths

Database Developer

Stored Procedures, Query Optimization, Database Design

Database Administrator

Performance Tuning, Backup/Recovery, Security, High Availability

Data Architect

Database Design, Data Modeling, Enterprise Architecture

Continue Your MySQL Journey!

You've now explored the comprehensive landscape of MySQL database management, from basic SQL queries to advanced administration and optimization techniques. MySQL continues to be one of the world's most popular and reliable database systems.

Continue exploring advanced topics like database clustering, cloud database services, big data integration, and stay updated with the latest MySQL features and best practices in the ever-evolving database landscape.

Happy Querying! 🐬

This comprehensive guide covers MySQL from absolute fundamentals to advanced database administration concepts.

Designed for clarity and depth - perfect for beginners and professional database administrators alike.

© 2025 MySQL Mastery Guide | Keep Learning, Keep Querying!