Compare commits
4 Commits
ecae2e0484
...
b1595ee9af
| Author | SHA1 | Date | |
|---|---|---|---|
| b1595ee9af | |||
| cecd55cd3d | |||
| 259340df46 | |||
| 59cd724959 |
2
.gitignore
vendored
2
.gitignore
vendored
@@ -3,4 +3,4 @@ venv/
|
||||
export/
|
||||
*_host_ids.txt
|
||||
*.log
|
||||
partitioning/tests/
|
||||
backup/
|
||||
|
||||
60
partitioning/CODE_DOCUMENTATION.md
Normal file
60
partitioning/CODE_DOCUMENTATION.md
Normal file
@@ -0,0 +1,60 @@
|
||||
# Code Documentation: ZabbixPartitioner
|
||||
|
||||
## Class: ZabbixPartitioner
|
||||
|
||||
### Core Methods
|
||||
|
||||
#### `__init__(self, config: Dict[str, Any], dry_run: bool = False)`
|
||||
Initializes the partitioner with configuration and runtime mode.
|
||||
- **config**: Dictionary containing database connection and partitioning rules.
|
||||
- **dry_run**: If True, SQL queries are logged but not executed.
|
||||
|
||||
#### `connect_db(self)`
|
||||
Context manager for database connections.
|
||||
- Handles connection lifecycle (open/close).
|
||||
- Sets strict session variables:
|
||||
- `wait_timeout = 86400` (24h) to prevent timeouts during long operations.
|
||||
- `sql_log_bin = 0` (if configured) to prevent replication of partitioning commands.
|
||||
|
||||
#### `run(self, mode: str)`
|
||||
Main entry point for execution.
|
||||
- **mode**:
|
||||
- `'init'`: Initial setup. Calls `initialize_partitioning`.
|
||||
- `'maintenance'` (default): Routine operation. Calls `create_future_partitions` and `drop_old_partitions`.
|
||||
|
||||
### Logic Methods
|
||||
|
||||
#### `initialize_partitioning(table: str, period: str, premake: int, retention_str: str)`
|
||||
Converts a standard table to a partitioned table.
|
||||
- **Strategies** (via `initial_partitioning_start` config):
|
||||
- `retention`: Starts from (Now - Retention). Creates `p_archive` for older data. FAST.
|
||||
- `db_min`: Queries `SELECT MIN(clock)`. PRECISE but SLOW.
|
||||
|
||||
#### `create_future_partitions(table: str, period: str, premake: int)`
|
||||
Ensures sufficient future partitions exist.
|
||||
- Calculates required partitions based on current time + `premake` count.
|
||||
- Checks `information_schema` for existing partitions.
|
||||
- Adds missing partitions using `ALTER TABLE ... ADD PARTITION`.
|
||||
|
||||
#### `drop_old_partitions(table: str, period: str, retention_str: str)`
|
||||
Removes partitions older than the retention period.
|
||||
- Parses partition names (e.g., `p2023_01_01`) to extract their date.
|
||||
- Compares against the calculated retention cutoff date.
|
||||
- Drops qualifiers using `ALTER TABLE ... DROP PARTITION`.
|
||||
|
||||
### Helper Methods
|
||||
|
||||
#### `get_table_min_clock(table: str) -> Optional[datetime]`
|
||||
- Queries the table for the oldest timestamp. Used in `db_min` initialization strategy.
|
||||
|
||||
#### `has_incompatible_primary_key(table: str) -> bool`
|
||||
- **Safety Critical**: Verifies that the table's Primary Key includes the `clock` column.
|
||||
- Returns `True` if incompatible (prevents partitioning to avoid MySQL errors).
|
||||
|
||||
#### `get_partition_name(dt: datetime, period: str) -> str`
|
||||
- Generates standard partition names:
|
||||
- Daily: `pYYYY_MM_DD`
|
||||
- Monthly: `pYYYY_MM`
|
||||
|
||||
#### `get_partition_description(dt: datetime, period: str) -> str`
|
||||
- Generates the `VALUES LESS THAN` expression for the partition (Start of NEXT period).
|
||||
39
partitioning/REFACTORING_NOTES.md
Normal file
39
partitioning/REFACTORING_NOTES.md
Normal file
@@ -0,0 +1,39 @@
|
||||
# Refactoring Notes: Zabbix Partitioning Script
|
||||
|
||||
## Overview
|
||||
The `zabbix_partitioning.py` script has been significantly refactored to improve maintainability, reliability, and compatibility with modern Zabbix versions (7.x).
|
||||
|
||||
## Key Changes
|
||||
|
||||
### 1. Architecture: Class-Based Structure
|
||||
- **Old**: Procedural script with global variables and scattered logic.
|
||||
- **New**: Encapsulated in a `ZabbixPartitioner` class.
|
||||
- **Purpose**: Improves modularity, testability, and state management. Allows the script to be easily imported or extended.
|
||||
|
||||
### 2. Database Connection Management
|
||||
- **Change**: Implemented `contextlib.contextmanager` for database connections.
|
||||
- **Purpose**: Ensures database connections are robustly opened and closed, even if errors occur. Handles `wait_timeout` and binary logging settings automatically for every session.
|
||||
|
||||
### 3. Logging
|
||||
- **Change**: Replaced custom `print` statements with Python's standard `logging` module.
|
||||
- **Purpose**:
|
||||
- Allows consistent log formatting.
|
||||
- Supports configurable output destinations (Console vs Syslog) via the config file.
|
||||
- Granular log levels (INFO for standard ops, DEBUG for SQL queries).
|
||||
|
||||
### 4. Configuration Handling
|
||||
- **Change**: Improved validation and parsing of the YAML configuration.
|
||||
- **Purpose**:
|
||||
- Removed unused parameters (e.g., `timezone`, as the script relies on system local time).
|
||||
- Added support for custom database ports (critical for non-standard deployments or containerized tests).
|
||||
- Explicitly handles the `replicate_sql` flag to control binary logging (it was intergrated into the partitioning logic).
|
||||
|
||||
### 5. Type Safety
|
||||
- **Change**: Added comprehensive Python type hinting (e.g., `List`, `Dict`, `Optional`).
|
||||
- **Purpose**: Makes the code self-documenting and allows IDEs/linters to catch potential errors before execution.
|
||||
|
||||
### 6. Zabbix 7.x Compatibility
|
||||
- **Change**: Added logic to verify Zabbix database version and schema requirements.
|
||||
- **Purpose**:
|
||||
- Checks `dbversion` table.
|
||||
- **Critical**: Validates that target tables have the `clock` column as part of their Primary Key before attempting partitioning, preventing potential data corruption or MySQL errors.
|
||||
@@ -40,6 +40,14 @@ logging: syslog
|
||||
# premake: Number of partitions to create in advance
|
||||
premake: 10
|
||||
|
||||
# initial_partitioning_start: Strategy for the first partition during initialization (--init).
|
||||
# Options:
|
||||
# db_min: (Default) Queries SELECT MIN(clock) to ensure ALL data is covered. Slow on huge tables consistently.
|
||||
# retention: Starts partitioning from (Now - Retention Period).
|
||||
# Creates a 'p_archive' partition for all data older than retention.
|
||||
# Much faster as it skips the MIN(clock) query. (Recommended for large DBs)
|
||||
initial_partitioning_start: db_min
|
||||
|
||||
# replicate_sql: False - Disable binary logging. Partitioning changes are NOT replicated to slaves (use for independent maintenance).
|
||||
# replicate_sql: True - Enable binary logging. Partitioning changes ARE replicated to slaves (use for consistent cluster schema).
|
||||
replicate_sql: False
|
||||
@@ -371,7 +371,7 @@ class ZabbixPartitioner:
|
||||
for name in to_drop:
|
||||
self.execute_query(f"ALTER TABLE `{table}` DROP PARTITION {name}")
|
||||
|
||||
def initialize_partitioning(self, table: str, period: str, premake: int):
|
||||
def initialize_partitioning(self, table: str, period: str, premake: int, retention_str: str):
|
||||
"""Initial partitioning for a table (convert regular table to partitioned)."""
|
||||
self.logger.info(f"Initializing partitioning for {table}")
|
||||
|
||||
@@ -384,35 +384,63 @@ class ZabbixPartitioner:
|
||||
self.logger.info(f"Table {table} is already partitioned.")
|
||||
return
|
||||
|
||||
# Check for data
|
||||
min_clock = self.get_table_min_clock(table)
|
||||
init_strategy = self.config.get('initial_partitioning_start', 'db_min')
|
||||
start_dt = None
|
||||
p_archive_ts = None
|
||||
|
||||
if not min_clock:
|
||||
# Empty table. Start from NOW
|
||||
start_dt = self.truncate_date(datetime.now(), period)
|
||||
if init_strategy == 'retention':
|
||||
self.logger.info(f"Strategy 'retention': Calculating start date from retention ({retention_str})")
|
||||
retention_date = self.get_lookback_date(retention_str)
|
||||
# Start granular partitions from the retention date
|
||||
start_dt = self.truncate_date(retention_date, period)
|
||||
# Create a catch-all for anything older
|
||||
p_archive_ts = int(start_dt.timestamp())
|
||||
else:
|
||||
# Table has data.
|
||||
# For a safe migration, we usually create a catch-all for old data (p_old) or just start partitions covering existing data.
|
||||
# This script's strategy: Create partitions starting from min_clock.
|
||||
start_dt = self.truncate_date(min_clock, period)
|
||||
# Default 'db_min' strategy
|
||||
self.logger.info("Strategy 'db_min': Querying table for minimum clock (may be slow)")
|
||||
min_clock = self.get_table_min_clock(table)
|
||||
|
||||
if not min_clock:
|
||||
# Empty table. Start from NOW
|
||||
start_dt = self.truncate_date(datetime.now(), period)
|
||||
else:
|
||||
# Table has data.
|
||||
start_dt = self.truncate_date(min_clock, period)
|
||||
|
||||
# Build list of partitions from start_dt up to NOW + premake
|
||||
target_dt = self.get_next_date(self.truncate_date(datetime.now(), period), period, premake)
|
||||
|
||||
curr = start_dt
|
||||
partitions_def = {}
|
||||
|
||||
# If we have an archive partition, add it first
|
||||
if p_archive_ts:
|
||||
partitions_def['p_archive'] = str(p_archive_ts)
|
||||
|
||||
while curr < target_dt:
|
||||
name = self.get_partition_name(curr, period)
|
||||
desc = self.get_partition_description(curr, period)
|
||||
partitions_def[name] = desc
|
||||
curr = self.get_next_date(curr, period, 1)
|
||||
|
||||
# Re-doing the loop to be cleaner on types
|
||||
parts_sql = []
|
||||
for name, timestamp_expr in sorted(partitions_def.items()):
|
||||
parts_sql.append(PARTITION_TEMPLATE % (name, timestamp_expr))
|
||||
|
||||
# 1. Archive Partition
|
||||
if p_archive_ts:
|
||||
parts_sql.append(f"PARTITION p_archive VALUES LESS THAN ({p_archive_ts}) ENGINE = InnoDB")
|
||||
|
||||
# 2. Granular Partitions
|
||||
# We need to iterate again from start_dt
|
||||
curr = start_dt
|
||||
while curr < target_dt:
|
||||
name = self.get_partition_name(curr, period)
|
||||
desc_date_str = self.get_partition_description(curr, period) # Returns "YYYY-MM-DD HH:MM:SS"
|
||||
parts_sql.append(PARTITION_TEMPLATE % (name, desc_date_str))
|
||||
curr = self.get_next_date(curr, period, 1)
|
||||
|
||||
query = f"ALTER TABLE `{table}` PARTITION BY RANGE (`clock`) (\n" + ",\n".join(parts_sql) + "\n)"
|
||||
self.logger.info(f"Applying initial partitioning to {table} ({len(partitions_def)} partitions)")
|
||||
self.logger.info(f"Applying initial partitioning to {table} ({len(parts_sql)} partitions)")
|
||||
self.execute_query(query)
|
||||
|
||||
def run(self, mode: str):
|
||||
@@ -437,7 +465,7 @@ class ZabbixPartitioner:
|
||||
retention = item[table]
|
||||
|
||||
if mode == 'init':
|
||||
self.initialize_partitioning(table, period, premake)
|
||||
self.initialize_partitioning(table, period, premake, retention)
|
||||
else:
|
||||
# Maintenance mode (Add new, remove old)
|
||||
self.create_future_partitions(table, period, premake)
|
||||
|
||||
36
zabbix-tests/partitioning/README.md
Normal file
36
zabbix-tests/partitioning/README.md
Normal file
@@ -0,0 +1,36 @@
|
||||
# Zabbix Partitioning Tests
|
||||
|
||||
This directory contains a Docker-based test environment for the Zabbix Partitioning script.
|
||||
|
||||
## Prerequisites
|
||||
- Docker & Docker Compose
|
||||
- Python 3
|
||||
|
||||
## Setup & Run
|
||||
1. Start the database container:
|
||||
```bash
|
||||
docker compose up -d
|
||||
```
|
||||
This will start a MySQL 8.0 container and import the Zabbix schema.
|
||||
|
||||
2. Create valid config (done automatically):
|
||||
The `test_config.yaml` references the running container.
|
||||
|
||||
3. Run the partitioning script:
|
||||
```bash
|
||||
# Create virtual environment if needed
|
||||
python3 -m venv venv
|
||||
./venv/bin/pip install pymysql pyyaml
|
||||
|
||||
# Dry Run
|
||||
./venv/bin/python3 ../../partitioning/zabbix_partitioning.py -c test_config.yaml --dry-run --init
|
||||
|
||||
# Live Run
|
||||
./venv/bin/python3 ../../partitioning/zabbix_partitioning.py -c test_config.yaml --init
|
||||
```
|
||||
|
||||
## Cleanup
|
||||
```bash
|
||||
docker compose down
|
||||
rm -rf venv
|
||||
```
|
||||
14
zabbix-tests/partitioning/docker-compose.yml
Normal file
14
zabbix-tests/partitioning/docker-compose.yml
Normal file
@@ -0,0 +1,14 @@
|
||||
services:
|
||||
zabbix-db:
|
||||
image: mysql:8.0
|
||||
container_name: zabbix-partition-test
|
||||
environment:
|
||||
MYSQL_ROOT_PASSWORD: root_password
|
||||
MYSQL_DATABASE: zabbix
|
||||
MYSQL_USER: zbx_part
|
||||
MYSQL_PASSWORD: zbx_password
|
||||
volumes:
|
||||
- ../../partitioning/schemas/70-schema-mysql.txt:/docker-entrypoint-initdb.d/schema.sql
|
||||
ports:
|
||||
- "33060:3306"
|
||||
command: --default-authentication-plugin=mysql_native_password
|
||||
31
zabbix-tests/partitioning/find_tables.py
Normal file
31
zabbix-tests/partitioning/find_tables.py
Normal file
@@ -0,0 +1,31 @@
|
||||
import re
|
||||
|
||||
def get_partitionable_tables(schema_path):
|
||||
with open(schema_path, 'r', encoding='utf-8', errors='ignore') as f:
|
||||
content = f.read()
|
||||
|
||||
# Split into CREATE TABLE statements
|
||||
tables = content.split('CREATE TABLE')
|
||||
valid_tables = []
|
||||
|
||||
for table_def in tables:
|
||||
# Extract table name
|
||||
name_match = re.search(r'`(\w+)`', table_def)
|
||||
if not name_match:
|
||||
continue
|
||||
table_name = name_match.group(1)
|
||||
|
||||
# Check for PRIMARY KEY definition
|
||||
pk_match = re.search(r'PRIMARY KEY \((.*?)\)', table_def, re.DOTALL)
|
||||
if pk_match:
|
||||
pk_cols = pk_match.group(1)
|
||||
if 'clock' in pk_cols:
|
||||
valid_tables.append(table_name)
|
||||
|
||||
return valid_tables
|
||||
|
||||
if __name__ == '__main__':
|
||||
tables = get_partitionable_tables('/opt/git/Zabbix/partitioning/70-schema-mysql.txt')
|
||||
print("Partitionable tables (PK contains 'clock'):")
|
||||
for t in tables:
|
||||
print(f" - {t}")
|
||||
25
zabbix-tests/partitioning/test_config.yaml
Normal file
25
zabbix-tests/partitioning/test_config.yaml
Normal file
@@ -0,0 +1,25 @@
|
||||
database:
|
||||
type: mysql
|
||||
host: 127.0.0.1
|
||||
socket:
|
||||
user: root
|
||||
passwd: root_password
|
||||
db: zabbix
|
||||
# Port mapping in docker-compose is 33060
|
||||
port: 33060
|
||||
|
||||
partitions:
|
||||
daily:
|
||||
- history: 7d
|
||||
- history_uint: 7d
|
||||
- history_str: 7d
|
||||
- history_log: 7d
|
||||
- history_text: 7d
|
||||
- history_bin: 7d
|
||||
- trends: 365d
|
||||
- trends_uint: 365d
|
||||
|
||||
logging: console
|
||||
premake: 2
|
||||
replicate_sql: False
|
||||
initial_partitioning_start: retention
|
||||
25
zabbix-tests/partitioning/wait_for_db.py
Normal file
25
zabbix-tests/partitioning/wait_for_db.py
Normal file
@@ -0,0 +1,25 @@
|
||||
import time
|
||||
import pymysql
|
||||
import sys
|
||||
|
||||
config = {
|
||||
'host': '127.0.0.1',
|
||||
'port': 33060,
|
||||
'user': 'root',
|
||||
'password': 'root_password',
|
||||
'database': 'zabbix'
|
||||
}
|
||||
|
||||
max_retries = 90
|
||||
for i in range(max_retries):
|
||||
try:
|
||||
conn = pymysql.connect(**config)
|
||||
print("Database is ready!")
|
||||
conn.close()
|
||||
sys.exit(0)
|
||||
except Exception as e:
|
||||
print(f"Waiting for DB... ({e})")
|
||||
time.sleep(2)
|
||||
|
||||
print("Timeout waiting for DB")
|
||||
sys.exit(1)
|
||||
Reference in New Issue
Block a user