Compare commits
2 Commits
71e85c2e6e
...
064b0ab6ca
| Author | SHA1 | Date | |
|---|---|---|---|
| 064b0ab6ca | |||
| 0a66a800b5 |
@@ -174,7 +174,101 @@ Alternatively, use systemd timers for more robust scheduling and logging.
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
## 8. Troubleshooting
|
||||
- **Connection Refused**: Check `host`, `port` in config. Ensure MySQL is running.
|
||||
- **Access Denied (1227)**: The DB user needs `SUPER` privileges to disable binary logging (`replicate_sql: False`). Either grant the privilege or set `replicate_sql: True` (if replication load is acceptable).
|
||||
- **Primary Key Error**: "Primary Key does not include 'clock'". The table cannot be partitioned by range on `clock` without schema changes. Remove it from config.
|
||||
|
||||
## 9. Docker Usage
|
||||
|
||||
You can run the partitioning script as a stateless Docker container. This is ideal for Kubernetes CronJobs or environments where you don't want to manage Python dependencies on the host.
|
||||
|
||||
### 9.1 Build the Image
|
||||
The image is not yet published to a public registry, so you must build it locally:
|
||||
```bash
|
||||
cd /opt/git/Zabbix/partitioning
|
||||
docker build -t zabbix-partitioning -f docker/Dockerfile .
|
||||
```
|
||||
|
||||
### 9.2 Operations
|
||||
The container uses `entrypoint.py` to auto-generate the configuration file from Environment Variables at runtime.
|
||||
|
||||
#### Scenario A: Dry Run (Check Configuration)
|
||||
Verify that your connection and retention settings are correct without making changes.
|
||||
```bash
|
||||
docker run --rm \
|
||||
-e DB_HOST=10.0.0.5 -e DB_USER=zabbix -e DB_PASSWORD=secret \
|
||||
-e RETENTION_HISTORY=7d \
|
||||
-e RETENTION_TRENDS=365d \
|
||||
-e RUN_MODE=dry-run \
|
||||
zabbix-partitioning
|
||||
```
|
||||
|
||||
#### Scenario B: Initialization (First Run)
|
||||
Convert your existing tables to partitioned tables.
|
||||
> [!WARNING]
|
||||
> Ensure backup exists and Zabbix Housekeeper is disabled!
|
||||
```bash
|
||||
docker run --rm \
|
||||
-e DB_HOST=10.0.0.5 -e DB_USER=zabbix -e DB_PASSWORD=secret \
|
||||
-e RETENTION_HISTORY=14d \
|
||||
-e RETENTION_TRENDS=365d \
|
||||
-e RUN_MODE=init \
|
||||
zabbix-partitioning
|
||||
```
|
||||
|
||||
#### Scenario C: Daily Maintenance (Cron/Scheduler)
|
||||
Run this daily (e.g., via K8s CronJob) to create future partitions and drop old ones.
|
||||
```bash
|
||||
docker run --rm \
|
||||
-e DB_HOST=10.0.0.5 -e DB_USER=zabbix -e DB_PASSWORD=secret \
|
||||
-e RETENTION_HISTORY=14d \
|
||||
-e RETENTION_TRENDS=365d \
|
||||
zabbix-partitioning
|
||||
```
|
||||
|
||||
#### Scenario D: Custom Overrides
|
||||
You can override the retention period for specific tables or change their partitioning interval.
|
||||
*Example: Force `history_log` to be partitioned **Weekly** with 30-day retention.*
|
||||
```bash
|
||||
docker run --rm \
|
||||
-e DB_HOST=10.0.0.5 \
|
||||
-e RETENTION_HISTORY=7d \
|
||||
-e PARTITION_WEEKLY_history_log=30d \
|
||||
zabbix-partitioning
|
||||
```
|
||||
|
||||
#### Scenario E: SSL Connection
|
||||
Mount your certificates and provide the paths.
|
||||
```bash
|
||||
docker run --rm \
|
||||
-e DB_HOST=zabbix-db \
|
||||
-e DB_SSL_CA=/certs/ca.pem \
|
||||
-e DB_SSL_CERT=/certs/client-cert.pem \
|
||||
-e DB_SSL_KEY=/certs/client-key.pem \
|
||||
-v /path/to/local/certs:/certs \
|
||||
zabbix-partitioning
|
||||
```
|
||||
|
||||
### 9.3 Supported Environment Variables
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `DB_HOST` | localhost | Database hostname |
|
||||
| `DB_PORT` | 3306 | Database port |
|
||||
| `DB_USER` | zabbix | Database user |
|
||||
| `DB_PASSWORD` | zabbix | Database password |
|
||||
| `DB_NAME` | zabbix | Database name |
|
||||
| `DB_SSL_CA` | - | Path to CA Certificate |
|
||||
| `DB_SSL_CERT` | - | Path to Client Certificate |
|
||||
| `DB_SSL_KEY` | - | Path to Client Key |
|
||||
| `RETENTION_HISTORY` | 14d | Retention for `history*` tables |
|
||||
| `RETENTION_TRENDS` | 365d | Retention for `trends*` tables |
|
||||
| `RETENTION_AUDIT` | 365d | Retention for `auditlog` (if enabled) |
|
||||
| `ENABLE_AUDITLOG_PARTITIONING` | false | Set to `true` to partition `auditlog` |
|
||||
| `RUN_MODE` | maintenance | `init` (initialize), `maintenance` (daily run), or `dry-run` |
|
||||
| `PARTITION_DAILY_[TABLE]` | - | Custom daily retention (e.g., `PARTITION_DAILY_mytable=30d`) |
|
||||
| `PARTITION_WEEKLY_[TABLE]` | - | Custom weekly retention |
|
||||
| `PARTITION_MONTHLY_[TABLE]` | - | Custom monthly retention |
|
||||
|
||||
|
||||
16
partitioning/docker/Dockerfile
Normal file
16
partitioning/docker/Dockerfile
Normal file
@@ -0,0 +1,16 @@
|
||||
FROM python:3.12-slim
|
||||
|
||||
# Install dependencies
|
||||
RUN pip install --no-cache-dir pymysql pyyaml
|
||||
|
||||
# Copy main script and entrypoint
|
||||
# Note: Build context should be the parent directory 'partitioning/'
|
||||
COPY script/zabbix_partitioning.py /usr/local/bin/
|
||||
RUN mkdir -p /etc/zabbix
|
||||
COPY docker/entrypoint.py /usr/local/bin/entrypoint.py
|
||||
|
||||
# Set permissions
|
||||
RUN chmod +x /usr/local/bin/zabbix_partitioning.py /usr/local/bin/entrypoint.py
|
||||
|
||||
# Entrypoint
|
||||
ENTRYPOINT ["python3", "/usr/local/bin/entrypoint.py"]
|
||||
114
partitioning/docker/entrypoint.py
Normal file
114
partitioning/docker/entrypoint.py
Normal file
@@ -0,0 +1,114 @@
|
||||
import os
|
||||
import sys
|
||||
import yaml
|
||||
import subprocess
|
||||
|
||||
def generate_config():
|
||||
# Base Configuration
|
||||
config = {
|
||||
'database': {
|
||||
'type': 'mysql',
|
||||
'host': os.getenv('DB_HOST', 'localhost'),
|
||||
'user': os.getenv('DB_USER', 'zabbix'),
|
||||
'passwd': os.getenv('DB_PASSWORD', 'zabbix'),
|
||||
'db': os.getenv('DB_NAME', 'zabbix'),
|
||||
'port': int(os.getenv('DB_PORT', 3306)),
|
||||
'socket': os.getenv('DB_SOCKET', '')
|
||||
},
|
||||
'logging': 'console',
|
||||
'premake': int(os.getenv('PREMAKE', 10)),
|
||||
'replicate_sql': os.getenv('REPLICATE_SQL', 'False').lower() == 'true',
|
||||
'initial_partitioning_start': os.getenv('INITIAL_PARTITIONING_START', 'db_min'),
|
||||
'partitions': {
|
||||
'daily': [],
|
||||
'weekly': [],
|
||||
'monthly': []
|
||||
}
|
||||
}
|
||||
|
||||
# SSL Config
|
||||
if os.getenv('DB_SSL_CA'):
|
||||
config['database']['ssl'] = {'ca': os.getenv('DB_SSL_CA')}
|
||||
if os.getenv('DB_SSL_CERT'): config['database']['ssl']['cert'] = os.getenv('DB_SSL_CERT')
|
||||
if os.getenv('DB_SSL_KEY'): config['database']['ssl']['key'] = os.getenv('DB_SSL_KEY')
|
||||
|
||||
# Retention Mapping
|
||||
retention_history = os.getenv('RETENTION_HISTORY', '14d')
|
||||
retention_trends = os.getenv('RETENTION_TRENDS', '365d')
|
||||
retention_audit = os.getenv('RETENTION_AUDIT', '365d')
|
||||
|
||||
# Standard Zabbix Tables
|
||||
history_tables = ['history', 'history_uint', 'history_str', 'history_log', 'history_text', 'history_bin']
|
||||
trends_tables = ['trends', 'trends_uint']
|
||||
|
||||
# Auditlog: Disabled by default because Zabbix 7.0+ 'auditlog' table lacks 'clock' in Primary Key.
|
||||
# Only enable if the user has manually altered the schema and explicitly requests it.
|
||||
|
||||
# Collect overrides first to prevent duplicates
|
||||
overrides = set()
|
||||
for key in os.environ:
|
||||
if key.startswith(('PARTITION_DAILY_', 'PARTITION_WEEKLY_', 'PARTITION_MONTHLY_')):
|
||||
table = key.split('_', 2)[-1].lower()
|
||||
overrides.add(table)
|
||||
|
||||
for table in history_tables:
|
||||
if table not in overrides:
|
||||
config['partitions']['daily'].append({table: retention_history})
|
||||
|
||||
for table in trends_tables:
|
||||
if table not in overrides:
|
||||
config['partitions']['monthly'].append({table: retention_trends})
|
||||
|
||||
if os.getenv('ENABLE_AUDITLOG_PARTITIONING', 'false').lower() == 'true':
|
||||
config['partitions']['weekly'].append({'auditlog': retention_audit})
|
||||
|
||||
# Custom/Generic Overrides
|
||||
# Look for env vars like PARTITION_DAILY_mytable=7d
|
||||
for key, value in os.environ.items():
|
||||
if key.startswith('PARTITION_DAILY_'):
|
||||
table = key.replace('PARTITION_DAILY_', '').lower()
|
||||
config['partitions']['daily'].append({table: value})
|
||||
elif key.startswith('PARTITION_WEEKLY_'):
|
||||
table = key.replace('PARTITION_WEEKLY_', '').lower()
|
||||
config['partitions']['weekly'].append({table: value})
|
||||
elif key.startswith('PARTITION_MONTHLY_'):
|
||||
table = key.replace('PARTITION_MONTHLY_', '').lower()
|
||||
config['partitions']['monthly'].append({table: value})
|
||||
|
||||
# Filter empty lists
|
||||
config['partitions'] = {k: v for k, v in config['partitions'].items() if v}
|
||||
|
||||
print("Generated Configuration:")
|
||||
print(yaml.dump(config, default_flow_style=False))
|
||||
|
||||
with open('/etc/zabbix/zabbix_partitioning.conf', 'w') as f:
|
||||
yaml.dump(config, f, default_flow_style=False)
|
||||
|
||||
def main():
|
||||
generate_config()
|
||||
|
||||
cmd = [sys.executable, '/usr/local/bin/zabbix_partitioning.py', '-c', '/etc/zabbix/zabbix_partitioning.conf']
|
||||
|
||||
run_mode = os.getenv('RUN_MODE', 'maintenance')
|
||||
if run_mode == 'init':
|
||||
cmd.append('--init')
|
||||
elif run_mode == 'dry-run':
|
||||
cmd.append('--dry-run')
|
||||
if os.getenv('DRY_RUN_INIT') == 'true':
|
||||
cmd.append('--init')
|
||||
elif run_mode == 'discovery':
|
||||
cmd.append('--discovery')
|
||||
elif run_mode == 'check':
|
||||
target = os.getenv('CHECK_TARGET')
|
||||
if not target:
|
||||
print("Error: CHECK_TARGET env var required for check mode")
|
||||
sys.exit(1)
|
||||
cmd.append('--check-days')
|
||||
cmd.append(target)
|
||||
|
||||
print(f"Executing: {' '.join(cmd)}")
|
||||
result = subprocess.run(cmd)
|
||||
sys.exit(result.returncode)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -13,6 +13,7 @@ import argparse
|
||||
import pymysql
|
||||
from pymysql.constants import CLIENT
|
||||
import yaml
|
||||
import json
|
||||
import logging
|
||||
import logging.handlers
|
||||
from datetime import datetime, timedelta
|
||||
@@ -443,12 +444,81 @@ class ZabbixPartitioner:
|
||||
self.logger.info(f"Applying initial partitioning to {table} ({len(parts_sql)} partitions)")
|
||||
self.execute_query(query)
|
||||
|
||||
def run(self, mode: str):
|
||||
def discovery(self):
|
||||
"""Output Zabbix Low-Level Discovery logic JSON."""
|
||||
partitions_conf = self.config.get('partitions', {})
|
||||
discovery_data = []
|
||||
|
||||
for period, tables in partitions_conf.items():
|
||||
if not tables:
|
||||
continue
|
||||
for item in tables:
|
||||
table = list(item.keys())[0]
|
||||
discovery_data.append({"{#TABLE}": table, "{#PERIOD}": period})
|
||||
|
||||
print(json.dumps(discovery_data))
|
||||
|
||||
def check_partitions_coverage(self, table: str, period: str) -> int:
|
||||
"""
|
||||
Check how many days of future partitions exist for a table.
|
||||
Returns: Number of days from NOW until the end of the last partition.
|
||||
"""
|
||||
top_partition_ts = self.execute_query(
|
||||
"""SELECT MAX(`partition_description`) FROM `information_schema`.`partitions`
|
||||
WHERE `table_schema` = %s AND `table_name` = %s AND `partition_name` IS NOT NULL""",
|
||||
(self.db_name, table), fetch='one'
|
||||
)
|
||||
|
||||
if not top_partition_ts:
|
||||
return 0
|
||||
|
||||
# partition_description is "VALUES LESS THAN (TS)"
|
||||
# So it represents the END of the partition (start of next)
|
||||
end_ts = int(top_partition_ts)
|
||||
end_dt = datetime.fromtimestamp(end_ts)
|
||||
now = datetime.now()
|
||||
|
||||
diff = end_dt - now
|
||||
return max(0, diff.days)
|
||||
|
||||
def run(self, mode: str, target_table: str = None):
|
||||
"""Main execution loop."""
|
||||
with self.connect_db():
|
||||
self.check_compatibility()
|
||||
|
||||
partitions_conf = self.config.get('partitions', {})
|
||||
|
||||
# --- Discovery Mode ---
|
||||
if mode == 'discovery':
|
||||
self.discovery()
|
||||
return
|
||||
|
||||
# --- Check Mode ---
|
||||
if mode == 'check':
|
||||
if not target_table:
|
||||
# Check all and print simple status? Or error?
|
||||
# Zabbix usually queries one by one.
|
||||
# Implementing simple check which returns days for specific table
|
||||
raise ConfigurationError("Target table required for check mode")
|
||||
|
||||
# Find period for table
|
||||
found_period = None
|
||||
for period, tables in partitions_conf.items():
|
||||
for item in tables:
|
||||
if list(item.keys())[0] == target_table:
|
||||
found_period = period
|
||||
break
|
||||
if found_period: break
|
||||
|
||||
if not found_period:
|
||||
# Table not in config?
|
||||
print("-1") # Error code
|
||||
return
|
||||
|
||||
days_left = self.check_partitions_coverage(target_table, found_period)
|
||||
print(days_left)
|
||||
return
|
||||
|
||||
# --- Normal Mode (Init/Maintain) ---
|
||||
self.check_compatibility()
|
||||
premake = self.config.get('premake', 10)
|
||||
|
||||
if mode == 'delete':
|
||||
@@ -473,8 +543,10 @@ class ZabbixPartitioner:
|
||||
|
||||
# Housekeeping extras
|
||||
if mode != 'init' and not self.dry_run:
|
||||
# delete_extra_data logic...
|
||||
pass # Can add back specific cleanups like `sessions` table if desired
|
||||
self.logger.info("Partitioning completed successfully")
|
||||
|
||||
if mode != 'init' and not self.dry_run:
|
||||
pass
|
||||
|
||||
def setup_logging(config_log_type: str):
|
||||
logger = logging.getLogger('zabbix_partitioning')
|
||||
@@ -484,7 +556,7 @@ def setup_logging(config_log_type: str):
|
||||
|
||||
if config_log_type == 'syslog':
|
||||
handler = logging.handlers.SysLogHandler(address='/dev/log')
|
||||
formatter = logging.Formatter('%(name)s: %(message)s') # Syslog has its own timestamps usually
|
||||
formatter = logging.Formatter('%(name)s: %(message)s')
|
||||
else:
|
||||
handler = logging.StreamHandler(sys.stdout)
|
||||
|
||||
@@ -497,6 +569,11 @@ def parse_args():
|
||||
parser.add_argument('-i', '--init', action='store_true', help='Initialize partitions')
|
||||
parser.add_argument('-d', '--delete', action='store_true', help='Remove partitions (Not implemented)')
|
||||
parser.add_argument('--dry-run', action='store_true', help='Simulate queries')
|
||||
|
||||
# Monitoring args
|
||||
parser.add_argument('--discovery', action='store_true', help='Output Zabbix LLD JSON')
|
||||
parser.add_argument('--check-days', type=str, help='Check days of future partitions left for table', metavar='TABLE')
|
||||
|
||||
return parser.parse_args()
|
||||
|
||||
def load_config(path):
|
||||
@@ -515,20 +592,47 @@ def main():
|
||||
with open(conf_path, 'r') as f:
|
||||
config = yaml.safe_load(f)
|
||||
|
||||
setup_logging(config.get('logging', 'console'))
|
||||
logger = logging.getLogger('zabbix_partitioning')
|
||||
# For discovery/check, we might want minimal logging or specific output, so we handle that in run()
|
||||
# But we still need basic logging setup for db errors
|
||||
|
||||
mode = 'maintain'
|
||||
if args.init: mode = 'init'
|
||||
target = None
|
||||
|
||||
if args.discovery:
|
||||
mode = 'discovery'
|
||||
config['logging'] = 'console' # Force console for discovery? Or suppress?
|
||||
# actually we don't want logs mixing with JSON output
|
||||
# so checking mode before setup logging
|
||||
elif args.check_days:
|
||||
mode = 'check'
|
||||
target = args.check_days
|
||||
elif args.init: mode = 'init'
|
||||
elif args.delete: mode = 'delete'
|
||||
|
||||
# Setup logging
|
||||
# If discovery or check, we mute info logs to stdout to keep output clean,
|
||||
# unless errors happen.
|
||||
if mode in ['discovery', 'check']:
|
||||
logging.basicConfig(level=logging.ERROR) # Only show critical errors
|
||||
else:
|
||||
setup_logging(config.get('logging', 'console'))
|
||||
|
||||
logger = logging.getLogger('zabbix_partitioning')
|
||||
|
||||
if args.dry_run:
|
||||
logger.info("Starting in DRY-RUN mode")
|
||||
|
||||
|
||||
# ZabbixPartitioner expects dict config
|
||||
app = ZabbixPartitioner(config, dry_run=args.dry_run)
|
||||
app.run(mode)
|
||||
app.run(mode, target)
|
||||
|
||||
except Exception as e:
|
||||
# Important: Zabbix log monitoring needs to see "Failed"
|
||||
# We print to stderr for script failure, logging handles log file
|
||||
try:
|
||||
logging.getLogger('zabbix_partitioning').critical(f"Partitioning failed: {e}")
|
||||
except:
|
||||
pass
|
||||
print(f"Critical Error: {e}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
65
partitioning/zabbix_partitioning_template.yaml
Normal file
65
partitioning/zabbix_partitioning_template.yaml
Normal file
@@ -0,0 +1,65 @@
|
||||
zabbix_export:
|
||||
version: '7.0'
|
||||
template_groups:
|
||||
- uuid: e29f7cbf75cf41cb81078cb4c10d584a
|
||||
name: 'Templates/Databases'
|
||||
templates:
|
||||
- uuid: 69899eb3126b4c62b70351f305b69dd9
|
||||
template: 'Zabbix Partitioning Monitor'
|
||||
name: 'Zabbix Partitioning Monitor'
|
||||
description: |
|
||||
Monitor Zabbix Database Partitioning.
|
||||
Prerequisites:
|
||||
1. Install zabbix_partitioning.py on the Zabbix Server/Proxy.
|
||||
2. Configure userparameter for automatic discovery:
|
||||
UserParameter=zabbix.partitioning.discovery[*], /usr/local/bin/zabbix_partitioning.py -c $1 --discovery
|
||||
UserParameter=zabbix.partitioning.check[*], /usr/local/bin/zabbix_partitioning.py -c $1 --check-days $2
|
||||
|
||||
Or use Docker wrapper scripts.
|
||||
|
||||
groups:
|
||||
- name: 'Templates/Databases'
|
||||
items:
|
||||
- uuid: bc753e750cc2485f917ba1f023c87d05
|
||||
name: 'Partitioning Last Run Status'
|
||||
type: TRAP
|
||||
key: partitioning.run.status
|
||||
delay: 0
|
||||
history: 7d
|
||||
trends: '0'
|
||||
value_type: TEXT
|
||||
description: 'Send "Success" or "Failed" via zabbix_sender or check log file'
|
||||
triggers:
|
||||
- uuid: 25497978dbb943e49dac8f3b9db91c29
|
||||
expression: 'find(/Zabbix Partitioning Monitor/partitioning.run.status,,"like","Failed")=1'
|
||||
name: 'Zabbix Partitioning Failed'
|
||||
priority: HIGH
|
||||
description: 'The partitioning script reported a failure.'
|
||||
tags:
|
||||
- tag: services
|
||||
value: database
|
||||
|
||||
discovery_rules:
|
||||
- uuid: 097c96467035468a80ce5c519b0297bb
|
||||
name: 'Partitioning Discovery'
|
||||
key: 'zabbix.partitioning.discovery[/etc/zabbix/zabbix_partitioning.conf]'
|
||||
delay: 1h
|
||||
description: 'Discover partitioned tables'
|
||||
item_prototypes:
|
||||
- uuid: 1fbff85191c244dca956be7a94bf08a3
|
||||
name: 'Partitions remaining: {#TABLE}'
|
||||
key: 'zabbix.partitioning.check[/etc/zabbix/zabbix_partitioning.conf, {#TABLE}]'
|
||||
delay: 12h
|
||||
history: 7d
|
||||
description: 'Days until the last partition runs out for {#TABLE}'
|
||||
tags:
|
||||
- tag: component
|
||||
value: partitioning
|
||||
- tag: table
|
||||
value: '{#TABLE}'
|
||||
trigger_prototypes:
|
||||
- uuid: da23fae76a41455c86c58267d6d9f86d
|
||||
expression: 'last(/Zabbix Partitioning Monitor/zabbix.partitioning.check[/etc/zabbix/zabbix_partitioning.conf, {#TABLE}])<=3'
|
||||
name: 'Partitioning critical: {#TABLE} has less than 3 days of partitions'
|
||||
priority: HIGH
|
||||
description: 'New partitions are not being created. Check the script logs.'
|
||||
@@ -7,15 +7,19 @@ database:
|
||||
db: zabbix
|
||||
# Port mapping in docker-compose is 33060
|
||||
port: 33060
|
||||
|
||||
partitions:
|
||||
# daily: Partitions created daily
|
||||
daily:
|
||||
- history: 7d
|
||||
- history_uint: 7d
|
||||
- history_str: 7d
|
||||
- history_log: 7d
|
||||
- history_text: 7d
|
||||
- history_bin: 7d
|
||||
# weekly: Partitions created weekly
|
||||
weekly:
|
||||
- history_log: 7d
|
||||
# monthly: Partitions created monthly
|
||||
monthly:
|
||||
- trends: 365d
|
||||
- trends_uint: 365d
|
||||
|
||||
|
||||
Reference in New Issue
Block a user