# PostgreSQL Partitioning for Zabbix This is the declarative (PostgreSQL procedures based) partitioning implementation for Zabbix `history`, `trends`, and `auditlog` tables on PostgreSQL. This solution is intended to replace standard Zabbix housekeeping for the configured tables. Partitioning is very useful for large environments because it completely eliminates the housekeeper from the process. Instead of huge DELETE queries on several million rows, fast DDL queries (ALTER TABLE) are executed, which drop an entire partition. > [!WARNING] > **High-Load Environments**: > 1. **Data Visibility**: After enabling partitioning, old data remains in `*_old` tables and is **NOT visible** in Zabbix. You must migrate data manually if needed. > 2. **Disable Housekeeping**: You **MUST** disable Zabbix Housekeeper for History and Trends in *Administration -> Housekeeping*. Failure to do so will cause massive `DELETE` loads. ## Table of Contents - [Architecture](#architecture) - [Components](#components) - [Prerequisites: Database & User Creation](#prerequisites-database--user-creation) - [Installation](#installation) - [Configuration](#configuration) - [Modifying Retention](#modifying-retention) - [Maintenance](#maintenance) - [Scheduling Maintenance](#scheduling-maintenance) - [Monitoring & Permissions](#monitoring--permissions) - [Versioning](#versioning) - [Least Privilege Access (`zbx_monitor`)](#least-privilege-access-zbx_monitor) - [Implementation Details](#implementation-details) - [`auditlog` Table](#auditlog-table) - [Converting Existing Tables](#converting-existing-tables) - [Upgrades](#upgrades) - [Appendix: Zabbix Server & Frontend RDS Configuration](#appendix-zabbix-server--frontend-rds-configuration) ## Architecture The solution uses PostgreSQL native declarative partitioning (`PARTITION BY RANGE`). All procedures, information, statistics and configuration are stored in the `partitions` schema to maintain full separation from Zabbix schema. ### Components 1. **Configuration Table**: `partitions.config` defines retention policies. 2. **Maintenance Procedure**: `partitions.run_maintenance()` manages partition lifecycle. 3. **Monitoring View**: `partitions.monitoring` provides system state visibility. 4. **Version Table**: `partitions.version` provides information about installed version of the partitioning solution. ## Prerequisites: Database & User Creation If you are deploying Zabbix on a fresh database instance (like AWS RDS) rather than a local server, you must first create the `zabbix` user and database using your administrator account (e.g., `postgres`). 1. Connect to your DB instance as the administrator: ```bash psql "host=YOUR_RDS_HOST port=5432 user=postgres dbname=postgres sslmode=require" ``` 2. Create the user and database: ```sql CREATE USER zabbix WITH PASSWORD 'your_secure_password'; -- On Cloud DBs like RDS, the master user must inherit the new role to grant ownership GRANT zabbix TO postgres; CREATE DATABASE zabbix OWNER zabbix; ``` ## Installation The installation is performed by executing the SQL procedures in the following order: 1. Initialize schema (`00_partitions_init.sql`). 2. Install maintenance procedures (`01_maintenance.sql`). 3. Enable partitioning on tables (`02_enable_partitioning.sql`). 4. Install monitoring views (`03_monitoring_view.sql`). **Command Example:** You can deploy these scripts manually against your Zabbix database using `psql`. Navigate to the `procedures/` directory and run: ```bash # Connect as the zabbix database user export PGPASSWORD="your_zabbix_password" DB_HOST="localhost" # Or your RDS endpoint DB_NAME="zabbix" DB_USER="zabbix" for script in 00_partitions_init.sql 01_maintenance.sql 02_enable_partitioning.sql 03_monitoring_view.sql; do echo "Applying $script..." psql -h $DB_HOST -U $DB_USER -d $DB_NAME -f "$script" done ``` ## Configuration Partitioning policies are defined in the `partitions.config` table. | Column | Type | Description | |--------|------|-------------| | `table_name` | text | Name of the Zabbix table (e.g., `history`, `trends`). | | `period` | text | Partition interval: `day`, `week`, or `month`. | | `keep_history` | interval | Data retention period (e.g., `30 days`, `12 months`). | | `future_partitions` | integer | Number of future partitions to pre-create (buffer). Default: `5`. | | `last_updated` | timestamp | Timestamp of the last successful maintenance run. | ### Modifying Retention To change the retention period for a table, update the configuration: ```sql UPDATE partitions.config SET keep_history = '60 days' WHERE table_name = 'history'; ``` ## Maintenance The maintenance procedure `partitions.run_maintenance()` is responsible for: 1. Creating future partitions (current period + `future_partitions` buffer). 2. Creating past partitions (backward coverage based on `keep_history`). 3. Dropping partitions older than `keep_history`. This procedure should be scheduled to run periodically (e.g., daily via `pg_cron` or system cron). ```sql CALL partitions.run_maintenance(); ``` ### Scheduling Maintenance To ensure partitions are created in advance and old data is cleaned up, the maintenance procedure should be scheduled to run automatically. It is recommended to run the maintenance **twice a day** (e.g., at 05:30 and 23:30). * **Primary Run**: Creates new future partitions and drops old ones. * **Secondary Run**: Acts as a safety check. Since the procedure is idempotent (safe to run multiple times), a second run ensures everything is consistent if the first run failed or was interrupted. You can schedule this using one of the following methods: #### Option 1: `pg_cron` (Recommended) `pg_cron` is a cron-based job scheduler that runs directly inside the database as an extension. > [!NOTE] > **Cloud Managed Databases (AWS RDS, Aurora, Azure, GCP):** > Managed databases generally have `pg_cron` pre-installed and handle the authentication/connections securely for you automatically. You do **not** need to install OS packages or configure a `.pgpass` file! Simply modify your RDS Parameter Group to include `shared_preload_libraries = 'pg_cron'` and `cron.database_name = 'zabbix'`, reboot the instance, and execute `CREATE EXTENSION pg_cron;`. **Setup `pg_cron` (Self-Hosted):** 1. Install the package via your OS package manager (e.g., `postgresql-15-cron` on Debian/Ubuntu, or `pg_cron_15` on RHEL/CentOS). 2. Configure it modifying `postgresql.conf`: ```ini shared_preload_libraries = 'pg_cron' cron.database_name = 'zabbix' # Define the database where pg_cron will run ``` 3. Restart PostgreSQL: ```bash systemctl restart postgresql ``` 4. Connect to your `zabbix` database as a superuser and create the extension: ```sql CREATE EXTENSION pg_cron; ``` 5. Schedule the job to run: ```sql SELECT cron.schedule('zabbix_partition_maintenance', '30 5,23 * * *', 'CALL partitions.run_maintenance();'); ``` 6. **Manage your `pg_cron` jobs** (run as superuser): - To **list all active schedules**: `SELECT * FROM cron.job;` - To **view execution logs/history**: `SELECT * FROM cron.job_run_details;` - To **remove/unschedule** the job: `SELECT cron.unschedule('zabbix_partition_maintenance');` **⚠️ Troubleshooting `pg_cron` Connection Errors:** If your cron jobs fail to execute and you see `FATAL: password authentication failed` in your PostgreSQL logs, it is because `pg_cron` attempts to connect via TCP (`localhost`) by default, which usually requires a password. **Solution A: Use Local Unix Sockets (Easier)** Edit your `postgresql.conf` to force `pg_cron` to use the local Unix socket (which uses passwordless `peer` authentication): ```ini cron.host = '/var/run/postgresql' # Or '/tmp', depending on your OS ``` *(Restart PostgreSQL after making this change).* **Solution B: Provide a Password (`.pgpass`)** If you *must* connect via TCP with a specific database user and password, the `pg_cron` background worker needs a way to authenticate. You provide this by creating a `.pgpass` file for the OS `postgres` user. 1. Switch to the OS database user: ```bash sudo su - postgres ``` 2. Create or append your database credentials to `~/.pgpass` using the format `hostname:port:database:username:password`: ```bash echo "localhost:5432:zabbix:zabbix:my_secure_password" >> ~/.pgpass ``` 3. Set strict permissions (PostgreSQL will ignore the file if permissions are too loose): ```bash chmod 0600 ~/.pgpass ``` #### Option 2: Systemd Timers Systemd timers provide better logging and error handling properties than standard cron. 1. Create a service file **`/etc/systemd/system/zabbix-partitions.service`**: ```ini [Unit] Description=Zabbix PostgreSQL Partition Maintenance After=network.target postgresql.service [Service] Type=oneshot User=postgres ExecStart=/usr/bin/psql -d zabbix -c "CALL partitions.run_maintenance();" ``` 2. Create a timer file **`/etc/systemd/system/zabbix-partitions.timer`**: ```ini [Unit] Description=Run Zabbix Partition Maintenance Twice Daily [Timer] OnCalendar=*-*-* 05:30:00 OnCalendar=*-*-* 23:30:00 Persistent=true [Install] WantedBy=timers.target ``` 3. Enable and start the timer: ```bash systemctl daemon-reload systemctl enable --now zabbix-partitions.timer ``` #### Option 3: System Cron (`crontab`) Standard system cron is a simple fallback. **Example Crontab Entry (`crontab -e`):** ```bash # Run Zabbix partition maintenance twice daily (5:30 AM and 5:30 PM) 30 5,23 * * * psql -U zabbix -d zabbix -c "CALL partitions.run_maintenance();" >> /var/log/zabbix_maintenance.log 2>&1 ``` **Docker Environment:** If running in Docker, you can execute it via the host's cron by targeting the container: ```bash 30 5,23 * * * docker exec zabbix-db-test psql -U zabbix -d zabbix -c "CALL partitions.run_maintenance();" ``` ## Monitoring & Permissions System state can be monitored via the `partitions.monitoring` view. It includes a `future_partitions` column which counts how many partitions exist *after* the current period. This is useful for alerting (e.g., trigger if `future_partitions < 2`). ```sql SELECT * FROM partitions.monitoring; ``` ### Versioning To check the installed version of the partitioning solution: ```sql SELECT * FROM partitions.version ORDER BY installed_at DESC LIMIT 1; ``` ### Least Privilege Access (`zbx_monitor`) For monitoring purposes, it is recommended to create a dedicated user with read-only access to the monitoring view. ```sql CREATE USER zbx_monitor WITH PASSWORD 'secure_password'; GRANT USAGE ON SCHEMA partitions TO zbx_monitor; GRANT SELECT ON partitions.monitoring TO zbx_monitor; ``` ## Implementation Details ### `auditlog` Table The standard Zabbix `auditlog` table has a primary key on `(auditid)`. Partitioning by `clock` requires the partition key to be part of the primary key. To prevent placing a heavy, blocking lock on a highly active `auditlog` table to alter its primary key, the enablement script (`02_enable_partitioning.sql`) detects it and handles it exactly like the history tables: it automatically renames the live, existing table to `auditlog_old`, and instantly creates a brand new, empty partitioned `auditlog` table pre-configured with the required `(auditid, clock)` composite primary key. ### Converting Existing Tables The enablement script guarantees practically zero downtime by automatically renaming the existing tables to `table_name_old` and creating new partitioned tables matching the exact schema. * **Note**: Data from the old tables is NOT automatically migrated to minimize downtime. * New data flows into the new partitioned tables immediately. * Old data remains accessible in `table_name_old` for manual lookup or migration if required. ## Upgrades When upgrading Zabbix: 1. **Backup**: Ensure a full database backup exists. 2. **Compatibility**: Zabbix upgrade scripts may attempt to `ALTER` tables. PostgreSQL supports `ALTER TABLE` on partitioned tables for adding columns, which propagates to partitions. 3. **Failure Scenarios**: If an upgrade script fails due to partitioning, the table may need to be temporarily reverted or the partition structure manually adjusted. --- ## Appendix: Zabbix Server & Frontend RDS Configuration If you are running Zabbix against an external Cloud database (like AWS RDS) via SSL (`verify-full`), you must explicitly configure both the Zabbix Server daemon and the Web Frontend to enforce SSL and locate the downloaded Root CA Certificate. **Prerequisite:** Download your cloud provider's root certificate (e.g., `global-bundle.pem`) and place it in a secure location on your Zabbix Server (e.g., `/etc/zabbix/global-bundle.pem`). ### 1. Zabbix Server (`/etc/zabbix/zabbix_server.conf`) Ensure the following database lines are active: ```ini DBHost=YOUR_RDS_ENDPOINT.amazonaws.com DBPort=5432 DBName=zabbix DBUser=zabbix DBPassword=your_secure_password DBTLSConnect=verify_full DBTLSCAFile=/etc/zabbix/global-bundle.pem ``` ### 2. Zabbix Frontend PHP (`/etc/zabbix/web/zabbix.conf.php`) If you used the Web Setup Wizard, it might not configure the Root CA File correctly. Update your config array to enforce encryption and verify the host certificate: ```php $DB['TYPE'] = 'POSTGRESQL'; $DB['SERVER'] = 'YOUR_RDS_ENDPOINT.amazonaws.com'; $DB['PORT'] = '5432'; $DB['DATABASE'] = 'zabbix'; $DB['USER'] = 'zabbix'; $DB['PASSWORD'] = 'your_secure_password'; $DB['SCHEMA'] = ''; $DB['ENCRYPTION'] = true; $DB['VERIFY_HOST'] = true; $DB['CA_FILE'] = '/etc/zabbix/global-bundle.pem'; ```