Refactor auditlog preparation, rename procedures sequentially, and update test suite

This commit is contained in:
Maksym Buz
2026-03-26 15:57:35 +00:00
parent 14f38efafd
commit 2b7a69ba11
12 changed files with 128 additions and 95 deletions

View File

@@ -34,7 +34,7 @@ The solution is divided into a series of SQL scripts that must be executed seque
* PostgreSQL range partitioning requires the partition key (in this case, `clock`) to be part of the Primary Key.
* This script dynamically locates the existing Primary Key (usually just `auditid`) and alters it to a composite key `(auditid, clock)`.
### 3. `02_maintenance.sql`
### 3. `01_maintenance.sql`
* **Purpose:** Contains the core PL/pgSQL procedural logic that manages the lifecycle of the partitions.
* **Key Functions/Procedures:**
* `partition_exists()`: Queries `pg_class` to verify if a specific child partition partition exists.
@@ -43,14 +43,14 @@ The solution is divided into a series of SQL scripts that must be executed seque
* `maintain_table()`: The orchestrator for a single table. It calculates the necessary UTC timestamps, calls `create_partition()` to build the future buffer, calls `create_partition()` recursively backward to cover the retention period, and finally calls `drop_old_partitions()`.
* `run_maintenance()`: The global loop that iterates through `partitions.config` and triggers `maintain_table()` for every configured Zabbix table.
### 4. `03_enable_partitioning.sql`
### 4. `02_enable_partitioning.sql`
* **Purpose:** The migration script that actually executes the partition conversion on the live database.
* **Actions:**
* It takes the original Zabbix table (e.g., `history`) and renames it to `history_old` (`ALTER TABLE ... RENAME TO ...`).
* It immediately creates a new partitioned table with the original name, inheriting the exact structure of the old table (`CREATE TABLE ... (LIKE ... INCLUDING ALL) PARTITION BY RANGE (clock)`).
* It triggers the first maintenance run so new incoming data has immediate partitions to land in.
### 5. `04_monitoring_view.sql`
### 5. `03_monitoring_view.sql`
* **Purpose:** Provides an easy-to-read observability layer.
* **Actions:**
* Creates the `partitions.monitoring` view by joining `pg_class`, `pg_inherits`, `pg_tablespace`, and `pg_size_pretty`.

View File

@@ -1,7 +1,6 @@
-- ============================================================================
-- SCRIPT: 00_partitions_init.sql
-- DESCRIPTION: Creates the 'partitions' schema and configuration table.
-- Defines the structure for managing Zabbix partitioning.
-- Creates the 'partitions' schema and configuration table.
-- Defines the structure for managing Zabbix partitioning.
-- ============================================================================
CREATE SCHEMA IF NOT EXISTS partitions;

View File

@@ -1,27 +0,0 @@
-- ============================================================================
-- SCRIPT: 01_auditlog_prep.sql
-- DESCRIPTION: Modifies the 'auditlog' table Primary Key to include 'clock'.
-- This is REQUIRED for range partitioning by 'clock'.
-- ============================================================================
DO $$
BEGIN
-- Check if PK needs modification
-- Original PK is typically on (auditid) named 'auditlog_pkey'
IF EXISTS (
SELECT 1 FROM pg_constraint
WHERE conname = 'auditlog_pkey'
AND conrelid = 'auditlog'::regclass
) THEN
-- Verify if 'clock' is already in PK (basic check)
-- Realistically, if 'auditlog_pkey' exists on default Zabbix, it's just (auditid).
RAISE NOTICE 'Dropping existing Primary Key on auditlog...';
ALTER TABLE auditlog DROP CONSTRAINT auditlog_pkey;
RAISE NOTICE 'Creating new Primary Key on auditlog (auditid, clock)...';
ALTER TABLE auditlog ADD PRIMARY KEY (auditid, clock);
ELSE
RAISE NOTICE 'Constraint auditlog_pkey not found. Skipping or already modified.';
END IF;
END $$;

View File

@@ -1,6 +1,5 @@
-- ============================================================================
-- SCRIPT: 02_maintenance.sql
-- DESCRIPTION: Core functions for Zabbix partitioning (Create, Drop, Maintain).
-- Core functions for Zabbix partitioning (Create, Drop, Maintain).
-- ============================================================================
-- Function to check if a partition exists

View File

@@ -1,7 +1,6 @@
-- ============================================================================
-- SCRIPT: 03_enable_partitioning.sql
-- DESCRIPTION: Converts standard Zabbix tables to Partitioned tables.
-- WARNING: This renames existing tables to *_old.
-- Converts standard Zabbix tables to Partitioned tables.
-- WARNING: This renames existing tables to *_old.
-- ============================================================================
DO $$
@@ -23,15 +22,21 @@ BEGIN
WHERE c.relname = v_table;
-- Check if table exists and is NOT already partitioned
IF EXISTS (SELECT 1 FROM pg_class WHERE relname = v_table AND relkind = 'r') THEN
RAISE NOTICE 'Converting table % to partitioned table...', v_table;
-- 1. Rename existing table
EXECUTE format('ALTER TABLE %I.%I RENAME TO %I', v_schema, v_table, v_old_table);
-- 2. Create new partitioned table (copying structure)
EXECUTE format('CREATE TABLE %I.%I (LIKE %I.%I INCLUDING ALL) PARTITION BY RANGE (clock)', v_schema, v_table, v_schema, v_old_table);
-- 2. Create new partitioned table (handling auditlog PK uniquely)
IF v_table = 'auditlog' THEN
EXECUTE format('CREATE TABLE %I.%I (LIKE %I.%I INCLUDING DEFAULTS INCLUDING COMMENTS) PARTITION BY RANGE (clock)', v_schema, v_table, v_schema, v_old_table);
EXECUTE format('ALTER TABLE %I.%I ADD PRIMARY KEY (auditid, clock)', v_schema, v_table);
EXECUTE format('CREATE INDEX IF NOT EXISTS auditlog_1 ON %I.%I (userid, clock)', v_schema, v_table);
EXECUTE format('CREATE INDEX IF NOT EXISTS auditlog_2 ON %I.%I (clock)', v_schema, v_table);
ELSE
EXECUTE format('CREATE TABLE %I.%I (LIKE %I.%I INCLUDING ALL) PARTITION BY RANGE (clock)', v_schema, v_table, v_schema, v_old_table);
END IF;
-- 3. Create initial partitions
RAISE NOTICE 'Creating initial partitions for %...', v_table;

View File

@@ -1,6 +1,5 @@
-- ============================================================================
-- SCRIPT: 04_monitoring_view.sql
-- DESCRIPTION: Creates a view to monitor partition status and sizes.
-- Creates a view to monitor partition status and sizes.
-- ============================================================================
CREATE OR REPLACE VIEW partitions.monitoring AS

View File

@@ -70,18 +70,15 @@ if [[ -f "$SQL_DIR/schema.sql" ]]; then
if [[ -f "../procedures/00_partitions_init.sql" ]]; then
cp "../procedures/00_partitions_init.sql" ./init_scripts/01_10_partitions_init.sql
fi
if [[ -f "../procedures/01_auditlog_prep.sql" ]]; then
cp "../procedures/01_auditlog_prep.sql" ./init_scripts/01_20_auditlog_prep.sql
fi
if [[ -f "../procedures/02_maintenance.sql" ]]; then
cp "../procedures/02_maintenance.sql" ./init_scripts/01_30_maintenance.sql
fi
if [[ -f "../procedures/03_enable_partitioning.sql" ]]; then
cp "../procedures/03_enable_partitioning.sql" ./init_scripts/01_40_enable.sql
fi
if [[ -f "../procedures/04_monitoring_view.sql" ]]; then
cp "../procedures/04_monitoring_view.sql" ./init_scripts/01_50_monitoring.sql
fi
if [[ -f "../procedures/01_maintenance.sql" ]]; then
cp "../procedures/01_maintenance.sql" ./init_scripts/01_30_maintenance.sql
fi
if [[ -f "../procedures/02_enable_partitioning.sql" ]]; then
cp "../procedures/02_enable_partitioning.sql" ./init_scripts/01_40_enable.sql
fi
if [[ -f "../procedures/03_monitoring_view.sql" ]]; then
cp "../procedures/03_monitoring_view.sql" ./init_scripts/01_50_monitoring.sql
fi
else
echo -e "${RED}Error: schema.sql not found in $SQL_DIR${NC}"
exit 1

View File

@@ -1,26 +0,0 @@
-- ============================================================================
-- Modifies the 'auditlog' table Primary Key to include 'clock'.
-- This is REQUIRED for range partitioning by 'clock'.
-- ============================================================================
DO $$
BEGIN
-- Check if PK needs modification
-- Original PK is typically on (auditid) named 'auditlog_pkey'
IF EXISTS (
SELECT 1 FROM pg_constraint
WHERE conname = 'auditlog_pkey'
AND conrelid = 'auditlog'::regclass
) THEN
-- Verify if 'clock' is already in PK (basic check)
-- Realistically, if 'auditlog_pkey' exists on default Zabbix, it's just (auditid).
RAISE NOTICE 'Dropping existing Primary Key on auditlog...';
ALTER TABLE auditlog DROP CONSTRAINT auditlog_pkey;
RAISE NOTICE 'Creating new Primary Key on auditlog (auditid, clock)...';
ALTER TABLE auditlog ADD PRIMARY KEY (auditid, clock);
ELSE
RAISE NOTICE 'Constraint auditlog_pkey not found. Skipping or already modified.';
END IF;
END $$;

View File

@@ -22,15 +22,21 @@ BEGIN
WHERE c.relname = v_table;
-- Check if table exists and is NOT already partitioned
IF EXISTS (SELECT 1 FROM pg_class WHERE relname = v_table AND relkind = 'r') THEN
RAISE NOTICE 'Converting table % to partitioned table...', v_table;
-- 1. Rename existing table
EXECUTE format('ALTER TABLE %I.%I RENAME TO %I', v_schema, v_table, v_old_table);
-- 2. Create new partitioned table (copying structure)
EXECUTE format('CREATE TABLE %I.%I (LIKE %I.%I INCLUDING ALL) PARTITION BY RANGE (clock)', v_schema, v_table, v_schema, v_old_table);
-- 2. Create new partitioned table (handling auditlog PK uniquely)
IF v_table = 'auditlog' THEN
EXECUTE format('CREATE TABLE %I.%I (LIKE %I.%I INCLUDING DEFAULTS INCLUDING COMMENTS) PARTITION BY RANGE (clock)', v_schema, v_table, v_schema, v_old_table);
EXECUTE format('ALTER TABLE %I.%I ADD PRIMARY KEY (auditid, clock)', v_schema, v_table);
EXECUTE format('CREATE INDEX IF NOT EXISTS auditlog_1 ON %I.%I (userid, clock)', v_schema, v_table);
EXECUTE format('CREATE INDEX IF NOT EXISTS auditlog_2 ON %I.%I (clock)', v_schema, v_table);
ELSE
EXECUTE format('CREATE TABLE %I.%I (LIKE %I.%I INCLUDING ALL) PARTITION BY RANGE (clock)', v_schema, v_table, v_schema, v_old_table);
END IF;
-- 3. Create initial partitions
RAISE NOTICE 'Creating initial partitions for %...', v_table;

View File

@@ -19,14 +19,28 @@ All procedures, information, statistics and configuration are stored in the `par
3. **Monitoring View**: `partitions.monitoring` provides system state visibility.
4. **Version Table**: `partitions.version` provides information about installed version of the partitioning solution.
## Prerequisites: Database & User Creation
If you are deploying Zabbix on a fresh database instance (like AWS RDS) rather than a local server, you must first create the `zabbix` user and database using your administrator account (e.g., `postgres`).
1. Connect to your DB instance as the administrator:
```bash
psql "host=YOUR_RDS_HOST port=5432 user=postgres dbname=postgres sslmode=require"
```
2. Create the user and database:
```sql
CREATE USER zabbix WITH PASSWORD 'your_secure_password';
-- On Cloud DBs like RDS, the master user must inherit the new role to grant ownership
GRANT zabbix TO postgres;
CREATE DATABASE zabbix OWNER zabbix;
```
## Installation
The installation is performed by executing the SQL procedures in the following order:
1. Initialize schema (`00_partitions_init.sql`).
2. Auditlog PK adjustment (`01_auditlog_prep.sql`).
3. Install maintenance procedures (`02_maintenance.sql`).
4. Enable partitioning on tables (`03_enable_partitioning.sql`).
5. Install monitoring views (`04_monitoring_view.sql`).
2. Install maintenance procedures (`01_maintenance.sql`).
3. Enable partitioning on tables (`02_enable_partitioning.sql`).
4. Install monitoring views (`03_monitoring_view.sql`).
**Command Example:**
You can deploy these scripts manually against your Zabbix database using `psql`. Navigate to the `procedures/` directory and run:
@@ -38,7 +52,7 @@ DB_HOST="localhost" # Or your RDS endpoint
DB_NAME="zabbix"
DB_USER="zabbix"
for script in 00_partitions_init.sql 01_auditlog_prep.sql 02_maintenance.sql 03_enable_partitioning.sql 04_monitoring_view.sql; do
for script in 00_partitions_init.sql 01_maintenance.sql 02_enable_partitioning.sql 03_monitoring_view.sql; do
echo "Applying $script..."
psql -h $DB_HOST -U $DB_USER -d $DB_NAME -f "$script"
done
@@ -90,7 +104,11 @@ You can schedule this using one of the following methods:
#### Option 1: `pg_cron` (Recommended)
`pg_cron` is a cron-based job scheduler that runs directly inside the database as an extension.
**Setup `pg_cron`:**
> [!NOTE]
> **Cloud Managed Databases (AWS RDS, Aurora, Azure, GCP):**
> Managed databases generally have `pg_cron` pre-installed and handle the authentication/connections securely for you automatically. You do **not** need to install OS packages or configure a `.pgpass` file! Simply modify your RDS Parameter Group to include `shared_preload_libraries = 'pg_cron'` and `cron.database_name = 'zabbix'`, reboot the instance, and execute `CREATE EXTENSION pg_cron;`.
**Setup `pg_cron` (Self-Hosted):**
1. Install the package via your OS package manager (e.g., `postgresql-15-cron` on Debian/Ubuntu, or `pg_cron_15` on RHEL/CentOS).
2. Configure it modifying `postgresql.conf`:
```ini
@@ -114,6 +132,31 @@ You can schedule this using one of the following methods:
- To **view execution logs/history**: `SELECT * FROM cron.job_run_details;`
- To **remove/unschedule** the job: `SELECT cron.unschedule('zabbix_partition_maintenance');`
**⚠️ Troubleshooting `pg_cron` Connection Errors:**
If your cron jobs fail to execute and you see `FATAL: password authentication failed` in your PostgreSQL logs, it is because `pg_cron` attempts to connect via TCP (`localhost`) by default, which usually requires a password.
**Solution A: Use Local Unix Sockets (Easier)**
Edit your `postgresql.conf` to force `pg_cron` to use the local Unix socket (which uses passwordless `peer` authentication):
```ini
cron.host = '/var/run/postgresql' # Or '/tmp', depending on your OS
```
*(Restart PostgreSQL after making this change).*
**Solution B: Provide a Password (`.pgpass`)**
If you *must* connect via TCP with a specific database user and password, the `pg_cron` background worker needs a way to authenticate. You provide this by creating a `.pgpass` file for the OS `postgres` user.
1. Switch to the OS database user:
```bash
sudo su - postgres
```
2. Create or append your database credentials to `~/.pgpass` using the format `hostname:port:database:username:password`:
```bash
echo "localhost:5432:zabbix:zabbix:my_secure_password" >> ~/.pgpass
```
3. Set strict permissions (PostgreSQL will ignore the file if permissions are too loose):
```bash
chmod 0600 ~/.pgpass
```
#### Option 2: Systemd Timers
Systemd timers provide better logging and error handling properties than standard cron.
@@ -189,13 +232,14 @@ GRANT SELECT ON partitions.monitoring TO zbx_monitor;
## Implementation Details
### `auditlog` Table
The standard `auditlog` table Primary Key is `(auditid)`. Partitioning by `clock` requires the partition key to be part of the Primary Key. The initialization script modifies the PK to `(auditid, clock)`.
The standard Zabbix `auditlog` table has a primary key on `(auditid)`. Partitioning by `clock` requires the partition key to be part of the primary key.
To prevent placing a heavy, blocking lock on a highly active `auditlog` table to alter its primary key, the enablement script (`02_enable_partitioning.sql`) detects it and handles it exactly like the history tables: it automatically renames the live, existing table to `auditlog_old`, and instantly creates a brand new, empty partitioned `auditlog` table pre-configured with the required `(auditid, clock)` composite primary key.
### Converting Existing Tables
The enablement script renames the existing table to `table_name_old` and creates a new partitioned table with the same structure.
* **Note**: Data from the old table is NOT automatically migrated to minimize downtime.
* New data flows into the new partitioned table immediately.
* Old data remains accessible in `table_name_old` for manual query or migration if required.
The enablement script guarantees practically zero downtime by automatically renaming the existing tables to `table_name_old` and creating new partitioned tables matching the exact schema.
* **Note**: Data from the old tables is NOT automatically migrated to minimize downtime.
* New data flows into the new partitioned tables immediately.
* Old data remains accessible in `table_name_old` for manual lookup or migration if required.
## Upgrades
@@ -203,3 +247,40 @@ When upgrading Zabbix:
1. **Backup**: Ensure a full database backup exists.
2. **Compatibility**: Zabbix upgrade scripts may attempt to `ALTER` tables. PostgreSQL supports `ALTER TABLE` on partitioned tables for adding columns, which propagates to partitions.
3. **Failure Scenarios**: If an upgrade script fails due to partitioning, the table may need to be temporarily reverted or the partition structure manually adjusted.
---
## Appendix: Zabbix Server & Frontend RDS Configuration
If you are running Zabbix against an external Cloud database (like AWS RDS) via SSL (`verify-full`), you must explicitly configure both the Zabbix Server daemon and the Web Frontend to enforce SSL and locate the downloaded Root CA Certificate.
**Prerequisite:** Download your cloud provider's root certificate (e.g., `global-bundle.pem`) and place it in a secure location on your Zabbix Server (e.g., `/etc/zabbix/global-bundle.pem`).
### 1. Zabbix Server (`/etc/zabbix/zabbix_server.conf`)
Ensure the following database lines are active:
```ini
DBHost=YOUR_RDS_ENDPOINT.amazonaws.com
DBPort=5432
DBName=zabbix
DBUser=zabbix
DBPassword=your_secure_password
DBTLSConnect=verify_full
DBTLSCAFile=/etc/zabbix/global-bundle.pem
```
### 2. Zabbix Frontend PHP (`/etc/zabbix/web/zabbix.conf.php`)
If you used the Web Setup Wizard, it might not configure the Root CA File correctly. Update your config array to enforce encryption and verify the host certificate:
```php
$DB['TYPE'] = 'POSTGRESQL';
$DB['SERVER'] = 'YOUR_RDS_ENDPOINT.amazonaws.com';
$DB['PORT'] = '5432';
$DB['DATABASE'] = 'zabbix';
$DB['USER'] = 'zabbix';
$DB['PASSWORD'] = 'your_secure_password';
$DB['SCHEMA'] = '';
$DB['ENCRYPTION'] = true;
$DB['VERIFY_HOST'] = true;
$DB['CA_FILE'] = '/etc/zabbix/global-bundle.pem';
```