feat: Enhanced partitioning procedures with schema awareness.

This commit is contained in:
Maksym Buz
2026-02-20 21:46:27 +00:00
parent 9d1b84225c
commit ea3e89effa
9 changed files with 323531 additions and 18 deletions

63
ARCHITECTURE.md Normal file
View File

@@ -0,0 +1,63 @@
# Zabbix PostgreSQL Partitioning Architecture
This document provides a brief technical overview of the components, logic, and dynamic querying mechanisms that power the PostgreSQL partitioning solution for Zabbix.
## Schema-Agnostic Design
A core architectural principle of this solution is its **schema-agnostic design**. It does not assume that your Zabbix database is installed in the default `public` schema.
When the procedures need to create, drop, or manipulate a partitioned table (e.g., `history`), they do not hardcode the schema. Instead, they dynamically query PostgreSQL's internal system catalogs (`pg_class` and `pg_namespace`) to locate exactly which schema the target table belongs to:
```sql
SELECT n.nspname INTO v_schema
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = v_table;
```
This ensures that the partitioning scripts will work flawlessly, even in custom Zabbix deployments where tables are housed in alternative schemas.
## File Structure & Queries
The solution is divided into a series of SQL scripts that must be executed sequentially to set up the environment.
### 1. `00_partitions_init.sql`
* **Purpose:** Initializes the foundation for the partitioning system.
* **Actions:**
* Creates the isolated `partitions` schema to keep everything separate from Zabbix's own structure.
* Creates the `partitions.config` table (which stores retention policies).
* Creates the `partitions.version` table for tracking the installed version.
### 2. `01_auditlog_prep.sql`
* **Purpose:** Prepares the Zabbix `auditlog` table for partitioning.
* **Actions:**
* PostgreSQL range partitioning requires the partition key (in this case, `clock`) to be part of the Primary Key.
* This script dynamically locates the existing Primary Key (usually just `auditid`) and alters it to a composite key `(auditid, clock)`.
### 3. `02_maintenance.sql`
* **Purpose:** Contains the core PL/pgSQL procedural logic that manages the lifecycle of the partitions.
* **Key Functions/Procedures:**
* `partition_exists()`: Queries `pg_class` to verify if a specific child partition partition exists.
* `create_partition()`: Executes the DDL `CREATE TABLE ... PARTITION OF ... FOR VALUES FROM (x) TO (y)` to generate a new time-bound chunk.
* `drop_old_partitions()`: Iterates over existing child partitions (using `pg_inherits`) and calculates their age based on their suffix. Drops those older than the defined `keep_history` policy.
* `maintain_table()`: The orchestrator for a single table. It calculates the necessary UTC timestamps, calls `create_partition()` to build the future buffer, calls `create_partition()` recursively backward to cover the retention period, and finally calls `drop_old_partitions()`.
* `run_maintenance()`: The global loop that iterates through `partitions.config` and triggers `maintain_table()` for every configured Zabbix table.
### 4. `03_enable_partitioning.sql`
* **Purpose:** The migration script that actually executes the partition conversion on the live database.
* **Actions:**
* It takes the original Zabbix table (e.g., `history`) and renames it to `history_old` (`ALTER TABLE ... RENAME TO ...`).
* It immediately creates a new partitioned table with the original name, inheriting the exact structure of the old table (`CREATE TABLE ... (LIKE ... INCLUDING ALL) PARTITION BY RANGE (clock)`).
* It triggers the first maintenance run so new incoming data has immediate partitions to land in.
### 5. `04_monitoring_view.sql`
* **Purpose:** Provides an easy-to-read observability layer.
* **Actions:**
* Creates the `partitions.monitoring` view by joining `pg_class`, `pg_inherits`, `pg_tablespace`, and `pg_size_pretty`.
* This view aggregates the total size of each partitioned family and calculates how many "future partitions" exist as a safety buffer.
## Automated Scheduling (`pg_cron`)
While `systemd` timers or standard `cron` can be used to trigger the maintenance, the recommended approach (especially for AWS RDS/Aurora deployments) is using the `pg_cron` database extension.
`pg_cron` allows you to schedule the `CALL partitions.run_maintenance();` procedure directly within PostgreSQL, ensuring the database autonomously manages its own housekeeping without requiring external OS-level access or triggers.

View File

@@ -11,7 +11,6 @@ BEGIN
SELECT 1 FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = p_partition_name
AND n.nspname = 'public'
);
END;
$$ LANGUAGE plpgsql;
@@ -28,7 +27,17 @@ DECLARE
v_start_ts bigint;
v_end_ts bigint;
v_suffix text;
v_parent_schema text;
BEGIN
-- Determine the schema of the parent table
SELECT n.nspname INTO v_parent_schema
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = p_parent_table;
IF NOT FOUND THEN
RAISE EXCEPTION 'Parent table % not found', p_parent_table;
END IF;
-- (No changes needed for time here as passed params are already UTC-adjusted in caller)
v_start_ts := extract(epoch from p_start_time)::bigint;
v_end_ts := extract(epoch from p_end_time)::bigint;
@@ -43,8 +52,8 @@ BEGIN
IF NOT partitions.partition_exists(v_partition_name) THEN
EXECUTE format(
'CREATE TABLE public.%I PARTITION OF public.%I FOR VALUES FROM (%s) TO (%s)',
v_partition_name, p_parent_table, v_start_ts, v_end_ts
'CREATE TABLE %I.%I PARTITION OF %I.%I FOR VALUES FROM (%s) TO (%s)',
v_parent_schema, v_partition_name, v_parent_schema, p_parent_table, v_start_ts, v_end_ts
);
END IF;
END;
@@ -61,16 +70,19 @@ DECLARE
v_partition record;
v_partition_date timestamp with time zone;
v_suffix text;
v_partition_schema text;
BEGIN
-- Calculate cutoff timestamp
v_cutoff_ts := extract(epoch from (now() - p_retention))::bigint;
FOR v_partition IN
SELECT
child.relname AS partition_name
child.relname AS partition_name,
n.nspname AS partition_schema
FROM pg_inherits
JOIN pg_class parent ON pg_inherits.inhparent = parent.oid
JOIN pg_class child ON pg_inherits.inhrelid = child.oid
JOIN pg_namespace n ON child.relnamespace = n.oid
WHERE parent.relname = p_parent_table
LOOP
-- Parse partition suffix to determine age
@@ -85,14 +97,14 @@ BEGIN
-- To be safe, adding 1 month to check vs cutoff.
IF extract(epoch from (v_partition_date + '1 month'::interval)) < v_cutoff_ts THEN
RAISE NOTICE 'Dropping old partition %', v_partition.partition_name;
EXECUTE format('DROP TABLE public.%I', v_partition.partition_name);
EXECUTE format('DROP TABLE %I.%I', v_partition.partition_schema, v_partition.partition_name);
COMMIT; -- Release lock immediately
END IF;
ELSIF length(v_suffix) = 8 THEN -- YYYYMMDD
v_partition_date := to_timestamp(v_suffix, 'YYYYMMDD') AT TIME ZONE 'UTC';
IF extract(epoch from (v_partition_date + '1 day'::interval)) < v_cutoff_ts THEN
RAISE NOTICE 'Dropping old partition %', v_partition.partition_name;
EXECUTE format('DROP TABLE public.%I', v_partition.partition_name);
EXECUTE format('DROP TABLE %I.%I', v_partition.partition_schema, v_partition.partition_name);
COMMIT; -- Release lock immediately
END IF;
END IF;

View File

@@ -10,27 +10,35 @@ DECLARE
v_table text;
v_old_table text;
v_pk_sql text;
v_schema text;
BEGIN
FOR v_row IN SELECT * FROM partitions.config LOOP
v_table := v_row.table_name;
v_old_table := v_table || '_old';
-- Determine schema
SELECT n.nspname INTO v_schema
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = v_table;
-- Check if table exists and is NOT already partitioned
IF EXISTS (SELECT 1 FROM pg_class WHERE relname = v_table AND relkind = 'r') THEN
RAISE NOTICE 'Converting table % to partitioned table...', v_table;
-- 1. Rename existing table
EXECUTE format('ALTER TABLE public.%I RENAME TO %I', v_table, v_old_table);
EXECUTE format('ALTER TABLE %I.%I RENAME TO %I', v_schema, v_table, v_old_table);
-- 2. Create new partitioned table (copying structure)
EXECUTE format('CREATE TABLE public.%I (LIKE public.%I INCLUDING ALL) PARTITION BY RANGE (clock)', v_table, v_old_table);
EXECUTE format('CREATE TABLE %I.%I (LIKE %I.%I INCLUDING ALL) PARTITION BY RANGE (clock)', v_schema, v_table, v_schema, v_old_table);
-- 3. Create initial partitions
RAISE NOTICE 'Creating initial partitions for %...', v_table;
CALL partitions.maintain_table(v_table, v_row.period, v_row.keep_history, v_row.future_partitions);
-- Optional: Migrate existing data
-- EXECUTE format('INSERT INTO public.%I SELECT * FROM public.%I', v_table, v_old_table);
-- EXECUTE format('INSERT INTO %I.%I SELECT * FROM %I.%I', v_schema, v_table, v_schema, v_old_table);
ELSIF EXISTS (SELECT 1 FROM pg_class WHERE relname = v_table AND relkind = 'p') THEN
RAISE NOTICE 'Table % is already partitioned. Skipping conversion.', v_table;

View File

@@ -11,7 +11,6 @@ BEGIN
SELECT 1 FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = p_partition_name
AND n.nspname = 'public'
);
END;
$$ LANGUAGE plpgsql;
@@ -28,7 +27,17 @@ DECLARE
v_start_ts bigint;
v_end_ts bigint;
v_suffix text;
v_parent_schema text;
BEGIN
-- Determine the schema of the parent table
SELECT n.nspname INTO v_parent_schema
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = p_parent_table;
IF NOT FOUND THEN
RAISE EXCEPTION 'Parent table % not found', p_parent_table;
END IF;
-- (No changes needed for time here as passed params are already UTC-adjusted in caller)
v_start_ts := extract(epoch from p_start_time)::bigint;
v_end_ts := extract(epoch from p_end_time)::bigint;
@@ -43,8 +52,8 @@ BEGIN
IF NOT partitions.partition_exists(v_partition_name) THEN
EXECUTE format(
'CREATE TABLE public.%I PARTITION OF public.%I FOR VALUES FROM (%s) TO (%s)',
v_partition_name, p_parent_table, v_start_ts, v_end_ts
'CREATE TABLE %I.%I PARTITION OF %I.%I FOR VALUES FROM (%s) TO (%s)',
v_parent_schema, v_partition_name, v_parent_schema, p_parent_table, v_start_ts, v_end_ts
);
END IF;
END;
@@ -61,16 +70,19 @@ DECLARE
v_partition record;
v_partition_date timestamp with time zone;
v_suffix text;
v_partition_schema text;
BEGIN
-- Calculate cutoff timestamp
v_cutoff_ts := extract(epoch from (now() - p_retention))::bigint;
FOR v_partition IN
SELECT
child.relname AS partition_name
child.relname AS partition_name,
n.nspname AS partition_schema
FROM pg_inherits
JOIN pg_class parent ON pg_inherits.inhparent = parent.oid
JOIN pg_class child ON pg_inherits.inhrelid = child.oid
JOIN pg_namespace n ON child.relnamespace = n.oid
WHERE parent.relname = p_parent_table
LOOP
-- Parse partition suffix to determine age
@@ -85,14 +97,14 @@ BEGIN
-- To be safe, adding 1 month to check vs cutoff.
IF extract(epoch from (v_partition_date + '1 month'::interval)) < v_cutoff_ts THEN
RAISE NOTICE 'Dropping old partition %', v_partition.partition_name;
EXECUTE format('DROP TABLE public.%I', v_partition.partition_name);
EXECUTE format('DROP TABLE %I.%I', v_partition.partition_schema, v_partition.partition_name);
COMMIT; -- Release lock immediately
END IF;
ELSIF length(v_suffix) = 8 THEN -- YYYYMMDD
v_partition_date := to_timestamp(v_suffix, 'YYYYMMDD') AT TIME ZONE 'UTC';
IF extract(epoch from (v_partition_date + '1 day'::interval)) < v_cutoff_ts THEN
RAISE NOTICE 'Dropping old partition %', v_partition.partition_name;
EXECUTE format('DROP TABLE public.%I', v_partition.partition_name);
EXECUTE format('DROP TABLE %I.%I', v_partition.partition_schema, v_partition.partition_name);
COMMIT; -- Release lock immediately
END IF;
END IF;

View File

@@ -10,27 +10,35 @@ DECLARE
v_table text;
v_old_table text;
v_pk_sql text;
v_schema text;
BEGIN
FOR v_row IN SELECT * FROM partitions.config LOOP
v_table := v_row.table_name;
v_old_table := v_table || '_old';
-- Determine schema
SELECT n.nspname INTO v_schema
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = v_table;
-- Check if table exists and is NOT already partitioned
IF EXISTS (SELECT 1 FROM pg_class WHERE relname = v_table AND relkind = 'r') THEN
RAISE NOTICE 'Converting table % to partitioned table...', v_table;
-- 1. Rename existing table
EXECUTE format('ALTER TABLE public.%I RENAME TO %I', v_table, v_old_table);
EXECUTE format('ALTER TABLE %I.%I RENAME TO %I', v_schema, v_table, v_old_table);
-- 2. Create new partitioned table (copying structure)
EXECUTE format('CREATE TABLE public.%I (LIKE public.%I INCLUDING ALL) PARTITION BY RANGE (clock)', v_table, v_old_table);
EXECUTE format('CREATE TABLE %I.%I (LIKE %I.%I INCLUDING ALL) PARTITION BY RANGE (clock)', v_schema, v_table, v_schema, v_old_table);
-- 3. Create initial partitions
RAISE NOTICE 'Creating initial partitions for %...', v_table;
CALL partitions.maintain_table(v_table, v_row.period, v_row.keep_history, v_row.future_partitions);
-- Optional: Migrate existing data
-- EXECUTE format('INSERT INTO public.%I SELECT * FROM public.%I', v_table, v_old_table);
-- EXECUTE format('INSERT INTO %I.%I SELECT * FROM %I.%I', v_schema, v_table, v_schema, v_old_table);
ELSIF EXISTS (SELECT 1 FROM pg_class WHERE relname = v_table AND relkind = 'p') THEN
RAISE NOTICE 'Table % is already partitioned. Skipping conversion.', v_table;

File diff suppressed because it is too large Load Diff

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,49 @@
ALTER TABLE history RENAME TO history_old;
CREATE TABLE history (
itemid bigint NOT NULL,
clock integer DEFAULT '0' NOT NULL,
value DOUBLE PRECISION DEFAULT '0.0000' NOT NULL,
ns integer DEFAULT '0' NOT NULL,
PRIMARY KEY (itemid,clock,ns)
);
ALTER TABLE history_uint RENAME TO history_uint_old;
CREATE TABLE history_uint (
itemid bigint NOT NULL,
clock integer DEFAULT '0' NOT NULL,
value numeric(20) DEFAULT '0' NOT NULL,
ns integer DEFAULT '0' NOT NULL,
PRIMARY KEY (itemid,clock,ns)
);
ALTER TABLE history_str RENAME TO history_str_old;
CREATE TABLE history_str (
itemid bigint NOT NULL,
clock integer DEFAULT '0' NOT NULL,
value varchar(255) DEFAULT '' NOT NULL,
ns integer DEFAULT '0' NOT NULL,
PRIMARY KEY (itemid,clock,ns)
);
ALTER TABLE history_log RENAME TO history_log_old;
CREATE TABLE history_log (
itemid bigint NOT NULL,
clock integer DEFAULT '0' NOT NULL,
timestamp integer DEFAULT '0' NOT NULL,
source varchar(64) DEFAULT '' NOT NULL,
severity integer DEFAULT '0' NOT NULL,
value text DEFAULT '' NOT NULL,
logeventid integer DEFAULT '0' NOT NULL,
ns integer DEFAULT '0' NOT NULL,
PRIMARY KEY (itemid,clock,ns)
);
ALTER TABLE history_text RENAME TO history_text_old;
CREATE TABLE history_text (
itemid bigint NOT NULL,
clock integer DEFAULT '0' NOT NULL,
value text DEFAULT '' NOT NULL,
ns integer DEFAULT '0' NOT NULL,
PRIMARY KEY (itemid,clock,ns)
);

File diff suppressed because it is too large Load Diff