feat: enterprise audit fixes (schema resolution, race conditions, documentation)

2026-04-30 10:50:36 +00:00
parent fb65b2f1e7
commit 7305b79943
9 changed files with 399 additions and 52 deletions
--- a/postgresql/procedures/README.md
+++ b/postgresql/procedures/README.md
@@ -21,6 +21,8 @@ This is the declarative partitioning implementation for Zabbix `history*`, `tren
 - [Implementation Details](#implementation-details)
  - [`auditlog` Table](#auditlog-table)
  - [Converting Existing Tables](#converting-existing-tables)
+- [PostgreSQL Tuning](#postgresql-tuning)
+- [Uninstall / Reverting](#uninstall--reverting)
 - [Upgrades](#upgrades)

 ## Architecture
@@ -36,27 +38,10 @@ All procedures, information, statistics and configuration are stored in the `par

 ## Installation

-The installation is performed by executing the SQL procedures in the following order:
-1.  Initialize schema (`00_schema_create.sql`).
-2.  Install maintenance procedures (`01_maintenance.sql`).
-3.  Enable partitioning on tables (`02_enable_partitioning.sql`).
-4.  Install monitoring views (`03_monitoring_view.sql`).
+> [!IMPORTANT]
+> **Please refer to the [MANUAL.md](MANUAL.md) for the complete, step-by-step, foolproof installation instructions.**
+> The manual contains critical safety procedures, backup warnings, and copy-pasteable commands for a safe deployment.

-**Command Example:**
-You can deploy these scripts manually against your Zabbix database using `psql`. Navigate to the `procedures/` directory and run:
-
-```bash
-# Connect as the zabbix database user
-export PGPASSWORD="your_zabbix_password"
-DB_HOST="localhost" # Or your DB endpoint
-DB_NAME="zabbix"
-DB_USER="zbxpart_admin"
-
-for script in 00_schema_create.sql 01_maintenance.sql 02_enable_partitioning.sql 03_monitoring_view.sql; do
-    echo "Applying $script..."
-    psql -h $DB_HOST -U $DB_USER -d $DB_NAME -f "$script"
-done
-```

 ## Configuration

@@ -213,6 +198,24 @@ System state can be monitored via the `partitions.monitoring` view. It includes
 SELECT * FROM partitions.monitoring;
 ```

+### Zabbix Agent Integration
+To monitor the state of the partitions directly from Zabbix, you need to provide the Zabbix Agent with the SQL query used to fetch this data. You can automatically generate the required `partitions.get_all.sql` file on your agent using this one-liner:
+
+```bash
+cat << 'EOF' | sudo tee /etc/zabbix/zabbix_agent2.d/partitions.get_all.sql > /dev/null
+SELECT 
+    table_name,
+    period,
+    keep_history::text AS keep_history,
+    configured_future_partitions,
+    actual_future_partitions,
+    total_size_bytes,
+    EXTRACT(EPOCH FROM (now() - last_updated)) AS age_seconds
+FROM partitions.monitoring;
+EOF
+```
+*(Make sure to adjust the destination path according to your Zabbix Agent template directory)*
+
 ### Versioning
 To check the installed version of the partitioning solution:
 ```sql
@@ -243,9 +246,59 @@ The enablement script guarantees practically zero downtime by automatically rena
 *   New data flows into the new partitioned tables immediately.
 *   Old data remains accessible in `table_name_old` for manual lookup or migration if required.

-## Upgrades
+### Housekeeper Interceptor
+Even when Zabbix Housekeeping is disabled in the UI for History and Trends, the Zabbix Server daemon may still generate and insert tasks into the `housekeeper` table (e.g., when an item or trigger is deleted, it schedules the deletion of its historical data). Without intervention, this results in the `housekeeper` table bloating massively over time, leading to slow sequential scans and `autovacuum` overhead.

-When upgrading Zabbix:
+To prevent this, this extension installs a `BEFORE INSERT` trigger on the `housekeeper` table. 
+*   When Zabbix attempts to insert a housekeeper task, the trigger intercepts it and checks if the target table is managed in `partitions.config`.
+*   If the table is partitioned (like `history`), the trigger **silently discards the insert** (`RETURNS NULL`), preventing disk I/O and table bloat entirely.
+*   If the table is not partitioned (like `events` or `sessions`), the task is allowed to be recorded and is cleaned up naturally by Zabbix.
+
+## PostgreSQL Tuning
+
+Before or immediately after enabling partitioning, you should tune your `postgresql.conf`. The standard configuration is not optimized for partitioned tables and might cause performance degradation or out-of-memory errors.
+
+| Parameter | Recommended | Description |
+|-----------|-------------|-------------|
+| `max_locks_per_transaction`| `512` (or higher) | **Requires DB Restart.** Default is `64`, which is far too low. PostgreSQL lock tables per partition. With many partitioned tables (e.g., history x 30 days), operations like `pg_dump`, `VACUUM`, or queries crossing multiple boundaries will fail with *“out of shared memory”*. |
+| `jit` | `off` | **Highly Recommended.** JIT adds overhead to query planning. With many partitions, JIT can drastically increase CPU usage as PostgreSQL attempts to optimize simple queries across dozens of partitions. |
+
+**Default parameters to verify:**
+The following are usually set correctly by default, but you should verify them just in case:
+*   `enable_partition_pruning = on` : **Critical.** Ensures PostgreSQL only queries the necessary partitions instead of scanning everything.
+*   `enable_partitionwise_join = off` : Zabbix does not do massive joins on history tables; enabling this only wastes planner CPU time.
+*   `enable_partitionwise_aggregate = off` : Zabbix doesn't perform complex DB-side `GROUP BY` aggregations on history. Leave it disabled.
+
+## Uninstall / Reverting
+
+If you wish to stop using partitioning and revert back to standard, unpartitioned tables without data loss, carefully follow these steps. 
+
+> [!CAUTION]
+> Reverting partitioning replaces your partitioned tables with standard empty tables. If you need to retain data from the partitioned period, you must manually migrate it before dropping the partition sets. **Always stop Zabbix Server before proceeding.**
+
+1. **Stop Zabbix Server** to prevent new data from being inserted during the transition.
+2. **Execute Undo Script:** Run the `04_undo_partitioning.sql` script to recreate non-partitioned tables matching your original Zabbix schema. This script will rename your current partitioned tables to `*_part` (`history_part`, `trends_part`, etc.) and automatically create native, clean tables (`history`, `trends`) in their place.
+   ```bash
+   psql -h $DB_HOST -U zbxpart_admin -d zabbix -f 04_undo_partitioning.sql
+   ```
+3. **Data Migration (Optional):** If you want to keep the metrics collected during the partitioned period, you must manually insert them into the newly created regular tables. This step can take hours depending on table sizes.
+   ```sql
+   INSERT INTO history SELECT * FROM history_part;
+   INSERT INTO trends SELECT * FROM trends_part;
+   -- Repeat for all tables you wish to restore
+   ```
+4. **Cleanup:** Once you have migrated the data you need (or if you don't need it at all), you can drop the heavy partitioned tables and remove the partitioning extensions completely.
+   ```sql
+   DROP TABLE history_part CASCADE;
+   DROP TABLE history_uint_part CASCADE;
+   -- Repeat for all *_part tables ...
+   
+   -- To drop the automatic maintenance infrastructure:
+   DROP SCHEMA partitions CASCADE;
+   ```
+5. **Start Zabbix Server & Re-enable Housekeeper:** Once the tables are replaced, you can start the server. *Don't forget to re-enable Housekeeping for History and Trends in the Zabbix UI!*
+
+## Upgrades
 1.  **Backup**: Ensure a full database backup exists.
 2.  **Compatibility**: Zabbix upgrade scripts may attempt to `ALTER` tables. PostgreSQL supports `ALTER TABLE` on partitioned tables for adding columns, which propagates to partitions.
 3.  **Failure Scenarios**: If an upgrade script fails due to partitioning, the table may need to be temporarily reverted or the partition structure manually adjusted.