Hydra is an open-source extension that adds columnar tables to Postgres for efficient analytical reporting. With Hydra, you can analyze billions of rows instantly without changing code.
Hydra augments Postgres’ existing row-based tables, enabling developers to tailor Postgres to their application’s custom transactional and analytical needs. Hydra combines a columnar format, vectorized execution, and parallelism to supercharge modern applications that aren’t wholly transactional or fully analytical, such as real-time dashboards, IOT, geospatial & logistics apps, and time-series workloads.
Tailor Postgres to your modern, real-time apps
👸 Columnar tables = OLAP
OLAP (Online Analytical Processing) is designed to support analytical workloads, such as data mining, reporting, and business intelligence. OLAP systems typically use a multidimensional data model, which allows users to analyze data from multiple perspectives and at different levels of detail. OLAP systems are often used in decision support applications, where users need to quickly and easily analyze large amounts of data.
🤴 Row tables (heap) = OLTP
OLTP (Online Transactional Processing) is optimized for a large number of small, frequent transactions that insert, update, delete, and retrieve data from a database. This type of system manages real-time data processing for record lookups, fast writes, high concurrency, and useful for order entry, sales, financial applications, and more.
👸🤝🤴 Row + Columnar tables = HTAP
HTAP (Hybrid Transactional Analytical Processing) combines the strengths of OLTP and OLAP into a single system. When transactions occur, they are instantly accessible for analytics and machine learning. HTAP is commonly used when reporting latency must be low, such as financial analysis, IOT alerting, fraud detection, supply chain management, customer-facing dashboards, and applications with real-time decision making.
Columnar tables 101
Columnar tables are organized transversely from row tables. For example, take the following table stored in row format:
The same data stored in columnar can be visualized as follows:
Learn more in our docs.
Using Hydra
It’s super simple to swap table format:
Benchmarks
Hydra columnar tables enable the fastest Postgres aggregates on earth.
Review Clickbench for comprehensive results and the list of 42 queries tested.
This benchmark represents typical workload in the following areas: clickstream and traffic analysis, web analytics, machine-generated data, structured logs, and events data.
Benchmarks were run on a c6a.4xlarge (16 vCPU, 32 GB RAM) with 500 GB of GP2 storage.
For our continuous benchmark results, see BENCHMARKS.
Release Notes
Aggregate queries are over 60% faster compared to Hydra 1.0 beta release. Spatial indexes and pg_hint_plan are now enabled for performance optimization.
Please refer to Hydra 1.0 beta release notes here.
Aggregate vectorization
We added vectorization of integer and date data that is stored in a columnar table. Vectorization happens automatically whenever applicable. The following aggregate functions are vectorized:
- MIN
- MAX
- COUNT
- SUM
- AVG
Vectorization can result in aggregate queries being over 60% faster. If vectorized aggregate is not found or execution plan is not suitable, Hydra falls back to standard Postgres execution.
This optimization is for Postgres 14+ only.
Spatial index types and pg_hint_plan
Developer Changelog
View or full CHANGELOG on Github.