What is a Columnar Database? Definition & FAQs

Our blog keeps you up to date with recent news about the ScyllaDB NoSQL database and related technologies, success stories and developer how-tos.

Read the Blog

Columnar Database Definition

Relational databases typically store data in rows, which in turn consist of columns. A columnar database, also called a column-oriented database or a wide-column store, is a database that stores the values of each column together, rather than storing the values of each row together. By organizing data in this way, columnar databases usually exhibit faster queries and support aggregate functions over columns of data. A columnar database design does not require the same columns to be present for all rows, allowing more flexible usage, and reducing space that would be reserved in empty columns in a relational database management system (RDBMS).

What are the advantages of a Columnar Database?

The manner in how a columnar database stores data together is functionally similar to defining an index on every column in a row-based database. For example, calculating the average price or sum of values is fast, because there are no page scans, no traversing all rows on the disk to find the prices to average. Because the data in each column is contiguous, sorting can also be performed efficiently. Columnar databases are also highly scalable and compress well.

What are the drawbacks of a Columnar Database?

Because the column data is stored together in a columnar database, a row of data is split across multiple sections. When a row is written, each column is written separately, causing slower writes and lower consistency during the write, but the data will become consistent eventually as all the writes complete. Therefore, one trade-off with columnar databases is slower write times in exchange for faster aggregate function times. In short, row operations are slower, but column operations are faster.

Use Cases for Columnar Databases

Columnar databases are well suited for big data processing, business intelligence (BI), and analytics. IoT (Internet of Things) devices feed high volumes of data into data stores. IoT records may have limited data elements with new records arriving in a steady stream, so a big data columnar database would be ideal for analyzing the data. Cellular telecommunications, for example, relies on constant communication between the handset and the towers. Telecom providers need to analyze that data in near real time in order to detect problems with a tower.

Columnar Database Examples

Apache Cassandra led the way with columnar databases. Many other vendors have followed the Cassandra model with their own unique variations, while also implementing CQL (Cassandra Query Language). Columnar databases that use CQL include Apache Cassandra, DataStax, Microsoft Azure Cosmos DB, and ScyllaDB, which is a native C++ rewrite of Cassandra.

Other databases, such as Apache HBase, use their own query language. Apache Kudu was built to run on Hadoop, while MariaDB was forked from the MySQL RDBMS and supports a subset of SQL commands.

Does ScyllaDB Offer Solutions for Operational Databases?

NoSQL-based operational database systems can truly harness the power of big data by using technologies such as ScyllaDB, which is a drop-in replacement for Apache Cassandra that provides built-in schedulers, a native memory allocator, automatic configuration capabilities, high scalability, and support of global and local indexes. These features enable orders-of- magnitude increase in database performance, increased throughput and storage capacity, reduced hardware costs, and simplified usability and maintainability.

Why ScyllaDB?

Is ScyllaDB right for me?

ScyllaDB University

ScyllaDB Blog

Columnar Database

Columnar Database Definition