Open Source Database Guide

What is an Open Source Database?

An open source database is any database application with an open source license and codebase that renders it free to download, view, distribute, modify, and reuse. Developers have more freedom with open source licenses to use existing database technologies along with their own ideas and innovation to build new applications.

The first open-source databases were basic SQL databases, for example, PostgreSQL [1986] and MySQL [1995]. Today many NoSQL open source databases also exist, such as document stores like MongoDB [2007] and Couchbase [2010], as well as wide-column databases such as Apache Cassandra [2008], and ScyllaDB [2016] for analysis of large data sets; e.g., those produced by social media, AdTech, or the Internet of Things (IoT).

Advantages of Open Source Databases

Why use open source databases? Although the best open source database is largely going to be determined by the specific needs of the project and user, there are some general advantages of open source databases.

Zero Cost

Commercial databases can demand major up front and recurring costs for infrastructure, software licenses, and technical expertise before anything is developed or deployed. Commercial licenses generally cost more depending on which features or level of usage attach to them. In contrast, typically open source databases attach no license fees for either the database or its features.

Open Core

Some companies offer both OSS and enterprise database options with additional features and support. In such “open core” models, the foundational version or core of the product is free, but additional features require commercial licensing. This way, users can start by trialing the OSS version and easily move to the more robust, better-supported version when it’s time to go into production or scale their applications.

Database Extensions and Plugins

There are an impressive range of plugins and extensions available for many open source databases touching on integration, real-time monitoring, and custom data formats. Users can generally add them on to the standard database platform and configure them precisely as needed at no additional cost. However, some open source business models allow for additional commercial licenses — so-called dual licenses or multi-licensing — that may require an additional purchase to use additional features, extensions or plugins.

Community Support and Development

As described above, open source more broadly refers to source code that is available for users to deploy and modify as they choose. Yet it also refers to an ethos of an open source community and culture, with the developer and user community providing mutual support for their peers looking to modify or adopt the software for their own use. This itself can lead to a robust community of new product and feature development and free support.

As new business models emerge, open-source software is likely to exist that can take advantage of related new opportunities. In addition, since users customize their open-source databases in countless ways, the most popular open-source products are very flexible and benefit from massive user communities. These communities generate documentation, YouTube training videos, and Q&A forums with free open source database expert advice.

However, some open source contributors and communities maintain themselves by giving away the open source database code for free use, yet charging other users for application development, cloud hosted services, or technical services and support.

Disadvantages of Open Source Databases

There are several potential disadvantages to open source databases as well—although the most robust, well-used open source databases mitigate against these.

Lack of Vendor Accountability

Commercially-provided software, supported by a vendor, means that the database software comes with explicit warranties and support agreements. However, open source community is often more of an “as-is” offering, with little recourse in case of defects.

While many open source communities are as responsive (or even at times more responsive) than a comparable commercially-provided product, users considering adopting open source databases should objectively research the open source contributors and community backing the project to assess such risks as:

    • Do they have a large number of coders, or does the project rely upon a few key contributors?
    • What is the release cadence? Do they regularly release updates, or are updates irregular and infrequent?
    • How many bugs are opened against the open source project, and how long do those bugs take to resolve on average?
    • Is the community unified, or fractured? For example, some users may feel so dissatisfied with an open source community they may clone or fork an open source database project to evolve it a different way. While this might be the sign of a vibrant and growing developer community, it can also indicate competing camps of developers and possibly incompatible variants.

Maintenance and Support

For users that need 24/7 enterprise support with an SLA, open-source products and services may not be the best fit.

Documentation

Depending on release and platform, documentation breadth and depth for open source databases can vary. Where users deploy additional open source utilities or plugins with the database for increased functionality, this may become an issue, particularly where documentation sources are not robust. Reliance on blog posts, community emails, and the other typical online sources can morph into a time consuming exercise in trial and error.

Features

Some open source databases, especially those with wide adoption, have mature capabilities. However, commercial databases may offer more advanced maturity and functionality; this may be particularly salient for workloads that rely on specific database features.

Closed Source vs Open Source Databases

Closed source, commercial databases are built on proprietary source code that users cannot access, distribute, modify, or reuse. To use the database within applications, users often pay licensing or subscription fees. The codebase is maintained exclusively by the company that created the database, so if there are any issues with the database management system or you need new features, you’ll have to wait. In contrast, any user can access, modify, and contribute to the open source database source code.

Open Source Database Examples

There are many examples of open source databases on the web to choose from. Examples of popular open source relational databases include: CockroachDB, PostgreSQL, MySQL, and MariaDB. Examples of popular open source NoSQL document databases include MongoDB and CouchDB. Examples of popular open source NoSQL wide column databases include Cassandra and ScyllaDB.. Examples of open source key value databases include Redis, Riak, and ScyllaDB. Examples of open source graph databases include Neo4j and JanusGraph.

How to Select the Best Open Source Database

When selecting an open source database management system for your production environment, consider the factors outlined in these vendor agnostic expert guides.

Here’s a few guides that might be helpful as you select the best open source database for your team and projects.

Martin Heller is a well-respected industry expert. His database selection guide outlines 12 key questions you should consider when evaluating and selecting an open source database:

    • How much data do you expect to store when the application is mature?
    • How many users do you expect to handle simultaneously at peak load?
      What availability, scalability, latency, throughput, and data consistency does your application need?
    • How often will your database schemas change?
    • What is the geographic distribution of your user population?
    • What is the natural “shape” of your data?
    • Does your application need online transaction processing (OLTP), analytic queries (OLAP), or both?
    • What ratio of reads to writes do you expect in production?
    • Do you need geographic queries and/or full-text queries?
    • What are your preferred programming languages?
    • Do you have a budget? If so, will it cover licenses and support contracts?
    • Are there legal restrictions on your data storage?

His article expands on each of those questions and explores their implications for your database evaluation.

Gigaom offers a number of reports that might be helpful when selecting an open source database. For example:

    • GigaOm Radar for Cloud Database Platforms
    • GigaOm Radar for Graph Databases
    • Key Criteria for Evaluating Graph Databases
    • Key Criteria for Evaluating Cloud Database Platforms

G2 offers a database selection guide, as well as an extensive library of user reviews on open source SQL and NoSQL databases.

    • Each open source database is rated on its:
    • Ease of use
    • Quality of support
    • Ease of setup

In each review, real users respond to questions such as:

    • What do you like best?
    • What do you dislike?
    • What problems are you solving with the product?
    • What benefits have you realized?

NoSQL Masterclasses: Advance Your NoSQL Knowledge

Looking for extensive training on data modeling, database migration, and high performance for NoSQL Databases? Our experts offer 3-hour masterclasses that assists practitioners wanting to migrate from SQL to NoSQL or advance their understanding of NoSQL data modeling. This free, self-paced class covers techniques and best practices on these NoSQL concepts that will help you steer clear of mistakes that could inconvenience any engineering team.

You can access the complete course here.

Trending NoSQL Resources