SQLStarters

Open-source vs. commercial database systems

While SQL Server has been one of the cornerstones of my 25+ years career in IT, I’ve occasionally wandered out of my commercial databases comfort zone to the open-source database space (PostgreSQL and less frequently MySQL). These forays to the open-source database systems have definitely become more common as I started working with the public clouds, and for a reason. Public clouds have been one of the biggest things to drive the adaptation of open-source database systems in a while.

Open-source or commercial database

While I see a lot of interest in moving to use open-source database systems, in my opinion at least, I don’t often see enough debate around the capabilities of open-source databases compared to commercial ones. Are there still reasons to pay top dollar (or euro) for enterprise grade database systems?

The answer to that question is, that yes there is. There are database workloads out there, that require top security, top performance and the highest level of availability and 24/7 support available. No question about it. And these are things, that you can only get by spending money on them.

You can definitely build a high performing and secure system with open-source databases, especially by using public cloud and PaaS deployment models. However, when it comes to your most business critical database workloads, the commercial systems will provide the most value for the money.

About commercial database systems

When it comes to commercial, relational database systems, the top 3 on my mind are SQL Server, Oracle and DB2. When you look at the feature lists on all of them, they’re long and very similar in content. Besides having similar set of features and capabilities, there are couple other things that are common between each of them.

  • Their owning companies have invested ungodly amount of money in their development.
  • Each company also has a premium PaaS option available in their own cloud (Azure, OCI, IBM Cloud).
  • All of these database systems continue to receive updates and new versions in regular pace.
  • Each of the companies have a large organization to provide 24/7 support for their products.

The topic of this post isn’t to tell you which commercial database system to pick (it should be SQL Server anyway), but I wanted to point out that development of the commercial database systems is rather similar.

About open-source database systems

Similarly to commercial database systems, development of the open-source systems tends to be the same as well. In this post, my focus is on 2 of the most popular open-source database systems: PostgreSQL and MySQL. Out of these 2, the MySQL has a bit more complicated offering, as it’s actually owned by Oracle. There is a community edition, which is free, but Oracle has also rolled out a commercial offering. Not so surprisingly, the commercial editions are more feature packed.

It should be noted, that while PostgreSQL is a fully open-sourced project, one of the major contributors to its development has historically been EDB. And maybe not too surprisingly, they have developed quite a feature packed, commercial PostgreSQL offering, of their own.

See the pattern forming here?

The major hyperscalers also have open-source database PaaS solutions, but they also differ in their approach, especially in the case of MySQL. Microsoft uses the community editions in Azure Database for MySQL, whereas AWS Aurora MySQL is, more or less compatible, with the community edition. While they’re both MySQL, the approach AWS has, creates a vendor lock to AWS cloud.

Comparing open-source to commercial databases

Now that we’ve looked into the background, let’s look at the actual, technological differences between open-source and commercial database systems. In this case, I’ll just focus on using SQL Server as the commercial database option, and compare it against PostgreSQL and MySQL.

Assume that other commercial databases, like Oracle and DB2 would match the capabilities of SQL Server.

Engine capabilities

Let’s start by looking at the engine capabilities. While we’re comparing relational database systems here, that’s not all they are. Unlike the purpose-built databases, all of these are what multi-model databases. A multi-model database, as you can probably guess, can support multiple data models using the same engine.

SQL ServerPostgreSQLMySQL
RelationalRelationalRelational
JSONJSONDocument
XMLSpatial
Spatial
Graph
Key-value
Table: Supported database models

While the list for PostgreSQL and MySQL are considerably shorter, it should again be noted that there are commercial plugins, extensions, etc. to bring many of these same models available for both of the open-source databases.

High Availability and Disaster Recovery

Let’s then compare HA and DR capabilities for each system. I am only listing features available natively (not added in by the cloud, 3rd party product, or commercial edition) for each database.

SQL ServerPostgreSQLMySQL
Availability GroupsStandby DatabaseReplication
Failover ClusteringReplicationInnoDB Clustering
Log Shopping
Database Mirroring
Replication
Table: Comparing HA/DR capabilities

Here SQL Server provides most capabilities out of the box, compared to open-source alternatives. Now, there are many alternative ways to do high availability with PostgreSQL and MySQL, but they mostly rely on 3rd party open-source solutions (load balancers, clustering services, etc.)

Replication capabilities

As replication is something we can find from each of these three, let’s compare that next.

SQL ServerPostgreSQLMySQL
Transactional replicationStreaming replicationMaster-Slave replication
Snapshot replicationLogical replicationMulti-source replication
Merge replication
Peer-to-Peer migration
Table: Comparing replication options

Both PostgreSQL and MySQL have additional open-source and commercial options for other replication methods. Let’s then move to look a bit more at the internals, and indexes are always interesting.

Supported index types

SQL ServerPostgreSQLMySQL
ClusteredB-TreeB-Tree
NonclusteredHashHash
UniqueGINSpatial
Index with included columnsBRINFull-text
Index with computed columnsGiSTCovering
ColumnstoreSP-GiSTComposite
Memory-optimized nonclusteredUnique
SpatialClustered
XML
Full-text
Filtered
Hash
Table: Available index types by engine

And I could produce plenty of similar types of comparison tables, but I guess these should make the point already. There are also some things that can completely surprise someone who is used to managing, let’s say SQL Server, by proving some surprising lessons learned. Like the process of vacuuming.

Moreover, you can also expect to get the higher performance out of the commercial database systems.

To wrap it up

Open-source database systems are popular and for a good reason, they offer reasonably sized feature sets for a low cost. Run it as a PaaS in a cloud platform, and for that additional cost you get a database that can do many of the things that commercial, enterprise grade database systems can do.

And where you have those business critical workloads that require a bit more, there are some great commercial database systems out there that will cover all your needs.

Published by

Leave a Reply

%d bloggers like this: