The Rise of PostgreSQL:

The Rise of PostgreSQL:
Image created with DALL-E

Today, I am looking at the two most popular open source relational Database Management Systems MySQL and PostgreSQL.

We can observe a consistent trend in DB Engines Rankings, namely, MySQL going down and PostgreSQL rising up.

Stack Overflow Developer Survey 2022 shows what is popular to the professional developer.

Stack Overflow Developer Survey 2022
In May 2022 over 70,000 developers told us how they learn and level up, which tools they’re using, and what they want.

Professional Developers Section show they work with PostgreSQL the most out of the database. I believe this is the first time it was above MySQL.

The majority of my experiences have been with scaling MySQL, so why has Google invested so much development into their Google Cloud SQL (PostgreSQL) service while the Google Cloud SQL (MySQL) service lacked until recently? Also Google recently introduced AlloyDB, which offers PostgreSQL clustering with full HA, but this article will not be discussing AlloyDB yet and will instead focus on why PostgreSQL has seen a surge in popularity.

What makes PostgreSQL rise while MySQL drops in use?

There are few key reasons I believe give PostgreSQL the upper hand.

  • I often hear that Oracle owning MySQL is a reason to dismiss it, which could explain why MySQL is becoming less popular. While I understand this feeling, I believe it should not be a cause for dismissal. In fact, Oracle has done a lot for the MySQL open source project, and many "drop in" versions, such as Percona and MariaDB, still draw from it.
  • Another reason is PostgreSQL can answer more complex data structures than MySQL. PostgreSQL can handle complex data structures such as arrays, JSON, XML, and key-value data types, which MySQL cannot fully support. PostgreSQL also supports advanced data types like geometric, network, and bit-string types as well as custom data types. Furthermore, PostgreSQL can store data in native PostgreSQL data types, allowing for more efficient storage and access than MySQL.

Note: MySQL can handle JSON using a set of functions that allow for manipulation, comparison, querying and parsing of JSON documents, although it does not have a dedicated JSON data type and can be limited.

Due to these important reasons, I believe developers in the open-source community contribute a lot of value to PostgreSQL. The ability to handle more complex data structures gives PostgreSQL an edge in the current data landscape.

When to use PostgreSQL or MySQL?

PostgreSQL:

When data integrity and complex transactions are a priority, PostgreSQL is often chosen as the go-to database. It offers superior performance when dealing with larger databases than MySQL and is also suitable for advanced data manipulation tasks, such as geospatial analysis, full-text search, and processing of large datasets.

MySQL:

MySQL is often utilized to make web-based applications simpler, particularly those which have heavy relational read workloads. It is especially beneficial in these applications because it is adept at quickly performing primary key lookups in heavily concurrent, read-intensive workloads. For applications with more complex data requirements, PostgreSQL is often used.

Here is a table for some comparisons.

OpenSource RDMS:

PostgreSQL

MySQL

Type:

OLTP, OLAP (object-relational database)

OTLP, OLAP (relational database)

Typical Use Cases:

  • GIS Data

  • Web

  • Scientific Data (integrates with Matlab and R)

  • OLTP transactions

  • Web

  • LAMP Stack Applications

  • E-commerce

 

Benefits:

  • PostgreSQL offers more sophisticated data types, and lets objects inherit properties.

  • It offers more Advanced indexes liked partial Index, Bloom Filters.

  • On the flip side, it also makes it more complex to work with PostgreSQL.

  • MySQL can use 16 Alternative Storage Engines while PSQL was one. 

  • Requires less Memory than PSQL with higher connections because of how it allocates memory. This make it easier for capacity planing than with PSQL. 

Excels:

  • PSQL's scalability and high performance makes is suited for larger, complicated database projects when properly architected.

  •  MySQL Excels in a web-base applications because it is performs better in simple primary key lookups in high-concurrency read-intensive workloads. 

When to use:

  • Need Structured Data (SQL) with ACID transaction guarantee. 

  • "Distributed SQL" is required (via Citus Data plugin).  

  • A multi-model database including Geospatial Data type is expected

  • A sophisticated query planner is wanted

  • Need Structured Data (SQL) with ACID transaction guarantee. 

  • Horizontal Scalability is a key requirement. 

  • Multi-Master ACID is a requirement.  (InnodbCluster)

When Not to Use: 

  • Multi-Master ACID transaction is a must-have feature.

  • Data is Semi-structured, i.e., JSON data with advanced query planers.

  • (PSQL’s JSON feature is limited.) 

  • A Converged database (OLTP and OLAP) is required.

  • Data is extremely relational (e.g., Social Media), i.e., Graph like data.

  • "Distributed SQL” is required where millions of transactions should be handled in a globally distributed database.

  • Data is extremely relational i.e., Graph like data.

  • A Converged database (OLTP and OLAP) is required.

  • Data is Semi-structured, i.e., JSON data with advanced query features

  • (MySQL’s JSON feature is limited.) 

 

Security:

  • SSL Support 

  • TLS Support 

Replication:

  • Asynchronous 

  • Cascade

  • Synchronous

  • Asynchronous

  • Synchronous 

  • SemiSynchronous

Community: 

  •  Vendor Backed Community with access to free resources 

  •   (support with paid version) 

  • Poplular Forks like:

    • Percona 

    • MariaDB

    • Drizzle 

Databases As a Service: 

 


Open Source Clustering MySQL and PostgreSQL:

MySQL Cluster:

Percona’s XtraDB Cluster:

Percona XtraDB Cluster
Documentation

Percona XtraDB Cluster

Key components used:

Galera Replication:

  • Synchronous
  • Multi-Source

Configuration script for ProxySQL:

  • Connection Multiplexing
  • ProxySQL Query Rules

Oracle's MySQL Community:

InnoDB Cluster: (Not to be confused with MySQL NDB Cluster)

MySQL :: MySQL Shell 8.0 :: 7 MySQL InnoDB Cluster

MySQL :: MySQL Shell 8.0 :: 7 MySQL InnoDB Cluster

All components native to MySQL:

Key Components used:

MySQL Group Replication (a group of database server which replicates to each other with fault tolerance).

  • Synchronous
  • Multi-Source
  • MySQL Router (query router to the healthy database nodes)
  • MySQL Shell (helper, client, configuration tool)

Note: MySQL Innodb Cluster utilizes Group Replication as part of its suite for High Availability. Group Replication is seen as the future of MySQL replication and clustering, however, Percona XtraDB Cluster and MariaDB Galera Cluster are other MySQL database cluster solutions that may be better suited for certain use cases. As I have not had significant experience with Group Replication on a large scale, I cannot attest to its efficacy.

PostgreSQL:

PostgreSQL with a combination of tools:

  • PostreSQL Replication

It can be configured as synchronous, and multi-Source if needed. With the use of Patroni and HAproxy it can be an open source solution for HA like the MySQL cluster options.

  • Patroni

Patroni is a reliable open source solution for PostgreSQL high availability. It is written in Python and uses etcd or Zookeeper to store its data. It can be used to set up and manage a highly available PostgreSQL cluster with multiple standby servers.

  • HAProxy

HAproxy is an event-driven, non-blocking engine that combines a fast I/O layer with a priority-based, multi-threaded scheduler. Its architecture is specifically designed for forwarding data quickly and efficiently, with a minimal number of operations.

High Availability PostgreSQL Cluster using Patroni and HAProxy

High Availability PostgreSQL Cluster using Patroni and HAProxy
Highly Available PostgreSQL Cluster using Patroni and HAProxy is a blog article from JFrog on for Open Source Database Suggestions

GCP Variant:

MySQL:

  • No cluster solution native to Google

PostgreSQL:

Cloud Agnostic:

MySQL:

PostgreSQL: