Understanding Redis
What is Redis?
Redis (Remote Dictionary Service) is an open source key-value database server.
The most accurate way to describe Redis is to say that it is a data structure server. The unique features of the Redis server have become the main reason for its popularity and the fact that it is used in many real projects.
Instead of working with database rows, iterating, sorting, ordering them, what if the information is initially in the data structures that the programmer needs? At first, Redis was used almost the same way as Memcached. But, as Redis evolved, this database management system (DBMS) found application in many other situations. In particular, in the implementation of the publisher/subscriber mechanism, in streaming data processing tasks, in systems where it is necessary to work with queues.
Here are the data types supported by Redis:
- String
- Bitmap
- Bitfield
- Hash
- List
- Set
- Sorted set
- Geospatial
- HyperLogLog structure
- Stream
Redis is an in-memory database that is used primarily as a cache in front of another, "real" database, such as MySQL or PostgreSQL. A cache based on Redis helps improve application performance. It leverages the data speed of memory and alleviates the load on the application's central database for the following data:
- Data that rarely changes and is accessed frequently by the application.
- Non-mission-critical data that changes frequently.
Examples of such data may include session or data caches, as well as dashboard content such as leaderboards and reports that include data aggregated from different sources.
The traditional approach to using Redis is as follows: the client accesses the application, and it receives the data necessary to fulfill its request. First (item 1 in the figure), the application accesses the Redis cache, represented by the main database (Main). If the cache contains data, a cache hit occurs, and the data is returned as usual. If a cache miss occurs (item 2), the system accesses the persistent storage (in this case, the MySQL database). The data from it (item 3) is loaded into the cache, after which the application can use it.
But in many cases, Redis guarantees a high enough level of data safety that it can be used as a true primary database. And adding Redis plugins and various High Availability (HA) configurations to the system makes the Redis database extremely interesting for certain use cases and workloads.
Another important feature of Redis is that it blurs the boundaries between cache and data storage. It is important to understand that reading data from memory and working with data in memory is much faster than the same operations performed by traditional DBMSs using regular hard disk drives (HDD) or solid-state drives (SSD).
Latency and throughput characteristics of systems that every programmer should know.
Initially, Redis was most often compared to Memcached, a system that at that time had no hint of long-term data storage.
Memcached was created in 2003 by Brad Fitzpatrick. It appeared 6 years before Redis. At first, it was a Perl project, and later it was rewritten in C. At one time, Memcached was a standard caching tool. The main differences between it and Redis are that Memcached has fewer data types and limitations related to the key eviction policy. Memcached only supports the LRU (Least Recently Used) policy, when the data that has not been used for the longest time is evicted first.
Another difference between these storages is that Redis is a single-threaded system, and Memcached is multi-threaded. Memcached can show excellent performance results in limited caching environments. And when using this system in a distributed cluster, additional settings are required. Redis supports such scenarios right out of the box.
The following table provides information on the differences between Memcached and Redis that are relevant today.
Nowadays, Redis supports customization of how exactly data is saved to disk. In the very beginning, the system used snapshots, when asynchronous copies of data located in memory were sent to disk for long-term storage. Unfortunately, this mechanism has a drawback, expressed in the possible loss of data changed or added to the storage in the time intervals between snapshots.
Redis has evolved significantly since its inception in 2009. We will cover most of the architectural and topological decisions specific to Redis, which will allow you to study this system and include it in your arsenal of data storage systems.
Redis Architecture
Before we start talking about the internal mechanisms of Redis, let's look at the different options for deploying this storage and discuss the trade-offs that those who choose one or another option have to make.
We will mainly focus on the following configurations:
- Single Redis instance.
- Redis HA.
- Redis Sentinel.
- Redis Cluster.
You can choose one or another configuration depending on the specifics of your project and its scale.
Single Redis instance
A simple Redis deployment option, represented by a single Main database.
The simplest way to deploy Redis is to use a single instance of the system. With this approach, the user has a small storage at their disposal that will help the project grow and develop and speed up the services of this project. But this way of using Redis also has its drawbacks. For example, if the Redis instance used in the project fails or becomes unavailable, all client requests to Redis will fail, resulting in a drop in overall performance and system speed.
If you give the Redis instance enough memory and server resources, this instance can turn out to be quite a powerful entity. This approach is mainly used for caching, and allows you to get a serious increase in project performance by spending a minimum of effort and time on setting up the system. If you have sufficient server resources, you can deploy such a Redis service on the same machine where the main application is running.
To work successfully with Redis, it is important to understand some of the concepts of the system related to data management. Requests to the Redis database are processed by working with data stored in memory. If the Redis instance being used provides for the use of persistent data storage, the system will have a fork of the process. It uses the RDB (Redis Database) to organize persistent storage of snapshots of data. This is a very compact representation of Redis data at a certain point in time. Instead of RDB, files intended only for appending data (Append-Only File, AOF) can be used.
These two mechanisms allow Redis to have a long-term data storage, support various replication strategies, and help to implement more complex topologies based on Redis. If the Redis server is not configured for persistent data storage, then upon a reboot or system failure, the data is lost. If persistent storage is enabled, then upon a system reboot, the data from the RDB snapshot or from the AOF is loaded into memory. After that, the Redis instance will be able to process client requests.
Given the above, let's consider Redis configurations that are characterized, so to speak, by greater distribution than the one considered.
Redis HA
High availability configuration. The system consists of a main database (Main), the leading node, and a secondary database (Secondary), the slave node. The state of the nodes is synchronized by replication.
Another popular Redis configuration is a system consisting of a master and slave nodes, whose state is synchronized through replication. As data is written to the master Redis instance, copies of the corresponding commands are sent to the output buffer of the slave nodes, ensuring that the data is replicated. Slaves can contain one or more Redis instances. These instances can help scale data reads or provide fault tolerance for the system in the event of loss of communication with the master node.
High Availability (HA) is a characteristic of a system that aims to provide a consistent level of its performance (usually system uptime) over longer-than-average time intervals.
High availability systems are designed to have no single point of failure. This allows them to gracefully and quickly recover from failures. High availability configurations require reliable communication links, which eliminates the possibility of data loss during transmission from the master to the slave node. In addition, such systems automatically detect and recover from failures.
As we now move into distributed systems, which are associated with many misconceptions, we need to become familiar with a few new concepts. What was once simple and straightforward is now becoming more complex.
Data Replication in Redis
Each Redis master node has an ID and a replication offset. These two metrics are extremely important to determine when a slave node can continue the replication process or when a full data synchronization is needed. The offset is incremented by any action that occurs on the Redis master node.
Replication ID, offset
To be more specific, when a Redis slave is only a few offset steps behind the master, it receives outstanding commands from the master, applies those commands to the data, and so on until the nodes are synchronized. If the two instances cannot agree on a replication ID, or the master has no offset information, the slave requests a full data synchronization. This involves the master creating a new RDB snapshot and sending it to the slave. While sending this data, the master buffers the intermediate data updates that occurred between the snapshot creation and the current time. These updates will be sent to the slave after it synchronizes with the snapshot. Once this process is complete, replication can continue as normal.
If different Redis instances have the same ID and offset, it means that they are storing exactly the same data. This may raise the question of why Redis uses a replication ID. The point is that when a Redis instance is promoted to a master, or if an instance is started as a master right away, it is assigned a new replication ID. This is used to figure out which Redis instance was the master before. Specifically, it figures out which instance the node that was just promoted was previously copying data from. This allows for partial synchronization (with other slaves) to occur, since the new master remembers its old replication ID.
For example, two Redis instances, a master and a slave, have the same replication ID, but their offsets differ by several hundred commands. That is, if we replay the corresponding commands on the “lagging” instance, both instances will have an identical set of data. Suppose that the replication IDs of the instances are different and we do not know the previous ID (the instances do not have a common ancestor) of the node that was recently demoted to a slave (it is connected to the master). In such a situation, we need to perform a resource-intensive full data synchronization operation.
On the other hand, if the previous replication ID is known, we can think about how to synchronize the data of the two nodes. Since we know the common ancestor of the nodes, this means that they store common data, and therefore, using the offset, we can perform partial data synchronization.
Redis Sentinel
Deploying a system using Redis Sentinel. This deployment consists of Sentinel nodes, a master node, and slave nodes.
Redis Sentinel is a service that enables the creation of distributed systems. And, as with all distributed systems, Sentinel has its strengths and weaknesses. Sentinel is based on a cluster of Sentinel processes working together. They coordinate the state of the system, implementing a high-availability configuration for Redis. Sentinel is a service that protects the Redis storage from failures. Therefore, it is logical that this service should not have a single point of failure.
The Sentinel service solves several problems. First, it ensures the operability and availability of the current master and slave nodes. Thanks to this, the current Sentinel process (along with other similar processes) can react to a situation when communication with the master and/or slave nodes is lost. Second, it plays a certain role in service discovery. Zookeeper and Consul work in a similar way in other systems. That is, when a new client tries to write something to the Redis store, Sentinel will tell the client which Redis instance is currently the master.
So Sentinel nodes constantly monitor the availability of Redis instances and send information about them to clients, which allows clients to take certain actions in cases where the store fails.
Here are the functions that Sentinel nodes perform:
- Monitoring: Ensure that master and slave nodes are operating as expected.
- Sending notifications to administrators: The system sends notifications to administrators about incidents in Redis instances.
- Managing failover: Sentinel nodes can initiate a failover process if the master Redis instance is unavailable and a sufficient number (quorum) of nodes agree that this is the case.
- Configuration management: Sentinel nodes also act as a system that allows the current master Redis instance to be discovered.
Using Redis Sentinel to solve the above problems allows you to detect Redis failures. The failure detection procedure involves getting agreement from multiple Sentinel nodes that the current master Redis instance is unavailable. The process of getting such agreement is called a quorum. This allows you to increase the reliability of the system, protect against situations where one of the processes behaves incorrectly and cannot connect to the master Redis node.
A quorum is the minimum number of votes that a distributed system needs to receive in order to be allowed to perform certain operations, such as failover. This number is configurable, but it should reflect the number of nodes in the distributed system in question. Most distributed systems are three or five nodes in size, in which case the quorum is two or three votes. Odd numbers of nodes are preferable in cases where the system needs to resolve ambiguities.
Redis Sentinel also has its drawbacks. Therefore, we will look at some recommendations and practical tips regarding this service.
Redis Sentinel can be deployed in a number of ways. Honestly, I'd need details about the setup you plan to use Redis Sentinel in to give any meaningful recommendations. As a general rule, I'd recommend running a Sentinel node alongside each of your application servers (if possible). This will eliminate network connectivity differences between Sentinel nodes and clients running Redis.
Sentinel can also be run on the same machines as Redis instances, or even as independent nodes, but this complicates things in various ways. I recommend using at least three nodes with a quorum of at least two. Here's a simple table that describes the number of servers in the cluster, quorum information, and the number of failures allowed.
These metrics will vary from system to system, but the general idea is similar to what is expressed in the table.
Think about what could go wrong in a system running Sentinel. If such a system is run long enough, you could run into all of these problems.
- What if the Sentinel nodes fall out of quorum?
- What if the network splits and the old master Redis instance ends up in a smaller group of Sentinel nodes? What happens to the data written to that Redis instance? (Hint: this data will be lost after a full system recovery.)
- What happens if the network topologies of the Sentinel nodes and the client (application) nodes are inconsistent?
There are no guarantees for system resiliency, especially since writes to disk (more on that below) are asynchronous. There's also the pesky issue of when clients learn about new leaders. How many writes will go to waste if the new leader is unknown? The Redis developers recommend querying for the new leader when establishing new connections. Depending on your system configuration, this can lead to significant data loss.
There are a few ways to mitigate this data loss by forcing the Redis master to replicate writes to at least one slave. Remember that replication in Redis is asynchronous, and that it has its own drawbacks. You'll need to track acknowledgments independently, and if you can't get an acknowledgment from at least one slave, the master should stop accepting writes.
Redis Cluster
Redis cluster. Clients perform read/write operations by interacting with Redis master (M1, M2, M3) nodes. Data is replicated between master and slave (S1, S2, S3) nodes. Other clients perform data read operations by accessing slave nodes. The Gossip protocol is used to determine the overall state of the cluster.
I'm sure many people are wondering what to do if they can't store all the necessary data in memory on a single machine. These days, the maximum amount of RAM available on a single server is 24 TiB, and AWS has such configurations. Of course, this is a lot, but for some systems, this is not enough even for organizing a cache.
Redis Cluster allows you to horizontally scale Redis storage.
As a system grows, its owner can choose one of the following three options:
- Work less (nobody does this, because we are insatiable monsters).
- Increase the power of individual computers.
- Distribute the load across more small computers.
We will only take the last two items on this list seriously. They are known as vertical and horizontal scaling, respectively. Vertical scaling uses more advanced computers to speed up the system, hoping that the increased computing power will allow it to successfully cope with the growing load. But even if the expectations are met at first, eventually the one using vertical scaling will face the limitations of the hardware.
After vertical scaling has exhausted itself (and most likely, hopefully, long before that), it will be necessary to turn to horizontal scaling. This is the distribution of the load across many small machines responsible for solving small subtasks of one large task.
Let's get the terminology straight. Once we decide to use Redis Cluster, this means that we decide to distribute the data we store across many machines. This is called sharding. As a result, each Redis instance in the cluster is considered to store a shard, or fragment, of all the data.
This approach brings to life a new problem. If you send data to the cluster, how do you know which Redis instance (shard) stores this data? There are several ways to do this. Redis Cluster uses algorithmic sharding.
In order to find a shard for a given key, we hash the key and divide the result modulo the number of shards. Then, using a deterministic hash function (thanks to which a specific key will always correspond to the same shard), when we need to read the corresponding data, we can find out where exactly it is stored.
What happens if a new shard is added to the system after some time? What happens is called resharding.
Assuming that the foo key was assigned to shard 0, it can be resharded to shard 5. But moving data around to fit the new shard configuration is slow and unrealistic if we want storage expansion operations to be fast. Moving data around like this will also have a negative impact on the availability of Redis Cluster.
Redis Cluster has a mechanism to solve this problem. These are the so-called "hash slots" where data is sent. There are about 16,000 of these slots. This gives us an adequate way to distribute data across the cluster, and when adding new shards, we just need to move hash slots around the system. By doing this, we only need to move hash slots from shard to shard and simplify the process of adding new master Redis instances to the cluster.
This can be done without any system downtime and with minimal impact on its performance. Let's look at an example.
Node M1 contains hash slots from 0 to 8191.
Node M2 contains hash slots from 8192 to 16383.
When assigning a hash slot to key foo, we calculate the deterministic hash of the key (foo) and divide it modulo the number of hash slots (16383). As a result, the data corresponding to this key goes to node M2. Now, suppose we add a new node to the system - M3. The new mapping of nodes and hash slots will be as follows:
Node M1 contains hash slots 0 through 5460.
Node M2 contains hash slots 5461 through 10922.
Node M3 contains hash slots 10923 through 16383.
All keys that were in node M1's hash slots, now belonging to node M2, would need to be moved. But the correspondence between individual keys and hash slots is preserved, since the keys are already distributed across hash slots. Thus, this mechanism solves the resharding problem when using algorithmic sharding.
Gossip Protocol
Redis Cluster uses the Gossip protocol to determine the overall health of the cluster. In the above illustration, there are 3 master (M) nodes and 3 slave (S) nodes. All of these nodes are constantly communicating with each other to know which shards are available and ready to handle requests. If enough shards agree that node M1 is not responding to requests, they may decide to promote S1, M1's slave, to the master rank to keep the cluster alive. The number of nodes required to run this procedure is configurable. It is very important to get this setup right. If you get it wrong, you could end up with a cluster that is split into pieces if it cannot resolve an ambiguous situation where the same number of systems vote for and against it. This phenomenon is called a "split brain". Therefore, as a general rule, it is important to have an odd number of master nodes in the cluster, each with two slave nodes. This will serve as a good basis for building a reliable system.
Redis Persistence Models
If you plan to store any data in Redis with the expectation that it will be stored reliably, it is important to understand how Redis does this. There are many situations where losing data stored in Redis is not such a big deal. For example, using Redis as a cache, or in situations where Redis stores data for some real-time analytics system.
In other cases, developers need some guarantees about the persistence of data and the ability to recover it.
Redis is a fast store, and any guarantees about data integrity are secondary to speed. This may be a controversial topic, but it is true.
Persistent storage models in Redis. Data from memory is copied either to RDB, as snapshots, or to AOF. If a Redis instance fails but the instance's data was stored in persistent storage, that data is loaded into a new Redis instance.
Persistent data storage is not used
If necessary, persistent data storage can be disabled. This is the configuration when using Redis, which works the fastest, but does not guarantee reliable data storage.
RDB files
Persistent data storage in RDB files implies the creation of snapshots containing data relevant at certain points in time. Snapshots are created at specified time intervals.
The main disadvantage of this mechanism is that data received in the storage between the moments of snapshot creation will be lost if Redis fails. In addition, this data storage mechanism relies on creating a fork of the main process. When working with large data sets, this can lead to short-term delays in query processing. However, RDB files are loaded into memory much faster than AOF.
AOF
The persistent storage mechanism based on AOF logs every write operation that the server receives a request to perform. These operations will be replayed when the server starts, which will recreate the original data set.
This approach to persistent data storage is much more reliable than RDB. After all, we are not talking about snapshots of the storage state, but about files designed only for data to be appended to them. When operations occur, they are buffered in the journal, but they are not immediately placed in the persistent storage. The journal contains the actual commands that, if data needs to be restored, are run in the order in which they were executed.
Then, when possible, the journal is flushed to disk using fsync (the exact time of this process can be configured). After that, the data is in the persistent storage. The downside of this approach is that this data storage format is not compact, it requires more disk space than RDB files.
The fsync() call transfers ("flushes") all modified data from memory (that is, modified file buffer pages) related to the file represented by the file descriptor fd to the disk device (or other device for permanent storage of information). As a result, all modified information can be restored even after a serious failure or reboot of the system.
For various reasons, changes made to an open file first go to the cache, and the fsync() call ensures that they are physically saved to the disk, that is, they can be read from the disk later.
Why not use both RDB and AOF?
You can combine AOF and RDB in the same Redis instance. If you are okay with the reliability of storing data in exchange for some speed reduction, you can do so. I think this is an acceptable way to use Redis. However, if the system is rebooted, remember that Redis will use AOF to restore data, since this storage contains a more complete version of the data.
Forking Redis processes
Now that we have covered the mechanisms of organizing persistent data storage in Redis, let's talk about how this is done in a single-threaded application like Redis.
Creates a fork of the Redis process. The snapshot contains data relevant at a certain point in time. After the fork is created, the data is copied to disk.
I think the best thing about Redis is how it uses forking and copy-on-write to implement high-performance copying of data to persistent storage.
A fork is when the operating system, on command from a process, creates a new process by copying the parent process. As a result, we have a new process ID and some other useful information at our disposal. In this case, the newly created fork of the process (the child process) can interact with the parent process.
Now comes the fun part. Redis is a process that has a huge amount of memory allocated. How can we copy such a process and not run out of memory?
When a process is forked, the parent and child processes share memory. Redis starts the process of creating a snapshot in the child process. This is possible thanks to a memory sharing technique called “copy-on-write”. In this case, no memory is allocated during the fork, but references to already allocated memory are used. If nothing in memory has changed when the child process flushes data to disk, no new memory will be allocated.
If there have been changes, the following happens. The operating system kernel tracks references to each memory page. If there is more than one reference to a page, the changes are written to new pages. The child process knows nothing about the changes; it has a stable snapshot of memory at its disposal. As a result, only a small amount of memory is used to create a fork of the process. We can quickly and efficiently create snapshots that reflect the state of the storage at a certain point in time, the size of which can reach many gigabytes.
Summary
We've covered various issues related to the fast, in-memory Redis data store. If you haven't worked with this storage before, but after reviewing its capabilities, you realize that it might be useful to you, we recommend visiting the project's website and trying out Redis in action.
0 Comments
Recommended Comments
There are no comments to display.