banner
ShuWa

ShuWa

是进亦忧,退亦忧。然则何时而乐耶?
twitter

Redis Data Types

Data Types and Application Scenarios#

Redis provides a rich set of data types, commonly including five: String, Hash, List, Set, and Zset (sorted set). With the updates of Redis versions, four more data types were later supported: BitMap (added in version 2.2), HyperLogLog (added in version 2.8), GEO (added in version 3.2), and Stream (added in version 5.0).

String#

String is the most basic key-value structure, where the key is a unique identifier and the value is the specific value. The value can not only be a string but also a number (integer or floating point), and the maximum length of data that can be accommodated in the value is 512M.

Internal Implementation#

The underlying data structure implementation of the String type mainly consists of int and SDS (Simple Dynamic Strings).

Compared to C's native strings:

  • SDS can store not only text data but also binary data. This is because SDS uses the value of the len attribute instead of a null character to determine if the string has ended, and all APIs of SDS handle the data stored in the buf[] array in a binary manner. Therefore, SDS can store not only text data but also binary data such as images, audio, video, and compressed files.
  • The time complexity for obtaining the length of a string in SDS is O(1). C language strings do not record their own length, so obtaining the length has a complexity of O(n); whereas the SDS structure records the string length using the len attribute, so the complexity is O(1).
  • Redis's SDS API is safe, and concatenating strings will not cause buffer overflow. Before concatenating strings, SDS checks whether the SDS space meets the requirements, and if the space is insufficient, it will automatically expand, thus avoiding buffer overflow issues.

The internal encoding of string objects has three types: int, raw, and embstr.

If a string object stores an integer value that can be represented by a long type, the integer value will be stored in the ptr attribute of the string object structure (converting void* to long), and the encoding of the string object will be set to int.

If the string object stores a string and the length of this string is less than or equal to 32 bytes (Redis version 2.+), the string object will use a simple dynamic string (SDS) to store this string and set the object's encoding to embstr. The embstr encoding is an optimized encoding method specifically for storing short strings.

image

If the string object stores a string and the length of this string is greater than 32 bytes (Redis version 2.+), the string object will use a simple dynamic string (SDS) to store this string and set the object's encoding to raw.

image

Note that the boundary between embstr encoding and raw encoding varies across different Redis versions:

  • Redis 2.+ is 32 bytes
  • Redis 3.0-4.0 is 39 bytes
  • Redis 5.0 is 44 bytes

Both embstr and raw encodings use SDS to store values, but the difference is that embstr allocates a contiguous block of memory to store redisObject and SDS through a single memory allocation function, while raw allocates two separate memory blocks to store redisObject and SDS by calling memory allocation functions twice. This approach has many benefits:

  • The number of memory allocations required to create a string object using embstr is reduced from two to one compared to raw encoding;
  • Releasing an embstr-encoded string object also only requires calling a memory release function once;
  • Since all data of the embstr-encoded string object is stored in a contiguous memory block, it can better utilize CPU cache to improve performance.

However, embstr also has drawbacks:

  • If the length of the string increases and requires reallocating memory, both the entire redisObject and sds need to be reallocated, so embstr-encoded string objects are effectively read-only. Redis has not written any corresponding modification programs for embstr-encoded string objects. When we perform any modification commands (e.g., append) on embstr-encoded string objects, the program will first convert the object's encoding from embstr to raw before executing the modification command.

Common Commands#

# Set key-value type value
> SET name lin
OK
# Get corresponding value by key
> GET name
"lin"
# Check if a key exists
> EXISTS name
(integer) 1
# Return the length of the string value stored by the key
> STRLEN name
(integer) 3
# Delete the value corresponding to a key
> DEL name
(integer) 1

# Batch set key-value type values
> MSET key1 value1 key2 value2 
OK
# Batch get multiple values corresponding to keys
> MGET key1 key2 
1) "value1"
2) "value2"

# Set key-value type value
> SET number 0
OK
# Increment the number value stored in the key
> INCR number
(integer) 1
# Add 10 to the number value stored in the key
> INCRBY number 10
(integer) 11
# Decrement the number value stored in the key
> DECR number
(integer) 10
# Subtract 10 from the number value stored in the key
> DECRBY number 10
(integer) 0

# Set key to expire in 60 seconds (this method sets the expiration time for an existing key)
> EXPIRE name  60 
(integer) 1
# Check how long the data will expire
> TTL name 
(integer) 51

# Set key-value type value and set the expiration time for this key to 60 seconds
> SET key  value EX 60
OK
> SETEX key  60 value
OK

# Insert if not exists
>SETNX key value
(integer) 1

Application Scenarios#

Caching Objects#
  • Directly cache the entire object's JSON, command example: SET user:1 '{"name":"xiaolin", "age":18}'.
  • Separate the key into user:ID, use MSET to store, and use MGET to retrieve each attribute value, command example: MSET user:1 xiaolin user:1 18 user:2 xiaomei user:2 20.
General Counting#

Since Redis processes commands in a single-threaded manner, the execution of commands is atomic. Therefore, the String data type is suitable for counting scenarios, such as counting access times, likes, shares, inventory quantities, etc.

Distributed Locks#

The SET command has an NX parameter that can achieve "insert only if the key does not exist," which can be used to implement distributed locks:

  • If the key does not exist, it indicates successful insertion, which can represent a successful lock;
  • If the key exists, it indicates insertion failure, which can represent a failed lock.

Generally, an expiration time is also added to the distributed lock, and the command for the distributed lock is as follows:

SET lock_key unique_value NX PX 10000

The unlocking process is to delete the lock_key key, but it cannot be deleted randomly; it must ensure that the client executing the operation is the one that acquired the lock. Therefore, when unlocking, we first check whether the lock's unique_value belongs to the locking client; if so, we delete the lock_key key.

Shared Session Information#

Typically, when developing a backend management system, we use sessions to save user session (login) states. These session information will be stored on the server side, but this only applies to single-system applications. If it is a distributed system, this model will no longer be applicable.

Therefore, we need to leverage Redis to uniformly store and manage these session information, so that regardless of which server the request is sent to, the server will go to the same Redis to retrieve the relevant session information, thus solving the problem of session storage in distributed systems.

List#

The List is a simple list of strings, sorted in the order of insertion, and elements can be added to the List from either the head or the tail. The maximum length of a list is 2^32 - 1, meaning each list can support over 4 billion elements.

Internal Implementation#

The underlying data structure of the List type is implemented by doubly linked lists or zip lists:

  • If the number of elements in the list is less than 512 (default value, configurable by list-max-ziplist-entries), and the value of each element is less than 64 bytes (default value, configurable by list-max-ziplist-value), Redis will use a zip list as the underlying data structure for the List type;
  • If the number of elements in the list does not meet the above conditions, Redis will use a doubly linked list as the underlying data structure for the List type;

However, after Redis version 3.2, the underlying data structure of the List data type is implemented solely by quicklist, replacing both the doubly linked list and the zip list.

Common Commands#

# Insert one or more values at the head (left) of the key list, with the last value at the front
LPUSH key value [value ...] 
# Insert one or more values at the tail (right) of the key list
RPUSH key value [value ...]
# Remove and return the head element of the key list
LPOP key     
# Remove and return the tail element of the key list
RPOP key 

# Return the elements in the specified range of the list key, with the range specified by the start and stop offsets, starting from 0
LRANGE key start stop

# Pop an element from the head of the key list, blocking for timeout seconds if none exists; if timeout=0, block indefinitely
BLPOP key [key ...] timeout
# Pop an element from the tail of the key list, blocking for timeout seconds if none exists; if timeout=0, block indefinitely
BRPOP key [key ...] timeout

Application Scenarios#

Message Queue#

A message queue must meet three requirements when storing and retrieving messages: message ordering, handling duplicate messages, and ensuring message reliability. Redis's List and Stream data types can meet these three requirements for message queues.

What are the shortcomings of using List as a message queue?

List does not support multiple consumers consuming the same message because once a consumer pulls a message, that message is deleted from the List and cannot be consumed again by other consumers.

To allow a message to be consumed by multiple consumers, multiple consumers must be grouped into a consumer group so that they can consume the same message, but the List type does not support the implementation of consumer groups.

Hash#

Hash is a collection of key-value pairs (key - value), where the value is in the form: value=[{field1, value1}, ... {fieldN, valueN}]. Hash is particularly suitable for storing objects.

Internal Implementation#

The underlying data structure of the Hash type is implemented by either a zip list or a hash table:

  • If the number of elements in the hash type is less than 512 (default value, configurable by hash-max-ziplist-entries), and all values are less than 64 bytes (default value, configurable by hash-max-ziplist-value), Redis will use a zip list as the underlying data structure for the Hash type;
  • If the hash type elements do not meet the above conditions, Redis will use a hash table as the underlying data structure for the Hash type.

In Redis 7.0, the zip list data structure has been deprecated and replaced by the listpack data structure.

Common Commands#

# Store a key-value pair in a hash table
HSET key field value   
# Get the value corresponding to the field key in the hash table
HGET key field

# Store multiple key-value pairs in a hash table
HMSET key field value [field value...] 
# Batch get multiple field values in the hash table
HMGET key field [field ...]       
# Delete field values in the hash table
HDEL key field [field ...]    

# Return the number of fields in the hash table
HLEN key       
# Return all key-value pairs in the hash table
HGETALL key 

# Increment the value of field in the hash table by n
HINCRBY key field n                         

Application Scenarios#

Caching Objects#

The structure of Hash type (key, field, value) is similar to that of objects (object id, attribute, value), and can also be used to store objects.

# Store a key-value pair in a hash table uid:1
> HMSET uid:1 name Tom age 15
2
# Store a key-value pair in a hash table uid:2
> HMSET uid:2 name Jerry age 13
2
# Get all key-value pairs for user id 1 in the hash table
> HGETALL uid:1
1) "name"
2) "Tom"
3) "age"
4) "15"
Shopping Cart#

Using user id as the key, product id as the field, and product quantity as the value, it perfectly constitutes the three elements of a shopping cart, as shown below.
The commands involved are as follows:

  • Add product: HSET cart:{user_id} {product_id} 1
  • Add quantity: HINCRBY cart:{user_id} {product_id} 1
  • Total number of products: HLEN cart:{user_id}
  • Delete product: HDEL cart:{user_id} {product_id}
  • Get all products in the shopping cart: HGETALL cart:{user_id}

Initially, only the product ID is stored in Redis, and when displaying the specific information of the product, it still needs to query the database once with the product ID to get the complete product information.

Set#

The Set type is a unordered and unique collection of key-value pairs, and its storage order does not follow the order of insertion.

A set can store up to 2^32-1 elements. The concept is similar to sets in mathematics, allowing for intersection, union, difference, etc. Therefore, the Set type not only supports CRUD operations within the set but also supports intersection, union, and difference operations among multiple sets.

Internal Implementation#

The underlying data structure of the Set type is implemented by either a hash table or an integer set:

  • If all elements in the set are integers and the number of elements is less than 512 (default value, configurable by set-maxintset-entries), Redis will use an integer set as the underlying data structure for the Set type;
  • If the elements in the set do not meet the above conditions, Redis will use a hash table as the underlying data structure for the Set type.

Common Commands#

# Add elements to the set key, ignoring if the element already exists; if the key does not exist, create it
SADD key member [member ...]
# Remove elements from the set key
SREM key member [member ...] 
# Get all elements in the set key
SMEMBERS key
# Get the number of elements in the set key
SCARD key

# Check if member exists in the set key
SISMEMBER key member

# Randomly select count elements from the set key, elements are not removed from the key
SRANDMEMBER key [count]
# Randomly select count elements from the set key, elements are removed from the key
SPOP key [count]

# Intersection operation
SINTER key [key ...]
# Store the intersection result in a new set destination
SINTERSTORE destination key [key ...]

# Union operation
SUNION key [key ...]
# Store the union result in a new set destination
SUNIONSTORE destination key [key ...]

# Difference operation
SDIFF key [key ...]
# Store the difference result in a new set destination
SDIFFSTORE destination key [key ...]

Application Scenarios#

The Set type is suitable for data deduplication and ensuring data uniqueness, and can also be used to count intersections, differences, and unions among multiple sets. However, there is a potential risk. The computational complexity of the difference, union, and intersection operations of sets is relatively high, and in the case of large data volumes, directly executing these calculations may lead to Redis instance blocking.

Likes#

The Set type can ensure that a user can only like once. Here is an example where the key is the article id and the value is the user id.

# User uid:1 likes article article:1
> SADD article:1 uid:1
(integer) 1
# User uid:2 likes article article:1
> SADD article:1 uid:2
(integer) 1
# User uid:3 likes article article:1
> SADD article:1 uid:3
(integer) 1
User uid:1 cancels the like on article:1.

> SREM article:1 uid:1
(integer) 1
Get all users who liked article:1:

> SMEMBERS article:1
1) "uid:3"
2) "uid:2"
Get the number of users who liked article:1:

> SCARD article:1
(integer) 2
Check if user uid:1 liked article:1:

> SISMEMBER article:1 uid:1
(integer) 0  # Returns 0 indicating no like, returns 1 indicating liked
Mutual Followers#

The Set type supports intersection operations, so it can be used to calculate mutual friends, public accounts, etc.

The key can be the user id, and the value can be the id of the public accounts followed.

# User uid:1 follows public accounts with ids 5, 6, 7, 8, 9
> SADD uid:1 5 6 7 8 9
(integer) 5
# User uid:2 follows public accounts with ids 7, 8, 9, 10, 11
> SADD uid:2 7 8 9 10 11
(integer) 5

Mutual public accounts followed by uid:1 and uid:2:

# Get mutual follows
> SINTER uid:1 uid:2
1) "7"
2) "8"
3) "9"

Recommend public accounts followed by uid:1 to uid:2:

> SDIFF uid:1 uid:2
1) "5"
2) "6"
Verify whether a certain public account is followed by both uid:1 or uid:2:

> SISMEMBER uid:1 5
(integer) 1 # Returns 0, indicating followed
> SISMEMBER uid:2 5
(integer) 0 # Returns 0, indicating not followed
Lottery Activities#

Store the usernames of winners in a certain activity. The Set type can ensure that the same user does not win twice.

The key is the name of the lottery activity, and the value is the employee name. Put all employee names into the lottery box:

>SADD lucky Tom Jerry John Sean Marry Lindy Sary Mark
(integer) 5
If repeated wins are allowed, the SRANDMEMBER command can be used.

# Draw 1 first prize:
> SRANDMEMBER lucky 1
1) "Tom"
# Draw 2 second prizes:
> SRANDMEMBER lucky 2
1) "Mark"
2) "Jerry"
# Draw 3 third prizes:
> SRANDMEMBER lucky 3
1) "Sary"
2) "Tom"
3) "Jerry"
If repeated wins are not allowed, the SPOP command can be used.

# Draw 1 first prize
> SPOP lucky 1
1) "Sary"
# Draw 2 second prizes
> SPOP lucky 2
1) "Jerry"
2) "Mark"
# Draw 3 third prizes
> SPOP lucky 3
1) "John"
2) "Sean"
3) "Lindy"

Zset#

The Zset type (sorted set type) adds a sorting attribute score (value) compared to the Set type. For a sorted set ZSet, each stored element consists of two values: one is the element value of the sorted set, and the other is the sorting value.

Sorted sets retain the property that sets cannot have duplicate members (the score can be duplicated), but the difference is that the elements in sorted sets can be sorted.

Internal Implementation#

The underlying data structure of the Zset type is implemented by either a zip list or a skip list:

  • If the number of elements in the sorted set is less than 128 and each element's value is less than 64 bytes, Redis will use a zip list as the underlying data structure for the Zset type;
  • If the elements in the sorted set do not meet the above conditions, Redis will use a skip list as the underlying data structure for the Zset type;

In Redis 7.0, the zip list data structure has been deprecated and replaced by the listpack data structure.

Common Commands#

Common operations for Zset:

# Add elements with scores to the sorted set key
ZADD key score member [[score member]...]   
# Remove elements from the sorted set key
ZREM key member [member...]                 
# Return the score of the element member in the sorted set key
ZSCORE key member
# Return the number of elements in the sorted set key
ZCARD key 

# Increment the score of the element member in the sorted set key by increment
ZINCRBY key increment member 

# Get elements from the sorted set key in ascending order from start index to stop index
ZRANGE key start stop [WITHSCORES]
# Get elements from the sorted set key in descending order from start index to stop index
ZREVRANGE key start stop [WITHSCORES]

# Return members in the sorted set within a specified score range, sorted from low to high.
ZRANGEBYSCORE key min max [WITHSCORES] [LIMIT offset count]

# Return members within a specified member range, sorted in lexicographical order, with the same score.
ZRANGEBYLEX key min max [LIMIT offset count]
# Return members within a specified member range, sorted in reverse lexicographical order, with the same score
ZREVRANGEBYLEX key max min [LIMIT offset count]
Zset operation (compared to Set type, ZSet type does not support difference operations):

# Union calculation (same elements' scores are summed), numberkeys is the total number of keys, WEIGHTS is the score multiplier for each key
ZUNIONSTORE destkey numberkeys key [key...] 
# Intersection calculation (same elements' scores are summed), numberkeys is the total number of keys, WEIGHTS is the score multiplier for each key
ZINTERSTORE destkey numberkeys key [key...]

Application Scenarios#

Leaderboard#
 article:1 received 200 likes
> ZADD user:xiaolin:ranking 200 article:1
(integer) 1
# article:2 received 40 likes
> ZADD user:xiaolin:ranking 40 article:2
(integer) 1
# article:3 received 100 likes
> ZADD user:xiaolin:ranking 100 article:3
(integer) 1
# article:4 received 50 likes
> ZADD user:xiaolin:ranking 50 article:4
(integer) 1
# article:5 received 150 likes
> ZADD user:xiaolin:ranking 150 article:5
(integer) 1
To add a like to article article:4, you can use the ZINCRBY command (to increment the score of the element member in the sorted set key):

> ZINCRBY user:xiaolin:ranking 1 article:4
"51"
To check the number of likes for a specific article, you can use the ZSCORE command (to return the score of the element in the sorted set key):

> ZSCORE user:xiaolin:ranking article:4
"50"
To get the top 3 articles with the most likes for Xiaolin, you can use the ZREVRANGE command (to get elements from the sorted set key in descending order from start index to stop index):

# WITHSCORES indicates that the score should also be displayed
> ZREVRANGE user:xiaolin:ranking 0 2 WITHSCORES
1) "article:1"
2) "200"
3) "article:5"
4) "150"
5) "article:3"
6) "100"
To get articles with likes between 100 and 200 for Xiaolin, you can use the ZRANGEBYSCORE command (to return members in the sorted set within a specified score range, sorted from low to high):

> ZRANGEBYSCORE user:xiaolin:ranking 100 200 WITHSCORES
1) "article:3"
2) "100"
3) "article:5"
4) "150"
5) "article:1"
6) "200"

BitMap#

Bitmap is a continuous array of binary numbers (0 and 1), which can locate elements through an offset. BitMap uses the smallest unit bit to set 0|1, indicating the value or state of an element, with a time complexity of O(1).

Since a bit is the smallest unit in a computer, using it for storage is very space-efficient, making it particularly suitable for scenarios with large data volumes and binary statistics.

Internal Implementation#

Bitmap itself is implemented as a data type that counts binary states using the String type as the underlying data structure.

The String type is stored as a binary byte array, so Redis utilizes each bit of the byte array to represent the binary state of an element. You can think of Bitmap as a bit array.

Basic Bitmap Operations:#

# Set value, where value can only be 0 or 1
SETBIT key offset value

# Get value
GETBIT key offset

# Get the count of values equal to 1 in the specified range
# start and end are in bytes
BITCOUNT key start end
Bitmap operation:

# Operations between BitMaps
# operations are bitwise operators, enumerated values
  AND bitwise AND &
  OR bitwise OR |
  XOR bitwise XOR ^
  NOT bitwise NOT ~
# result is the calculated result, which will be stored in this key
# key1 … keyn are the keys participating in the operation, can be multiple, separated by spaces; not operation can only have one key
# When BITOP processes strings of different lengths, the shorter string's missing parts will be treated as 0. The return value is the length of the string saved to destkey (in bytes), equal to the length of the longest input key.
BITOP [operations] [result] [key1] [keyn…]

# Return the position of the first occurrence of the specified value (0/1) in the specified key
BITPOS [key] [value]

Application Scenarios#

Sign-in Statistics#
Assuming we want to count the sign-in status of user ID 100 for June 2022, we can perform the following steps.

First, execute the command below to record that the user signed in on June 3rd.

SETBIT uid:sign:100:202206 2 1
Next, check whether the user signed in on June 3rd.

GETBIT uid:sign:100:202206 2 
Finally, count the number of sign-ins for the user in June.

BITCOUNT uid:sign:100:202206
This way, we know the user's sign-in status for June.

We can execute this command to get the first sign-in date for userID = 100 in June 2022:

BITPOS uid:sign:100:202206 1
Count of Users with Continuous Sign-ins#

To count the number of users with 3 consecutive sign-ins, we can perform an AND operation on three bitmaps and save the result to destmap, then execute BITCOUNT on destmap, as follows:

# AND operation
BITOP AND destmap bitmap:01 bitmap:02 bitmap:03
# Count the number of bits = 1
BITCOUNT destmap

Even if one billion data points are generated in a day, the Bitmap occupies little memory, approximately 12 MB (10^8/8/1024/1024). The memory overhead for a week's Bitmap is about 84 MB. It is also advisable to set an expiration time for the Bitmap to allow Redis to delete expired sign-in data and save memory.

HyperLogLog#

HyperLogLog provides approximate deduplication counting. In Redis, each HyperLogLog key only requires 12 KB of memory to count the cardinality of nearly 2^64 different elements, making HyperLogLog very space-efficient compared to Set and Hash types, which consume more memory as the number of elements increases.

Internal Implementation#

Common Commands#

# Add specified elements to HyperLogLog
PFADD key element [element ...]

# Return the estimated cardinality of the given HyperLogLog.
PFCOUNT key [key ...]

# Merge multiple HyperLogLogs into one HyperLogLog
PFMERGE destkey sourcekey [sourcekey ...]

Application Scenarios#

Counting Millions of Webpage UVs#
When counting UVs, you can use the PFADD command (used to add new elements to HyperLogLog) to add each user visiting the page to HyperLogLog.

PFADD page1:uv user1 user2 user3 user4 user5
Next, you can directly obtain the UV value of page1 using the PFCOUNT command, which returns the statistical result of HyperLogLog.

PFCOUNT page1:uv

GEO#

Mainly used to store geographical location information and perform operations on the stored information. In daily life, we increasingly rely on searching for "nearby restaurants" and calling cars on ride-hailing apps, all of which depend on location-based services (LBS) applications. The data accessed by LBS applications is a set of latitude and longitude information associated with people or objects, and it must be able to query adjacent latitude and longitude ranges, making GEO very suitable for LBS service scenarios.

Internal Implementation#

GEO does not design a new underlying data structure but directly uses the Sorted Set collection type.

The GEO type uses the GeoHash encoding method to convert latitude and longitude into the weight score of elements in the Sorted Set. The two key mechanisms involved are "dividing the two-dimensional map into intervals" and "encoding the intervals." When a set of latitude and longitude falls within a certain interval, the encoding value of that interval is used to represent it, and the encoding value is treated as the weight score of the Sorted Set element.

This way, we can store latitude and longitude in the Sorted Set and utilize the "ordered range lookup by weight" feature provided by the Sorted Set to meet the frequent "search nearby" needs in LBS services.

Common Commands#

# Store specified geographical locations, adding one or more longitude, latitude, and location name (member) to the specified key.
GEOADD key longitude latitude member [longitude latitude member ...]

# Return all specified names (members) positions (longitude and latitude) from the given key, returning nil if not found.
GEOPOS key member [member ...]

# Return the distance between two given locations.
GEODIST key member1 member2 [m|km|ft|mi]

# Get the geographical location set within a specified range based on user-provided latitude and longitude coordinates.
GEORADIUS key longitude latitude radius m|km|ft|mi [WITHCOORD] [WITHDIST] [WITHHASH] [COUNT count] [ASC|DESC] [STORE key] [STOREDIST key]

Application Scenarios#

Execute the following command to store the current latitude and longitude of the vehicle with ID 33 in the GEO collection:
> GEOADD cars:locations 116.034579 39.030452 33

When a user wants to find nearby ride-hailing cars, the LBS application can use the GEORADIUS command.
For example, when the LBS application executes the following command, Redis will look for vehicle information within 5 kilometers centered around the user's latitude and longitude (116.054579, 39.030452) and return it to the LBS application.

> GEORADIUS cars:locations 116.054579 39.030452 5 km ASC COUNT 10

Stream#

Redis Stream is a new data type added in Redis version 5.0, specifically designed for message queues. It perfectly implements message queues, supporting message persistence, automatic generation of globally unique IDs, acknowledgment message confirmation patterns, consumer group modes, etc., making message queues more stable and reliable.

Common Commands#

Stream message queue operation commands:

XADD: Insert messages, ensuring order, and can automatically generate a globally unique ID;
XLEN: Query message length;
XREAD: Used to read messages, can read data by ID;
XDEL: Delete messages by message ID;
DEL: Delete the entire Stream;
XRANGE: Read interval messages
XREADGROUP: Read messages in consumer group form;
XPENDING and XACK:
XPENDING command can be used to query all messages "read but not confirmed" by each consumer in the consumer group;
XACK command is used to confirm that message processing is complete to the message queue;

Application Scenarios#

The producer inserts a message using the XADD command:

# * indicates that Redis will automatically generate a globally unique ID for the inserted data
# Insert a message into the message queue named mymq, with the key name and value xiaolin
> XADD mymq * name xiaolin
"1654254953808-0"

The consumer reads messages from the message queue using the XREAD command, specifying a message ID and starting to read from the next message after that ID (note that it starts reading from the next message after the input message ID, not querying the input ID message).

# Start reading all subsequent messages from the message with ID 1654254953807-0 (in this example, there is a total of 1 message).
> XREAD STREAMS mymq 1654254953807-0
1) 1) "mymq"
   2) 1) 1) "1654254953808-0"
         2) 1) "name"
            2) "xiaolin"

If you want to implement blocking reads (blocking when there is no data), you can set the BLOCK configuration option when calling XREAD to achieve a blocking read operation similar to BRPOP.

For example, the following command sets the BLOCK 10000 configuration option, where 10000 is in milliseconds, indicating that when XREAD reads the latest message, if no message arrives, XREAD will block for 10000 milliseconds (i.e., 10 seconds) before returning.

# The "$" symbol at the end of the command indicates reading the latest message
> XREAD BLOCK 10000 STREAMS mymq $
(nil)
(10.00s)

Streams can create consumer groups using XGROUP. After creating a consumer group, Streams can use the XREADGROUP command to allow consumers within the group to read messages.

Create two consumer groups, both consuming messages from the mymq message queue, specifying to start reading from the first message:

# Create a consumer group named group1, 0-0 indicates starting from the first message.
> XGROUP CREATE mymq group1 0-0
OK
# Create a consumer group named group2, 0-0 indicates starting from the first message.
> XGROUP CREATE mymq group2 0-0
OK

Consumers in the consumer group group1 can read all messages from the mymq message queue as follows:

# The last parameter ">" indicates starting to read from the first unconsumed message.
> XREADGROUP GROUP group1 consumer1 STREAMS mymq >
1) 1) "mymq"
   2) 1) 1) "1654254953808-0"
         2) 1) "name"
            2) "xiaolin"

Once a message in the message queue is read by one consumer in the consumer group, it cannot be read by other consumers in that group. However, consumers from different consumer groups can consume the same message (provided that the different consumer groups specified the same position to start reading messages when creating the groups).
The purpose of using consumer groups is to allow multiple consumers within the group to share the load of reading messages, so we usually let each consumer read part of the messages, thus achieving a balanced distribution of message reading load among multiple consumers.

How to ensure that consumers can still read unprocessed messages after a failure or crash when using Streams to implement a message queue?

Streams automatically use an internal queue (also known as the PENDING List) to retain messages read by each consumer in the consumer group until the consumer uses the XACK command to notify Streams that "the message has been processed."

That's all for the message queue implemented based on Streams. To summarize:

  • Message ordering: XADD/XREAD
  • Blocking reads: XREAD block
  • Duplicate message processing: Streams automatically generate globally unique IDs when using the XADD command;
  • Message reliability: Internally uses the PENDING List to automatically save messages, using the XPENDING command to view messages that have been read but not confirmed by the consumer group, and consumers use XACK to confirm messages;
  • Supports consumer group consumption of data.

What are the differences between Redis's Stream message queue and professional message queues?

A professional message queue must achieve two major aspects:

  • Messages must not be lost.
  • Messages can accumulate.
  1. Can Redis Stream messages be lost?

A message queue is essentially divided into three parts: producer, queue middleware, and consumer.
Can Redis Stream message queue guarantee that data is not lost in all three stages?

  • Will the Redis producer lose messages? The producer will not lose messages, depending on how well it handles exceptions. As long as it can receive the ack confirmation response from the MQ middleware during the process of producing and submitting the message, it indicates successful sending. Therefore, as long as the return value and exceptions are handled well, if an exception occurs, the message will be resent, so this stage will not lose messages.
  • Will the Redis consumer lose messages? No, because Streams (MQ middleware) will automatically use the internal queue (also known as the PENDING List) to retain messages read by each consumer in the consumer group that have not been confirmed. Consumers can check the messages that have been read but not confirmed after restarting using the XPENDING command. Once the consumer completes the business logic, it sends the consumption confirmation XACK command, which also ensures that messages are not lost.
  • Will the Redis message middleware lose messages? Yes, Redis can lose data in the following two scenarios:
    • AOF persistence is configured to write to disk every second, but this write process is asynchronous, and there is a possibility of data loss when Redis crashes.
    • Master-slave replication is also asynchronous, and there is a possibility of data loss during master-slave switching.
  1. Can Redis Stream messages accumulate?

Since Redis stores all data in memory, this means that once message accumulation occurs, it will lead to continuous growth of Redis's memory. If it exceeds the machine's memory limit, it will face the risk of being OOM.
Therefore, Redis Stream provides the functionality to specify the maximum length of the queue to avoid this situation.
When the maximum length of the queue is specified, if the queue length exceeds the limit, old messages will be deleted, and only a fixed length of new messages will be retained. Thus, it can be seen that when message accumulation occurs, if a maximum length is specified, there is still a possibility of losing messages.
However, professional message queues like Kafka and RabbitMQ store their data on disk, meaning that when messages accumulate, they simply occupy more disk space.

Therefore, using Redis as a queue comes with two issues:

  • Redis itself may lose data;
  • Facing message pressure, memory resources will become tight;

Thus, whether Redis can be used as a message queue depends on your business scenario:

  • If your business scenario is simple enough, not sensitive to data loss, and the probability of message accumulation is relatively low, using Redis as a queue is completely feasible.
  • If your business has a massive amount of messages, a high probability of message accumulation, and cannot tolerate data loss, it is better to use a professional message queue middleware.

Redis Data Structures#

Key-Value Pairs#

In Redis, the key in the key-value pair is a string object, while the value can be either a string object or an object of a collection data type, such as List, Hash, Set, and Zset objects.

How are these key-value pairs stored in Redis?

Redis uses a "hash table" to store all key-value pairs, and the biggest advantage of a hash table is that it allows us to quickly find key-value pairs in O(1) time complexity. A hash table is essentially an array, and the elements in the array are called hash buckets.

How are Redis's hash buckets used to store key-value data?

Hash buckets store pointers (dictEntry*) that point to key-value data, so we can find key-value data through pointers. Since the value of key-value pairs can store string objects and collection data type objects, the data structure of key-value pairs does not directly store the value itself but stores pointers to void * key and void * value, which point to the actual key object and value object. This way, even if the value is a collection data type, it can be found through the void * value pointer.

SDS#

Let's introduce each member variable in the data structure:

  • len records the length of the string. This way, when obtaining the string length, we only need to return the value of this member variable, and the time complexity is O(1).
  • alloc is the length of the space allocated for the character array. This way, when modifying the string, we can calculate the remaining space size using alloc - len to determine whether the space meets the modification requirements. If it does not meet the requirements, the space of SDS will be automatically expanded to the size required for the modification before executing the actual modification operation. Therefore, using SDS does not require manually modifying the space size of SDS, and there will be no buffer overflow issues as mentioned earlier.
  • flags indicate different types of SDS. A total of 5 types are designed: sdshdr5, sdshdr8, sdshdr16, sdshdr32, and sdshdr64, which will be explained later.
  • buf[] is the character array used to store actual data. It can store not only strings but also binary data.

Compared to C's native strings, SDS:

  • SDS can store not only text data but also binary data. This is because SDS uses the value of the len attribute instead of a null character to determine if the string has ended, and all APIs of SDS handle the data stored in the buf[] array in a binary manner. Therefore, SDS can store not only text data but also binary data such as images, audio, video, and compressed files.
  • The time complexity for obtaining the length of a string in SDS is O(1). C language strings do not record their own length, so obtaining the length has a complexity of O(n); whereas the SDS structure records the string length using the len attribute, so the complexity is O(1).
  • Redis's SDS API is safe, and concatenating strings will not cause buffer overflow. Before concatenating strings, SDS checks whether the SDS space meets the requirements, and if the space is insufficient, it will automatically expand, thus avoiding buffer overflow issues.
  • Saves memory space. The SDS structure has a flags member variable that indicates the type of SDS.
    Redis has designed 5 types: sdshdr5, sdshdr8, sdshdr16, sdshdr32, and sdshdr64. The main difference among these 5 types lies in the data types of the len and alloc member variables in their data structures.
    The len and alloc data types of sdshdr16 are both uint16_t, indicating that the length of the character array and the allocated space size cannot exceed 2 to the power of 16.
    The reason for designing different types of structures for SDS is to flexibly save strings of different sizes, thereby effectively saving memory space. For example, when saving small strings, the structure header occupies less space.
    In addition to designing different types of structures, Redis also employs specific compilation optimizations to save memory space by declaring __attribute__ ((packed)) in the struct, which instructs the compiler to cancel the optimization alignment of the structure during the compilation process and align according to the actual occupied byte size.

Linked List#

The list structure provides a head pointer for the linked list, a tail node, the number of nodes len, and customizable dup, free, and match functions.
Since the memory of linked lists is non-contiguous, it means that CPU cache cannot be well utilized, and the value of each linked list node requires allocation of a linked list node structure header, resulting in higher memory overhead. Therefore, in Redis 3.0, the List object uses a "zip list" as the underlying data structure implementation when the data volume is relatively small, which has the advantage of saving memory space and is a compact memory data structure.
In Redis 5.0, a new data structure called listpack was designed, which inherited the compact memory layout of the zip list. Ultimately, in the latest Redis version, the zip list, which was one of the underlying data structure implementations for Hash and Zset objects, was replaced by listpack.

Zip List#

The zip list was developed by Redis to save memory; it is a sequential data structure composed of contiguous memory blocks, somewhat similar to an array.
However, the zip list also has its drawbacks:

  • It cannot store too many elements; otherwise, the query efficiency will decrease;
  • When adding or modifying an element, the memory space occupied by the zip list needs to be reallocated, which may even trigger a chain update issue.

Therefore, the zip list is only used as the underlying data structure when the number of elements contained in Redis objects (List, Hash, Zset) is relatively small, or when the element values are not large.

Structural Design#

The zip list has three fields in the header:

  • zlbytes records the total memory bytes occupied by the entire zip list;
  • zltail records the offset of the "tail" node from the starting address, indicating the distance to the tail;
  • zllen records the number of nodes contained in the zip list;
  • zlend marks the end point of the zip list, with a fixed value of 0xFF (decimal 255).

In the zip list, if we want to locate the first and last elements, we can directly locate them through the length of the three fields (zllen) in the header, with a complexity of O(1). However, for searching other elements, it is not as efficient and can only be searched one by one, resulting in a complexity of O(N). Therefore, zip lists are not suitable for storing too many elements.

Additionally, the structure of a zip list node (entry) is as follows:

  • prevlen records the length of the "previous node," which is used for backward traversal;
  • encoding records the "type and length" of the actual data in the current node, with two main types: string and integer.
  • data records the actual data of the current node, with the type and length determined by encoding.

This design philosophy of allocating different space sizes based on data size and type is precisely what Redis adopts to save memory.

Let's discuss how prevlen and encoding allocate different space sizes based on data size and type.

The prevlen attribute in each node of the zip list records the "length of the previous node," and the space size of the prevlen attribute depends on the length value of the previous node. For example:

  • If the length of the previous node is less than 254 bytes, the prevlen attribute needs 1 byte of space to store this length value;
  • If the length of the previous node is greater than or equal to 254 bytes, the prevlen attribute needs 5 bytes of space to store this length value;

The space size of the encoding attribute depends on whether the data is a string or an integer, as well as the length of the string, as shown in the following diagram (the content in the diagram represents the actual data, i.e., the data field of this article):

  • If the data of the current node is an integer, the encoding will use 1 byte of space for encoding, meaning the encoding length is 1 byte. By confirming the integer type through encoding, the actual size of the integer data can be determined. For example, if the encoding confirms that the data is an int16 integer, then the length of the data is the size of int16.
  • If the data of the current node is a string, the encoding will use 1 byte/2 bytes/5 bytes of space for encoding based on the length of the string. The first two bits of the encoding indicate the data type, and the subsequent bits indicate the actual length of the string data, i.e., the length of data.

Chain Update#

When adding or modifying an element in the zip list, if there is insufficient space, the memory space occupied by the zip list needs to be reallocated. When a newly inserted element is larger, it may cause the prevlen of subsequent elements to change, leading to the "chain update" problem, causing every element's space to be reallocated and degrading the performance of accessing the zip list.
Therefore, the zip list is only used in scenarios where the number of nodes is small enough; even if a chain update occurs, it is still acceptable.

Hash Table#

A hash table is an array (dictEntry table), where each element in the array is a pointer to a "hash table node (dictEntry)."
The dictEntry structure not only contains pointers to the key and value but also contains a pointer to the next hash table node, which can link multiple key-value pairs with the same hash value together, thus solving the problem of hash collisions. This is known as chained hashing.

Hash Collision#

Chained Hashing#

The implementation method is that each hash table node has a next pointer to point to the next hash table node, so multiple hash table nodes can be linked together using this single linked list formed by the next pointer. This way, multiple nodes assigned to the same hash bucket can be connected using this single linked list, thus solving the hash collision problem.

The limitation of chained hashing is also evident; as the length of the linked list increases, the time taken to query data at that position will increase, as querying a linked list has a time complexity of O(n).

To solve this problem, rehashing is needed, which involves expanding the size of the hash table.

Rehash#

When using hash tables, Redis defines a dict structure that contains two hash tables (ht[2]). The reason for defining two hash tables is that rehashing requires both hash tables.
During normal service request phases, all inserted data will be written to "hash table 1," while "hash table 2" has not yet been allocated space.

As the data gradually increases, triggering the rehash operation, this process is divided into three steps:

  • Allocate space for "hash table 2," which is generally twice the size of "hash table 1";
  • Migrate the data from "hash table 1" to "hash table 2";
  • After migration is complete, the space of "hash table 1" will be released, and "hash table 2" will be set as "hash table 1," then a new empty hash table will be created in "hash table 2" for the next rehash preparation.
    image
    The second step has a problem; if the data volume of "hash table 1" is very large, the migration to "hash table 2" will involve a large amount of data copying, which may block Redis and prevent it from serving other requests.

Trigger Conditions:
image
The conditions for triggering the rehash operation mainly include two:

  • When the load factor is greater than or equal to 1, and Redis is not executing the bgsave or bgrewriteaof commands, meaning no RDB snapshot or AOF rewrite is being performed, rehashing will occur.
  • When the load factor is greater than or equal to 5, indicating that hash collisions are very severe, rehashing will be forcibly performed regardless of whether RDB snapshots or AOF rewrites are being executed.
Incremental Rehash#

The data migration work is no longer completed in one go but is done in multiple steps.
The steps for incremental rehash are as follows:

  • Allocate space for "hash table 2";
  • During the rehash period, every time an element in the hash table is added, deleted, searched, or updated, Redis will not only perform the corresponding operation but will also sequentially migrate all key-value pairs from "hash table 1" at the indexed position to "hash table 2";
  • As the number of client requests for hash table operations increases, eventually, at some point, all key-value pairs from "hash table 1" will be migrated to "hash table 2," completing the rehash operation.

Integer Set#

The integer set is one of the underlying implementations of the Set object. When a Set object contains only integer value elements and the number of elements is not large, it will use the integer set data structure as the underlying implementation.

The integer set is essentially a contiguous block of memory, and its structure is defined as follows:

typedef struct intset {
    // Encoding method
    uint32_t encoding;
    // Number of elements in the set
    uint32_t length;
    // Array to store elements
    int8_t contents[];
} intset;

As seen, the container for storing elements is an array called contents. Although contents is declared as an array of type int8_t, it does not actually store any int8_t type elements; the actual type of the contents array depends on the value of the encoding attribute in the intset structure.

Upgrade Operation of Integer Set#

The integer set has an upgrade rule: when a new element is added to the integer set, if the type of the new element (int32_t) is longer than that of all existing elements in the integer set (int16_t), the integer set needs to be upgraded. This means expanding the space of the contents array according to the type of the new element (int32_t) before adding the new element to the integer set, while maintaining the order of the integer set.

The process of upgrading the integer set does not involve reallocating a new type of array; instead, it expands the space on the original array and divides each element according to the size of the type. If the encoding attribute value is INTSET_ENC_INT16, then the interval for each element is 16 bits.

This saves resources and does not support downgrade operations.

Skip List#

Redis uses skip lists as the underlying implementation for Zset objects, which allows for average O(logN) complexity for node lookups.

typedef struct zset {
    dict *dict;
    zskiplist *zsl;
} zset;

During data insertion or update operations for Zset objects, data is inserted or updated in both the skip list and the hash table, ensuring that the information recorded in both the skip list and the hash table remains consistent.
Zset objects support range queries (such as ZRANGEBYSCORE operations) because their data structure design uses skip lists, and they can obtain element weights in constant complexity (such as ZSCORE operations) because they also use hash tables for indexing.
The hash table in the struct zset is only used to obtain element weights in constant complexity, while most operations are implemented by the skip list.

Skip List Structural Design#

A skip list is an improved version of a linked list, implementing a "multi-level" ordered linked list, which allows for faster data positioning.

typedef struct zskiplist {
    struct zskiplistNode *header, *tail;
    unsigned long length;
    int level;
} zskiplist;
typedef struct zskiplistNode {
    // Element value of the Zset object
    sds ele;
    // Element weight value
    double score;
    // Backward pointer
    struct zskiplistNode *backward;
  
    // Level array of the node, storing forward pointers and spans for each level
    struct zskiplistLevel {
        struct zskiplistNode *forward;
        unsigned long span;
    } level[];
} zskiplistNode;

Zset objects need to store both "elements" and "element weights," corresponding to the sds type variable ele and the double type variable score in the skip list node structure. Each skip list node has a backward pointer (struct zskiplistNode *backward) that points to the previous node, facilitating access from the tail node of the skip list, making reverse lookups convenient.

A skip list is a hierarchical linked list, and each level can contain multiple nodes, with each node connected by pointers. This feature is achieved through the level array of the skip list node structure.

Each element in the level array represents a level of the skip list, represented by the zskiplistLevel structure. For example, level[0] represents the first level, and level[1] represents the second level. The zskiplistLevel structure defines "pointers to the next skip list node" and "spans," where spans are used to record the distance between two nodes, which is actually used to calculate the ranking of this node in the skip list.
image

Skip List Node Query Process#

The process of looking up a skip list node starts from the highest level of the header node, traversing each level sequentially. When traversing a skip list node at a certain level, the weight of the node and the SDS type data are used for judgment, with two judgment conditions:

  • If the weight of the current node is "less than" the weight to be searched, the skip list will access the next node at that level.
  • If the weight of the current node is "equal to" the weight to be searched, and the SDS type data of the current node is "less than" the data to be searched, the skip list will access the next node at that level.
  • If neither of the above two conditions is met, or if the next node is null, the skip list will use the next layer pointer in the level array of the currently traversed node and continue searching along the next layer pointer, effectively jumping to the next layer to continue searching.

Setting the Number of Layers for Skip List Nodes#

The ratio of the number of nodes in adjacent layers of the skip list affects the query performance of the skip list.
The ideal ratio of the number of nodes in adjacent layers of the skip list is 2:1, which can reduce the lookup complexity to O(logN).

Redis adopts a clever method: when creating nodes, the skip list randomly generates the number of layers for each node, without strictly maintaining the ratio of the number of nodes in adjacent layers to be 2:1.

Specifically, when creating a node, a random number in the range of [0-1] is generated. If this random number is less than 0.25 (equivalent to a 25% probability), the number of layers will increase by 1. This process continues until the random number exceeds 0.25, determining the number of layers for that node.

This approach means that the probability of increasing the number of layers does not exceed 25%, and the higher the layer, the lower the probability, with a maximum limit of 64 layers.

Why Use Skip Lists Instead of Balanced Trees?#

  • From the perspective of memory usage, skip lists are more flexible than balanced trees. Each node in a balanced tree contains 2 pointers (pointing to the left and right subtrees), while each node in a skip list contains an average of 1/(1-p) pointers, depending on the size of parameter p. If implemented as in Redis, with p=1/4, each node averages 1.33 pointers, which is advantageous compared to balanced trees.
  • When performing range queries, operations on skip lists are simpler than on balanced trees. In a balanced tree, after finding the specified range's minimum value, we still need to continue searching for other nodes that do not exceed the maximum value in in-order traversal. Without certain modifications to the balanced tree, this in-order traversal is not easy to implement. However, performing range queries on a skip list is very simple; after finding the minimum value, we can traverse the first layer linked list for several steps to achieve this.
  • From the perspective of algorithm implementation difficulty, skip lists are much simpler than balanced trees. In balanced trees, insertion and deletion operations may trigger adjustments of subtrees, which is complex logic, while in skip lists, insertion and deletion only require modifying the pointers of adjacent nodes, making operations simple and fast.

Quicklist#

In Redis 3.2, the underlying implementation of List objects was changed to the quicklist data structure. Essentially, a quicklist is a combination of a "doubly linked list + zip list," as a quicklist is a linked list, and each element in the list is a zip list.

Quicklist addresses the issues of zip lists by controlling the size or number of elements in the zip lists within each linked list node to avoid chain update problems. The fewer or smaller the elements in the zip list, the smaller the impact of chain updates, thus providing better access performance.

Structural Design#

typedef struct quicklist {
    // Head of the quicklist
    quicklistNode *head;      // Head of the quicklist
    // Tail of the quicklist
    quicklistNode *tail; 
    // Total number of elements in all zip lists
    unsigned long count;
    // Number of quicklistNodes
    unsigned long len;       
    ...
} quicklist;

typedef struct quicklistNode {
    // Previous quicklistNode
    struct quicklistNode *prev;     // Previous quicklistNode
    // Next quicklistNode
    struct quicklistNode *next;     // Next quicklistNode
    // Pointer to the zip list of the quicklistNode
    unsigned char *zl;              
    // Size of the zip list in bytes
    unsigned int sz;                
    // Number of elements in the zip list
    unsigned int count : 16;        // Number of elements in the ziplist 
    ....
} quicklistNode;

The elements of the linked list nodes no longer simply store element values but instead store a zip list, so the quicklistNode structure contains a pointer to the zip list *zl.
image
When adding an element to the quicklist, it does not create a new linked list node like a regular linked list. Instead, it checks whether the zip list at the insertion position can accommodate the element. If it can, the element is directly saved to the zip list in the quicklistNode structure; if it cannot, a new quicklistNode structure is created.

Quicklist controls the size or number of elements in the zip list within the quicklistNode structure to mitigate potential chain update risks, but this does not completely solve the chain update problem.

Listpack#

Since quicklistNode still uses zip lists to store elements, the chain update problem of zip lists arises from their structural design. Therefore, to completely solve this problem, a new data structure called listpack was designed in Redis 5.0 to replace zip lists. The main feature of listpack is that each node in listpack no longer contains the length of the previous node, which avoids the chain update risk that zip lists face due to the need to store the length field of the previous node.

Structural Design#

image
It mainly includes three aspects:

  • encoding defines the encoding type of the element, which encodes different lengths of integers and strings;
  • data stores the actual data;
  • len records the total length of encoding + data;

As can be seen, listpack does not have the field for recording the length of the previous node like zip lists. Listpack only records the length of the current node, so when we add a new element to listpack, it does not affect the length fields of other nodes, thus avoiding the chain update problem of zip lists.

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.