AMQP 1.0 Benchmarks

This blog post demonstrates that native AMQP 1.0 in RabbitMQ 4.0 provides significant performance and scalability improvements compared to AMQP 1.0 in RabbitMQ 3.13.

Additionally, this blog post suggests that AMQP 1.0 can perform slightly better than AMQP 0.9.1 in RabbitMQ 4.0.

Setup

The following setup applies to all benchmarks in this blog post:

  • Intel NUC 11
  • 8 CPU cores
  • 32 GB RAM
  • Ubuntu 22.04
  • Single node RabbitMQ server
  • Server runs with (only) 3 scheduler threads (set via runtime flags as +S 3)
  • Erlang/OTP 27.0.1
  • Clients and server run on the same box

We use the latest RabbitMQ versions at the time of writing:

The following advanced.config is applied:

[
{rabbit, [
{loopback_users, []}
]},

{rabbitmq_management_agent, [
{disable_metrics_collector, true}
]}
].

Metrics collection is disabled in the rabbitmq_management_agent plugin.
For production environments, Prometheus is the recommended option.

RabbitMQ server is started as follows:

make run-broker 
TEST_TMPDIR="$HOME/scratch/rabbit/test"
RABBITMQ_CONFIG_FILE="$HOME/scratch/rabbit/advanced.config"
PLUGINS="rabbitmq_prometheus rabbitmq_management rabbitmq_amqp1_0"
RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+S 3"

The rabbitmq_amqp1_0 plugin is a no-op plugin in RabbitMQ 4.0.

The AMQP 1.0 benchmarks run quiver in a Docker container:

$ docker run -it --rm --add-host host.docker.internal:host-gateway ssorj/quiver:latest
bash-5.1# quiver --version
quiver 0.4.0-SNAPSHOT

Classic Queues

This section benchmarks classic queues.

We declare a classic queue called my-classic-queue:

deps/rabbitmq_management/bin/rabbitmqadmin declare queue 
name=my-classic-queue queue_type=classic durable=true

AMQP 1.0 in 4.0

The client sends and receives 1 million messages.
Each message contains a payload of 12 bytes.
The receiver repeatedly tops up 200 link credits at a time.

# quiver //host.docker.internal//queues/my-classic-queue 
--durable --count 1m --duration 10m --body-size 12 --credit 200

RESULTS

Count ............................................. 1,000,000 messages
Duration ............................................... 10.1 seconds
Sender rate .......................................... 99,413 messages/s
Receiver rate ........................................ 99,423 messages/s
End-to-end rate ...................................... 99,413 messages/s

Latencies by percentile:

0% ........ 0 ms 90.00% ........ 1 ms
25% ........ 1 ms 99.00% ........ 2 ms
50% ........ 1 ms 99.90% ........ 2 ms
100% ........ 9 ms 99.99% ........ 9 ms

AMQP 1.0 in 3.13

# quiver //host.docker.internal//amq/queue/my-classic-queue 
--durable --count 1m --duration 10m --body-size 12 --credit 200

RESULTS

Count ............................................. 1,000,000 messages
Duration ............................................... 45.9 seconds
Sender rate .......................................... 43,264 messages/s
Receiver rate ........................................ 21,822 messages/s
End-to-end rate ...................................... 21,790 messages/s

Latencies by percentile:

0% ....... 67 ms 90.00% .... 24445 ms
25% .... 23056 ms 99.00% .... 24780 ms
50% .... 23433 ms 99.90% .... 24869 ms
100% .... 24873 ms 99.99% .... 24873 ms

The same benchmark against RabbitMQ 3.13 results in 4.5 times lower throughput.

Detailed test execution
---------------------- Sender -----------------------  --------------------- Receiver ----------------------  --------
Time [s] Count [m] Rate [m/s] CPU [%] RSS [M] Time [s] Count [m] Rate [m/s] CPU [%] RSS [M] Lat [ms]
----------------------------------------------------- ----------------------------------------------------- --------
2.1 130,814 65,342 8 79.1 2.1 3,509 1,753 1 7.5 777
4.1 206,588 37,849 6 79.1 4.1 5,995 1,242 0 7.5 2,458
6.1 294,650 43,987 6 79.1 6.1 9,505 1,753 1 7.5 5,066
8.1 360,184 32,734 5 79.4 8.1 13,893 2,194 0 7.5 6,190
10.1 458,486 49,102 6 79.4 10.1 15,793 950 1 7.5 9,259
12.1 524,020 32,734 5 79.4 12.1 21,644 2,923 1 7.5 11,163
14.1 622,322 49,102 5 79.4 14.1 25,154 1,753 1 7.5 13,451
16.1 687,856 32,734 4 79.4 16.1 27,639 1,241 1 7.5 15,246
18.1 786,158 49,102 6 81.0 18.1 30,124 1,241 1 7.5 17,649
20.1 884,460 49,102 6 81.0 20.1 32,610 1,242 1 7.5 19,408
22.1 949,994 32,734 4 81.0 22.1 35,535 1,462 0 7.5 21,293
24.1 999,912 24,934 4 81.8 24.1 38,167 1,315 1 7.5 23,321
26.1 999,974 31 2 0.0 26.1 117,745 39,749 11 7.5 24,475
- - - - - 28.1 202,589 42,380 11 7.5 24,364
- - - - - 30.1 292,554 44,938 13 7.5 24,244
- - - - - 32.1 377,691 42,526 15 7.5 23,955
- - - - - 34.1 469,704 45,961 14 7.5 23,660
- - - - - 36.1 555,719 42,965 12 7.5 23,463
- - - - - 38.1 649,048 46,618 12 7.5 23,264
- - - - - 40.1 737,696 44,280 15 7.5 23,140
- - - - - 42.1 826,491 44,353 15 7.5 23,100
- - - - - 44.1 917,187 45,303 16 7.5 23,066
- - - - - 46.1 999,974 41,394 14 0.0 22,781

AMQP 0.9.1 in 4.0

For our AMQP 0.9.1 benchmarks we use PerfTest.
We try to run a somewhat fair comparison of our previous AMQP 1.0 benchmark.

Since an AMQP 1.0 /queues/:queue target address sends to the default exchange, we also send to the default exchange via AMQP 0.9.1.
Since we used durable messages with AMQP 1.0, we set the persistent flag in AMQP 0.9.1.
Since RabbitMQ settles with the released outcome when a message cannot be routed, we set the mandatory flag in AMQP 0.9.1.
Since RabbitMQ 4.0 uses a default rabbit.max_link_credit of 128 granting 128 more credits to the sending client when remaining credit falls below 0.5 * 128, we configure the AMQP 0.9.1 publisher to have at most 1.5 * 128 = 192 messages unconfirmed at a time.
Since we used 200 link credits in the previous run, we configure the AMQP 0.9.1 consumer with a prefetch of 200.

$ java -jar target/perf-test.jar 
--predeclared --exchange amq.default
--routing-key my-classic-queue --queue my-classic-queue
--flag persistent --flag mandatory
--pmessages 1000000 --size 12 --confirm 192 --qos 200 --multi-ack-every 200

id: test-151706-485, sending rate avg: 88534 msg/s
id: test-151706-485, receiving rate avg: 88534 msg/s
id: test-151706-485, consumer latency min/median/75th/95th/99th 99/975/1320/1900/2799 µs
id: test-151706-485, confirm latency min/median/75th/95th/99th 193/1691/2113/2887/3358 µs

Summary

Figure 1: Classic queue end-to-end message rate
Figure 1: Classic queue end-to-end message rate

Quorum Queues

This section benchmarks quorum queues.

We declare a quorum queue called my-quorum-queue:

deps/rabbitmq_management/bin/rabbitmqadmin declare queue 
name=my-quorum-queue queue_type=quorum durable=true

Flow Control Configuration

For highest data safety, quorum queues fsync all Ra commands including:

  • enqueue: sender enqueues a message
  • settle: receiver accepts a message
  • credit: receiver tops up link credit

Before a quorum queue confirms receipt of a message to the publisher, it ensures that any file modifications are flushed to disk, making the data safe even if the RabbitMQ node crashes shortly after.

The SSD of my Linux box is slow, taking 5-15 ms per fsync.
Since we want to compare AMQP protocol implementations without being bottlenecked by a cheap disk, the tests in this section increase flow control settings:

advanced.config
[
{rabbit, [
{loopback_users, []},

%% RabbitMQ internal flow control for AMQP 0.9.1
%% Default: {400, 200}
{credit_flow_default_credit, {5000, 2500}},

%% Maximum incoming-window of AMQP 1.0 session.
%% Default: 400
{max_incoming_window, 5000},

%% Maximum link-credit RabbitMQ grants to AMQP 1.0 sender.
%% Default: 128
{max_link_credit, 2000},

%% Maximum link-credit RabbitMQ AMQP 1.0 session grants to sending queue.
%% Default: 256
{max_queue_credit, 5000}
]},

{rabbitmq_management_agent, [
{disable_metrics_collector, true}
]}
].

This configuration allows more Ra commands to be batched before RabbitMQ calls fsync.
For production use cases, we recommend enterprise-grade high performance disks that fsync faster, in which case there is likely no need to increase flow control settings.

RabbitMQ flow control settings present a trade-off:

  • Low values ensure stability in production.
  • High values can result in higher performance for individual connections but may lead to higher memory spikes when many connections publish large messages concurrently.

RabbitMQ uses conservative flow control default settings to favour stability in production over winning performance benchmarks.

AMQP 1.0 in 4.0

# quiver //host.docker.internal//queues/my-quorum-queue 
--durable --count 1m --duration 10m --body-size 12 --credit 5000

RESULTS

Count ............................................. 1,000,000 messages
Duration ............................................... 12.0 seconds
Sender rate .......................................... 83,459 messages/s
Receiver rate ........................................ 83,396 messages/s
End-to-end rate ...................................... 83,181 messages/s

Latencies by percentile:

0% ........ 9 ms 90.00% ....... 47 ms
25% ....... 27 ms 99.00% ....... 61 ms
50% ....... 35 ms 99.90% ....... 76 ms
100% ....... 81 ms 99.99% ....... 81 ms
Default Flow Control Settings

The previous benchmark calls fsync 1,244 times in the ra_log_wal module (that implements the Raft write-ahead log).

The same benchmark with default flow control settings calls fsync 15,493 times resulting in significantly lower throughput:

# quiver //host.docker.internal//queues/my-quorum-queue 
--durable --count 1m --duration 10m --body-size 12 --credit 5000

RESULTS

Count ............................................. 1,000,000 messages
Duration .............................................. 100.2 seconds
Sender rate ........................................... 9,986 messages/s
Receiver rate ......................................... 9,987 messages/s
End-to-end rate ....................................... 9,983 messages/s

Latencies by percentile:

0% ....... 10 ms 90.00% ....... 24 ms
25% ....... 14 ms 99.00% ....... 30 ms
50% ....... 18 ms 99.90% ....... 38 ms
100% ....... 55 ms 99.99% ....... 47 ms

Each fsync took 5.9 ms on average.

(15,493 - 1,244) * 5.9 ms = 84 seconds

Therefore, this benchmark with default flow control settings is blocked for 84 seconds longer executing fsync than the previous benchmark with increased flow control settings.
This shows how critical enterprise-grade high performance disks are to get the best results out of quorum queues.
For your production workloads, we recommend using disks with lower fsync latency rather than tweaking
RabbitMQ flow control settings.

It’s worth noting that the Raft WAL log is shared by all quorum queue replicas on a given RabbitMQ node.
This means that ra_log_wal will automatically batch multiple Raft commands (operations) into a single fsync
call when there are dozens of quorum queues with hundreds of connections.
Consequently, flushing an individual Ra command to disk becomes cheaper on average when there is more traffic on the node.
Our benchmark ran somewhat artificially with a single connection as fast as possible.

AMQP 1.0 in 3.13

# quiver //host.docker.internal//amq/queue/my-quorum-queue 
--durable --count 1m --duration 10m --body-size 12 --credit 5000

---------------------- Sender ----------------------- --------------------- Receiver ---------------------- --------
Time [s] Count [m] Rate [m/s] CPU [%] RSS [M] Time [s] Count [m] Rate [m/s] CPU [%] RSS [M] Lat [ms]
----------------------------------------------------- ----------------------------------------------------- --------
2.1 163,582 81,709 11 84.2 2.1 29,548 14,759 3 7.5 840
4.1 336,380 86,356 12 185.3 4.1 29,840 146 0 7.5 2,331
6.1 524,026 93,729 14 328.0 6.1 29,840 0 0 7.5 0
8.1 687,864 81,837 11 462.3 8.1 31,302 730 1 7.5 6,780
10.1 884,470 98,303 14 605.4 10.1 31,447 72 0 7.5 7,897
12.1 999,924 57,669 7 687.5 12.1 31,447 0 0 7.5 0
14.1 999,924 0 0 687.5 14.1 31,447 0 0 7.5 0
16.1 999,924 0 0 687.5 16.1 31,447 0 1 7.5 0
18.1 999,924 0 1 688.3 18.1 31,447 0 0 7.5 0
receiver timed out
20.1 999,924 0 0 688.3 20.1 31,447 0 0 7.5 0

RabbitMQ 3.13 cannot handle this workload and the benchmark fails.

Default Flow Control Settings

The benchmark also fails with default flow control settings:

# quiver //host.docker.internal//amq/queue/my-quorum-queue 
--durable --count 1m --duration 10m --body-size 12 --credit 5000

---------------------- Sender ----------------------- --------------------- Receiver ---------------------- --------
Time [s] Count [m] Rate [m/s] CPU [%] RSS [M] Time [s] Count [m] Rate [m/s] CPU [%] RSS [M] Lat [ms]
----------------------------------------------------- ----------------------------------------------------- --------
2.1 130,814 65,342 9 70.0 2.1 26,915 13,437 6 7.5 1,213
4.1 196,348 32,718 5 70.2 4.1 28,084 584 0 7.5 3,093
6.1 261,882 32,734 7 70.2 6.1 30,131 1,022 1 7.5 4,952
8.1 360,184 49,126 6 70.2 8.1 32,325 1,096 0 7.5 6,637
10.1 425,718 32,734 6 70.2 10.1 34,225 949 1 7.5 8,089
12.1 491,252 32,734 5 70.2 12.1 34,225 0 0 7.5 0
14.1 589,554 49,102 7 70.2 14.1 34,225 0 0 7.5 0
16.1 655,088 32,734 5 70.2 16.1 34,225 0 0 7.5 0
18.1 720,622 32,734 6 70.2 18.1 34,225 0 0 7.5 0
receiver timed out

AMQP 0.9.1 in 4.0

Since we set max_link_credit to 2,000, we allow for a maximum of 2,000 * 1.5 = 3,000 unconfirmed messages in the publisher.

$ java -jar target/perf-test.jar 
--predeclared --exchange amq.default
--routing-key my-quorum-queue --queue my-quorum-queue
--flag persistent --flag mandatory
--pmessages 1000000 --size 12 --confirm 3000 --qos 5000 --multi-ack-every 5000

id: test-085526-136, sending rate avg: 70067 msg/s
id: test-085526-136, receiving rate avg: 70067 msg/s
id: test-085526-136, consumer latency min/median/75th/95th/99th 8803/33127/40424/53407/62883 µs
id: test-085526-136, confirm latency min/median/75th/95th/99th 8551/30323/38317/52103/63131 µs
Default Flow Control Settings
$ java -jar target/perf-test.jar 
--predeclared --exchange amq.default
--routing-key my-quorum-queue --queue my-quorum-queue
--flag persistent --flag mandatory
--pmessages 1000000 --size 12 --confirm 192 --qos 5000 --multi-ack-every 5000

id: test-084359-441, sending rate avg: 9931 msg/s
id: test-084359-441, receiving rate avg: 9931 msg/s
id: test-084359-441, consumer latency min/median/75th/95th/99th 7512/17054/26256/34249/38641 µs
id: test-084359-441, confirm latency min/median/75th/95th/99th 9432/16586/23918/32636/36858 µs

These results are similar to the results of the default flow control settings in AMQP 1.0 in 4.0 because both benchmarks are bottlenecked by my slow disk.

Summary

Figure 2: Quorum queue end-to-end message rate
Figure 2: Quorum queue end-to-end message rate

Streams

This sections benchmarks streams.

We declare a stream called my-stream:

deps/rabbitmq_management/bin/rabbitmqadmin declare queue 
name=my-stream queue_type=stream durable=true

(We run with default RabbitMQ flow control settings.)

We want the receiver to start consuming from the very beginning of the stream.
Quiver doesn’t support passing a filter field to the source where we could specify a rabbitmq:stream-offset-spec value first.
Therefore, for this benchmark it’s easier to patch RabbitMQ to use stream offset spec first by default instead of next:

git diff
diff --git a/deps/rabbit/src/rabbit_stream_queue.erl b/deps/rabbit/src/rabbit_stream_queue.erl
index e36ad708eb..acd193d76f 100644
--- a/deps/rabbit/src/rabbit_stream_queue.erl
+++ b/deps/rabbit/src/rabbit_stream_queue.erl
@@ -344,7 +344,7 @@ consume(Q, Spec, #stream_client{} = QState0)
{term(), non_neg_integer()}) ->
{ok, osiris:offset_spec()} | {error, term()}.
parse_offset_arg(undefined) ->
- {ok, next};
+ {ok, first};
parse_offset_arg({_, <<"first">>}) ->
{ok, first};
parse_offset_arg({_, <<"last">>}) ->

AMQP 1.0 in 4.0

# quiver //host.docker.internal//queues/my-stream 
--durable --count 1m --duration 10m --body-size 12 --credit 5000
---------------------- Sender ----------------------- --------------------- Receiver ---------------------- --------
Time [s] Count [m] Rate [m/s] CPU [%] RSS [M] Time [s] Count [m] Rate [m/s] CPU [%] RSS [M] Lat [ms]
----------------------------------------------------- ----------------------------------------------------- --------
2.1 278,782 139,321 25 8.0 2.1 215,185 107,539 22 7.6 224
4.1 554,492 137,717 25 8.0 4.1 434,027 109,312 24 7.6 651
6.1 825,082 135,160 25 8.0 6.1 650,236 107,997 26 7.6 1,079
8.1 999,992 87,368 17 0.0 8.1 888,973 119,249 29 7.6 1,469
- - - - - 10.1 999,993 55,455 13 0.0 1,583

RESULTS

Count ............................................. 1,000,000 messages
Duration ................................................ 8.9 seconds
Sender rate ......................................... 136,705 messages/s
Receiver rate ....................................... 112,587 messages/s
End-to-end rate ..................................... 112,196 messages/s

Latencies by percentile:

0% ........ 7 ms 90.00% ..... 1553 ms
25% ...... 519 ms 99.00% ..... 1612 ms
50% ..... 1011 ms 99.90% ..... 1615 ms
100% ..... 1616 ms 99.99% ..... 1616 ms

It is easy to observe a substantially higher throughput.

Note that end-to-end latencies are very high just because the sender can write into the stream at a higher rate than RabbitMQ being able
to dispatch messages to the consumer (“receiver” in quiver terms).

AMQP 1.0 in 3.13

# quiver //host.docker.internal//amq/queue/my-stream 
--durable --count 1m --duration 10m --body-size 12 --credit 5000

---------------------- Sender ----------------------- --------------------- Receiver ---------------------- --------
Time [s] Count [m] Rate [m/s] CPU [%] RSS [M] Time [s] Count [m] Rate [m/s] CPU [%] RSS [M] Lat [ms]
----------------------------------------------------- ----------------------------------------------------- --------
2.1 196,350 98,077 12 70.1 2.1 4,094 2,045 0 7.7 195
4.1 392,956 98,205 13 138.5 4.1 4,094 0 0 7.7 0
6.1 524,026 65,470 10 196.5 6.1 4,094 0 0 7.7 0
8.1 655,096 65,470 11 259.4 8.1 4,094 0 0 7.7 0
10.1 786,166 65,470 10 307.5 10.1 4,094 0 0 7.7 0
receiver timed out
12.1 917,236 65,470 9 355.5 12.1 4,094 0 0 7.7 0

RabbitMQ 3.13 cannot handle this workload and the benchmark fails.

AMQP 0.9.1 in 4.0

$ java -jar target/perf-test.jar 
--predeclared --exchange amq.default
--routing-key my-stream --queue my-stream
--flag persistent --flag mandatory
--pmessages 1000000 --size 12 --confirm 192 --qos 5000 --multi-ack-every 5000

id: test-104223-225, sending rate avg: 88912 msg/s
id: test-104223-225, receiving rate avg: 88912 msg/s
id: test-104223-225, consumer latency min/median/75th/95th/99th 701/1340/1523/2500/4524 µs
id: test-104223-225, confirm latency min/median/75th/95th/99th 788/1983/2130/2437/2970 µs

Since streams store messages in AMQP 1.0 format, this workload requires RabbitMQ to translate each message between AMQP 0.9.1 and AMQP 1.0.
This explains why stream throughput is lower when using AMQP 0.9.1 clients compared to AMQP 1.0 clients.

Summary

Figure 3: Stream end-to-end message rate
Figure 3: Stream end-to-end message rate

Many Connections

This section compares memory usage of connecting 40,000 clients with two AMQP 1.0 sessions / AMQP 0.9.1 channels per connection.

Setup
make run-broker 
TEST_TMPDIR="$HOME/scratch/rabbit/test"
RABBITMQ_CONFIG_FILE="$HOME/scratch/rabbit/rabbitmq.conf"
PLUGINS="rabbitmq_amqp1_0"
RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+P 3000000 +S 6"
ERL_MAX_PORTS=3000000

In the following rabbitmq.conf, we use small buffer sizes to better compare the memory usage of the protocol implementations.

tcp_listen_options.sndbuf = 2048
tcp_listen_options.recbuf = 2048
vm_memory_high_watermark.relative = 0.95
vm_memory_high_watermark_paging_ratio = 0.95
loopback_users = none
AMQP 1.0
package main

import (
"context"
"log"
"time"

"github.com/Azure/go-amqp"
)

func main() {
for i := 0; i < 40_000; i++ {
if i%1000 == 0 {
log.Printf("opened %d connections", i)
}
conn, err := amqp.Dial(
context.TODO(),
"amqp://localhost",
&amqp.ConnOptions{SASLType: amqp.SASLTypeAnonymous()})
if err != nil {
log.Fatal("open connection:", err)
}
_, err = conn.NewSession(context.TODO(), nil)
if err != nil {
log.Fatal("begin session:", err)
}
_, err = conn.NewSession(context.TODO(), nil)
if err != nil {
log.Fatal("begin session:", err)
}
}
log.Println("opened all connections")
time.Sleep(5 * time.Hour)
}
AMQP 0.9.1
package main

import (
"log"
"time"

amqp "github.com/rabbitmq/amqp091-go"
)

func main() {
for i := 0; i < 40_000; i++ {
if i%1000 == 0 {
log.Printf("opened %d connections", i)
}
conn, err := amqp.Dial("amqp://guest:guest@localhost")
if err != nil {
log.Fatal("open connection:", err)
}
_, err = conn.Channel()
if err != nil {
log.Fatal("open channel:", err)
}
_, err = conn.Channel()
if err != nil {
log.Fatal("open channel:", err)
}
}
log.Println("opened all connections")
time.Sleep(5 * time.Hour)
}

AMQP 1.0 in 4.0

The examples below directly invoke erlang:memory/0 on the node,
a function that returns the memory size in bytes for each memory type.

tip

To retrieve the same information from a running node, use rabbitmq-diagnostics like so:

rabbitmq-diagnostics -s memory_breakdown

This command can format the numbers using different information units (e.g. MiB, GiB) and supports JSON
output with --formatter=json:

# pipes the output to `jq` for more readable formatting
rabbitmq-diagnostics -s memory_breakdown --formatter=json | jq

Here are the runtime-reported memory footprint numbers:

1> erlang:memory().
[{total,5330809208},
{processes,4788022888},
{processes_used,4787945960},
{system,542786320},
{atom,999681},
{atom_used,974364},
{binary,194810368},
{code,19328950},
{ets,94161808}]

2> erlang:system_info(process_count).
360312

AMQP 1.0 in 3.13

To compare, the runtime-reported memory footprint numbers in this test are:

1> erlang:memory().
[{total,12066294144},
{processes,11156497904},
{processes_used,11156461208},
{system,909796240},
{atom,1089809},
{atom_used,1062780},
{binary,192784464},
{code,22068126},
{ets,318872128}]

2> erlang:system_info(process_count).
1480318

We observe that the memory usage of processes in RabbitMQ 3.13 is 11.1 GB compared to only 4.8 GB in RabbitMQ 4.0 (a reduction of about 56%).
As explained in the previous blog post, the RabbitMQ 3.13 implementation of AMQP 1.0 is resource heavy because each AMQP 1.0 session in the plugin includes an AMQP 0.9.1 client and maintains AMQP 0.9.1 state.

AMQP 0.9.1 in 4.0

1> erlang:memory().
[{total,5409763512},
{processes,4716150248},
{processes_used,4715945080},
{system,693613264},
{atom,991489},
{atom_used,962578},
{binary,187229040},
{code,19118766},
{ets,235605424}]

2> erlang:system_info(process_count).
600314

Summary

Figure 4: Memory usage of 40,000 connections and 80,000 sessions / channels
Figure 4: Memory usage of 40,000 connections and 80,000 sessions / channels

Conclusion

This blog post demonstrated that the new native AMQP 1.0 implementation in RabbitMQ 4.0 performs multiple times better than AMQP 1.0 in RabbitMQ 3.13.

We also observed that AMQP 1.0 can perform better than AMQP 0.9.1.
However, it’s challenging to provide a fair comparison.
This blog post used an AMQP 1.0 client written in C and an AMQP 0.9.1 client written in Java.
Therefore, we do not claim or promise that you will observe better throughput with your AMQP 1.0 workloads.
The AMQP 0.9.1 implementation in RabbitMQ performs well since it has been stable and optimized for over 15 years.

Use cases where AMQP 1.0 will likely outperform AMQP 0.9.1 include:

  • Sending to or receiving from a stream because a stream encodes messages in AMQP 1.0 format (as covered in this blog post).
  • Leveraging queue locality using the RabbitMQ AMQP 1.0 Java client. (This feature will be covered separately.)