All Products
Document Center

Elastic Compute Service:Use NetACC to accelerate TCP applications

Last Updated:Oct 16, 2024

To use elastic Remote Direct Memory Access (eRDMA) to accelerate network communication for TCP applications that require high communication performance, low latency, and high throughput, you can use Network Accelerator (NetACC) to adapt eRDMA and the TCP applications without the need to modify application code. This topic describes NetACC and how to use NetACC.


NetACC is in public preview.

Introduction to NetACC

NetACC is a user-mode network acceleration library that leverages the benefits of eRDMA, such as low latency and high throughput, and uses compatible socket interfaces to accelerate existing TCP applications.


NetACC is suitable for scenarios that involve high network overheads.

  • Scenarios in which the packets per second (PPS) rate is high, especially scenarios in which a large number of small packets are sent and received. You can use NetACC to reduce CPU overheads and improve the system throughput in specific scenarios, such as when Redis processes requests.

  • Network latency-sensitive scenarios: eRDMA provides lower network latency than TCP to accelerate network responses.

  • Repeated creation of short-lived connections: NetACC can accelerate the process of establishing secondary connections to reduce the connection creation time and improve system performance.

Install NetACC

  • Installation methods

    • Use the eRDMA driver to install NetACC

      When you install the eRDMA driver, NetACC is automatically installed. For information about how to install the eRDMA driver, see the Configure eRDMA on an existing ECS instance section of the "Configure eRDMA on an enterprise-level instance" topic.

    • Separately install NetACC

      You can use a specific version of NetACC or temporarily use NetACC on an Elastic Compute Service (ECS) instance. Run the following command on the instance to separately install NetACC:

      sudo curl -fsSL | sudo sh
  • Configuration file

    After you install NetACC, the /etc/netacc.conf configuration file is automatically generated. You can configure the parameters of NetACC, such as NACC_SOR_IO_THREADS and NACC_LOG_PATH, in the configuration file. The following sample code provides an example on how to configure the parameters in the configuration file:

    Sample /etc/netacc.conf configuration file

    # The size of a buffer. If a data block to be sent is large, you can increase the size to improve performance or reduce the size to save memory. 
    # int
    # The size of the first memory region (MR) registered by RDMA. You can reduce the size to save memory.
    # The Nth power multiple of 2 of the NACC_SOR_MSG_SIZE value. The minimum value is 1.
    # The maximum size of an MR registered by RDMA. Valid values: 1 to 512. Unit: MB. You can reduce the size to save memory.
    # The Nth power multiple of 2 of the NACC_RDMA_MR_MIN_INC_SIZE value. Valid values: 1 to 512. Unit: MB.
    # The number of links that can reuse a queue pair (QP). You can increase the value to improve performance. In specific scenarios, set this parameter to 1.
    # int
    # The number of NetACC threads. If the throughput is high, increase the value.
    # int
    # The expiration time of empty QPs. Unit: milliseconds. A value of 0 specifies that the empty QPs immediately expire. A value of -1 specifies that the empty QPs never expire.
    # The total number of empty QPs allowed.
    # The total number of empty QPs allowed for each destination address.
    # The probability of using RDMA to establish connections. Valid values: 0 to 100.
    # Specifies whether RDMA is enabled by default.
    # The log level.
    # 0: TRACE
    # 1: DEBUG
    # 2: INFO
    # 3: WARN
    # 4: ERROR
    # 5: FATAL
    # The log path.
    # The following parameters are infrequently used or do not need to be configured.
    # The thread affinity.
    # string
    # Specifies whether to preferentially use TCP to establish a connection.
    # bool

Use NetACC

You can use NetACC in applications by running the netacc_run command or configuring the LD_PRELOAD environment variable.

Run the netacc_run command

netacc_run is a tool that loads NetACC on application startup. You can add netacc_run before the COMMAND command to start an application and load NetACC at the same time. The COMMAND command specifies the command that is used to start an application.

netacc_run provides multiple parameters to improve the performance of NetACC. For example, -t specifies the number of I/O threads, and -p specifies the number of QPs to be reused. The parameters that you configure by running the netacc_run command overwrite the parameters in the configuration file.

netacc_run command parameters

netacc_run -h
Usage: netacc_run [ OPTIONS ] COMMAND

Run COMMAND using NetACC for TCP sockets

   -f <path>   set config file, default /etc/netacc.conf
   -p <num>    set max connections per QP, default 1
   -t <num>    set netacc io threads, default 4
   -s <num>    set netacc message size, default 16384
   -F <num>    fast connect mode, default 0
   -d          enable debug mode
   -T          use TCP first in connect
   -P <num>    polling cq time ms
   -A <str>    affinity CPU list, 0 | 1-3 | 1,3,4
   -i <num>    set cq comp_vector, default 0
   -h          display this message
   -v          display version info
  • Examples:

    In the following examples, Redis applications are used. Add netacc_run before a Redis command to start a Redis application and load NetACC at the same time.

    • Run the following command to start the Redis service and load NetACC at the same time:

      netacc_run redis-server
    • Run the following command to start the redis-benchmark utility and load NetACC at the same time:

      netacc_run redis-benchmark

Configure the LD_PRELOAD environment variable

The LD_PRELOAD environment variable specifies the shared libraries that are preloaded when a program starts. To automate the loading of NetACC, specify NetACC in the value of the LD_PRELOAD environment variable in the relevant script.

  1. Run the following command to query the location of the NetACC dynamic library:

    ldconfig -p | grep netacc

    The following command output is returned.


  2. Run the following command to configure the LD_PRELOAD environment variable to specify the preloaded shared libraries:

    LD_PRELOAD=/lib64/ your_application

    Replace your_application with the application that you want to accelerate.

    Examples: In the following examples, Redis applications are used.

    • Run the following command to start the Redis service and load NetACC at the same time:

      LD_PRELOAD=/lib64/ redis-server
    • Run the following command to start the redis-benchmark utility and load NetACC at the same time:

      LD_PRELOAD=/lib64/ redis-benchmark

    Configure the LD_PRELOAD environment variable in a script

    If you frequently use NetACC to accelerate an application or you want to use a script to manage multiple applications and accelerate the applications by using NetACC on application startup, you can configure the LD_PRELOAD environment variable in the script. For example, you can create a script named run_with_netacc.

    LD_PRELOAD=/lib64/ $@

    Run the following command to start an application and load NetACC at the same time:

    ./ your_application

    Examples: In the following examples, Redis applications are used.

    • Run the following command to start the Redis service and load NetACC at the same time:

      ./ redis-server
    • Run the following command to start the redis-benchmark utility and load NetACC at the same time:

      ./ redis-benchmark

Monitor NetACC

netacc_ss is a monitoring tool provided by NetACC. You can run the netacc_ss command to monitor the status of data sent and received by a NetACC-accelerated process. You can run the command on a server and a client to monitor NetACC.

netacc_ss command

netacc_ss -h
 netacc_ss: [-p] <pid> [options]...
 Show monitoring information of specified netacc process

 -c   clear unused sock file
 -h   display this help
 -s   display specified monitoring metric[s]. [all|cfg|cnt|mem|qp|sock]
      all: all monitoring information
      cfg: configuration information
      cnt: counter information[default]
      mem: memory information
      qp : queue pair information
      sock: socket information
 -v   display netacc version

 netacc_ss -p 12345 -s mem,cnt

Run the following command to monitor the status of data sent and received by a NetACC-accelerated process:

netacc_ss -s all -p <Process ID>

To query the ID of a process, run the ps -ef | grep <Process name> command.

Use NetACC in Redis applications

Benefits of NetACC for Redis applications

  • Improved system throughput

    NetACC is suitable for scenarios in which Redis processes a large number of requests per second. This reduces CPU overheads and improves system throughput.

  • Accelerated network responses

    NetACC leverages the low latency benefit of eRDMA to significantly accelerate network responses to Redis applications.

NetACC used in Redis performance benchmarks

Redis-benchmark is a built-in benchmark utility of Redis, which is designed to measure the performance of the Redis server under various workloads by simulating a number of clients to concurrently send requests to the Redis server.

Test scenario

Use NetACC in the redis-benchmark utility to simulate 100 clients and 4 threads to make 5 million SET requests.

Common parameters used with the redis-server command

The redis-server command is used to start the Redis server. You can run the redis-server -h command to view the parameters that you can use with the redis-server command. Take note of the parameters in the following sample redis-server command:

redis-server --port 6379 --protected-mode no
  • --port 6379: The --port parameter specifies the port on which the Redis server is started. Default value: 6379. If you do not specify the parameter, the default value is used. In this example, the parameter is set to 6379.

  • --protected-mode no: The --protected-mode parameter specifies whether to enable protected mode for the Redis server. Protected mode is a security feature of Redis. When protected mode is enabled, the Redis server accepts connections only from clients that run on the local host ( or localhost) and rejects all connections from external hosts. A value of no specifies that the Redis server accepts connections from all IP addresses.


    If you disable protected mode in a production environment, the production environment may be exposed to security risks. Proceed with caution in an open network environment.

Common command parameters used with redis-benchmark

redis-benchmark is a stress testing tool provided by Redis to test the performance of Redis by simulating multiple clients to send a large number of requests. You can run the redis-benchmark --help command to view the parameters that you can use with the redis-benchmark command. Take note of the parameters in the following sample redis-benchmark command:

redis-benchmark -h -p 6379 -c 100 -n 5000000 -r 10000 --threads 4 -d 512 -t set
  • -h The -h parameter specifies the hostname or IP address of the Redis server. In this example, the -h parameter is set to

  • -p 6379: The -p parameter specifies the port on which Redis is started. Default value: 6379. If Redis is started on port 6379, you do not need to specify this parameter. If Redis is started on a different port, set this parameter to the number of the port.


    You can run the sudo grep "^port" /<Path in which the redis.conf file is stored>/redis.conf command to query the port on which Redis is started. By default, the redis.conf file is stored in the /etc/redis.conf path.

  • -c 100: The -c parameter specifies the number of concurrent connections (clients). In this example, the -c parameter is set to 100.

  • -n 5000000: The -n parameter specifies the total number of requests to make. In this example, the -n parameter is set to 5000000.

  • -r 10000: The -r parameter specifies a range of random keys to use. In this example, the -r parameter is set to 10000, which specifies that the SET command uses random integers from 0 to 999 as part of the keys in the benchmark.

  • --threads 4: The --threads parameter specifies the number of threads. In this example, the --threads parameter is set to 4. By default, redis-benchmark uses only one thread to run a benchmark. However, specific systems allow redis-benchmark to use multiple threads to simulate concurrency.

  • -d 512: The -d parameter specifies the data size of each SET request in bytes. In this example, the -d parameter is set to 512.

  • -t set: The -t parameter specifies to run only a subset of tests. The -t parameter is followed by the test command names. In this example, the -t parameter is set to set to benchmark the performance of only the SET command.

The preceding sample command uses four threads to establish 100 concurrent connections per thread to the Redis server that runs at and send 5 million SET requests to the server. Each SET request contains 512 bytes of random data and uses a random integer from 0 to 999 as part of the key.

Common metrics in redis-benchmark benchmark results

  • Throughput Summary:

    rps: the number of requests that the Redis server can process per second during the benchmark. For example, 332933.81 requests per second indicates that the Redis server can process 332,934 requests per second.

  • Latency Summary: Unit: milliseconds.

    • avg: the average latency, which is the average response time across all requests.

    • min: the minimum latency, which is the minimum response time across all requests.

    • p50: the 50th percentile, which indicates that 50% of requests are faster than this latency value.

    • p95: the 95th percentile, which indicates that 95% of requests are faster than this latency value.

    • p99: the 99th percentile, which indicates that 99% of requests are faster than this latency value.

    • max: the maximum latency, which is the maximum response time across all requests.


Create two eRDMA-capable ECS instances on the instance buy page in the ECS console. Select Auto-install eRDMA Driver and then select eRDMA Interface to enable the eRDMA Interface (ERI) feature for the primary elastic network interface (ENI). Use one ECS instance as the Redis server and the other ECS instance as a Redis client.

The ECS instances have the following configurations:

  • Image: Alibaba Cloud Linux 3

  • Instance type: ecs.g8ae.4xlarge

  • Private IP address of the primary ENI: for the server and for the client. In the following benchmark, replace the IP addresses with actual values based on your business requirements.

    • In this topic, the ERI feature is enabled for the primary ENIs of the ECS instances to perform the benchmark. is the private IP address of the primary ENI of the ECS instance that serves as the Redis server.

    • If you enable the ERI feature for the secondary ENIs of the ECS instances, replace the preceding IP addresses with the private IP addresses of the secondary ENIs. For more information, see the Configure eRDMA when you create an ECS instance section of the "Configure eRDMA on an enterprise-level instance" topic.

Example on how to configure specific parameters during ECS instance creation

When you create the ECS instances, take note of the following parameters. For information about how to configure other parameters on the instance buy page, see Create an instance on the Custom Launch tab.

  • Instance and Image: Select an instance type that supports eRDMA and an image. For more information, see the Limits section in this topic. Auto-install eRDMA Driver: Select this option to automatically install the eRDMA driver during ECS instance creation.


  • ENI: Select eRDMA Interface to the right of the primary ENI.


    When you create an ECS instance, you can enable the ERI feature only for the primary ENI. You can bind only one ERI to each ECS instance. If you want to use a secondary ENI to configure eRDMA on an ECS instance, create a secondary ENI for which the ERI feature is enabled and bind the ENI to the ECS instance after the ECS instance is created. For more information, see Create an ENI and Bind an ENI.



  1. Connect to the ECS instance that serves as the Redis server and the ECS instance that serves as a Redis client.

    For more information, see Connect to a Linux instance by using a password or key.

  2. Check whether the eRDMA driver is installed on the ECS instances.

    After the ECS instances start, run the ibv_devinfo command to check whether the eRDMA driver is installed.

    • The following command output indicates that the eRDMA driver is installed.


    • The following command output indicates that the eRDMA driver is being installed. The eRDMA driver requires a few minutes to install. Try again later.


  3. Run the following command on the ECS instances to install Redis:

    sudo yum install -y redis

    The following command output indicates that Redis is installed.


  4. Use the redis-benchmark utility to benchmark the performance of Redis.

    Perform a benchmark with NetACC
    1. Run the following command on the ECS instance that serves as the Redis server to start Redis and accelerate Redis by using NetACC:

      netacc_run redis-server --port 6379 --protected-mode no
      • Replace 6379 with the number of the actual port on which you want to start Redis. For more information, see the Common parameters used with the redis-server command section of this topic.

      • In this example, the netacc_run command is run to use NetACC. For other methods of using NetACC, see the Use NetACC section of this topic.

      The following command output indicates that Redis is started as expected.


    2. Run the following command on the ECS instance that serves as a Redis client to start redis-benchmark:

       netacc_run redis-benchmark -h -p 6379 -c 100 -n 5000000 -r 10000 --threads 4 -d 512 -t set
      • Replace with the actual IP address of the Redis server and 6379 with the number of the actual port on which Redis is started. For more information, see the Common command parameters used with redis-benchmark section of this topic.

      • The benchmark results may vary based on the network conditions. The benchmark data provided in this topic is only for reference.

      Sample Redis benchmark result

      ====== SET ======                                                      
        5000000 requests completed in 6.52 seconds
        100 parallel clients
        512 bytes payload
        keep alive: 1
        host configuration "save": 3600 1 300 100 60 10000
        host configuration "appendonly": no
        multi-thread: yes
        threads: 4
      Latency by percentile distribution:
      0.000% <= 0.039 milliseconds (cumulative count 3)
      50.000% <= 0.127 milliseconds (cumulative count 2677326)
      75.000% <= 0.143 milliseconds (cumulative count 3873096)
      87.500% <= 0.151 milliseconds (cumulative count 4437348)
      93.750% <= 0.159 milliseconds (cumulative count 4715347)
      96.875% <= 0.175 milliseconds (cumulative count 4890339)
      98.438% <= 0.183 milliseconds (cumulative count 4967042)
      99.609% <= 0.191 milliseconds (cumulative count 4991789)
      99.902% <= 0.207 milliseconds (cumulative count 4995847)
      99.951% <= 0.263 milliseconds (cumulative count 4997733)
      99.976% <= 0.303 milliseconds (cumulative count 4998853)
      99.988% <= 0.343 milliseconds (cumulative count 4999403)
      99.994% <= 0.367 milliseconds (cumulative count 4999704)
      99.997% <= 0.391 milliseconds (cumulative count 4999849)
      99.998% <= 2.407 milliseconds (cumulative count 4999924)
      99.999% <= 5.407 milliseconds (cumulative count 4999962)
      100.000% <= 6.847 milliseconds (cumulative count 4999981)
      100.000% <= 8.423 milliseconds (cumulative count 4999991)
      100.000% <= 8.919 milliseconds (cumulative count 4999996)
      100.000% <= 9.271 milliseconds (cumulative count 4999998)
      100.000% <= 9.471 milliseconds (cumulative count 4999999)
      100.000% <= 9.583 milliseconds (cumulative count 5000000)
      100.000% <= 9.583 milliseconds (cumulative count 5000000)
      Cumulative distribution of latencies:
      18.820% <= 0.103 milliseconds (cumulative count 941003)
      99.917% <= 0.207 milliseconds (cumulative count 4995847)
      99.977% <= 0.303 milliseconds (cumulative count 4998853)
      99.998% <= 0.407 milliseconds (cumulative count 4999879)
      99.998% <= 0.503 milliseconds (cumulative count 4999903)
      99.998% <= 0.703 milliseconds (cumulative count 4999904)
      99.998% <= 0.807 milliseconds (cumulative count 4999905)
      99.998% <= 0.903 milliseconds (cumulative count 4999906)
      99.998% <= 1.007 milliseconds (cumulative count 4999908)
      99.998% <= 1.103 milliseconds (cumulative count 4999909)
      99.998% <= 1.207 milliseconds (cumulative count 4999912)
      99.998% <= 1.407 milliseconds (cumulative count 4999913)
      99.998% <= 1.503 milliseconds (cumulative count 4999915)
      99.998% <= 1.607 milliseconds (cumulative count 4999916)
      99.998% <= 1.703 milliseconds (cumulative count 4999917)
      99.998% <= 1.807 milliseconds (cumulative count 4999918)
      99.998% <= 1.903 milliseconds (cumulative count 4999919)
      99.998% <= 2.103 milliseconds (cumulative count 4999920)
      99.999% <= 3.103 milliseconds (cumulative count 4999931)
      99.999% <= 4.103 milliseconds (cumulative count 4999944)
      99.999% <= 5.103 milliseconds (cumulative count 4999958)
      99.999% <= 6.103 milliseconds (cumulative count 4999971)
      100.000% <= 7.103 milliseconds (cumulative count 4999984)
      100.000% <= 8.103 milliseconds (cumulative count 4999989)
      100.000% <= 9.103 milliseconds (cumulative count 4999996)
      100.000% <= 10.103 milliseconds (cumulative count 5000000)
        throughput summary: 767341.94 requests per second
        latency summary (msec):
                avg       min       p50       p95       p99       max
              0.126     0.032     0.127     0.167     0.183     9.583

      The Summary section at the end of the preceding benchmark result indicates that approximately 770,000 requests can be processed per second. For information about the metrics in Redis benchmark results, see the Common metrics in redis-benchmark benchmark results section of this topic.

    Use netacc_ss to monitor the Redis server during the benchmark

    During the benchmark, you can use netacc_ss on the ECS instance that serves as the Redis server to monitor the server.

    • Run the following command to query the ID of the Redis process (redis-server):

      ps -ef | grep redis-server

      The following command output indicates that the ID of the redis-server process is 114379.


    • Run the following command to query the connection information about Redis and the status of data sent and received by Redis:

      netacc_ss -p 114379 -s all

      Replace 114379 in the preceding command with the actual Redis process ID. For more information, see the netacc_ss command section of this topic.

      The following command output indicates that the socket connection established for Redis is an RDMA connection. The fourth rightmost column indicates the volumes of data sent and received.


    Perform a benchmark without NetACC
    1. Run the following command on the ECS instance that serves as the Redis server to start Redis:

      redis-server --port 6379 --protected-mode no --save

      Replace 6379 with the number of the actual port on which you want to start Redis. For more information, see the Common parameters used with the redis-server command section of this topic.

      The following command output indicates that Redis is started as expected.


    2. Run the following command on the ECS instance that serves as a Redis client to start redis-benchmark:

       redis-benchmark -h -c 100 -n 5000000 -r 10000 --threads 4 -d 512 -t set
      • Replace with the actual IP address of the Redis server and 6379 with the number of the actual port on which Redis is started. For more information, see the Common command parameters used with redis-benchmark section of this topic.

      • The benchmark results may vary based on the network conditions. The benchmark data provided in this topic is only for reference.

      Sample Redis benchmark result

      ====== SET ======                                                         
        5000000 requests completed in 15.02 seconds
        100 parallel clients
        512 bytes payload
        keep alive: 1
        host configuration "save": 
        host configuration "appendonly": no
        multi-thread: yes
        threads: 4
      Latency by percentile distribution:
      0.000% <= 0.055 milliseconds (cumulative count 27)
      50.000% <= 0.287 milliseconds (cumulative count 2635010)
      75.000% <= 0.335 milliseconds (cumulative count 3782931)
      87.500% <= 0.367 milliseconds (cumulative count 4459136)
      93.750% <= 0.391 milliseconds (cumulative count 4720397)
      96.875% <= 0.415 milliseconds (cumulative count 4855130)
      98.438% <= 0.439 milliseconds (cumulative count 4936478)
      99.219% <= 0.455 milliseconds (cumulative count 4965765)
      99.609% <= 0.471 milliseconds (cumulative count 4984031)
      99.805% <= 0.487 milliseconds (cumulative count 4993326)
      99.902% <= 0.495 milliseconds (cumulative count 4995579)
      99.951% <= 0.511 milliseconds (cumulative count 4997659)
      99.976% <= 0.551 milliseconds (cumulative count 4998848)
      99.988% <= 0.599 milliseconds (cumulative count 4999468)
      99.994% <= 0.631 milliseconds (cumulative count 4999722)
      99.997% <= 0.663 milliseconds (cumulative count 4999862)
      99.998% <= 0.695 milliseconds (cumulative count 4999924)
      99.999% <= 0.759 milliseconds (cumulative count 4999964)
      100.000% <= 0.807 milliseconds (cumulative count 4999982)
      100.000% <= 1.935 milliseconds (cumulative count 4999993)
      100.000% <= 2.071 milliseconds (cumulative count 4999996)
      100.000% <= 2.111 milliseconds (cumulative count 4999998)
      100.000% <= 2.119 milliseconds (cumulative count 4999999)
      100.000% <= 2.143 milliseconds (cumulative count 5000000)
      100.000% <= 2.143 milliseconds (cumulative count 5000000)
      Cumulative distribution of latencies:
      0.028% <= 0.103 milliseconds (cumulative count 1377)
      0.985% <= 0.207 milliseconds (cumulative count 49228)
      60.094% <= 0.303 milliseconds (cumulative count 3004705)
      96.325% <= 0.407 milliseconds (cumulative count 4816230)
      99.938% <= 0.503 milliseconds (cumulative count 4996887)
      99.991% <= 0.607 milliseconds (cumulative count 4999546)
      99.999% <= 0.703 milliseconds (cumulative count 4999927)
      100.000% <= 0.807 milliseconds (cumulative count 4999982)
      100.000% <= 0.903 milliseconds (cumulative count 4999987)
      100.000% <= 1.903 milliseconds (cumulative count 4999990)
      100.000% <= 2.007 milliseconds (cumulative count 4999995)
      100.000% <= 2.103 milliseconds (cumulative count 4999997)
      100.000% <= 3.103 milliseconds (cumulative count 5000000)
        throughput summary: 332955.97 requests per second
        latency summary (msec):
                avg       min       p50       p95       p99       max
              0.292     0.048     0.287     0.399     0.447     2.143

      The Summary section at the end of the preceding benchmark result indicates that approximately 330,000 requests can be processed per second. For information about the metrics in Redis benchmark results, see the Common metrics in redis-benchmark benchmark results section of this topic.