Wistron Demonstrates PCIe Gen4 on Power9
Published on Wednesday 20 June 2018
By Wistron Corporation
Wistron P9 products with PCIe Gen4
For today’s complex HPC, Enterprise, and Data-Center workloads, the need for high-speed I/O is paramount – which is why PCIe Gen4 is one of the main features of the Wistron POWER9 product portfolio. To demonstrate the impact of PCIe Gen4 on system performance, we compared it to PCIe Gen3 performance on both POWER9 and x86 systems.
Mellanox ConnectX-5 100G/EDR dual-port InfiniBand Adapter
First, we needed to have an add-in card which supports both PCIe Gen4 and Gen3. Considering the driver readiness on both x86 and OpenPOWER platform, we selected the Mellanox ConnectX-5 (Figure 1) for the test. The theoretical bandwidth is shown below in Table 1:
PCIe Gen4 x16 | ConnectX-5 100G dual-port | PCIe Gen3 x16 | |
Formula | 16Gb/s * 16 | 100Gb/s * 2 | 8Gb/s * 16 |
Bandwidth | 256 Gb/s | 200Gb/s | 128 Gb/s |
Table 1. Theoretical bandwidth
Figure 1. Mellanox ConnectX-5 100G/EDR dual-port InfiniBand Adapter
Hardware Setup
We have OpenPOWER P9 and x86 systems, and we mounted a ConnectX-5 card on the target PCIe slots of all systems and connected the EDR ports with Mellanox 100G cables individually. The detail configuration of our test is shown below in Table 2.
System | Wistron P91D2-2P-48 * 2 | Sugon I620-G30 |
CPU | P9 Sforza 20core (160W) * 2 | Intel 8153 16core (125W) * 2 |
Memory | 256GB | 256GB |
IB adaptor | Mellanox ConnectX-5 | Mellanox ConnectX-5 |
OS | RHEL 7.5 | RHEL 7.5 |
OFED | MLNX_OFED_LINUX-4.3-3.0.2.1 | MLNX_OFED_LINUX-4.3-3.0.2.1 |
Table 2. Configuration between OpenPOWER P9 and x86 systems
Bandwidth Average Result
After installing OFED® successfully, its inbox commands are available under OS. We executed a “ib_write_bw” command to check the average I/O bandwidth of each link at the same time and summarize it. To achieve the upper limit of clients, we used P9 Gen4 as a server to connect different clients. The test result is shown below in Figure 2:
Figure 2. Bandwidth Average of P9 Gen4, P9 Gen3 and x86 Gen3
The I/O bandwidth results meet our expectation. When we connect both ports from the P9 Gen4 slot, it reaches 96.6% of the theoretical bandwidth of PCIe Gen4. And when we use P9 Gen4 as a server and connect to PCIe Gen3 ports on P9 and x86 platforms, P9 still has a better performance - around 10% higher than the x86 platform.
Latency Result
In the latency portion of our test, considering most users are still using x86 Gen3 as the client, we set up different servers and re-ran the same test with another command, “ib_write_lat,” in one link. The result is 2 bytes of latency as shown below in Figure 3:
Figure 3. Latency Result of P9 Gen4 and x86 Gen3
Conclusion
In this test, we set out to give user a picture of how PCIe Gen4 improves performance using a real device on a real system instead of using theoretical calculations. Although there’s no significant performance with latency using P9 Gen4, it provides superior performance with overall bandwidth. By nearly doubling bandwidth, users will have a better ROI and a lower TCO by utilizing a single high speed Gen4 capable network adapter, instead of two Gen3 adapters in each system.
For more information, please contact: EBG_sales@wistron.com
About Wistron
As a long-standing partner with IBM, Wistron utilizes more than 10 years PowerPC design and manufacture experience to offer robust services across diverse technical platforms. Wistron provides tailored, flexible business models from barebones to rack integration delivery to meet various business needs.