Algo-Logic's Low Latency Key-Value Store: Video Presentation

In this seven part presentation video series, John Lockwood demonstrates our low latency Gateware Defined Networking® (GDN) solutions for the datacenter. He highlights a demonstration of our network-attached, in-memory Key-Value Store and real-time analytics from sensors on a drone at Supercomputing 2016.

Algo-Logic's Key-Value Store Provides Live Data for Analytics from Drone at Supercomputing

Algo-Logic Systems exhibited at the Supercomputing (SC16) conference in Salt Lake City from Nov. 13-18, 2016. Algo-Logic showcased the capabilities of our hardware accelerated, in-memory Key-Value Store (KVS) running on Nallatech's P385 receiving real-time sensor data from our Black Diamond IoT monitoring system.

There has been a huge change in the world: Sensor networks are now being used for mission critical real-time data, and new systems need to operate on live data, not stored data from archives. While embedded processors were useful for non-real time data, they don't work well with tight jitter bounds. Because processors serialize execution, they become backlogged when there is burst of data to process. Systems that operate on archived data will struggle to implement solutions that need real-time data, such as control loops for flight control. Algo-Logic has a solution for sharing real-time data over a network: We implemented a KVS in FPGAs. The FPGA systems uses parallel logic blocks instead of CPU cycles to offload processing of network packets, retrieve data form memory, and send a response over Ethernet.

1. Algo-Logic's Gateware Defined Networking® Solutions

Gateware Defined Networking®:
Gateware is similar to software in that it is fully programmable. However, unlike software, gateware compiles to fully parallel logic, allowing it to compute efficiently like hardware. Gateware solutions achieve high performance with flexibility by running in Field Programmable Gate Array (FPGA) devices.

By leveraging gateware for networking systems, Algo-Logic builds networking solutions that achieve high throughput with minimal power and sub-microsecond latency. These gateware products are deployed in enterprise networks for multiple sectors and products.

2. Key-Value Store on SCinet at Supercomputing Showing Real-time Analytics

By reducing the latency and increasing the throughput, supercomputing centers can speed up data sharing between machines. In this live demonstration, we will show an ultra low latency KVS implemented in FPGA hardware. We will run benchmarks that show key/value search performed over standard 10 or 40 Gigabit/second Ethernet (GE) achieve a fiber-to-fiber lookup latency of under a half microsecond (0.5 µS).

SCinet and R&E information:
- Just a single 120v or 240v outlet with low amperage is needed to run the KVS.
- For research and educational (R&E) purposes, the KVS in FPGA can be benchmarked over SCinet from any machine with 1G, 10G, or 100G Ethernet using standard Layer 2 Gigabit Ethernet switching.
- Fast KVS and Large KVS tables will be accessible over two different static IP addresses.
- KVS in FPGA will also be available over public internet for functional testing.
- Test equipment with a web GUI will display measured latency.

3. Key-Value Store Use Cases and Applications

Key-Value Store (KVS) provides a simple and scalable means to store and retrieve distributed data. For example, Telecom directories, Internet Protocol forwarding tables, and de-duplicating storage systems all need key-value tables to associate data with unique identifiers. In datacenters, high performance KVS tables allow hundreds or thousands of machines to easily share data by simply associating values with keys and allowing client machines to read and write those keys and values over standard high-speed Ethernet.

Algo-Logic is working to accelerate a wide base of applications that benefit from KVS in FPGA, including low-latency finance (trading, compliance), communications (network status, short messages), database speedup, sensor tracking for location and movement, social status updates, and multimedia.

4. Key-Value Store Comparative Performance Metrics

Key-Value Store (KVS) can be implemented in software in multiple ways. Traditional software implementations are built as a user-space application program that communicates with the network operating system via standard BSD-style network sockets. Optimized software implementations avoid the overhead of processing the packets in the kernel by dedicating CPU cores to read packets directly from the network interface card (NIC). Rather than using a standard NIC to forward packets to software, the fastest version of the packet processing datapath and KVS are implemented in FPGA logic. For the 10-GE version, we ran the KVS on a Nallatech P385 board, and for the 40-GE version, we ran the KVS application on a BittWare S5PHQ board, each with an Altera Stratix V A7 FPGA device.

We evaluated each of the software, DPDK, and FPGA implementations of KVS using the on-chip table over the 10-GE interface using a series of tests to determine latency, throughput, and power usage.

5. Key-Value Store Scale Up Benchmarks and Scale Out Analysis

The total latency of the KVS is a combination of latency through the lookup IP, datapath, PHY+MAC, SFP+/QSFP+ module delay, and fiber delay.

KVS scales out for use in the datacenter. Whereas a single instance of Fast KVS over a 40-GE interface in an FPGA supports a throughput of 150 MSPS, a scaled-out configuration of 40 instances of Fast KVS in FPGA supports up to 6 billion (150 × 40) I/O operations (IOPs) per rack with an aggregate bandwidth of 40 G × 40 = 1.6 Tbps. The latency for any client to access any KVS in this network is estimated as the sum of roundtrip switch and fiber latency plus a 40-GE KVS latency of 388 ns. For nearby machines with low-latency switch hardware, the total round-trip latency is below a single microsecond.

6. Real-time Data Monitoring and Analytics

The Black Diamond (BD) data acquisition system and AlgoCentral provide a complete data monitoring solution. The solution saves time and money by automatically acquiring and analyzing big data from sensors.

The BD system collects data from sensors with high resolution and minimal noise. Leveraging Field Programmable Gate Array (FPGA) technology, the BD deterministically collects and processes data synchronously. The resulting data is archived and streamed over the network to the cloud.

AlgoCentral is a cloud service that executes algorithms to process data, graph results, automate tasks, and send notifications when critical events occur. Data is continuously processed to ensure reports are always up to date.

7. Demonstration of Key-Value Store and Analytics of Live Data at Supercomputing

Algo-Logic Systems exhibited at the Supercomputing (SC16) conference in Salt Lake City from Nov. 13-18, 2016. Algo-Logic showcased the capabilities of our hardware accelerated, in-memory Key-Value Store (KVS) running on Nallatech's P385 fed by data collected by our Black Diamond (a robust, real-time measurement and monitoring system).

For Algo-Logic's demonstration at SC16, our Key-Value Store collected data from a strain gauge, temperature gauge, and a tri-axis accelerometer attached to a drone. We fused data with the KVS, analyzed it, and shared it between network-attached devices in the cloud with sub-microsecond latency.

Although this demonstration of the KVS was for industrial use-cases, there are many other applications that see a benefit from real-time data such as Real-Time-Bidding (RTB) for Ad Tech, order books for online trading, asset monitoring, network monitoring and control for telecommunications, and NoSQL database acceleration.

Thank you to Filmscape Productions for producing and editing all of these videos!