Purdue University Graduate School
Browse

OFFLOADING NVME OVER FABRICS (NVME-OF) TO SMARTNICS ON AN AT-SCALE DISTRIBUTED TESTBED

thesis
posted on 2025-05-01, 20:40 authored by Shoaib BasuShoaib Basu
<p dir="ltr">The rapid growth of cloud deployments has made them foundational to modern com-</p><p dir="ltr">puting infrastructure, enabling unprecedented scalability, flexibility, and resource sharing.</p><p dir="ltr">However, traditional general-purpose CPUs are burdened with managing critical infrastruc-</p><p dir="ltr">ture functions such as storage, security, and networking. SmartNICs have emerged as a</p><p dir="ltr">transformative technology, integrating specialized processing cores and accelerators directly</p><p dir="ltr">on NIC hardware to offload and accelerate these essential functions. By handling complex</p><p dir="ltr">tasks at line speed, SmartNICs enhance throughput, reduce latency, and alleviate CPU</p><p dir="ltr">load, effectively addressing the limitations of conventional NICs in cloud environments. This</p><p dir="ltr">work explores the offloading of NVMe over Fabrics (NVMe-oF) functionality to SmartNICs,</p><p dir="ltr">investigating the performance implications of hardware-accelerated storage management.</p><p dir="ltr">The experimental approach of this study leverages SmartNIC technology to measure</p><p dir="ltr">performance metrics directly indicative of CPU efficiency, including CPU interrupts, context</p><p dir="ltr">switches, and I/O throughput, which demonstrate improved performance through offload-</p><p dir="ltr">ing. To investigate if these metrics indicate real-world performance gains for the CPU, this</p><p dir="ltr">study benchmarks two database applications, PostgreSQL and MongoDB, and the CPU</p><p dir="ltr">performance is measured using completed database transactions and average latency per op-</p><p dir="ltr">eration. By comparing the offloaded and non-offloaded configurations, the increase in CPU</p><p dir="ltr">efficiency due to SmartNIC offloading in distributed storage infrastructures is quantified.</p><p dir="ltr">The number of database operations rises by at least 20% in the least efficient case, with a</p><p dir="ltr">corresponding decrease of 10% in database transaction latency when the operations are of-</p><p dir="ltr">floaded. These results highlight the potential of SmartNICs in improving CPU performance,</p><p dir="ltr">reducing transaction latency, and maximizing resource utilization, ultimately demonstrating</p><p dir="ltr">their transformative impact on scalability and performance in modern cloud-native infras-</p><p dir="ltr">tructure. This study contributes to understanding the feasibility and benefits of SmartNIC</p><p dir="ltr">deployment for NVMe-oF in cloud infrastructures, providing valuable insights into future</p><p dir="ltr">cloud optimization strategies</p>

History

Degree Type

  • Master of Science

Department

  • Computer and Information Technology

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Deepak Nadig

Additional Committee Member 2

William C Ledbetter

Additional Committee Member 3

Julia Rayz

Additional Committee Member 4

Chad Laux

Additional Committee Member 5

Erik Gough