Exploiting Tradeoff between Parallelism and Synchronizations in Graph Processing
Digital Document
Document
Persons |
Persons
Creator (cre): Shan, Mohsin
Major Advisor (mja): Khan, Omer
Associate Advisor (asa): Ding, Caiwen
Associate Advisor (asa): Chandy, John A.
|
||||||||
---|---|---|---|---|---|---|---|---|---|
Title |
Title
Title
Exploiting Tradeoff between Parallelism and Synchronizations in Graph Processing
|
||||||||
Origin Information |
Origin Information
|
||||||||
Parent Item |
Parent Item
|
||||||||
Resource Type |
Resource Type
|
||||||||
Description |
Description
Graph algorithms face challenges in efficiently exploiting parallelism on multicore processors,
including communication/synchronization overheads and low work efficiency. Algorithms can be divided into tasks and are prioritized for a work-efficient execution. However, searching for high-priority tasks requires communication that affects performance. To address these issues, this thesis proposes a Hardware-assisted Drift-aware Concurrent Priority Scheduler (HD-CPS). HD-CPS uses task priority to optimize task priority drift and communication at runtime. Additionally, compute-intensive task transfer and processing are offloaded to pe r-core lo cal ha rdware, im proving performance. HD-CPS consistently outperforms state-of-the-art CPS designs and achieves near-linear performance scaling with core counts. Graph Neural Networks (GNNs), on the other hand, automate analytic graph algorithms and extract valuable information using machine learning techniques. However, achieving efficient inference under strict latency constraints poses a significant c hallenge. This i s primarily due to the sparse and irregular nature of the graphs themselves, which results in a workload imbalance and poor data locality. This thesis introduces MergePath-SpMM, a software-only algorithm for load-balancing sparse matrix-matrix computations, a fundamental computation in GNNs. MergePath-SpMM enables fine-grain parallelism with controlled synchronizations, resulting in robust performance scaling, and performs better on GPU processors than hardware accelerators and software-only approaches. However, GPUs require a large number of threads to take advantage of latency-hiding capabilities, increasing synchronizations in SpMMs. To overcome this limitation, the thesis proposes a 1000-core RISC-V processor that uses MergePath-SpMM to achieve a loadbalanced execution. This processor enables massive fine-grain p arallelism, ensuring low-latency data access with a single-threaded pipeline, cache hierarchy, and graph processing-aware data prefetcher. |
||||||||
Language |
Language
|
||||||||
Organizations |
Organizations
Degree granting institution (dgg): University of Connecticut
|
||||||||
Held By | |||||||||
Rights Statement |
Rights Statement
|
||||||||
Note |
Note
|
||||||||
Degree Name |
Degree Name
Doctor of Philosophy
|
||||||||
Degree Level |
Degree Level
Ph.D.
|
||||||||
Degree Discipline |
Degree Discipline
Computer Science and Engineering
|
||||||||
Local Identifier |
Local Identifier
S_41647544
|