Accelerating Synchronization on Futuristic 1000-cores Multicore Processor with Moving Compute to Data Model

Metadata

Handle

http://hdl.handle.net/11134/20002:860655934

Persons

Creator (cre): Dogan, Halit

Major Advisor (mja): Khan, Omer

Associate Advisor (asa): Chandy, John

Associate Advisor (asa): Van Dijk, Marten

Title

Accelerating Synchronization on Futuristic 1000-cores Multicore Processor with Moving Compute to Data Model

Origin Information

Event Place	Storrs, CT
Date Created	2018
Publisher	University of Connecticut

Parent Item

Dissertations

Resource Type

Text

Digital Origin

born digital

Description

Single chip multicore processors are now prevalent and processors with hundreds of cores are being proposed and explored by both academia and industry. Shared memory cache coherence is the state-of-the-art technology for these processors to enable synchronization and communication between cores. However, since the synchronization of cores on shared data using hardware cache coherence suffers from instruction retries and cache line ping-pong overheads, it prevents performance scaling as core counts increase on a chip. This thesis proposes to utilize a novel moving computation to data model (MC) to overcome this synchronization bottleneck in a 1000-cores scale shared memory multicore processor. The proposed MC model pins shared data to dedicated cores called service cores. The execution of critical code sections is explicitly requested from worker cores to be performed at the service cores. In this way, the cache line bouncing between cores is prevented, hence data locality optimization is enabled. The proposed MC model utilizes auxiliary in-hardware explicit messaging for the critical section requests to enable efficient fine-grained blocking and non-blocking communication between communicating cores. To show the effectiveness of the proposed model, workloads with wide range of synchronization requirements from graph analytics, machine learning and database domains are implemented. The proposed model is then prototyped and exhaustively evaluated on a 72 core machine, Tilera Tile-Gx72 multicore platform, as it incorporates in-hardware core-to-core messaging as an auxiliary capability to the shared memory cache coherence paradigm. Since the Tile-Gx72 machine includes only 72 cores, it is deployed for evaluation at 8 to 64 core count scale. For further analysis at higher core count, a simulated RISC-V multicore environment is built and utilized, and the performance and dynamic energy scaling advantages of the MC model is evaluated against various baseline synchronization models up to 1024 cores.

Genre

doctoral dissertations

Organizations

Degree granting institution (dgg): University of Connecticut

Held By

Archives & Special Collections, University of Connecticut Library

Use and Reproduction

These Materials are provided for educational and research purposes only.

Note

Note	Multicore, Synchronization, Moving Compute to Data, explicit messaging, cache coherence

Degree Name

Doctor of Philosophy

Degree Level

Doctoral

Degree Discipline

Electrical Engineering

Local Identifier

OC_d_2026