Accommodating workload diversity in chip multiprocessors via
The exemplary embodiment presented hereinbelow start from a CMP substrate with a homogeneous set of small cores.
The embodiment maximizes the core count to exploit high levels of thread level parallelism (TLP), and has the modularity advantages of fine-grain CMPs.
The hope is to match the demands of a variety of sequential and parallel workloads by executing them on an appropriate subset of these cores.
Recently, the impact of performance asymmetry on explicitly parallelized applications has been studied, finding that asymmetry hurts parallel application scalability and renders the applications' performance less predictable unless relatively sophisticated software changes are introduced.
Improving the performance of computer or other processing systems generally improves overall throughput and/or provides a better user experience.
One technique of improving the overall quantity of instructions processed in a system is to increase the number of processors in the system.
A subset of these regions are then parallelized, and the rest of the application is left as “future work.” Over time, more effort is spent on portions of the remaining code. As a result of this “pay-as-you-go” approach, the complexity (and cost) associated with software parallelization is amortized over a greater time span.In the short term, on-chip integration of a modest number of relatively powerful (and relatively complex, cores may yield high utilization when running multiple sequential workloads, temporarily avoiding the complexity of parallelization.However, although sequential codes are likely to remain important, they alone are not sufficient to sustain long-term performance scalability.In fact, some of the most common shared-memory programming models in use today (for example, Open MP) are designed to facilitate the incremental parallelization of sequential codes.We envision a diverse landscape of software in different stages of parallelization, from purely sequential, to fully parallel, to everything in between.
a shows a graphical schematic block diagram of the exemplary embodiment.