Superword-Level Parallelism in the Presence of Control Flow

Author(s):  
Jaewook Shin ◽  
M. Hall ◽  
J. Chame
2012 ◽  
Vol 21 (02) ◽  
pp. 1240006 ◽  
Author(s):  
RAGAVENDRA NATARAJAN ◽  
VINEETH MEKKAT ◽  
WEI-CHUNG HSU ◽  
ANTONIA ZHAI

For today's increasingly power-constrained multicore systems, integrating simpler and more energy-efficient in-order cores becomes attractive. However, since in-order processors lack complex hardware support for tolerating long-latency memory accesses, developing compiler technologies to hide such latencies becomes critical. Compiler-directed prefetching has been demonstrated effective on some applications. On the application side, a large class of data centric applications has emerged to explore the underlying properties of the explosively growing data. These applications, in contrast to traditional benchmarks, are characterized by substantial thread-level parallelism, complex and unpredictable control flow, as well as intensive and irregular memory access patterns. These applications are expected to be the dominating workloads on future microprocessors. Thus, in this paper, we investigated the effectiveness of compiler-directed prefetching on data mining applications in in-order multicore systems. Our study reveals that although properly inserted prefetch instructions can often effectively reduce memory access latencies for data mining applications, the compiler is not always able to exploit this potential. Compiler-directed prefetching can become inefficient in the presence of complex control flow and memory access patterns; and architecture dependent behaviors. The integration of multithreaded execution onto a single die makes it even more difficult for the compiler to insert prefetch instructions, since optimizations that are effective for single-threaded execution may or may not be effective in multithreaded execution. Thus, compiler-directed prefetching must be judiciously deployed to avoid creating performance bottlenecks that otherwise do not exist. Our experiences suggest that dynamic performance tuning techniques that adjust to the behaviors of a program can potentially facilitate the deployment of aggressive optimizations in data mining applications.


2020 ◽  
Vol 16 (2) ◽  
pp. 214
Author(s):  
Wang Yong ◽  
Liu SanMing ◽  
Li Jun ◽  
Cheng Xiangyu ◽  
Zhou Wan

Author(s):  
Bo Wang ◽  
Yanhui Wu ◽  
Kai Liu

Driven by the need to control flow separations in highly loaded compressors, a numerical investigation is carried out to study the control effect of wavy blades in a linear compressor cascade. Two types of wavy blades are studied with wavy blade-A having a sinusoidal leading edge, while wavy blade-B having pitchwise sinusoidal variation in the stacking line. The influence of wavy blades on the cascade performance is evaluated at incidences from −1° to +9°. For the wavy blade-A with suitable waviness parameters, the cascade diffusion capacity is enhanced accompanied by the loss reduction under high incidence conditions where 2D separation is the dominant flow structure on the suction surface of the unmodified blade. For well-designed wavy blade-B, the improvement of cascade performance is achieved under low incidence conditions where 3D corner separation is the dominant flow structure on the suction surface of the baseline blade. The influence of waviness parameters on the control effect is also discussed by comparing the performance of cascades with different wavy blade configurations. Detailed analysis of the predicted flow field shows that both the wavy blade-A and wavy blade-B have capacity to control flow separation in the cascade but their control mechanism are different. For wavy blade-A, the wavy leading edge results in the formation of counter-rotating streamwise vortices downstream of trough. These streamwise vortices can not only enhance momentum exchange between the outer flow and blade boundary layer, but also act as the suction surface fence to hamper the upwash of low momentum fluid driven by cross flow. For wavy blade-B, the wavy surface on the blade leads to a reduction of the cross flow upwash by influencing the spanwise distribution of the suction surface static pressure and guiding the upwash flow.


Sign in / Sign up

Export Citation Format

Share Document