Pen-Chung Yew Professor Department
of Computer Science and Engineering University
of Minnesota at Twin Cities 4-192
Keller Hall 200
Union Street, SE Minneapolis,
MN 55455, USA Contact:
<yew> {at} umn [dot] edu Education
Ph.D. 1981 University of Illinois
at Urbana-Champaign, Computer Science. M.S. 1977
University of Massachusetts at Amherst, Computer Engineering B.S. 1972
National Taiwan University, Electrical Engineering Publications
(in chronological order, updated 12/20/2020)
A. Refereed Conference Papers · N.
Namashivayam, S. Mehta, P.C. Yew, Variable-Sized
Blocks for Locality-Aware SpMV, Proc. of the Annual IEEE/ACM Int'l Symp. on Code
Generation and Optimization (CGO), February, 2021 · Z.
Zhao, Y. Chen, X. Gong, W. Wang, P.C. Yew, Enhancing Atomic Instruction for Cross-ISA Dynamic Binary
Translation, Proc.
of the Annual IEEE/ACM Int'l Symp. on Code Generation and Optimization (CGO), February, 2021 · X.
Liu, X. Gong, W. Wang, Z. Zhao, Regaining
Lost Seconds: Efficient Page Preloading for SGX Enclaves, Proc. of ACM/IFIT Middleware Conference (MIDDLEWARE), December 2020 · J. Jiang, R. Dong, Z. Zhou, C. Song, W.
Wang, P.C. Yew, W. Zhang, More with
Less -Deriving More Translation Rules with Less Training Data for DBTs Using
Parameterization, Proc. of the Intn’l Symp. on Microarchitectures
(MICRO), October 2020 · K. Ramkrishnan, A. Zhai, S. McCamant, and P.C. Yew, First
Time Miss:Low Overhead Mitigation For Shared Memory Cache Side Channels, Proc. of the Intn’l Conference on Parallel
Processing (ICPP), August, 2020 · Z.
Zhao, Z.Jiang, X.Liu, X. Gong, W.Wang, P.C. Yew, DQEMU: A Scalable
Emulator with Retargetable DBT on Distributed Platforms, Proc. of the International Conference on
Parallel Processing (ICPP), August, 2020 · (Best Paper Award Finalist) W. Wang,
P.C. Yew, A. Zhai, S. McCamant, Efficient and Scalable Cross-ISA Virtualization of Hardware
Transaction Memory, Proc. of the Annual IEEE/ACM Int'l Symp. on Code
Generation and Optimization (CGO), March, 2020 · C. Song W. Wang, P.C. Yew, A. Zhai, W. Zhang, Unleashing
the Power of Learning: An Enhanced Learning-based Approach for Dynamic Binary
Translation, Proc. of the
2019 USENIX Annual Technical Conference (ATC), June 2019 · Z. Wang. C. Wu, P.C. Yew, SafeHidden:
An Efficient and Secure Information Hiding Technique Using Re-randomization,
Proc. of 28th USENIX Security Symposium, August 2019 · W.
Wang, S. McCamant, A. Zhai, P.C. Yew, Enhancing
DBT Performance Through Automatically Learned Translation Rules, 23rd ACM
International Conference on Architectural Support for Programming Languages
and Operating Systems (ASPLOS), March 2018 · W.
Wang, J. Wu, T. Li, X. Gong, P.C. Yew, Improving
Dynamically-Generated Code Performance on Dynamic Binary Translator,
Proc. of 14th Int'l Conf. on Virtual Execution Environments (VEE), March
2018. · W.
Wang, P.C. Yew, A. Zhai, S. McCamant, Y. Wu, J. Bobba, Enabling Cross-ISA Offloading for COTS Binaries, The 15th
ACM International Conf. on Mobile, Systems, Applications, and Services
(MobiSys), June 2017 · W. Wang, A. Zhai and P.C.
Yew, A General Persistent Code Caching
Framework for Dynamic Binary Translation, Proc. of the 2016 USENIX Annual
Technical Conference (ATC), June 2016 · S. Mehta, R. Garg, N.
Trivedi and P.C. Yew, TurboTiling:
Leveraging Prefetching to Boost Performance of Tiled Codes, Proc.
of the 2016 Int't Conf. on Supercomputing (ICS), June 2016. · C.J. Chang, Y.C. Peng, C.C.
Chen, T.F. Chen and P.C. Yew, Adaptive Granularity and Coordinated Management
for Timely Prefetching in Multi-core Systems, 2015 International Symposium on VLSI
Design, Automation and Test (VLSI-DAT), May 28 2015 · S. Mehta and P.C. Yew, Improving Compiler Scalability: Optimizing
Programs at Small Price, Proc. of ACM SIGPLAN Int’l Conf. on Programming
Languages Design and Implementation (PLDI), June 2015 · X. Yuan, C. Wu, Z. Wang, J.
Li, X. Feng, P.C. Yew, Y. Lan, Y. Chen, J. Huang, Y. Guan, Reproducing Concurrency Bugs using Local
Clocks, Proc. of Int'l Conf. on Software
Engineering (ICSE), May, 2015 · W. Wang, C. Wu, P.C. Yew, X.
Shen, X. Yuan, Z. Wang, J. Li, X. Feng, Localization
of Concurrency Bugs Using Shared Memory Access Pairs, 29th
IEEE/ACM International Conference on Automated Software Engineering (ASE),
September 2014 · S. Mehta, Z. Fang, A. Zhai
and P.C. Yew, Multistage Coordinated
Prefetching for Present-Day Processors, Proc.
of the 2014 Int't Conf. on Supercomputing (ICS), June 2014 · C.F. Chen, C.C. Chen, et al,
DAPs:
Dynamic Adjustment and Partial Sampling for Multithreaded/Multicore
Simulation, Proc. of 51th International Design
Automation Conference (DAC), June 2014 · Y.H. Lu, D.Y. Hong, T.Y.
Wu, J.J. Wu, P. Liu, W.C. Hsu, and
P.C. Yew, DBILL: An Efficient and
Retargetable Dynamic Binary Instrumentation Framework using LLVM Backend,
Proc. of 10th Int'l Conf. on Virtual Execution
Environments (VEE), March 2014 · C.R. Chang, J.J. Wu, P. Liu,
W.C. Hsu, and P.C. Yew, Efficient
Memory Virtualization for Cross-ISA System Mode Emulation, Proc. of 10th
Int'l Conf. on Virtual Execution Environments (VEE), March 2014 · S. Mehta, P.H. Lin, and P.C.
Yew, Revisiting
Loop Fusion in the Polyhedral Framework, Proc.
of ACM SIGPLAN 19th Annual Symp. on Principles and Practice of
Parallel Programming (Ppopp), February 2014 · S.H. Chen, S.M. Lin, K.Y.
Chen, Y.H. Chang, P.C. Yew, C.C. Ho, A
Systematic Methodology for OS Benchmarks Characterization, Proc. of ACM
Int’l Conf. on Reliable and Convergent Systems (RACS), October 2013 · V. Mekkat, A. Holey, P.C.
Yew and A. Zhai, Managing Last-Level
Cache in a Heterogeneous Multicore Processor, Proc. of Int'l
Conf. on Parallel Architectures and Compiler Techniques (PACT), September 2013. · X. Yuan, C. Wu, P.C. Yew, W.
Wang, Z. Wang, J. Li and D. Xu, Synchronization
Identification through On-the-Fly Test, Proc. of 2013 Euro-Par Conference
(Euro-Par), August 2013 · C.C. Hsu, J.J. Wu, P.C. Yew,
D.Y. Hong, C.M. Wang, and W.C. Hsu, Improving
Dynamic Binary Optimization Through Early-Exit Guided Code Region Formation,
9th Int'l Conf on Virtual Execution
Environments (VEE), March 2013 · P.H. Lin, J. Jayaraj, P.
Woodward, and P.C. Yew, A Study of
Performance Portability Using Piecewise-Parabolic Method (PPM) Gas Dynamics
Applications, Proc. of Int’l
Conf. on Computational Science (ICCS), May 2012 · D.
Xu, C. Wu, P.C. Yew, J. Li, and Z. Wang, Providing Fairness on Shared
Memory Multiprocessors via Process Scheduling, ACM SIGMETRICS
Performance, June 2012 · D.Y.
Hong, C.C. Hsu, P.C. Yew, J.J. Wu, W.C. Hsu, Y.C. Chung, P. Liu and C.M. Wang,
HQEMU: A Multi-Threaded and
Retargetable Dynamic Binary Translator on Multicores, Proc. of the 10th
Annual IEEE/ACM Int'l Symp. on Code Generation and Optimization (CGO), March, 2012 · C.C. Hsu, P. Liu, C.M. Wang, J.J. Wu, D.Y.
Hong, P.C. Yew and W.C. Hsu, LnQ:
Building High Performance Dynamic Binary Translators with Existing Compiler
Backends, Proc. of the 40th
International Conference on Parallel Processing (ICPP), Taipei,
Taiwan, September 2011 · D. Xu, C. Wu and P.C. Yew, On Mitigating Memory Bandwidth Contention
Through Bandwidth-Aware Scheduling, Proc. of Int'l Conf. on Parallel Architectures and Compiler
Techniques (PACT), September 2010. · P. Woodward, et al, Boosting the Performance of Computational Fluid Dynamics Codes for Interactive Supercomputing, Proc. Of Int’l Conf. on Computational Science (ICCS), May 2010 · J. Lin and P.C. Yew, A Compiler Framework for General Memory Layout Optimization Targeting
Structures and Arrays, The 12th
Annual Workshop on the Interaction between Compilers and Computer
Architecture (INTERACT), March 2010. · Z. Wang, C. Wu and P.C. Yew, On Improving Heap Memory Layout by Dynamic
Pool Allocation, Proc. of the 8th
Annual IEEE/ACM Int'l Symp. on Code Generation and Optimization (CGO),
April, 2010. · L. Wang, et. al., An Adaptive Task Creation Strategy for Work-Stealing Scheduling, Proc. of the 8th Annual
IEEE/ACM Int'l Symp. on Code Generation and Optimization (CGO), April,
2010. · H. Chen, L. Yuan, X. Wu, B. Zang, B.
Huang, P.C. Yew, Control Flow Obfuscation
with Information Flow Tracking, Proc.
of the 42nd Int'l Symp. on Microarchitecture (MICRO-42),
November 2009 · V. Packirisamy, A. Zhai, W.C. Hsu, T.F. Ngai, P.C. Yew, Exploring Speculative Parallelism in SPEC2006, Proc. of IEEE Int’l Symp. On Performance Analysis of Systems and Software (ISPASS), April 2009 · Y.
Duan, X. Feng, P.C. Yew, Detecting and
Eliminating Violation of Sequential Consistency for Concurrent C/C++ Programs,
Proc. of IEEE/ACM Int'l Symp. on Code
Generation and Optimization (CGO), March 2009 · (Best Paper Award) V.
Packirisamy, Y. Luo, W.L. Hung, A. Zhai and P.C. Yew, Efficiency of Thread-Level Speculation in SMT and CMP Architectures – Performance, Power and Thermal Perspective, Proc. Of
Int’l Conf. on Computer Design (ICCD), Oct. 2008. · H.
Chen, X. Wu, L. Yuan, B. Zang, P.C. Yew, F.T. Chong, From Speculation Security: Practical and Efficient Information Flow
Tracking Using Speculative Hardware, Proc. of 35th Int'l Symp. on
Computer Architecture (ISCA-35), June 2008 · G.J.
He, A. Zhai and P.C. Yew, Ex-Mon: An
Architectural Framework for Dynamic Program Monitoring on Multicore
Processors, The 12th Annual Workshop on the Interaction between Compilers
and Computer Architecture (INTERACT), Feb 2008. · (Best
Paper Award) H.
Chen, R. Chen, F. Zhang, B. Zang and P.C. Yew, Mercury: combining performance with dependability using
self-virtualization, Proc. of Int’l Conf. on
Parallel Processing (ICPP), Sept. 2007 · S.V.
Kodakara, J. Kim, W.C. Hsu, D.J. Lilja, P.C. Yew, Analysis of Statistical Sampling in Microarchitecture Simulation:
Metric, Methodology and Program Characterization, Proc. of Int'l Symp. on
Workload Characterization (IIWCS),
Sept, 2007 · J.
Kim, W.C. Hsu and P.C. Yew, COBRA: An
Adaptive Runtime Binary Optimization Framework for Multithreaded Applications,
Proc. of Int’l Conf. on Parallel Processing (ICPP),
Sept. 2007 · S.J.
Lee, H.K. Lee, and P.C. Yew, Runtime
Performance Projection Model for Dynamic Power Management, Proc. of
Asia-Pacific Computer Systems Architecture Conference (ACSAC), Aug. 2007 · J.
Kim, S.V. Kodakara, W.C. Hsu, D.J. Lilja, R. Geva, P.C. Yew, Entropy-Based Profile Characterization and
Classification for Automatic Profile Management, Proc. of Asia-Pacific
Computer Systems Architecture Conference (ACSAC), Aug. 2007 · H.
Chen, J. Yu, C. Rong, B.Y. Zang and P.C. Yew, POLUS: A Powerful Live Updating Systems, Proc. of Int'l Conf. on
Software Engineering (ICSE), May, 2007 · R.
Fu, A. Zhai, P.C. Yew and W.C. Hsu, J. Lu, Reducing Queueing Stalls Caused by Data Prefetching, The 11th
Annual Workshop on the Interaction between Compilers and Computer
Architecture (INTERACT) May 2007. · V.
Packirisamy, S.Y. Wang, A. Zhai, W.C. Hsu, P.C. Yew, Supporting Speculative Multithreading on Simultaneous Multithreaded
Processors, in Proc. of Int'l Conf. on High Performance Computing (HiPC),
Bangalore, India, Dec 2006 · S.Y.
Wang, A. Zhai, P.C. Yew, Exploiting
Speculative Thread-Level Parallelism in Data Compression Applications, in
Proc. of 19th Workshop on Languages and Compiler for Parallel Computing (LCPC), New Orlean, LA
Nov. 2006 · H.B.
Chen, R. Chen, F.Z. Zhang, B.Y. Zang, P.C. Yew, Live Updating Operating Systems Using
Virtualization, 2nd Int'l Conf on Virtual Execution Environments (VEE), June
2006 · J.
Kim, S.V. Kodakara, W.C. Hsu, D.J. Lilja, P.C Yew, Dynamic Code Region (DCR) Based Program Phase Tracking and Prediction
for Dynamic Optimizations, Lecture Notes in Computer Science, Volume 3793
(HiPEAC), Oct 2005, pp. 203 - 217. · X.
Dai, A. Zhai, W.C. Hsu and P.C. Yew, A
General Compiler Framework for Speculative Optimizations Using Data
Speculative Code Motion, Proc. of the Third Annual IEEE/ACM Int'l Symp.
on Code Generation and Optimization (CGO), March 2005, pp. 280-290 · A.
Das, J. Lu, H. Chen, J. Kim, P.C. Yew, W.C. Hsu, D.Y. Chen, Performance of Runtime Optimization on
BLAST, Proc. of the Third Annual IEEE/ACM Int'l Symp. on Code Generation
and Optimization (CGO), March 2005, pp. 86-96 · J.
Lin, W.C. Hsu, P.C. Yew, R.D. Ju and T.F. Ngai, A Compiler Framework for Recovery Code Generation in General
Speculative Optimizations, Proc. of Int'l Conf. on Parallel Architectures
and Compiler Techniques (PACT), September 2004, pp. 17-28 · T.
Chen, J. Lin, X. Dai, W.C. Hsu and P.C. Yew, Data Dependence Profiling for Speculative Optimizations, Proc. of
14 Int'l Conf. on Compiler Construction (CC),
March 2004, pp. 57-62 · H.
Chen, J. Lu, W.C. Hsu, P.C. Yew, Continuous
Adaptive Object-Code Re-optimization Framework, Ninth Asia-Pacific Computer
Systems Architecture Conference (ACSAC), pp. 241-255, Sept 2004. · J.
Lu, H. Chen, R. Fu, W.C. Hsu, B. Othmer and P.C. Yew, The Performance of Runtime Data Cache Prefetching in a Dynamic
Optimization System, Proc. of 36th Annual Int'l Symp. on Microarchitecture
(MICRO-36), December 2003 · J.
Lin, T. Chen, W.C. Hsu, P.C. Yew, R.D. Ju and T.F. Ngai, A Compiler Framework for Speculative Analysis and Optimizations,
Proc. of ACM/SIGPLAN Conf. on Programming Language Design and Implementation
(PLDI), June 2003, pp.289-299 · H.
Chen, W.C. Hsu, J. Lu, B. Othmer, D.Y. Chen, and P.C. Yew, Dynamic Trace Selection Using Performance
Monitoring Hardware Sampling, Proc. of the 1st IEEE/ACM Int'l Symp. on
Code Generation and Optimization (CGO), March 2003, pp. 79-90 · J.
Lin, T. Chen, W.C. Hsu and P.C. Yew, Speculative
Register Promotion Using Advanced Load Address Table (ALAT), Proc. of the
1st IEEE/ACM Int'l Symp. on Code Generation and Optimization (CGO), March
2003, pp. 125-134 · T.
Chen, J. Lin, W.C. Hsu and P.C. Yew, An
Empirical Study on the Granularity of Pointer Analysis in C Programs,
Proc of the 15th Workshop on Languages and Compilers for Parallel Computing
(LCPC), Aug. 2002 · W.C.
Hsu, H. Chen, P.C. Yew and D.Y. Chen, On
the Predictability of Program Behavior Using Different Input Data Sets,
Proc. of the 6th Workshop on Interaction Between Compilers and Computer
Architectures (INTERACT-6), Feb 2002. · P.Y.
Tang and P.C. Yew, Interprocedural
Induction Variable Analysis, Proc. of 6th Int'l Symp. on Parallel
Architectures, Algorithms and Networks (I-SPAN), pp. 245-250, May 2002. · T.
Chen, J. Lin, W.C. Hsu and P.C. Yew, An
Empirical Study on the Characteristics of Heap-Oriented Pointers in C
Programs, Proc. of 6th Int'l Symp. on Parallel Architectures, Algorithms
and Networks (I-SPAN), pp.251-256, May 2002 · S.J.
Lee and P.C. Yew, On Some
Implementation Issues for Value Prediction on Wide-Issue ILP Processors,
Proc. of Int'l Conf. on Parallel architectures and Compiler Techniques
(PACT), Oct. 2000, pp.145-156 · S.J.
Lee, Y. Wang and P.C. Yew, Decoupled
Value Prediction on Trace Processors, Proc. of Int'l Conf on
High-Performance Computer Architecture (HPCA-6), Jan 2000, pp.231-240 · (Best
Paper Award) H.B. Lim and P.C. Yew, Efficient Integration of Compiler-Directed
Cache Coherence and Data Prefetching, Proc. of the 2000 Int'l Parallel
and Distributed Processing Symposium (IPDPS) (Best Paper Award), May 2000,
pp. 331-342 · S.Y.
Cho, P.C. Yew and G.H. Lee, Access
Region Locality for High-Bandwidth Processor Memory System Design, Proc.
of the 32nd Int'l Symp. on Microarchitecture (MICRO-32), Nov. 1999,
pp.136-146 · S.Y.
Cho and P.C. Yew, Decoupling Local
Variable Accesses in a Wide-Issue Superscalar Processor, Proc. of the
26th Intn'l Symp. on Computer Architecture (ISCA-26), May 1999, pp.100-110 · B.
Zheng, et. al., Designing the Agassiz
Compiler for Concurrent Multithreaded Architectures, Proc. of the 12th
Workshop on Languages and Compilers for Parallel Computing (LCPC-12), Aug.
1999 · J.Y.
Tsai, Z. Jiang, E. Ness, and P.C. Yew, Performance
of a Concurrent Multithreaded Processors, Proc. of the 4th International
Symposium of High Performance Computer Architectures (HPCA-4), Feb. 1998, pp.
24-34 · S.
Cho, J.Y. Tsai, et.al, High-Level
Information - An Approach for Integrating Front-end and Back-end Compilers,
Proc of the 1998 Int'l Conf on Parallel Processing (ICPP), Aug. 1998, pp.
346-355 · H.B.
Lim and P.C. Yew, An Integrated
Framework for Compiler-Directed Cache Coherence and Data Prefetching,
Proc. of the 11th Workshop on Languages and Compilers for Parallel Computing
(LCPC-11), Aug. 1998 · H.B.
Lim, and P.C. Yew, A Compiler-Directed
Cache Coherence Scheme Using Data Prefetching, Proc. of the Int'l Symp.
on Parallel Processing (IPPS), Apirl 1997, pp. 643-649 · J.Y.
Tsai, B. Zheng, and P.C. Yew, Program
Optimization for Concurrent Multithreaded Architectures, Proc. of the
10th Workshop on Languages and Compilers for Parallel Computing (LCPC-10),
Aug. 1997 · L.
Choi and P.C. Yew, Compiler and
Hardware Support for Cache Coherence in Large-Scale Multiprocessors: Design
Considerations and Performance Evaluation, Proc. of the 23rd Int'l Symp
on Computer Architecture (ISCA-23), May 1996, pp. 283-294 · L.
Choi and P.C. Yew, Program Analysis for
Cache Coherence: Beyond Procedural Boundaries, Proc. of the 1996 Int'l
Conf. on Parallel Processing (ICPP), Aug. 1996, Vol. 3, pp. 103-114, · L.
Choi and P.C. Yew, Eliminating Stale
Data References through Array Data-Flow Analysis, Proc. of the 1996 Int'l
Symp. on Parallel Processing (IPPS), April, 1996, pp. 4-13 · J.Y.
Tsai and P.C. Yew, The Superthreaded
Architecture: Thread Pipelining for Run-Time Data Dependence Checking and
Control Speculation, Proc. of the 1996 Int'l Conf. on Parallel
architectures and Compiler Techniques (PACT), Oct. 1996, pp.35-46 · W.T.
Hsu and P.C. Yew, Let Us Build
System-Friendly Networks – Build Them Hierarchically,
invited paper for 1996 ICPP Workshop on Challenges for Parallel Processing,
Aug 1996 · Z.
Li, J.Y. Tsai, X. Wang, P.C. Yew and B. Zheng, Compiler Techniques for Concurrent Multithreading with Hardware
Speculation Support, Proc. of the 9th Workshop on Languages and Compilers
for Parallel Computing (LCPC-9), Aug. 1996 · H.B.
Lim, L. Choi and P.C. Yew, On Using
Data Prefetching for Cache Coherence in Multiprocessors, Proc. of the 9th
Workshop on Languages and Compilers for Parallel Computing (LCPC-9), Aug.
1996 · P.
Konas and P.C. Yew, Processor
Self-Scheduling in Parallel Discrete Event Simulation, Proc. of the 1995
Winter Simulation Conference, December 1995. · L.
Choi and P.C. Yew, Interprocedural
Array Data-Flow Analysis for Cache Coherence, Eighth Workshop on
Languages and Compilers for Parallel Computing (LCPC-8), August 1995. · P.
Konas and P.C. Yew, Partitioning for
Synchronous Parallel Simulation, Proc. of the ACM/IEEE/SCS 9th Workshop
on Parallel and Distributed Simulation, 1995 · D.K.
Chen, J. Torrellas and P.C. Yew, An
Efficient Algorithm for the Run-Time Parallelization of Doacross Loops,
Proc. of Supercomputing '94, pp. 518-527 · L.
Choi and P.C. Yew, A Compiler-Directed
Cache Coherence Scheme with Improved Intertask Locality, Proc. of
Supercomputing '94, pp. 773-782 · D.
Poulsen and P.C. Yew, Data Prefetching
and Data Forwarding in Shared-Memory Multiprocessors, Proc. of the Int'l
Conf. on Parallel Processing (ICPP), Vol. II, Aug. 1994, pp. 276-280 · D.K.
Chen and P.C. Yew, Statement Reordering
for Doacross Loops, Proc. of the Int'l Conf. on Parallel Processing
(ICPP), Vol.II, Aug. 1994, pp. 24-28 · D.K.
Chen and P.C. Yew, Redundant
Synchronization Elimination for Doacross Loops, Prof. of 1994 Int'l
Parallel Processing Symp. (IPPS), April 1994, pp. 477-481 · P.
Konas and P.C. Yew, Improved Parallel
Architectural Simulations on Shared-Memory Multiprocessors, Proc. of the
ACM/IEEE/SCS 8th Workshop on Parallel and Distributed Simulation, July 1994. · D.J.
Kuck, et. al, The Cedar System and an
Initial Performance Study, Proc. of the 20th Symp. on Computer
Architecture (ISCA-20), May 1993, pp.213-223 · D.K.
Poulsen and P.C. Yew, Execution-Driven
Tools for Parallel Simulation of Parallel Architecture and Applications,
Proc. of Supercomputing '93, Nov. 1993, pp. 860-869 · D.K.
Chen and P.C. Yew, Efficient
Synchronization for Doacross Loops Execution, Proc. of 1992 Int'l Conf.
on Parallel Processing (ICPP), Aug. 1992 · W.T.
Hsu and P.C. Yew, The Impact of Wiring
Constraints on Hierarchical Network Performance, Proc. of the 1992 Int'l
Parallel Processing Symp. (IPPS), March, 1992, pp.580-588 · P.
Konas and P.C. Yew, Synchronous
Parallel Discrete Event Simulation on Shared-Memory Multiprocessors,
Proc. of the 6th Workshop on Parallel and Distributed Simulation, Jan. 1992,
pp.12-21. · H.M.
Su and P.C. Yew, Efficient Doacross
Execution for Distributed Shared-Memory Systems, Proc. of Supercomputing
'91, Nov. 1991, pp.842-853 · D.K.
Chen and P.C. Yew, An Empirical Study on Doacross Loops,
Proc. of Supercomputing '91, Nov. 1991, pp. 620-632 · J.
Konicek, et al, The Organization of the
Cedar System, Proc. of 1991 Int'l Conf. on Parallel Processing (ICPP),
Aug. 1991, pp.49-56 · D.J.
Lilja and P.C. Yew, Combining Hardware
and Software Cache Coherence Strategies, Proc. of the 1991 Int't Conf. on
Supercomputing (ICS), June 1991, pp. 274-283 · H.B.
Lim and P.C. Yew, Parallel Program
Behavioral Study on a Shared-Memory Multiprocessor, Proc. of the 1991
Int'l Conf. on Supercomputing (ICS), June, 1991, pp. 386-395 · H.M.
Su and P.C. Yew, Efficient
Interprocessor Communication on Distributed Shared-Memory Multiprocessors,
Proc. of the 1991 Int'l Conf. on Parallel Processing (ICPP), Vol.1, Aug.
1991, pp. 45-48 · W.T.
Hsu and P.C. Yew, The Performance of
Hierarchical Systems with Wiring Constraints, Proc. of the 1991 Int'l
Conf. on Parallel Processing (ICPP), Vol. 1, Aug. 1991, pp. 9-16 · P.
Konas, P.C. Yew, Parallel Event Discrete
Event Driven Simulation on Shared-Memory Multiprocessors, Proc. of the
24th Annual Simulation Symp., April, 1991, pp. 134-148. · J.
Bruner, H. Cheong, A. Veidenbaum and P. C. Yew, Chief: A Parallel Simulation Environment for Parallel Systems, Proc.
of the 5th Int'l Parallel Processing Symp (IPPS). April, 1991, pp. 568-575 · W.
T. Hsu and P.C. Yew, An Effective
Synchronization Network for Large Multiprocessor Systems, Proc. of the
5th Int'l Parallel Processing Symp (IPPS). May, 1991, pp. 309-317 · D.
Lilja and P.C. Yew, Comparing
Parallelism Extraction Techniques: Superscalar Processors, Pipelined
Processors and Multiprocessors, Proc. of 1990 Int'l Conf. on Parallel
Processing (ICPP), Aug. 1990, pp. 563-564 · P.
Tang, P.C. Yew and C.Q. Zhu, Compiler
Algorithms for Data Synchronization in Nested Parallel Loops, Proc. of
1990 Int'l Conf. on Supercomputing (ICS), June 1990, pp. 177-186 · D.K.
Chen, H.M. Su and P.C. Yew, The Impact
of Synchronization and Granularity on Parallel Systems, Proc. of 17th
Int'l Symp. on Computer Architecture (ISCA-17), June 1990, pp. 239-249 · P.C.
Yew and J. Bruner, SEE: A System
Evaluation Environment for Studying Parallel Systems, Proc. of the First
Workshop on Parallel Processing, Dec. 1990. · H.M.
Su and P.C. Yew, On Data Synchronization
for Multiprocessors, Proc. of the 16th Int'l Symp. on Computer
Architecture (ISCA-16), 1989, pp
416-423 · Z.
Shen, Z. Li and P.C. Yew, An Empirical
Study on Array Subscripts and Data Dependences, Proc. of the 1989 Int'l
Conf. on Parallel Processing (ICPP), Aug. 1989, pp 145-152 · P.Y.
Tang and P.C. Yew, A Parallel Linked List for Shared-Memory Multiprocessors,
Proc. of the 1989 Computer Software and Application Conf, Oct. 1989,
pp.130-135. · Z.
Li, P.C. Yew and C.Q. Zhu, Data Dependence Analysis on Multi-Dimensional
Array References, Proc. of the 1989 Int'l Conf. on Supercomputing, June 1989,
pp 215-224 · P.A.
Emrath, D.A. Padua and P.C. Yew, Cedar
Architecture and Its Software, 22nd Hawaii Intn'l Conf. on System
Sciences, Jan. 1989, pp 306-315. · Z.
Li and P.C. Yew, Efficient
Interprocedural Analysis for Parallel Programs, ACM SIGPLAN Symp. on
Parallel Programming: Experience with Applications, Languages and Systems,
July 1988, pp. 85-99 · P.
Tang, P.C. Yew and C.Q. Zhu, Impact of
Self-Scheduling Order on Performance of Multiprocessor Systems, Proc. of
the 1988 Int'l Conf. on Supercomputing (ICS), pp. 593-603 · Z.
Li and P.C. Yew, Interprocedural
Analysis for Parallel Computing, Proc. of the 1988 Int'l Conf. on
Parallel Processing (ICPP), pp. 221-228 · W.T.
Hsu and P.C. Yew, A Scheme to Enhance
Binary N-Cube Networks, Proc. of the 1987 Int'l Conf. on Parallel
Processing (ICPP), pp. 820-823 · R.L.
Lee, P.C. Yew and D.H. Lawrie, Data
Prefetching in Shared Memory Multiprocessors, Proc. of the 1987 Int'l
Conf. on Parallel Processing (ICPP), pp. 28-31 · Z.
Fang, P.Y. Tang and P.C. Yew, C.Q. Zhu, Dynamic
Processor Self-Scheduling for General Parallel Nested Loops, Proc. of the
1987 Int'l Conf. on Parallel Processing (ICPP), pp. 1-10 · R.L.
Lee, P.C. Yew and D.H. Lawrie, Multiprocessor
Cache Design Considerations, Proc. of the 14th Int'l Symp. on Computer
Architecture (ISCA-14), pp. 253-262, 1987 · P.Y.
Tang and P.C. Yew, Deadlock Prevention
in Processor Self-Scheduling for Nested Parallel Loops, Proc. of the 1987
Int'l Conf. on Parallel Processing (ICPP),
pp. 11-18, 1987 · N.F.
Tzeng, P.C. Yew and C.Q. Zhu, Fault-Diagnosis
in a Multiple-Path Interconnection Networks, Proc. of the 16th Int'l
Symp.on Fault-Tolerance Computing, pp.98-103, July 1986 · P.Y.
Tang and P.C. Yew, Processor
Self-Scheduling for Multiple-Nested Parallel Loops, Proc. of the 1986
Int'l Conf. on Parallel Processing (ICPP), St. Charles, IL., pp. 528-535,
Aug. 1986 · N.F.
Tzeng, P.C. Yew and C.Q. Zhu, A
Fault-Tolerant Scheme for Multistage Interconnection Networks, Proc. of
the 12th Int'l Symp. on Computer Architecture (ISCA-12), pp. 368-375, June
1985 · N.F.
Tzeng, P.C. Yew and C.Q. Zhu, The
Performance of A Fault-Tolerant Multistage Interconnection Network, Proc.
of the 1985 Int'l Conf. on Parallel Processing (ICPP), pp. 458-465, Aug. 1985
· C.Q.
Zhu and P.C. Yew, A Synchronization
Scheme and Its Applications for Large Multiprocessor Systems, Proc. of
the 4th Int'l Conf. on Distributed Computing Systems, pp. 486-493, May 1984. · Q.X.
Xu and P.C. Yew, Simulations and
Analysis for a Multiprocessor System with Multiprogramming, Proc. of the
First Int'l Conf. on Computers and Applications, June 1984. · P.Y.
Chen, P.C. Yew and D.H. Lawrie, Performance
of Packet Switching in a Buffered Single-Stage Shuffle-Exchange Network,
Proc. of the 3rd Int'l Conf. on Distributed Computing Systems, pp. 622-629,
Oct., 1982. · W.
Abu-Sufah, R. Lee, M. Malkawi and P.C. Yew, Experimental Results on the Paging Behavior of Numerical Programs,
Proc. of the 6th Int'l Conf. on Software Engineering (ICSE), pp. 110-117,
Sept., 1982. · J.E.
Lilienkamp, D.H. Lawrie and P.C. Yew, A
Fault Tolerant Interconnection Networks Using Error Correcting Codes,
Proc. of the 1982 Int'l Conf. on Parallel Processing (ICPP), pp.123-125, Aug.
1982 B.
Journal Papers · G.
Shi, Y. Zhang, S. Shang, W. Wang, Y. Dong, P.C. Yew, A Formally Verified Transformation to Unify Multiple Nested Clock for
a Lustre-Like Language, Science China Information Sciences, January 2019 · L.
Zhong, W. Hou, X. Feng, Z. Zhang, P.C. Yew, RARE: An Efficient Static Fault Detection Framework for
Definition-Use Faults in Large Programs, IEEE Access, February 2018 · C. Wu, Z. Wang, X. Yuan, Z. Wang, L. Li,
P. C. Yew, J. Huang, X. Feng, Y. Lan, Y. Chen, Y. Lai, Y. Guan, Using Local Clocks to Reproduce
Concurrency Bugs, IEEE Trans. on Software
Engineering (TSE), Vol. 44, Issue 11, November 2018. · W. Zhang, X., Ji, Y. Lu, H. Wang, H. Chen,
P.C. Yew, Prophet: A Parallel Instruction-Oriented
Many-Core Simulator, IEEE Transactions on Parallel and Distributed
Systems (TPDS), Vol. 28, No. 10, October 2017. · W. Zhang, X. Ji, S. Yu, H. Chen, T. Li and
P.C. Yew, VarCatcher: A Framework for
Tackling Performance Variability of Parallel Workloads on Multicores,
IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol 28, No. 4,
April 2017. · S. Mehta and P.C. Yew, Variable Liberalization, ACM
Transactions on Architecture and Code Optimization (TACO), Vol. 13, Issue 3,
September 2016. · Z. Fang, S. Mehta, P.C. Yew, A. Zhai, J. Greensky, G. Beeraka, B. Zang, Measuring Micro-architectural Details of Multi- and Many-core Memory Systems Through Micro-benchmarking, ACM Transactions on Architecture and Code Optimization (TACO), Vol.11, Issue 4, January 2015. · F.
Lv, L. Liu, M.H. Cui, L. Wang, Y. Liu, X. Feng, P.C. Yew, WiseThrottling: A New Asynchronous Task
Scheduler for Mitigating I/O Bottleneck in Large-Scale Datacenter Servers,
J. of Supercomputing, 2015 · A.
Holey, V. Mekkat, P.C. Yew, A. Zhai, Performance-Energy
Considerations for Shared Cache Management in a Heterogeneous Multicore
Processor, ACM Transactions on Architecture and Code Optimization (TACO),
Vol 12, Issue 1, March 2015. · C.
Wu, J. Li, D. Xu, P.C. Yew, J. Li, and Z. Wang, FPS: A
Fair-progress Process Scheduling Policy on Shared-Memory Multiprocessors,
IEEE Transactions on
Parallel and Distributed Systems (TPDS), Vol. 26, No. 2, February 2015, pp.
444-454 · Lv,
H.M. Cui, L. Wang, L. Liu, C.G. Wu, X.B. Feng and P.C. Yew, Dynamic I/O-Aware Scheduling for
Batch-Mode Applications on Chip Multiprocessor Systems of Cluster Platform,
J. of Computer Science and Technology (JCST), 29(1): 21-37, 2014 · D.Y. Hong, J.J. Wu, P.C. Yew, W.C. Hsu,
C.C. Hsu, P. Liu, C.M. Wang and Y.C. Chung, Efficient and Retargetable Dynamic Binary Translation on Multicores,
IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 25, No. 3,
March 2014, pp. 622-632 · S. Mehta, G. Beeraka and P.C. Yew, Tile Size Selection Revisited, ACM Transactions on Architecture and Code Optimization (TACO), Vol. 10, No. 4, December 2013 · L. Gao, L. Li, J.L. Xue and P.C. Yew, SEED: A Statically-Greedy and
Dynamically-Adaptive Approach for Speculative Loop Execution, IEEE
Transaction on Computers (TC), Vol. 62, No. 5, May 2013, pp. 1004-1016 · S.Y. Wang, P.C. Yew and A. Zhai,
Code Transformations for Enhancing the
Performance of Speculatively Parallel Threads, J. of Circuits, Systems
and Computers (JCSC), No. 2, Vol. 21, 2012 · Z. Wang, C. Wu, P.C. Yew, J.J. Li and X.
Di, On-the-Fly Structure Splitting for
Heap Objects, ACM Transactions on Architecture and Code
Optimization (TACO), January 2012 · H. Chen, R. Chen, F. Zhang, B. Zang and P.C. Yew, Mercury: combining performance with dependability using self-virtualization, Journal of Computer Science and Technology (JCST), 2012. · H.
Chen, J. Wu, C. Huang, P.C. Yew and B. Zang, Dynamic Software Updating Using a Relaxed Consistency Model,
IEEE Trans. on Software Engineering (TSE), Vol. 37, No. 5, Sept/Oct 2011, pp.
679-694 · P.
Woodward, J. Jayaraj, P.H. Lin, P.C. Yew, Moving
Scientific Codes to IBM Cell Processor and Other Multicore Microprocessor
CPUs, IEEE Computing in Science and Engineering, Vol.10, No.6, pp.16-25, Nov/Dec. 2008 · S.V.
Kodakara, J. Kim, D.J. Lilja, D. Hawkins, W.C. Hsu, and P.C. Yew, CIM: A Reliable Metric for Evaluating
Program Phase Classifications, IEEE Computer Architecture Newsletter
(CAN), 2007 · J.
Lin, W.C. Hsu, P.C. Yew, R.D.C. Ju, and T.F. Ngai, Recovery Code Generation for General Speculative Optimizations, ACM
Transactions on Architecture and Code Optimization (TACO), Vol.3, No.1, March
2006, pp. 67-89 · J.S.
Kong, P.C. Yew and G.H. Lee, Minimizing
the Directory Size for Large-Scale Shared-Memory Multiprocessors, IEICE
Trans. on Information and Systems, Vol. E88-D No.11 2533, November 2005, pp.
2533-2543. · J.
Lin, T. Chen, W.C. Hsu, P.C. Yew, R.D.C. Ju, T.F. Ngai and S. Chan, A Compiler Framework for Speculative
Optimizations, ACM Transactions on Architecture and Code Optimization (TACO),
Vol.1, No.3, September 2004, pp. 247-271 · J.
Lu, H. Chen, P.C. Yew, W.C. Hsu, Design
and Implementation of a Lightweight Dynamic Optimization System, Journal of
Instruction-Level Parallelism, Volume 6, 2004 · P.Y.
Tang and P.C. Yew, Interprocedural Induction
Variable Analysis, International Journal of Foundation of Computer
Science, World Scientific, Vol.14, No.3, June 2003, pp.405-423 · S.J.
Lee and P.C. Yew, On Augmenting Trace
Cache for High-Bandwidth Value Prediction, IEEE Tran. on Computers (TC),
Vol.51, No. 9, September 2002, pp. 1074-1088. · S.J.
Lee and P.C. Yew, On Table Bandwidth
and Its Update Delay for Value Prediction on Wide-Issue ILP Processors,
IEEE Transactions on Computers (TC), Vol. 50,
No.8, August 2001, pp.847-852. · H.B.
Lim and P.C. Yew, Efficient Integration
of Compiler-Directed Cache Coherence and Data Prefetching, Journal of
Parallel and Distributed Computing (JPDC), Vol. 61, No. 12, Dec 2001, pp.
1775-1802 · S.Y.
Cho, P.C. Yew and G. Lee, A
High-Bandwidth Memeory Pipeline for Wide-Issue Processors, IEEE Trans. on
Computers (TC), Vol. 50, No.7 , July 2001, pp. 709-723. · L.
Choi and P.C. Yew, Compiler Analysis
for Cache Coherence: Interprocedural Array Data-Flow Analysis and Its Impact
on Cache Performance, IEEE Trans. on Parallel and Distributed Systems
(TPDS), Vol. 11, No. 9, Sept 2000, pp. 879-896. · L.
Choi and P.C. Yew, Hardware and
Compiler-Directed Cache Coherence in Large-Scale Multiprocessors, the
IEEE Trans. on Parallel and Distributed Systems (TPDS), Vol. 11, No. 4, April
2000, pp. 375-394. · I.H.
Kazi, et al., JaViz: A Client/Server
Java Profiling Tool, a special issue on Java technology in IBM Systems
Journal, Vol. 39, No.1.1, 2000. · J.Y.
Tsai, et al., The Superthreaded
Architecture, a special issue on multithreaded architectures in the IEEE
Trans. on Computers (TC), Vol 48, No. 9, September 1999, pp. 881-903. · D.K.
Chen and P.C. Yew, Redundant
Synchronization Elimination for Doacross Loops, IEEE Trans. on Parallel
and Distributed Systems (TPDS), Vol.10, No. 5, May 1999. · H.B.
Lim and P.C. Yew, Maintaining Cache
Coherence Through Compiler-Directed Data Prefetching, Journal of Parallel
and Distributed Computing (JPDC), Vol 53, No. 2, pp. 144-173, Sep 1998. · J.Y.
Tsai, Z. Jiang, and P.C. Yew, Compiler
Techniques for the Superthreaded Architectures, a Special Issue on
Languages and Compilers for Parallel Computing, International Journal of
Parallel Programming, June 1998. · J.Y.
Tsai, P.C. Yew, et al, Integrating
Parallelizing Compilation Technology and Processor Architecture for
Cost-Effective Concurrent Multithreading, a special issue in Journal of
Information Science and Eng, No. 14, pp.205-222, March 1998 · S.
Adve, et al, The Interaction of
Architecture and Compilation Technology for High-Performance Processor Design,
IEEE Computers, December 1997 · W.T.
Hsu and P.C. Yew, Performance
Evaluation of Wire-Limited Hierarchical Networks, Journal of Parallel and
Distributed Computer (JPDC), Vol. 41, June 1997, pp 156-172. · J.Y.
Tsai and P.C. Yew, Enhancing
Multiple-Path Speculative Execution with Predicate Window Shifting, a
special issue on Microprocessor Architecture in Journal of System
Architecture, June 1997 · L.
Choi, H.B. Lim and P.c. Yew, Multiprocessor
Cache Coherence: The Compiler-Directed Approach, IEEE Parallel &
Distributed Technology, Winter 1996, pp.23-35 · D.K.
Poulsen and P.C. Yew, Integrating
Fine-Grained Message Passing in Cache Coherent Shared-Memory Multiprocessors,
Journal of Parallel and Distributed Computing (JPDC), Vol. 33, No. 2, March
1996, pp. 172-188. · D.K.
Chen and P.C. Yew, On Effective
Execution of Non-Uniform Doacross Loops, IEEE Trans. on Parallel and
Distributed Systems (TPDS), Vol. 7, No. 5, May 1996, pp. 463-476. · J.D.
Bruner, C.J. Beckmann, P. Konas, D.K. Poulsen and P.C. Yew, Chief: A Simulation Environment for
Studying Parallel Systems, International Journal of Computer Simulation,
Vol.6, No.1, 1996, pp. 89-100. · D.J.
Lilja and P.C. Yew, Improving Memory
Utilization in Cache Coherence Directories, IEEE Trans. on Parallel and
Distributed Systems (TPDS), Vol. 4, No.10, Oct. 1993, pp. 1130-1146. · W.T.
Hsu and P.C. Yew, An Effective
Synchronization Network for Hot Spot Accesses, ACM Trans. on Computing
Systems (TOCS), Vol. 10, No.3, Aug. 1992, pp. 167-189. · Z.
Shen, Z. Li and P.C. Yew, An Empirical
Study on Program Characteristics for Parallelizing Compilers, IEEE Trans.
on Parallel and Distributed Systems (TPDS), Vol. 1, No. 3, July 1990, pp.
356-364. · Tim
Davis and P.C. Yew, A Stable
Non-Deterministic Parallel Algorithm for General Unsymmetric Sparse LU
Factorization, SIAM J. on Matrix Analysis and Applications, Vol. 2, No.
3, July 1990, pp. 383-403. · Z.
Li, P.C. Yew and C.Q. Zhu, An Efficient
Data Dependence Analysis for Parallelizing Compiler, IEEE Trans. on
Parallel and Distributed Systems (TPDS), Vol. 1, No. 1, Jan. 1990, pp. 26-34. · Z.
Fang, P. Tang, P.C. Yew and C.Q. Zhu, Dynamic
Processor Self-Scheduling for General Parallel Nested Loops, IEEE Trans.
on Computers (TC), Vol. 39, No. 7, July 1990, pp. 919-929. · P.
Tang, P.C. Yew, Software Combining
Algorithms for Distributing Hot-Spot Addressing, J. of Parallel and
Distributed Computing (JPDC), Vol. 10, No.2, Oct. 1990, pp. 130-139. · N.F.
Tzeng and P.C. Yew and C.Q. Zhu, Realizing
Fault-Tolerant Interconnection Networks via Chaining, IEEE Trans. on
Computers (TC), Vol. 37, No. 4, pp. 458-462, April 1988. · Z.
Li and P.C. Yew, Program
Parallelization with Interprocedural Analysis, J. of Supercomputing,
Kluwer Academic Publishers, 1988, pp. 225-244. · C.Q.
Zhu and P.C. Yew, A Scheme to Enforce
Data Dependence on Large Multiprocessor Systems, IEEE Trans. on Software
Engineering (TSE), Vol. SE-13, No. 6, pp. 726-739, June 1987. · P.C.
Yew, N.F. Tzeng and D.H. Lawrie,,
Distributing Hot Spot Addressing in Large Scale Multiprocessors, IEEE
Trans. on Computers (TC), Vol. C-36, No. 4, pp. 388-395, April 1987. · P.C.
Yew, D.A. Padua and D.H. Lawrie,, Stochastic
Properties of a Multiple-Layer Single-Stage Shuffle-Exchange Network in a
Message Switching Environment, J. of Digital Systems, Vol. 6, No. 4, pp.
387-410, 1982. · P.Y.
Chen, D.H. Lawrie, P.C. Yew and D.A. Padua, Interconnection Networks Using Shuffle, IEEE Computer, Vol. 14,
No. 12, pp. 55-64, December 1981. · P.C.
Yew and D.H. Lawrie, An Easily
Controlled Network for Frequently Used Permutations, IEEE Trans. on
Computers (TC), Vol. C-30, No.4, pp. 296-298, April 1981. |