2020 |
SIEVE: Speculative Inference on the Edge with Versatile Exportation (paper: pdf; slides: pptx) Babak Zamirai, Salar Latifi, Pedram Zamirai, Scott Mahlke Proc. 57th Design Automation Conference (DAC) July. 2020. |
Path Sensitive Signatures for Control Flow Error Detection (paper: pdf; slides: pptx) Ze Zhang, Sunghyun Park, Scott Mahlke 21st International Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES) June. 2020. |
PolygraphMR: Enhancing the Reliability and Dependability of CNNs (paper: pdf; slides: pptx) Salar Latifi, Babak Zamirai, Scott Mahlke 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN) June. 2020. |
Low-Cost Prediction-Based Fault Protection Strategy (paper: pdf; slides: pptx) Sunghyun Park, Shikai Li, Ze Zhang, Scott Mahlke Proceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization (CGO) Feb. 2020. |
2019 |
Multi-objective Exploration for Practical Optimization Decisions in Binary Translation (paper: pdf; slides: pptx) Sunghyun Park, Youfeng Wu, Janghaeng Lee, Amir Aupov, Scott Mahlke ESWEEK-TECS special issue / the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES) Oct. 2019. |
TF-Net: Deploying Sub-Byte Deep Neural Networks on Microcontrollers (paper: pdf; slides: pptx) Jiecao Yu, Andrew Lukefahr, Reetuparna Das, Scott Mahlke ESWEEK-TECS special issue / the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES) Oct. 2019. |
POSTER: Pairing Up CNNs for High Throughput Deep Learning (paper: pdf; slides: pptx) Babak Zamirai, Salar Latifi, Scott Mahlke 2019 28th International Conference on Parallel Architectures and Compilation Techniques (PACT) Sep. 2019. |
Characterization of Unnecessary Computations in Web Applications (paper: pdf; slides: pptx) Hossein Golestani, Scott Mahlke, Satish Narayanasamy 2019 Intl. Symposium on Performance Analysis of Systems and Software (ISPASS) Mar. 2019. |
2018 |
Scratch That (But Cache This): A Hybrid Register Cache/Scratchpad for GPUs (paper: pdf; slides: pptx) Jonathan Bailey, John Kloosterman, and Scott Mahlke 2018 Intl. Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES) Oct. 2018. |
Sculptor: Flexible Approximation with Selective Dynamic Loop Perforation (paper: pdf; slides: pptx) Shikai Li, Sunghyun Park, Scott Mahlke Proc. 32nd International Conference on Supercomputing (ICS) Jun. 2018. |
Low Cost Transient Fault Protection Using Loop Output Prediction (paper: pdf; slides: pptx) Sunghyun Park, Shikai Li, Scott Mahlke 48th IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W) Jun. 2018. |
Low Cost Transient Fault Protection Using Loop Output Protection (paper: pdf; slides: pptx) Sunghyun Park, Shikai Li, Scott Mahlke 14th IEEE workshop on Silicon Errors in Logic - System Effects (SELSE) April. 2018. |
In-Memory Data Parallel Processor (paper: pdf; slides: pptx) Daichi Fujiki, Scott Mahlke, Reetuparna Das Proc. 23rd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) Mar. 2018. |
2017 |
Mirage Cores: The Illusion of Many Out-of-order Cores Using In-order Hardware (paper: pdf; slides: pptx) Shruti Padmanabha, Andrew Lukefahr, Reetuparna Das, Scott Mahlke Proc. 50th IEEE/ACM International Symposium on Microarchitecture (MICRO) Oct. 2017. |
RegLess: Just-in-Time Operand Staging for GPUs (paper: pdf; slides: pptx) John Kloosterman, Jonathan Beaumont, D. Anoushe Jamshidi, Jonathan Bailey, Trevor Mudge, Scott Mahlke Proc. 50th IEEE/ACM International Symposium on Microarchitecture (MICRO) Oct. 2017. |
DeftNN: addressing bottlenecks for DNN execution on GPUs via synapse vector elimination and near-compute data fission (paper: pdf) Parker Hill, Animesh Jain, Mason Hill, Babak Zamirai, Chang-Hong Hsu, Michael A. Laurenzano, Scott Mahlke, Lingjia Tang, Jason Mars Proc. 50th IEEE/ACM International Symposium on Microarchitecture (MICRO) Oct. 2017. |
Scalpel: Customizing DNN Pruning to the Underlying Hardware Parallelism (paper: pdf; slides: pptx) Jiecao Yu, Andrew Lukefahr, David Palframan, Ganesh Dasika, Reetuparna Das, Scott Mahlke Proc. The 44th International Symposium on Computer Architecture (ISCA) Jun. 2017. |
Dynamic Resource Management for Efficient Utilization of Multitasking GPUs (paper: pdf; slides: pptx) Jason Jong Kyu Park, Yongjun Park, and Scott Mahlke Proc. 22nd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) Apr. 2017. |
2016 |
BugMD: Automatic Mismatch Diagnosis for Bug Triaging (paper: pdf) Biruk Mammo, Milind Furia, Valeria Bertacco, Scott Mahlke, Daya S Khudia Proc. 35th Intl. Conference on Computer-Aided Design (ICCAD) Nov. 2016. |
Concise loads and stores: The case for an asymmetric compute-memory architecture for approximation (paper: pdf) Animesh Jain, Parker Hill, Shih-Chieh Lin, Muneeb Khan, Md E. Haque, Michael A. Laurenzano, Scott Mahlke, Lingjia Tang, Jason Mars Proc. 49th IEEE/ACM Intl. Symposium on Microarchitecture (MICRO) Oct. 2016. |
A Bypass First Policy for Energy-Efficient Last Level Caches (paper: pdf; slides: pptx) Jason Jong Kyu Park, Yongjun Park, and Scott Mahlke Proc. International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS) Jul. 2016. |
Input responsiveness: using canary inputs to dynamically steer approximation (paper: pdf) Michael A. Laurenzano, Parker Hill, Mehrzad Samadi, Scott Mahlke, Jason Mars and Lingjia Tang Proc. 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI) Jun. 2016, pp. 161--176. |
Statistical Error Bounds for Data Parallel Applications (paper: pdf) Parker Hill, Michael Laurenzano, Babak Zamirai, Mehrzad Samadi, Scott Mahlke, Jason Mars, Lingjia Tang The 2016 Workshop on Approximate Computing Across the Stack (WAX) April. 2016. |
Exploring Fine-Grained Heterogeneity with Composite Cores (paper: pdf) Andrew Lukefahr, Shruti Padmanabha, Reetuparna Das, Faissal M. Sleiman, Ronald G. Dreslinski, Thomas F. Wenisch, and Scott Mahlke IEEE Transactions on Computers (TC) vol. 65, no. 2, Feb. 2016, pp.~535--547. |
Quality Control for Approximate Accelerators by Error Prediction (paper: pdf) Daya S Khudia, Babak Zamirai, Mehrzad Samadi and Scott Mahlke IEEE Design and Test Vol. 33, No. 1, Jan. 2016, pp. 43-50. |
2015 |
WarpPool: Sharing Requests with Inter-Warp Coalescing for Throughput Processors (paper: pdf slides: pptx) John Kloosterman, Jonathan Beaumont, Mick Wollman, Ankit Sethia, Ron Dreslinski, Trevor Mudge, and Scott Mahlke Proc. 48th IEEE/ACM International Symposium on Microarchitecture (MICRO) Dec. 2015. |
DynaMOS: Dynamic Schedule Migration for Heterogeneous Cores (paper: pdf slides: pptx) Shruti Padmanabha, Andrew Lukefahr, Reetuparna Das, and Scott Mahlke Proc. 48th IEEE/ACM International Symposium on Microarchitecture (MICRO) Dec. 2015. |
ELF: Maximizing Memory-level Parallelism for GPUs with Coordinated Warp and Fetch Scheduling (paper: pdf; slides: pptx) Jason Jong Kyu Park, Yongjun Park, and Scott Mahlke Proc. International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Nov. 2015. |
Orchestrating Multiple Data-Parallel Kernels on Multiple Devices (paper: pdf; slides: pptx) Janghaeng Lee, Mehrzad Samadi, and Scott Mahlke Proc. 24th Intl. Conference on Parallel Architectures and Compilation Techniques (PACT) Oct. 2015. |
Fine Grain Cache Partitioning using Per-Instruction Working Blocks (paper: pdf; slides: pptx) Jason Jong Kyu Park, Yongjun Park, and Scott Mahlke Proc. 24th International Conference on Parallel Architectures and Compilation Techniques (PACT) Oct. 2015. |
SKMD: Single Kernel on Multiple Devices for Transparent CPU-GPU Collaboration (paper: pdf) Janghaeng Lee, Mehrzad Samadi, Yongjun Park, and Scott Mahlke ACM Transactions on Computer Systems (TOCS) Aug. 2015. |
Rumba: An Online Quality Management System for Approximate Computing (paper: pdf; slides: pptx) Daya S Khudia, Babak Zamirai, Mehrzad Samadi and Scott Mahlke Proc. The 42nd International Symposium on Computer Architecture (ISCA) Jun. 2015. |
Colony of NPUs: Scaling the Efficiency of Neural Accelerators (paper: pdf; slides: key) Babak Zamirai, Daya S Khudia, Mehrzad Samadi, and Scott Mahlke Proc. The 2015 Workshop on Approximate Computing Across the Stack (WAX) Jun. 2015. |
Approximating with Input Level Granularity (paper: pdf; slides: pdf) Parker Hill, Michael Laurenzano, Mehrzad Samadi, Scott Mahlke, Jason Mars, and Lingjia Tang Proc. The 2015 Workshop on Approximate Computing Across the Stack (WAX) Jun. 2015. |
Adaptive Cache Partitioning on a Composite Core (paper: pdf; slides: pptx) Jiecao Yu, Andrew Lukefahr, Shruti Padmanabha, Reetuparna Das, Scott Mahlke The 3rd Annual Workshop on Parallelism in Mobile Platforms (PRISM-3) Jun. 2015. |
Accelerating Asynchronous Programs through Event Sneak Peek (paper: pdf; slides: pptx) Gaurav Chadha, Scott Mahlke, Satish Narayanasamy Proc. The 42nd International Symposium on Computer Architecture (ISCA) Jun. 2015. |
Accelerating Mobile Applications through Flip-Flop Replication (paper: pdf) Mark Gordon, David Ke Hong, Peter M. Chen, Jason Flinn, Scott Mahlke, and Z. Morley Mao Proc. 13th Intl. Conference on Mobile Systems, Applications, and Services May 2015, pp.~137--150. |
Chimera: Collaborative Preemption for Multitasking on a Shared GPU (paper: pdf; slides: pptx) Jason Jong Kyu Park, Yongjun Park, and Scott Mahlke Proc. 20th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) Mar. 2015. |
Mascar: Speeding up GPU Warps by Reducing Memory Pitstops (paper: pdf; slides: pptx) Ankit Sethia, D. Anoushe Jamshidi and Scott Mahlke The 21st IEEE Symposium on High Performance Computer Architecture (HPCA) Feb. 2015. |
Using Graphics Processing Units in an LTE Base Station (paper: pdf) Qi Zheng, Yajing Chen, Hyunseok Lee, Ronald Dreslinski, Chaitali Chakrabarti, Achilleas Anastasopoulos, Scott Mahlke, Trevor Mudge Journal of Signal Processing Systems vol. 78, no. 1, Jan. 2015, pp.~35--47. |
2014 |
Equalizer: Dynamic Tuning of GPU Resources for Efficient Execution (paper: pdf; slides: pptx) Ankit Sethia and Scott Mahlke The 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) Dec. 2014. |
Harnessing Soft Computations for Low-budget Fault Tolerance (paper: pdf; slides: pptx) Daya Shanker Khudia and Scott Mahlke The 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) Dec. 2014. |
Scaling Performance via Self-Tuning Approximation for Graphics Engines (paper: pdf) Mehrzad Samadi, Janghaeng Lee, D. Anoushe Jamshidi, Amir Hormati, and Scott Mahlke ACM Transactions on Computer Systems (TOCS) Aug. 2014. |
EFetch: Optimizing Instruction Fetch for Event-Driven Web Applications (paper: pdf; slides: pptx) Gaurav Chadha, Scott Mahlke, Satish Narayanasamy Proc. 23nd Intl. Conference on Parallel Architectures and Compilation Techniques (PACT) Sep. 2014. |
D2MA: Accelerating Coarse-Grained Data Transfer for GPUs (paper: pdf; slides: pptx) D. Anoushe Jamshidi, Mehrzad Samadi, and Scott Mahlke Proc. 23nd Intl. Conference on Parallel Architectures and Compilation Techniques (PACT) Sep. 2014. |
VAST: The Illusion of a Large Memory Space for GPUs (paper: pdf; slides: pptx) Janghaeng Lee, Mehrzad Samadi, and Scott Mahlke Proc. 23nd Intl. Conference on Parallel Architectures and Compilation Techniques (PACT) Sep. 2014. |
Heterogeneous Microarchitectures Trump Voltage Scaling for Low-Power Cores (paper: pdf; slides: pptx) Andrew Lukefahr, Shruti Padmanabha, Reetuparna Das, Ronald Dreslinski Jr., Thomas F. Wenisch, and Scott Mahlke Proc. 23nd Intl. Conference on Parallel Architectures and Compilation Techniques (PACT) Sep. 2014. |
Embracing Heterogeneity with Dynamic Core Boosting (paper: pdf; slides: pptx) Hyoun Kyu Cho and Scott Mahlke Proc. 2014 ACM International Conference on Computing Frontiers (CF) May 2014. |
CPU-GPU Collaboration for Output Quality Monitoring (paper: pdf; slides: pptx) Mehrzad Samadi and Scott Mahlke First Workshop on Approximate Computing Across the System Stack (WACAS) Mar. 2014. |
Paraprox: Pattern-Based Approximation for Data Parallel Applications (paper: pdf; slides: pptx) Mehrzad Samadi, D. Anoushe Jamshidi, Janghaeng Lee, and Scott Mahlke Proc. 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) Mar. 2014. |
Leveraging GPUs Using Cooperative Loop Speculation (paper: pdf) Mehrzad Samadi, Amir Hormati, Janghaeng Lee, and Scott Mahlke ACM Transactions on Architecture and Code Optimization (TACO) Feb. 2014. |
2013 |
Trace Based Phase Prediction For Tightly-Coupled Heterogeneous Cores (paper: pdf; slides: pdf) Shruti Padmanabha, Andrew Lukefahr, Reetuparna Das, and Scott Mahlke Proc. 46th IEEE/ACM International Symposium on Microarchitecture (MICRO) Dec. 2013. |
SAGE: Self-Tuning Approximation for Graphices Engines (paper: pdf; slides: pptx) Mehrzad Samadi, Janghaeng Lee, D. Anoushe Jamshidi, Amir Hormati, and Scott Mahlke Proc. 46th IEEE/ACM International Symposium on Microarchitecture (MICRO) Dec. 2013. |
Efficient Execution of Augmented Reality Applications on Mobile Programmable Accelerators (paper: pdf; slides: pptx) Jason Jong Kyu Park, Yongjun Park, and Scott Mahlke Proc. The 2013 International Conference on Field-Programmable Technology (ICFPT) Dec. 2013. |
APOGEE: Adaptive Prefetching On GPUs for Energy Efficiency (paper: pdf; slides: pptx) Ankit Sethia, Ganesh Dasika, Mehrzad Samadi, and Scott Mahlke Proc. 22nd Intl. Conference on Parallel Architectures and Compilation Techniques (PACT) Sep. 2013. |
Transparent CPU-GPU Collaboration for Data-Parallel Kernels on Heterogeneous Systems (paper: pdf; slides: pptx) Janghaeng Lee, Mehrzad Samadi, Yongjun Park, and Scott Mahlke Proc. 22nd Intl. Conference on Parallel Architectures and Compilation Techniques (PACT) Sep. 2013. |
Low Cost Control Flow Protection Using Abstract Control Signatures (paper: pdf; slides: pptx) Daya Shanker Khudia and Scott Mahlke Proc. ACM SIGPLAN 2013 Conference on Languages, Compilers, Tools and Theory for Embedded Systems (LCTES) Jun. 2013. |
Concurrency Bugs in Multithreaded Software: Modeling and Analysis Using Petri Nets (paper: pdf) Hongwei Liao, Yin Wang, Hyoun Kyu Cho, Jason Stanley, Terence Kelly, Stéphane Lafortune, Scott Mahlke, and Spyros Reveliotis Journal of Discrete Event Dynamic Systems Vol. 23, Issue 2, Jun. 2013, pp. 157-195. |
Practical Lock/Unlock Pairing for Concurrent Programs (paper: pdf; slides: pptx) Hyoun Kyu Cho, Yin Wang, Hongwei Liao, Terence Kelly, Stephane Lafortune, and Scott Mahlke Proc. 2013 Intl. Symposium on Code Generation and Optimization Feb. 2013. |
Instant Profiling: Instrumentation Sampling for Profiling Datacenter Applications (paper: pdf; slides: pptx) Hyoun Kyu Cho, Tipp Moseley, Richard Hank, Derek Bruening, and Scott Mahlke Proc. 2013 Intl. Symposium on Code Generation and Optimization Feb. 2013. |
Illusionist: Transforming Lightweight Cores into Aggressive Cores on Demand (paper: pdf; slides: pdf) Amin Ansari, Shuguang Feng, Shantanu Gupta, Josep Torrellas, and Scott Mahlke Proc. 19th IEEE Intl. Symposium on High Performance Computer Architecture (HPCA) Feb. 2013. |
2012 |
A Customized Processor for Energy Efficient Scientific Computing (paper: pdf) Ankit Sethia, Ganesh Dasika, Trevor Mudge, and Scott Mahlke IEEE Transactions on Computers Vol. 61, No. 12, Dec. 2012, pp. 1711-1723. |
Efficient Performance Scaling of Future CGRAs for Mobile Applications (paper: pdf; slides: ppt) Yongjun Park, Jason Jong Kyu Park, and Scott Mahlke Proc. The 2012 International Conference on Field-Programmable Technology (FPT) Dec. 2012. |
Composite Cores: Pushing Heterogeneity into a Core (paper: pdf; slides: pptx) Andrew Lukefahr, Shruti Padmanabha, Reetuparna Das, Faissal M. Sleiman, Ronald Dreslinski, Thomas F. Wenisch, and Scott Mahlke Proc. 45th IEEE/ACM International Symposium on Microarchitecture (MICRO) Dec. 2012. |
Libra: Tailoring SIMD Execution using Heterogeneous Hardware and Dynamic Configurability (paper: pdf; slides: ppt) Yongjun Park, Jason Jong Kyu Park, Hyunchul Park, and Scott Mahlke Proc. 45th IEEE/ACM International Symposium on Microarchitecture (MICRO) Dec. 2012. |
Dynamic Acceleration of Multithreaded Program Critical Paths in Near-Threshold Systems (paper: pdf; slides: ppt) Hyoun Kyu Cho and Scott Mahlke 2012 Workshop on Near-threshold Computing Dec. 2012. |
When Less Is MOre (LIMO): Controlled Parallelism for Improved Efficiency (paper: pdf; slides: pptx) Gaurav Chadha, Scott Mahlke and Satish Narayanasamy Proc. 2012 Intl. Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES) Oct. 2012, pp. 141-150. |
COMET: Code Offload by Migrating Execution Transparently (paper: pdf; slides: ppt) Mark S. Gordon, D. Anoushe Jamshidi, Scott Mahlke, Z. Morley Mao and Xu Chen Proc. 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI) Oct. 2012, pp. 93-106. |
Efficient Soft Error Protection for Commodity Embedded Microprocessors using Profile Information (paper: pdf; slides: pptx) Daya Shanker Khudia, Griffin Wright, and Scott Mahlke Proc. ACM SIGPLAN 2012 Conference on Languages, Compilers, Tools and Theory for Embedded Systems (LCTES) Jun. 2012. |
Adaptive Input-aware Compilation for Graphics Engines (paper: pdf; slides: pptx) Mehrzad Samadi, Amir Hormati, Mojtaba Mehrara, Janghaeng Lee, and Scott Mahlke Proc. ACM SIGPLAN 2012 Conference on Programming Languages Design and Implementation (PLDI) Jun. 2012. |
Process Variation in Near-Threshold Wide SIMD Architecture (paper: pdf; slides: ppt) Sangwon Seo, Ronald Dreslinski, Mark Woh, Yongjun Park, Scott Mahlke, David Blaauw, Chaitali Chakrabarti, and Trevor Mudge Proc. 49th Design Automation Conference (DAC) Jun. 2012. |
Runtime Asynchronous Fault Tolerance via Speculation (paper: pdf) Yun Zhang, Soumyadeep Ghosh, Jialu Huang, Jae W. Lee, Scott A. Mahlke, and David I. August Proc. 2012 Intl. Symposium on Code Generation and Optimization (CGO) Apr. 2012. |
Automatic Speculative DOALL for Clusters (paper: pdf) Hanjun Kim, Nick P. Johnson, Jae W. Lee, Scott A. Mahlke, and David I. August Proc. 2012 Intl. Symposium on Code Generation and Optimization (CGO) Apr. 2012. |
Reducing the Cost of Protection Against Soft Errors using Profile Based Analysis (paper: pdf; slides: pptx) Daya Shanker Khudia, Griffin Wright, and Scott Mahlke 8th IEEE workshop on Silicon Errors in Logic - System Effects (SELSE) Mar. 2012. |
SIMD Defragmenter: Efficient ILP Realization on Data-parallel Architectures (paper: pdf; slides: ppt) Yongjun Park, Sangwon Seo, Hyunchul Park, Hyoun Kyu Cho, and Scott Mahlke Proc. 17th Intl. Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) Mar. 2012. |
Paragon: Collaborative Speculative Loop Execution on GPU and CPU (paper: pdf slides: ppt) Mehrzad Samadi, Amir Hormati, Janghaeng Lee, and Scott Mahlke Fifth Workshop on General-Purpose Computation on Graphics Processing Units (GPGPU) Mar. 2012. |
2011 |
Encore: Low-Cost, Fine-Grained Transient Fault Recovery (paper: pdf; slides: ppt) Shuguang Feng, Shantanu Gupta, Amin Ansari, Scott Mahlke, and David August Proc. 44th IEEE/ACM International Symposium on Microarchitecture (MICRO) Dec. 2011. |
Bundled Execution of Recurring Traces for Energy-Efficient General Purpose Processing (paper: pdf; slides: ppt) Shantanu Gupta, Shuguang Feng, Amin Ansari, Scott Mahlke, and David August Proc. 44th IEEE/ACM International Symposium on Microarchitecture (MICRO) Dec. 2011. |
PEPSC: A Power Efficient Processor for Scientific Computing (paper: pdf ; slides: ppt) Ganesh Dasika, Ankit Sethia, Trevor Mudge, and Scott Mahlke Proc. 20th Intl. Conference on Parallel Architectures and Compilation Techniques (PACT) Oct. 2011. |
Dynamically Accelerating Client-side Web Applications through
Decoupled Execution (paper: pdf ; slides: ppt) Mojtaba Mehrara, and Scott Mahlke Proc. 2011 Intl. Symposium on Code Generation and Optimization (CGO) April 2011. |
Sponge: Portable Stream Programming on Graphics Engines (paper: pdf ; slides: pptx) Amir Hormati, Mehrzad Samadi, Mark Woh, Trevor Mudge and Scott Mahlke Proc. 16th Intl. Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) Mar. 2011, pp 381-392. |
Archipelago: A Polymorphic Cache Design for Enabling Robust Near-Threshold Operation (paper: pdf ; slides:pptx) Amin Ansari, Shuguang Feng, Shantanu Gupta, and Scott Mahlke Proc. 17th IEEE Intl. Symposium on High Performance Computer Architecture (HPCA) February 2011. |
Dynamic Parallelization of JavaScript Applications Using an
Ultra-lightweight Speculation Mechanism (paper: pdf ; slides: pptx) Mojtaba Mehrara, Po-Chun Hsu, Mehrzad Samadi, and Scott Mahlke Proc. 17th IEEE Intl. Symposium on High Performance Computer Architecture (HPCA) Feb. 2011, pp. 87-98. |
A Power-Efficient 32b ARM ISA Processor Using Timing-error Detection and Correction for Transient-error Tolerance and Adaptation to
PVT Variation (paper: pdf) David Bull, Shidhartha Das, Karthik Shivshankar, Ganesh Dasika, Krisztian Flautner, and David Blaauw IEEE Journal of Solid-State Circuits (JSSCC) Vol. 46, No. 1, Jan. 2011, pp. 18-31. |
Maximizing Spare Utilization by Virtually Reorganizing Faulty Cache Lines (paper: pdf) Amin Ansari, Shantanu Gupta, Shuguang Feng, and Scott Mahlke IEEE Transactions on Computers Vol. 60, No. 1, Jan. 2011, pp. 35-49. |
StageNet: A Reconfigurable Fabric for Constructing Dependable CMPs (paper: pdf) Shantanu Gupta, Shuguang Feng, Amin Ansari and Scott Mahlke IEEE Transactions on Computers Vol. 60, No. 1, Jan. 2011, pp. 5-19. |
2010 |
Erasing Core Boudaries for Robust and Configurable Performance (paper: pdf ; slides:pptx) Shantanu Gupta, Shuguang Feng, Amin Ansari, and Scott Mahlke Proc. 43rd Intl. Symposium on Microarchitecture (MICRO) December 2010. |
Putting Faulty Cores to Work (paper: pdf) Amin Ansari, Shuguang Feng, Shantanu Gupta, and Scott Mahlke IEEE Micro Vol. 30, No. 6, Nov. 2010, pp. 36-45. |
Mighty Morphing Power-SIMD (paper: ; slides: ) Ganesh Dasika, Mark Woh, Sangwon Seo, Nathan Clark, Trevor Mudge, and Scott Mahlke Proc. 2010 Intl. Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES) Oct. 2010. |
Resource Recycling: Putting Idle Resources to Work on a Composable Accelerator (paper: pdf ; slides:pptx) Yongjun Park, Hyunchul Park, Scott Mahlke and Sukjin Kim Proc. 2010 Intl. Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES) Oct. 2010. |
MEDICS: Ultra-Portable Processing for Medical Image Reconstruction Ganesh Dasika, Ankit Sethia, Vincentius Robby, Trevor Mudge, and Scott Mahlke Proc. 19th Intl. Conference on Parallel Architectures and Compilation Techniques (PACT) Sep. 2010. |
StageWeb: Interweaving Pipeline Stages into a Wearout and Variation Tolerant CMP Fabric (paper: pdf ; slides: pptx) Shantanu Gupta, Amin Ansari, Shuguang Feng, and Scott Mahlke Proc. 40th Intl. Conference on Dependable Systems and Networks (DSN) Jun. 2010. |
Necromancer: Enhancing System Throughput by Animating Dead Cores (paper: pdf ; slides: pptx) Amin Ansari, Shuguang Feng, Shantanu Gupta, and Scott Mahlke Proc. 37th Intl. Symposium on Computer Architecture (ISCA) Jun. 2010. |
Shoestring: Probabilistic Soft Error Reliability on the Cheap (paper: pdf ; slides: pptx) Shuguang Feng, Shantanu Gupta, Amin Ansari, and Scott Mahlke Proc. 15th Intl. Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) Mar. 2010, pp 385-396. |
MacroSS: Macro-SIMDization of Streaming Applications (paper: pdf ; slides: pptx) Amir Hormati, Yoonseo Choi, Mark Woh, Manjunath Kudlur, Rodric Rabbah, Trevor Mudge and Scott Mahlke Proc. 15th Intl. Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) Mar. 2010, pp 285-296. |
A Power-Efficient 32b ARM ISA Processor Using Timing-error Detection and Correction for Transient-error Tolerance and Adaptation to
PVT Variation (paper: pdf ; slides: pdf) David Bull, Shidhartha Das, Karthik Shivshankar, Ganesh Dasika, Krisztian Flautner, and David Blaauw Proc. 2010 Intl. Solid-State Circuits Conference (ISSCC) Feb. 2010, pp 284-286. |
Maestro: Orchestrating Lifetime Reliability in Chip Multiprocessors (paper: pdf ; slides: pptx) Shuguang Feng, Shantanu Gupta, Amin Ansari, and Scott Mahlke Proc. 2010 Intl. Conference on High-Performance Embedded Architectures and Compilers (HiPEAC) Jan. 2010, pp 186-200. |
AnySP: Anytime Anywhere Anyway Signal Processing (paper: pdf) Mark Woh, Sangwon Seo, Scott Mahlke, Trevor Mudge, Chaitali Chakrabarti, and Krisztian Flautner IEEE Micro (2009 Top Picks in Computer Architecture) Vol. 30, No. 1, Jan. 2010, pp. 81-91. |
Mobile Computers for the Next-Generation Cell Phone (paper: pdf) Mark Woh, Scott Mahlke, Trevor Mudge, and Chaitali Chakrabarti IEEE Computer Vol. 43, No. 1, Jan. 2010, pp. 81-85. |
2009 |
Eliminating Concurrency Bugs with Control Engineering (paper: pdf) Terence Kelly, Yin Wang, Stephane Lafortune, and Scott Mahlke IEEE Computer Vol. 42, No. 12, Dec. 2009, pp. 52-60. |
Polymorphic Pipeline Array: A Flexible Multicore Accelerator with
Virtuzlied Execution for Mobile Multimedia Applications (paper: pdf ; slides: pptx) Hyunchul Park, Yongjun Park, and Scott Mahlke Proc. 42nd Intl. Symposium on Microarchitecture (MICRO) Dec. 2009, pp. 370-380. |
ZerehCache: Armoring Cache Architectures in High Defect Density Technologies (paper: pdf ; slides: pptx) Amin Ansari, Shantanu Gupta, Shuguang Feng, and Scott Mahlke Proc. 42nd Intl. Symposium on Microarchitecture (MICRO) Dec. 2009, pp. 100-110. |
Low-Power Scientific Computing (paper: pdf ; slides: pptx) Ganesh Dasika, Ankit Sethia, Trevor Mudge, and Scott Mahlke 1st Workshop on New Directions in Computer Architecture Dec. 2009 |
Multicore Compilation Strategies and Challenges (paper: pdf) Mojtaba Mehrara, Thomas Jablin, Dan Upton, David August, Kim Hazelwood, and Scott Mahlke IEEE Signal Processing Magazine Vol. 26, No. 6, Nov. 2009, pp. 55-63. |
CGRA Express: Accelerating Execution using Dynamic
Operation Fusion (paper: pdf ; slides:ppt) Yongjun Park, Hyunchul Park and Scott Mahlke Proc. 2009 Intl. Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES) Oct. 2009, pp. 271-280. |
Adaptive Online Testing for Efficient Hard Fault Detection (paper: pdf ; slides:ppt) Shantanu Gupta, Amin Ansari, Shuguang Feng and Scott Mahlke Proc. 27th Intl. Conference on Computer Design (ICCD) Oct. 2009, pp. 343-349. |
Flextream: Adaptive Compilation of Streaming Applications for Heterogeneous Architectures (paper: pdf ; slides: pptx) Amir Hormati, Yoonseo Choi, Manjunath Kudlur, Rodric Rabbah, Trevor Mudge and Scott Mahlke Proc. 18th Intl. Conference on Parallel Architectures and Compilation Techniques (PACT) Sept. 2009, pp. 214-223. |
Enabling Ultra Low Voltage System Operation by Tolerating On-Chip Cache Failures (paper: pdf ; slides: pptx) Amin Ansari, Shuguang Feng, Shantanu Gupta, and Scott Mahlke Proc. 2009 Intl. Symposium on Low Power Electronics and Design (ISLPED) Aug. 2009, pp. 307-310. |
High Performance Mobile Computing Using Flexible Wide SIMD Processors (slides: ppt) Scott Mahlke 9th Intl. Forum on Embedded MPSoC and Multicore (MPSOC) Aug. 2009. |
Parade: A Versatile Parallel Architecture for Accelerating Pulse-Train Clustering (paper: pdf ; slides: pptx) Amin Ansari, Dan Zhang, and Scott Mahlke Proc. 7th IEEE Symposium on Application Specific Processors (SASP) Jul. 2009, pp. 88-93. |
Power-Efficient Medical Image Processing using PUMA (paper: pdf ; slides: pptx) Ganesh Dasika, Kevin Fan and Scott Mahlke Proc. 7th IEEE Symposium on Application Specific Processors (SASP) Jul. 2009, pp. 29-34. |
A Dataflow-centric Approach to Design Low Power Control Paths in CGRAs (paper: pdf ; slides: ppt) Hyunchul Park, Yongjun Park, and Scott Mahlke Proc. 7th IEEE Symposium on Application Specific Processors (SASP) Jul. 2009, pp. 15-20. |
Liquid Metal's OPTIMUS: Synthesis of Efficient Streaming Hardware (slides: pptx) Scott Mahlke and Rodric Rabbah Tutorial at 46th Design Automation Conference, High-Level Synthesis for ESL Design: Fundamentals and Case Studies Jul. 2009. |
Customizing Wide-SIMD Architectures for H.264 (paper: pdf ; slides: ppt) Sangwon Seo, Mark Woh, Scott Mahlke, Trevor Mudge, Sundaram Vijay, and Chaitali Chakrabarti Proc. 9th Intl. Symposium on Systems, Architectures, Modeling and Simulation (SAMOS) Jul. 2009, pp. 172-179. |
AnySP: Anytime Anywhere Anyway Signal Processing (paper: pdf ; slides: ppt) Mark Woh, Sangwon Seo, Scott Mahlke, Trevor Mudge, Chaitali Chakrabarti, and Krisztian Flautner Proc. 36th Intl. Symposium on Computer Architecture (ISCA) Jun. 2009, pp. 128-139. |
Parallelizing Sequential Applications on Commodity Hardware using a Low-cost Software Transactional Memory (paper: pdf ; slides: ppt) Mojtaba Mehrara, Jeff Hao, Po-Chun Hsu, and Scott Mahlke Proc. ACM SIGPLAN 2009 Conference on Programming Languages Design and Implementation (PLDI) Jun. 2009, pp. 166-176. |
Reducing Control Power in CGRAs with Token Flow (paper: pdf ; slides: ppt) Hyunchul Park, Yongjun Park, and Scott Mahlke Workshop on Optimizations for DSP and Embedded Systems (ODES) Mar. 2009. |
Stream Compilation for Real-time Embedded Multicore Systems (paper: pdf ; slides: ppt) Yoonseo Choi, Yuan Lin, Nathan Chong, Scott Mahlke and Trevor Mudge Proc. 2009 International Symposium on Code Generation and Optimization (CGO) Mar. 2009, pp. 210-220. |
Bridging the Computation Gap Between Programmable Processors and
Hardwired Accelerators (paper: pdf; slides: ppt) Kevin Fan, Manjunath Kudlur, Ganesh Dasika, and Scott Mahlke Proc. 15th Intl. Symposium on High-Performance Computer Architecture (HPCA) Feb. 2009, pp. 313-322. |
The Theory of Deadlock Avoidance via Discrete Control (paper: pdf ; slides: ppt) Yin Wang, Stephane Lafortune, Terence Kelly, Manjunath Kudlur, and Scott Mahlke Proc. 36th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL) Jan. 2009, pp. 252-263. |
2008 |
Gadara: Dynamic Deadlock Avoidance for Multithreaded Programs (paper: pdf ; slides: ppt) Ying Wang, Terence Kelly, Manjunath Kudlur, Stephane Lafortune, and Scott Mahlke Proc. 8th USENIX Symposium on Operating Systems Design and Implementation (OSDI) Dec. 2008, pp. 281-294. |
From SODA to Scotch: The Evolution of a Wireless Baseband Processor (paper: pdf ; slides:ppt) Mark Woh, Yuan Lin, Sangwon Seo, Scott Mahlke, Trevor Mudge, Chaitali Chakrabarti, Richard Bruce, Danny Kershaw, Alastair Reid, Mladen Wilder, and Krisztian Flautner Proc. 41st Intl. Symposium on Microarchitecture (MICRO) Nov. 2008, pp. 152-163. |
The StageNet Fabric for Constructing Resilient Multicore Systems (paper: pdf ; slides:ppt) Shantanu Gupta, Shuguang Feng, Amin Ansari, Jason Blome, and Scott Mahlke Proc. 41st Intl. Symposium on Microarchitecture (MICRO) Nov. 2008, pp. 141-151. |
Adaptive Streaming for Dealing with Dynamic Heterogeneity (slides: ppt) Amir Hormati and Scott Mahlke Workshop on Streaming Systems: From Web and Enterprise to Multicore Nov. 2008. |
A Reconfigurable Microarchitecture Building Block for Resilient CMP Systems (paper: pdf ; slides:ppt) Shantanu Gupta, Shuguang Feng, Amin Ansari, Jason Blome, and Scott Mahlke Proc. 2008 Intl. Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES) Oct. 2008, pp. 1-10. |
Optimus: Efficient Realization of Streaming Applications on FPGAs (paper: pdf; slides: ppt) Amir Hormati, Manjunath Kudlur, David Bacon, Scott Mahlke, and Rodric Rabbah Proc. 2008 Intl. Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES) Oct. 2008, pp. 41-50. |
Edge-centric Modulo Scheduling for Coarse-Grained Reconfigurable Architectures (paper: pdf ; slides: ppt) Hyunchul Park, Kevin Fan, Scott Mahlke, Taewook Oh, Heeseok Kim, and Hong-seok Kim. Proc. 17th Intl. Conference on Parallel Architectures and Compilation Techniques (PACT) Oct. 2008, pp. 166-176. |
Reliable Systems on Unreliable Fabrics (paper: pdf) Todd Austin, Valeria Bertacco, Scott Mahlke, and Yu Cao IEEE Design and Test of Computers Vol. 25, No. 4, Jul. 2008, pp. 322-332. |
A Parameterized Dataflow Language Extension for Embedded Streaming Systems (paper: pdf slides: ppt) Yuan Lin, Yoonseo Choi, Scott Mahlke, Trevor Mudge, and Chaitali Chakrabarti Proc. Intl. Symposium on Systems, Architectures, Modeling and Simulation (SAMOS) Jul. 2008, pp. 10-17. |
Olay: Combat the Signs of Aging with Introspective Reliability Management. (paper: pdf slides: ppt) Shuguang Feng, Shantanu Gupta, and Scott Mahlke The Workshop on Quality-Aware Design (W-QUAD) Jun. 2008. |
VEAL: Virtualized Execution Accelerator for Loops (paper: pdf ; slides: ppt) Nathan Clark, Amir Hormati, and Scott Mahlke Proc. 35th Intl. Symposium on Computer Architecture (ISCA) Jun. 2008, pp. 389-400. |
DVFS in Loop Accelerators using BLADES (paper: pdf ; slides: ppt) Ganesh Dasika, Shidhartha Das, Kevin Fan, Scott Mahlke and David Bull Proc. 45th Design Automation Conference (DAC) Jun. 2008, pp. 894-897. |
Orchestrating the Execution of Stream Programs on Multicore Platforms (paper: pdf; slides: ppt) Manjunath Kudlur and Scott Mahlke Proc. ACM SIGPLAN 2008 Conference on Programming Languages Design and Implementation (PLDI) Jun. 2008, pp. 114-124. |
Integrating Post-programmability Into the High-level Synthesis Equation (slides: ppt) Scott Mahlke Tutorial at 45th Design Automation Conference, High-level Synthesis: Back to the Future Workshop Jun. 2008. |
The Application of Supervisory Control to Deadlock Avoidance in Concurrent Software (paper: pdf ; slides: ppt) Yin Wang, Terence Kelly, Manjunath Kudlur, Scott Mahlke, and Stephane Lafortune. 9th Intl. Workshop on Discrete Event Systems (WODES) May 2008. |
Modulo Scheduling for Highly Customized Datapaths to Increase Hardware Reusability (paper: pdf; slides: ppt) Kevin Fan, Hyunchul Park, Manjunath Kudlur, and Scott Mahlke. Proc. 2008 Intl. Symposium on Code Generation and Optimization (CGO) Apr. 2008, pp. 124-133. |
Analyzing the Scalability of SIMD for the Next Generation Software Defined Radio (paper: pdf; slides: ppt) Mark Woh, Yuan Lin, Sangwon Seo, Trevor Mudge and Scott Mahlke. Proc. 2008 IEEE Intl. Conference on Acoustics, Speech, and Signal Processing (ICASSP) Mar. 2008, pp. 5388-5391. |
Uncovering Hidden Loop Level Parallelism in Sequential Applications (paper: pdf; slides: pdf) Hongtao Zhong, Mojtaba Mehrara, Steve Lieberman, and Scott Mahlke. Proc. 14th Intl. Symposium on High-Performance Computer Architecture (HPCA) Feb. 2008, pp. 290-301. |
2007 |
StageNet: A Reconfigurable CMP Fabric for Resilient Systems (paper: pdf; slides: ppt) Shantanu Gupta, Shuguang Feng, Jason Blome, and Scott Mahlke. 2nd Reconfigurable and Adaptive Architecture Workshop (RAAW) Dec. 2007. |
Self-calibrating Online Wearout Detection (paper: pdf; slides: ppt) Jason Blome, Shuguang Feng, Shantanu Gupta, and Scott Mahlke. Proc. 40th Intl. Symposium on Microarchitecture (MICRO) Dec. 2007, pp. 109-120. |
Data Access Partitioning for Fine-grain Parallelism on Multicore Architectures (paper: pdf; slides: ppt) Michael Chu, Rajiv Ravindran, and Scott Mahlke. Proc. 40th Intl. Symposium on Microarchitecture (MICRO) Dec. 2007, pp. 369-378. |
Hierarchical Coarse-grained Stream Compilation for Software Defined Radio (paper: pdf; slides: ppt) Yuan Lin, Manjunath Kudlur, Scott Mahlke, and Trevor Mudge Proc. 2007 Intl. Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES) Oct. 2007, pp. 115-124. |
The Next Generation Challenge for Software Defined Radio (paper: pdf; slides: ppt) Mark Woh, Sangwon Seo, Hyunseok Lee, Yuan Lin, Scott Mahlke, Trevor Mudge, Chaitali Chakrabarti, and Krisztian Flautner Proc. 7th Intl. Workshop on Systems, Architectures, Modeling, and Simulation (SAMOS) Jul. 2007, pp. 343-354. |
Code and Data Partitioning for Fine-grain Parallelism (paper: pdf ; slides: ppt) Michael Chu and Scott Mahlke Proc. ACM SIGPLAN/SIGBED 2007 Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES) Jun. 2007, pp. 161-164. |
Compiler-Managed Partitioned Data Caches for Low Power (paper: pdf ; slides: ppt) Rajiv Ravindran, Michael Chu, and Scott Mahlke Proc. ACM SIGPLAN/SIGBED 2007 Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES) Jun. 2007, pp. 237-247. |
Architecting a Reliable CMP Switch Architecture (paper: pdf) Kypros Constantinides, Stephen Plaza, Jason Blome, Valeria Bertacco, Scott Mahlke, Todd Austin, Bin Zhang, and Michael Orshansky ACM Transactions on Architecture and Code Optimization Vol. 4, No. 1, Mar. 2007, pp. 1-37. |
Exploiting Narrow Accelerators with Data-Centric Subgraph Mapping (paper: pdf ; slides: ppt) Amir Hormati, Nathan Clark, and Scott Mahlke Proc. 2007 Intl. Symposium on Code Generation and Optimization (CGO) Mar. 2007, pp. 147-157. |
Liquid SIMD: Abstracting SIMD Hardware using Lightweight Dynamic Mapping (paper: pdf ; slides: ppt) Nathan Clark, Amir Hormati, Sami Yehia, Scott Mahlke, and Krisztian Flautner Proc. 2007 Intl. Symposium on High Performance Computer Architecture (HPCA) Feb. 2007, pp. 216-227. |
Extending Multicore Architectures to Exploit Hybrid Parallelism in
Single-thread Applications (paper: pdf ; slides: ppt) Hongtao Zhong, Steven A. Lieberman, and Scott A. Mahlke Proc. 2007 Intl. Symposium on High Performance Computer Architecture (HPCA) Feb. 2007, pp. 25-36. |
SODA: A High-Performance DSP Architecture for Software-Defined Radio (paper: pdf) Yuan Lin, Hyunseok Lee, Mark Woh, Yoav Harel, Scott Mahlke, Trevor Mudge, Chaitali Chakrabarti, and Krisztián Flautner IEEE Micro (Micro's Top Picks in Computer Architecture for 2006) Vol. 27, No. 1, Jan./Feb. 2007, pp. 114-123. |
2006 |
Online Timing Analysis for Wearout Detection (paper: pdf ; slides: ppt) Jason Blome, Shuguang Feng, Shantanu Gupta, Scott Mahlke. 2nd Workshop on Architectural Reliability (WAR) Dec. 2006. |
SPEX: A Programming Language for Software Defined Radio (paper: pdf ; slides: ppt) Yuan Lin, Robert Mullenix, Mark Woh, Scott Mahlke, Trevor Mudge, Alastair Reid, Krisztian Flautner. 2006 Software Defined Radio Technical Conference and Product Exposition Nov. 2006. |
Streamroller: Compiler Orchestrated Synthesis of Accelerator Pipelines (slides: ppt) Manjunath Kudlur, Kevin Fan, Ganesh Dasika, and Scott Mahlke. Workshop on Compiler Assisted SoC Assembly (CASA) Oct. 2006. |
Increasing Hardware Efficiency with Multifunction Loop Accelerators (paper: pdf ; slides: ppt) Kevin Fan, Manjunath Kudlur, Hyunchul Park, and Scott Mahlke. Proc. 2006 Intl. Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS) Oct. 2006, pp. 276-281. |
Streamroller: Automatic Synthesis of Prescribed Throughput Accelerator Pipelines (paper: pdf ; slides: ppt) Manjunath Kudlur, Kevin Fan, and Scott Mahlke. Proc. 2006 Intl. Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS) Oct. 2006, pp. 270-275. |
Modulo Graph Embedding: Mapping Applications onto Coarse-Grained Reconfigurable Architectures (paper: pdf ; slides: ppt) Hyunchul Park, Kevin Fan, Manjunath Kudlur, and Scott Mahlke. Proc. 2006 Intl. Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES) Oct. 2006, pp. 136-146. |
Scalable Subgraph Mapping for Acyclic Computation Accelerators (paper: pdf ; slides: ppt) Nathan Clark, Amir Hormati, Scott Mahlke, and Sami Yehia. Proc. 2006 Intl. Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES) Oct. 2006, pp. 147-157. |
Cost-Efficient Soft Error Protection for Embedded Microprocessors (paper: pdf; slides: ppt) Jason A. Blome, Shantanu Gupta, Shuguang Feng, Scott Mahlke, and Daryl Bradley. Proc. 2006 Intl. Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES) Oct. 2006, pp. 421-431. |
Design and Implementation of Turbo Decoders for Software Defined Radio (paper: pdf ; slides: ppt) Yuan Lin, Scott Mahlke, Trevor Mudge, Chaitali Chakrabarti, Alastair Reid, and Krisztian Flautner. Proc. IEEE 2006 Workshop on Signal Processing Systems (SiPS) Oct. 2006. |
A Scalable Low-power Architecture For Software Radio (slides: ppt) Scott Mahlke 6th Intl. Forum on Application-Specific Multi-Processor SoC (MPSoC) Aug. 2006. |
SODA: A Low-power Architecture For Software Radio (paper: pdf ; slides: ppt) Yuan Lin, Hyunseok Lee, Mark Woh, Yoav Harel, Scott Mahlke, Trevor Mudge, Chaitali Chakrabarti, and Krisztian Flautner Proc. 33rd Intl. Symposium on Computer Architecture (ISCA) Jun. 2006, pp. 89-100. |
Compiler-directed Data Partitioning for Multicluster Processors (paper: pdf ; slides: ppt) Michael Chu and Scott Mahlke Proc. 4th Intl. Symposium on Code Generation and Optimization (CGO) Mar. 2006, pp. 208-218. |
BulletProof: A Defect-Tolerant CMP Switch Architecture (paper: pdf ; slides: ppt) Kypros Constantinides, Stephen Plaza, Jason Blome, Bin Zhang, Valeria Bertacco, Scott Mahlke, Todd Austin, and Michael Orshansky Proc. 12th Intl. Symposium on High-Performance Computer Architecture (HPCA) Feb. 2006, pp. 3-14. |
2005 |
Software Defined Radio - A High Performance Embedded Challenge (paper: pdf ; slides: ppt) Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and Krisztian Flautner Proc. 2005 Intl. Conference on High Performance Embedded Architectures and Compilers (HiPEAC) Nov. 2005, pp. 6-26. |
Cost Sensitive Modulo Scheduling in a Loop Accelerator Synthesis System (paper: pdf ; slides: ppt) Kevin Fan, Manjunath Kudlur, Hyunchul Park, and Scott Mahlke. Proc. 38th Intl. Symposium on Microarchitecture (MICRO) Nov. 2005, pp. 219-230. |
A Microarchitectural Analysis of Soft Error Propagation in a Production-level Embedded Microprocessor (paper: pdf ; slides: ppt) Jason Blome, Scott Mahlke, Daryl Bradley, and Krisztian Flautner. 1st Workshop on Architectural Reliability (WAR) Nov. 2005. |
Assessing SEU Vulnerability via Circuit-level Timing Analysis (paper: pdf ; slides: ppt) Kypros Constantinides, Stephen Plaza, Jason Blome, Bin Zhang, Valeria Bertacco, Scott Mahlke, Todd Austin, and Michael Orshansky. 1st Workshop on Architectural Reliability (WAR) Nov. 2005. |
A System Solution for High-Performance, Low Power SDR (paper: pdf ; slides: ppt) Yuan Lin, Hyunseok Lee, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and Krisztian Flautner 2005 Software Defined Radio Technical Conference and Product Exposition Nov. 2005. |
Automated Custom Instruction Generation for Domain-Specific Processor Acceleration (paper: pdf) Nathan Clark, Hongtao Zhong, and Scott Mahlke. IEEE Transactions on Computers Vol. 54, No. 10, Oct. 2005, pp. 1258-1270. |
Exploring the Design Space of LUT-based Transparent Accelerators (paper: pdf ; slides: ppt) Sami Yehia, Nathan Clark, Scott Mahlke, and Krisztian Flautner. Proc. 2005 Intl. Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES) Sep. 2005, pp. 11-21. |
Compiler-directed Synthesis of Multifunction Loop Accelerators (paper: pdf; slides: ppt) Kevin Fan, Manjunath Kudlur, Hyunchul Park, and Scott Mahlke. Workshop on Application Specific Processors (WASP) Sep. 2005, pp. 91-98. |
A Distributed Control Path Architecture for VLIW Processors (paper: pdf; slides: ppt) Hongtao Zhong, Kevin Fan, Scott Mahlke, and Michael Schlansker. Proc. 14th Intl. Conference on Parallel Architectures and Compilation Techniques (PACT) Sep. 2005, pp. 197-206. |
Partitioning Variables across Multiple Register Windows to Reduce Spill Code in a Low-power Processor (paper: pdf) Rajiv Ravindran, Robert Senger, Eric Marsman, Ganesh Dasika, Matthew Guthaus, Scott Mahlke, and Richard Brown. IEEE Transactions on Computers Vol. 54, No. 8, Aug. 2005, pp. 998-1012. |
Trimaran: An Infrastructure for Research in Instruction-Level Parallelism (paper: pdf) Lakshmi Chakrapani, John Gyllenhaal, Wen-mei Hwu, Scott Mahlke, Krishna Palem, and Rodric Rabbah. Lecture Notes in Computer Science Springer-Verlag, Vol. 3602, Aug. 2005, pp. 32-41. |
An Architecture Framework for Transparent Instruction Set Customization in Embedded Processors (paper: pdf ; slides: ppt) Nathan Clark, Jason Blome, Michael Chu, Scott Mahlke, Stuart Biles, and Krisztian Flautner. Proc. 32nd Intl. Symposium on Computer Architecture (ISCA) Jun. 2005, pp. 272-283. |
A 16-bit, Low-Power Microcontroller with Monolithic MEMS-LC Clocking (paper: pdf ; slides: ppt) Eric Marsman, Robert Senger, Michael McCorquodale, Mathew Guthaus, Rajiv Ravindran, Ganesh Dasika, Scott Mahlke, and Richard Brown. Proc. Intl. Symposium on Circuits and Systems (ISCAS) May 2005, pp. 624-627. |
Compiler Managed Dynamic Instruction Placement in a Low-Power Code Cache (paper: pdf ; slides: ppt) Rajiv Ravindran, Pracheeti Nagarkar, Ganesh Dasika, Eric Marsman, Robert Senger, Scott Mahlke, and Richard Brown. Proc. 3rd Intl. Symposium on Code Generation and Optimization (CGO) Mar. 2005, pp. 179-190. |
2004 |
Application Specific Processing on a General Purpose Core via Transparent Instruction Set Customization (paper: pdf ; slides: ppt) Nathan Clark, Manjunath Kudlur, Hyunchul Park, Scott Mahlke, and Krisztian Flautner. Proc. 37th Intl. Symposium on Microarchitecture (MICRO) Dec. 2004, pp. 30-40. |
Automatic Synthesis of Customized Local Memories for Multicluster Application Accelerators (paper: pdf ; slides: ppt) Manjunath Kudlur, Kevin Fan, Michael Chu, and Scott Mahlke. Proc. IEEE 15th Intl. Conference on Application-Specific Systems, Architectures and Processors (ASAP) Sep. 2004, pp. 304-314. |
Compiler-directed Synthesis of Programmable Loop Accelerators (slides: ppt) Kevin Fan, Hyunchul Park, and Scott Mahlke. 2004 Workshop on Emerging Directions in Electronic Design Automation: Accelerating Time-to-market through Compiler-driven Optimization of Embedded Platforms Sep. 2004. |
Memory System Design Space Exploration for Low-Power, Real-time Speech Recognition (paper: pdf ; slides: ppt) Rajeev Krishna, Scott Mahlke, and Todd Austin. Proc. 2004 Intl. Conference on Hardware/Software Codesign and System Synthesis (CODES-ISSS) Sep. 2004, pp. 140-145. |
A Programmable Vector Coprocessor Architecture for Wireless Applications (paper: pdf ; slides: ppt) Yuan Lin, Nadav Baron, Hyunseok Lee, Scott Mahlke, and Trevor Mudge. Proc. 3rd Workshop on Application Specific Processors (WASP) Sep. 2004. |
OptimoDE: Programmable Accelerator Engines Through Retargetable Customization (slides: ppt) Nathan Clark, Hongtao Zhong, Kevin Fan, Scott Mahlke, Krisztian Flautner, and Koen Van Nieuwenhove. Proc. Hot Chips 16 Aug. 2004. |
Cost-Sensitive Partitioning in an Architecture Synthesis System for Multicluster Processors (paper: pdf) Michael L. Chu, Kevin C. Fan, Rajiv A. Ravindran, and Scott A. Mahlke. IEEE Micro Vol. 24, No. 3, May/Jun. 2004, pp. 10-20. |
Mobile Supercomputers (paper: pdf) Todd Austin, David Blaauw, Scott Mahlke, Trevor Mudge, Chaitali Chakrabarti, and Wayne Wolf. IEEE Computer Vol. 37, No. 5, May 2004, pp. 82-84. |
FLASH: Foresighted Latency-Aware Scheduling Heuristic for Processors with Customized Datapaths (paper: pdf ; slides: ppt) Manjunath Kudlur, Kevin Fan, Michael Chu, Rajiv Ravindran, Nathan Clark, and Scott Mahlke. Proc. 2nd Intl. Symposium on Code Generation and Optimization (CGO) Mar. 2004, pp. 201-212. |
Probabilistic Predicate-Aware Modulo Scheduling (paper: pdf ; slides: ppt) Mikhail Smelyanskiy, Scott Mahlke, and Edward Davidson Proc. 2nd Intl. Symposium on Code Generation and Optimization (CGO) Mar. 2004, pp. 151-162. |
2003 |
Automatic Design of Application Specific Instruction Set Extensions Through Dataflow Graph Exploration Nathan Clark, Hongtao Zhong, Wilkin Tang and Scott Mahlke Intl. Journal of Parallel Programming Vol. 31, No. 6, Dec. 2003, pp. 429-449. |
Cost-Sensitive Operation Partitioning for Synthesizing Custom Multicluster Datapath Architectures (paper: pdf ; slides: ppt) Michael L. Chu, Kevin C. Fan, Rajiv A. Ravindran and Scott A. Mahlke Proc. 2nd Workshop on Application Specific Processors (WASP) Dec. 2003. pp. 40-47. |
Processor Acceleration Through Automated Instruction Set Customization (paper: pdf ; slides: ppt) Nathan Clark, Hongtao Zhong, and Scott Mahlke Proc. 36th Intl. Symposium on Microarchitecture (MICRO) Dec. 2003. pp. 129-140. |
Increasing the Number of Effective Registers in a Low-Power Processor Using a Windowed Register File (paper: pdf ; slides: ppt) Rajiv A. Ravindran, Robert M. Senger, Eric D. Marsman, Ganesh S. Dasika, Matthew R. Guthaus, Scott A. Mahlke, and Richard B. Brown Proc. 2003 Intl. Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES) Oct. 2003, pp. 125-136. |
Architectural Optimizations for Low-Power, Real-Time Speech Recognition (paper: pdf ; slides: ppt) Rajeev Krishna, Scott Mahlke, and Todd Austin Proc. 2003 Intl. Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES) Oct. 2003, pp. 220-231. |
Systematic Register Bypass Customization for Application-Specific Processors (paper: pdf ; slides: ppt) Kevin Fan, Nathan Clark, Michael Chu, K.V. Manjunath, Rajiv Ravindran, Mikhail Smelyanskiy, and Scott Mahlke. Proc. IEEE 14th Intl. Conference on Application-Specific Systems, Architectures and Processors (ASAP) Jun. 2003, pp. 64-74. |
Region-based Hierarchical Operation Partitioning for Multicluster Processors (paper: pdf ; slides: ppt) Michael Chu, Kevin Fan, and Scott Mahlke. Proc. ACM SIGPLAN 2003 Conference on Programming Languages Design and Implementation (PLDI) Jun. 2003, pp. 300-311. |
Predicate-Aware Scheduling: A Technique for Reducing Resource Constraints (paper: pdf ; slides: ppt) Mikhail Smelyanskiy, Scott A. Mahlke, Edward S. Davidson, and Hsien-Hsin S. Lee. Proc. 1st Intl. Symposium on Code Generation and Optimization (CGO) Mar. 2003, pp. 169-178. |
2002 |
Automatically Generating Custom Instruction Set Extensions (paper: pdf ; slides: ppt) Nathan Clark, Wilkin Tang, and Scott Mahlke. Proc. 1st Workshop on Application Specific Processors (WASP) Nov. 2002, pp. 94-101. |
Insights into the Memory Demands of Speech Recognition Algorithms (paper: pdf; ) Rajeev Krishna, Scott Mahlke, and Todd Austin. Proc. ACM/IEEE 2nd Workshop on Memory Performance Issues (WMPI) May 2002. |
Theses |
Efficient Deep Neural Network Computation on Processors (paper: pdf) Jiecao Yu, 2019. |
Data Resource Management in Throughput Processors (paper: pdf) John Kloosterman, 2018. |
Composite Cores: Improving Energy Efficiency Through Fine-Grained Heterogeneity (paper: pdf) Andrew Lukefahr, 2016. |
Exploiting fine-grain heterogeneity to build energy-efficient processors (paper: pdf) Shruti Padmanabha, 2016. |
Enabling Efficient Resource Utilization on Multitasking Throughput Processors (paper: pdf) Jason Jong Kyu Park, 2016. |
Virtualizing Data Parallel Systems for Portability, Productivity, and Performance (paper: pdf) Janghaeng Lee, 2015. |
Dependable Computing On Inexact Hardware Through Anomaly Detection (paper: pdf) Daya S Khudia, 2015. |
Dynamic Hardware Resource Management for Efficient Throughput Processing (paper: pdf) Ankit Sethia, 2015. |
Intelligent Management of Inter-Thread Synchronization Dependencies for Concurrent Programs (paper: pdf) Hyoun Kyu Cho, 2014. |
Dynamic Orchestration of Massively Data Parallel Execution (paper: pdf) Mehrzad Samadi, 2014. |
Libra: Achieving Efficient Instruction- and Data- Parallel
Execution for Mobile Applications (paper: pdf) Yongjun Park, 2013. |
Overcoming Hard-Faults in High-Performance Microprocessors (paper: pdf) Amin Ansari, 2011. |
Power-Efficient Accelerators for High-Performance
Applications (paper: pdf) Ganesh Dasika, 2011. |
Delivering Affordable Fault-tolerance to Commodity Computer Systems (paper: pdf) Shuguang Feng, 2011. |
Adaptive Architectures for Robust and Efficient Computing (paper: pdf) Shantanu Gupta, 2011. |
Compiling Stream Applications for Heterogeneous
Architectures (paper: pdf) Amir Hormati, 2011. |
Compiler and Runtime Techniques For Automatic Parallelization of
Sequential Applications (paper: pdf) Mojtaba Mehrara, 2011. |
Polymorphic Pipeline Array: A Flexible Multicore Accelerator for Mobile Multimedia Applications (paper: pdf) Hyunchul Park. 2009. |
Realizing Software Defined Radio - A Study in Designing Mobile Supercomputers (paper: pdf) Yuan Lin. 2008. |
Automatic Design of Efficient Application-centric Architectures (paper: pdf) Kevin Fan. 2008. |
Streamroller : A Unified Compilation and Synthesis Framework for Streaming Applications (paper: pdf) Manjunath Kudlur. 2008. |
Architectural and Compiler Mechanisms for Accelerating Single Thread Applications on Multicore Processors (paper: pdf) Hongtao Zhong. 2008. |
Cooperative Data and Computation Partitioning for Decentralized
Architectures (paper: pdf) Michael Chu. 2007. |
Customizing the Computation Capabilities of Microprocessors (paper: pdf) Nathan Clark. 2007. |
Hardware/Software Techniques for Memory Power Optimizations in Embedded Processors (paper: pdf) Rajiv Ravindran. 2007. |
Hardware/Software Mechanisms for Increasing Resource Utilization on VLIW/EPIC Processors (paper: pdf) Mikhail Smelyanskiy. 2004. |
Disclaimer: The documents contained on this page have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder. |
Page last modified August 13, 2020.