Author image not provided
 Todd Carl Mowry

Authors:
Add personal information
  Affiliation history
Bibliometrics: publication history
Average citations per article27.79
Citation Count1,945
Publication count70
Publication years1991-2016
Available for download54
Average downloads per article586.61
Downloads (cumulative)31,677
Downloads (12 Months)2,036
Downloads (6 Weeks)266
ACM Fellow
SEARCH
ROLE
Arrow RightAuthor only
· Editor only
· Advisor only
· Other only
· All roles


AUTHOR'S COLLEAGUES
See all colleagues of this author

SUBJECT AREAS
See all subject areas

KEYWORDS
See all author supplied keywords


AUTHOR PROFILE PAGES
Project background Author-Izer logoAuthor-Izer Service

BOOKMARK & SHARE


70 results found Export Results: bibtex | endnote | acmref | csv

Result 1 – 20 of 70
Result page: 1 2 3 4

Sort by:

1 published by ACM
RFVP: Rollback-Free Value Prediction with Safe-to-Approximate Loads
January 2016 ACM Transactions on Architecture and Code Optimization (TACO): Volume 12 Issue 4, January 2016
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 13,   Downloads (12 Months): 117,   Downloads (Overall): 143

Full text available: PDFPDF
This article aims to tackle two fundamental memory bottlenecks: limited off-chip bandwidth (bandwidth wall) and long access latency (memory wall). To achieve this goal, our approach exploits the inherent error resilience of a wide range of applications. We introduce an approximation technique, called Rollback-Free Value Prediction (RFVP). When certain safe-to-approximate ...
Keywords: Load value approximation, memory bandwidth, value prediction, memory latency, GPUs

2 published by ACM
Gather-scatter DRAM: in-DRAM address translation to improve the spatial locality of non-unit strided accesses
Vivek Seshadri, Thomas Mullins, Amirali Boroumand, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry
November 2015 MICRO-48: Proceedings of the 48th International Symposium on Microarchitecture
Publisher: ACM
Bibliometrics:
Citation Count: 7
Downloads (6 Weeks): 18,   Downloads (12 Months): 146,   Downloads (Overall): 207

Full text available: PDFPDF
Many data structures (e.g., matrices) are typically accessed with multiple access patterns. Depending on the layout of the data structure in physical address space, some access patterns result in non-unit strides. In existing systems, which are optimized to store and access cache lines, non-unit strided accesses exhibit low spatial locality. ...
Keywords: SIMD, in-memory databases, memory bandwidth, performance, energy, strided accesses, DRAM, caches

3
Tracking and Reducing Uncertainty in Dataflow Analysis-Based Dynamic Parallel Monitoring
October 2015 PACT '15: Proceedings of the 2015 International Conference on Parallel Architecture and Compilation (PACT)
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 0

Dataflow analysis-based dynamic parallel monitoring(DADPM) is a recent approach for identifying bugsin parallel software as it executes, based on the key insightof explicitly modeling a sliding window of uncertainty acrossparallel threads. While this makes the approach practical andscalable, it also introduces the possibility of false positives inthe analysis. In this ...

4
Fast Bulk Bitwise AND and OR in DRAM
Vivek Seshadri, Kevin Hsieh, Amirali Boroumand, Donghyuk Lee, Michael A. Kozuch, Onur Mutlu, Phillip B. Gibbons, Todd C. Mowry
June 2015 IEEE Computer Architecture Letters: Volume 14 Issue 2, July 2015
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 7

Bitwise operations are an important component of modern day programming, and are used in a variety of applications such as databases. In this work, we propose a new and simple mechanism to implement bulk bitwise AND and OR operations in DRAM, which is faster and more efficient than existing mechanisms. ...

5
Toggle-Aware Compression for GPUs
Gennady Pekhimenko, Evgeny Bolotin, Mike OConnor, Onur Mutlu, Todd C. Mowry, Stephen W. Keckler
June 2015 IEEE Computer Architecture Letters: Volume 14 Issue 2, July 2015
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 0

Memory bandwidth compression can be an effective way to achieve higher system performance and energy efficiency in modern data-intensive applications by exploiting redundancy in data. Prior works studied various data compression techniques to improve both capacity (e.g., of caches and main memory) and bandwidth utilization (e.g., of the on-chip and ...

6 published by ACM
A case for core-assisted bottleneck acceleration in GPUs: enabling flexible data compression with assist warps
Nandita Vijaykumar, Gennady Pekhimenko, Adwait Jog, Abhishek Bhowmick, Rachata Ausavarungnirun, Chita Das, Mahmut Kandemir, Todd C. Mowry, Onur Mutlu
June 2015 ISCA '15: Proceedings of the 42nd Annual International Symposium on Computer Architecture
Publisher: ACM
Bibliometrics:
Citation Count: 8
Downloads (6 Weeks): 11,   Downloads (12 Months): 168,   Downloads (Overall): 511

Full text available: PDFPDF
Modern Graphics Processing Units (GPUs) are well provisioned to support the concurrent execution of thousands of threads. Unfortunately, different bottlenecks during execution and heterogeneous application requirements create imbalances in utilization of resources in the cores. For example, when a GPU is bottlenecked by the available off-chip memory bandwidth, its computational ...
Also published in:
January 2016  ACM SIGARCH Computer Architecture News - ISCA'15: Volume 43 Issue 3, June 2015

7 published by ACM
Page overlays: an enhanced virtual memory framework to enable fine-grained memory management
Vivek Seshadri, Gennady Pekhimenko, Olatunji Ruwase, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry, Trishul Chilimbi
June 2015 ISCA '15: Proceedings of the 42nd Annual International Symposium on Computer Architecture
Publisher: ACM
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 16,   Downloads (12 Months): 162,   Downloads (Overall): 368

Full text available: PDFPDF
Many recent works propose mechanisms demonstrating the potential advantages of managing memory at a fine (e.g., cache line) granularity---e.g., fine-grained deduplication and fine-grained memory protection. Unfortunately, existing virtual memory systems track memory at a larger granularity (e.g., 4 KB pages), inhibiting efficient implementation of such techniques. Simply reducing the page ...
Also published in:
December 2015  ACM SIGARCH Computer Architecture News - ISCA'15: Volume 43 Issue 3, June 2015

8 published by ACM
Mitigating Prefetcher-Caused Pollution Using Informed Caching Policies for Prefetched Blocks
Vivek Seshadri, Samihan Yedkar, Hongyi Xin, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry
December 2014 ACM Transactions on Architecture and Code Optimization (TACO): Volume 11 Issue 4, January 2015
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 11,   Downloads (12 Months): 76,   Downloads (Overall): 226

Full text available: PDFPDF
Many modern high-performance processors prefetch blocks into the on-chip cache. Prefetched blocks can potentially pollute the cache by evicting more useful blocks. In this work, we observe that both accurate and inaccurate prefetches lead to cache pollution, and propose a comprehensive mechanism to mitigate prefetcher-caused cache pollution. First, we observe ...
Keywords: Prefetching, cache insertion/promotion policy, cache pollution, caches

9 published by ACM
Rollback-free value prediction with approximate loads
August 2014 PACT '14: Proceedings of the 23rd international conference on Parallel architectures and compilation
Publisher: ACM
Bibliometrics:
Citation Count: 5
Downloads (6 Weeks): 5,   Downloads (12 Months): 39,   Downloads (Overall): 147

Full text available: PDFPDF
This paper demonstrates how to utilize the inherent error resilience of a wide range of applications to mitigate the memory wall -- the discrepancy between core and memory speed. We define a new microarchitecturally-triggered approximation technique called rollback-free value prediction. This technique predicts the value of safe-to-approximate loads when they ...
Keywords: compilers, rollback-free value prediction, general-purpose approximate computing, memory systems

10
The dirty-block index
Vivek Seshadri, Abhishek Bhowmick, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry
June 2014 ISCA '14: Proceeding of the 41st annual international symposium on Computer architecuture
Publisher: IEEE Press
Bibliometrics:
Citation Count: 9
Downloads (6 Weeks): 7,   Downloads (12 Months): 50,   Downloads (Overall): 255

Full text available: PDFPDF
On-chip caches maintain multiple pieces of metadata about each cached block---e.g., dirty bit, coherence information, ECC. Traditionally, such metadata for each block is stored in the corresponding tag entry in the tag store. While this approach is simple to implement and scalable, it necessitates a full tag store lookup for ...
Also published in:
October 2014  ACM SIGARCH Computer Architecture News - ISCA '14: Volume 42 Issue 3, June 2014

11 published by ACM
Guardrail: a high fidelity approach to protecting hardware devices from buggy drivers
Olatunji Ruwase, Michael A. Kozuch, Phillip B. Gibbons, Todd C. Mowry
February 2014 ASPLOS '14: Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 4,   Downloads (12 Months): 61,   Downloads (Overall): 298

Full text available: PDFPDF
Device drivers are an Achilles' heel of modern commodity operating systems, accounting for far too many system failures. Previous work on driver reliability has focused on protecting the kernel from unsafe driver side-effects by interposing an invariant-checking layer at the driver interface, but otherwise treating the driver as a black ...
Keywords: device drivers, dynamic analysis
Also published in:
March 2014  ACM SIGPLAN Notices - ASPLOS '14: Volume 49 Issue 4, April 2014 March 2014  ACM SIGARCH Computer Architecture News - ASPLOS '14: Volume 42 Issue 1, March 2014

12 published by ACM
RowClone: fast and energy-efficient in-DRAM bulk data copy and initialization
Vivek Seshadri, Yoongu Kim, Chris Fallin, Donghyuk Lee, Rachata Ausavarungnirun, Gennady Pekhimenko, Yixin Luo, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry
November 2013 MICRO-46: Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Publisher: ACM
Bibliometrics:
Citation Count: 24
Downloads (6 Weeks): 14,   Downloads (12 Months): 109,   Downloads (Overall): 436

Full text available: PDFPDF
Several system-level operations trigger bulk data copy or initialization. Even though these bulk data operations do not require any computation, current systems transfer a large quantity of data back and forth on the memory channel to perform such operations. As a result, bulk data operations consume high latency, bandwidth, and ...
Keywords: memory bandwidth, bulk operations, in-memory processing, performance, energy, page copy, DRAM, page initialization

13 published by ACM
Linearly compressed pages: a low-complexity, low-latency main memory compression framework
Gennady Pekhimenko, Vivek Seshadri, Yoongu Kim, Hongyi Xin, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry
November 2013 MICRO-46: Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Publisher: ACM
Bibliometrics:
Citation Count: 14
Downloads (6 Weeks): 17,   Downloads (12 Months): 114,   Downloads (Overall): 479

Full text available: PDFPDF
Data compression is a promising approach for meeting the increasing memory capacity demands expected in future systems. Unfortunately, existing compression algorithms do not translate well when directly applied to main memory because they require the memory controller to perform non-trivial computation to locate a cache line within a compressed memory ...
Keywords: DRAM, data compression, memory, memory bandwidth, memory controller, memory capacity

14 published by ACM
Base-delta-immediate compression: practical data compression for on-chip caches
Gennady Pekhimenko, Vivek Seshadri, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry
September 2012 PACT '12: Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Publisher: ACM
Bibliometrics:
Citation Count: 32
Downloads (6 Weeks): 16,   Downloads (12 Months): 108,   Downloads (Overall): 445

Full text available: PDFPDF
Cache compression is a promising technique to increase on-chip cache capacity and to decrease on-chip and off-chip bandwidth usage. Unfortunately, directly applying well-known compression algorithms (usually implemented in software) leads to high hardware complexity and unacceptable decompression/compression latencies, which in turn can negatively affect performance. Hence, there is a need ...
Keywords: caching, cache compression, memory

15 published by ACM
The evicted-address filter: a unified mechanism to address both cache pollution and thrashing
Vivek Seshadri, Onur Mutlu, Michael A. Kozuch, Todd C. Mowry
September 2012 PACT '12: Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Publisher: ACM
Bibliometrics:
Citation Count: 14
Downloads (6 Weeks): 7,   Downloads (12 Months): 40,   Downloads (Overall): 334

Full text available: PDFPDF
Off-chip main memory has long been a bottleneck for system performance. With increasing memory pressure due to multiple on-chip cores, effective cache utilization is important. In a system with limited cache space, we would ideally like to prevent 1) cache pollution, i.e., blocks with low reuse evicting blocks with high ...
Keywords: insertion policy, memory, caching, pollution, thrashing

16 published by ACM
Linearly compressed pages: a main memory compression framework with low complexity and low latency
September 2012 PACT '12: Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 5,   Downloads (12 Months): 32,   Downloads (Overall): 196

Full text available: PDFPDF
Keywords: cache compression, main memory compression

17 published by ACM
Chrysalis analysis: incorporating synchronization arcs in dataflow-analysis-based parallel monitoring
Michelle L. Goodstein, Shimin Chen, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry
September 2012 PACT '12: Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 4,   Downloads (12 Months): 16,   Downloads (Overall): 118

Full text available: PDFPDF
Software lifeguards , or tools that monitor applications at runtime, are an effective way of identifying program errors and security exploits. Parallel programs are susceptible to a wider range of possible errors than sequential programs, making them even more in need of online monitoring. Unfortunately, monitoring parallel applications is difficult ...
Keywords: high-level synchronization, data flow analysis, vector clocks, dynamic program monitoring, parallel programming

18 published by ACM
Log-based architectures: using multicore to help software behave correctly
February 2011 ACM SIGOPS Operating Systems Review: Volume 45 Issue 1, January 2011
Publisher: ACM
Bibliometrics:
Citation Count: 4
Downloads (6 Weeks): 1,   Downloads (12 Months): 23,   Downloads (Overall): 242

Full text available: PDFPDF
While application performance and power-efficiency are both important, application correctness is even more important. In other words, if the application is misbehaving, it is little consolation that it is doing so quickly or power-efficiently. In the Log-Based Architectures (LBA) project, we are focusing on a challenging source of application misbehavior: ...
Keywords: log-based architectures, parallel monitoring, program monitoring, lifeguards, software bugs

19 published by ACM
Decoupled lifeguards: enabling path optimizations for dynamic correctness checking tools
Olatunji Ruwase, Shimin Chen, Phillip B. Gibbons, Todd C. Mowry
May 2010 PLDI '10: Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation
Publisher: ACM
Bibliometrics:
Citation Count: 9
Downloads (6 Weeks): 1,   Downloads (12 Months): 16,   Downloads (Overall): 379

Full text available: PDFPDF
Dynamic correctness checking tools (a.k.a. lifeguards) can detect a wide array of correctness issues, such as memory, security, and concurrency misbehavior, in unmodified executables at run time. However, lifeguards that are implemented using dynamic binary instrumentation (DBI) often slow down the monitored application by 10-50X, while proposals that replace DBI ...
Keywords: dynamic code optimization, dynamic program analysis, dynamic correctness checking
Also published in:
May 2010  ACM SIGPLAN Notices - PLDI '10: Volume 45 Issue 6, June 2010

20 published by ACM
ParaLog: enabling and accelerating online parallel monitoring of multithreaded applications
Evangelos Vlachos, Michelle L. Goodstein, Michael A. Kozuch, Shimin Chen, Babak Falsafi, Phillip B. Gibbons, Todd C. Mowry
March 2010 ACM SIGARCH Computer Architecture News - ASPLOS '10: Volume 38 Issue 1, March 2010
Publisher: ACM
Bibliometrics:
Citation Count: 19
Downloads (6 Weeks): 1,   Downloads (12 Months): 14,   Downloads (Overall): 438

Full text available: PDFPDF
Instruction-grain lifeguards monitor the events of a running application at the level of individual instructions in order to identify and help mitigate application bugs and security exploits. Because such lifeguards impose a 10-100X slowdown on existing platforms, previous studies have proposed hardware designs to accelerate lifeguard processing. However, these accelerators ...
Keywords: hardware support for debugging, instruction-grain lifeguards, online parallel monitoring
Also published in:
March 2010  ACM SIGPLAN Notices - ASPLOS '10: Volume 45 Issue 3, March 2010 March 2010  ASPLOS XV: Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems



The ACM Digital Library is published by the Association for Computing Machinery. Copyright © 2017 ACM, Inc.
Terms of Usage   Privacy Policy   Code of Ethics   Contact Us