ACM Transactions on

Design Automation of Electronic Systems (TODAES)

Latest Articles

An Approximation Algorithm for Threshold Voltage Optimization

We present a primal-dual approximation algorithm for minimizing the leakage power of an integrated circuit by assigning gate threshold voltages. While... (more)

CASCA: A Design Automation Approach for Designing Hardware Countermeasures Against Side-Channel Attacks

Implementing a cryptographic circuit poses challenges not always acknowledged in the backing mathematical theory. One of them is the vulnerability against side-channel attacks. A side-channel attack is a procedure that uses information leaked by the circuit through, for example, its own power consumption or electromagnetic emissions, to derive... (more)

Detection Mechanisms for Unauthorized Wireless Transmissions

With increasing diversity of supply chains from design to delivery, there is an increasing risk that unauthorized changes can be made within an IC.... (more)

PV-Aware Analog Sizing for Robust Analog Layout Retargeting with Optical Proximity Correction

For analog integrated circuits (ICs) in nanometer technology nodes, process variation (PV) induced by lithography may not only cause serious wafer... (more)

Rapid Triggering Capability Using an Adaptive Overlay during FPGA Debug

Field Programmable Gate Array (FPGA) technology is rapidly gaining traction in a wide range of applications. Nonetheless, FPGAs still require long... (more)

Fault-Tolerant Unicast-Based Multicast for Reliable Network-on-Chip Testing

We present a unified test technique that targets faults in links, routers, and cores of a network-on-chip design based on test sessions. We call an... (more)

UCR: An Unclonable Environmentally Sensitive Chipless RFID Tag For Protecting Supply Chain

Chipless Radio Frequency Identification (RFID) tags that do not include an integrated circuit (IC) in the transponder are more appropriate for supply-chain management of low-cost commodities and have been gaining extensive attention due to their relatively lower price. However, existing chipless RFID tags consume considerable tag area and... (more)

SHAIP: Secure Hamming Distance for Authentication of Intrinsic PUFs

In this article, we present SHAIP, a secure Hamming distance–based mutual authentication protocol. It allows an unlimited number of authentications by employing an intrinsic Physical Unclonable Function (PUF). PUFs are being increasingly employed for remote authentication of devices. Most of these devices have limited resources. Therefore,... (more)

Programmable Gates Using Hybrid CMOS-STT Design to Prevent IC Reverse Engineering

This article presents a rigorous step towards design-for-assurance by introducing a new class of logically reconfigurable design resilient to design... (more)

Learning From Sleeping Experts: Rewarding Informative, Available, and Accurate Experts

We consider a generalized model of learning from expert advice in which experts could abstain from participating at some rounds. Our proposed online algorithm falls into the class of weighted average predictors and uses a time-varying multiplicative weight update rule. This update rule changes the weight of an expert based on his or her relative... (more)


ACM TODAES new page limit policy: Manuscripts must be formatted in the ACM Transactions format; a 35-page limit applies to the final paper. Rare exceptions are possible if recommended by the reviewers and approved by the Editorial Board.

ORCID is a community-based effort to create a global registry of unique researcher identifiers for the purpose of ensuring proper attribution of works to their creators. When you submit a manuscript for review, you will be presented with the opportunity to register for your ORCID.

Welcome ACM Associate Editors

Forthcoming Articles
Reconfigurable Battery Systems: A Survey on Hardware Architecture and Research Challenges

In a reconfigurable battery pack, the connections among cells can be changed during operation, to form different configurations. This can lead a battery, a passive two-terminal device, to a smart battery which can reconfigure itself according to the requirement to enhance operational performance. Several hardware architectures with different levels of complexities have been proposed. Some researchers have used existing hardware and demonstrated improved performance on the basis of novel optimization and scheduling algorithms. The possibility of software techniques to benefit the energy-storage systems is exciting and it is the perfect time for such methods as the need of high performance and long lasting batteries is on the rise. This novel field requires new understanding, principles and evaluation metrics of proposed schemes. In this paper, we systematically discuss and critically review state-of-art. This is the first effort to compare the existing hardware topologies in terms of flexibility and functionality. We provide a comprehensive review that encompasses all existing research works starting from the details of the individual battery including modeling and properties as well as fixed-topology traditional battery packs. To stimulate further research in this area, we highlight key challenges and open problems in this domain.

Data-driven Anomaly Detection with Timing Features for Embedded Systems

Malware is a serious threat to network-connected embedded systems, as evidenced by the continued and rapid growth of such devices, commonly referred to as of the Internet of Things. Their ubiquitous use in critical applications require robust protection to ensure user safety and privacy. That protection must be applied to all system aspects, extending beyond protecting the network and external interfaces. Anomaly detection is one of the last lines of defence against malware, in which data-driven approaches that require the least domain knowledge are popular. However, embedded systems, particularly edge devices, face several challenges in applying data-driven anomaly detection, including unpredictability of malware, limited tolerance to long data collection windows, and limited computing/energy resources. In this paper, we utilize subcomponent timing information of software execution, including intrinsic software execution, instruction cache misses, and data cache misses as features, to detect anomalies based on ranges, multidimensional Euclidean distance, and classification at runtime. Detection methods based on lumped timing range are also evaluated and compared. We design several hardware detectors implementing these data-driven detection methods, which non-intrusively measuring lumped/subcomponent timing of all system/function calls of the embedded application. We evaluate the area, power, and detection latency of the presented detector designs. Experimental results demonstrate that the subcomponent timing model provides sufficient features to achieve high detection accuracy with low false positive rates using a one-class support vector machine, considering sophisticated mimicry malware.

A Cross-Level Verification Methodology for Digital IPs Augmented with Embedded Timing Monitors

Incomplete Tests for Undetectable Faults to Improve Test Set Quality

The presence of undetectable faults in a set of target faults implies that tests, which may be important for detecting defects, are missing from the test set. This paper suggests an approach for addressing missing tests that fits with the rationale for computing an n-detection test set. The paper defines the concept of an incomplete test that is relevant when a target fault is undetectable. An incomplete test activates the fault, but fails to detect it because of one or more assignments that are missing from the test. The procedure described in this paper improves the quality of a test set by attempting to ensure that every undetectable fault has n incomplete tests with the smallest possible numbers of missing assignments, for a constant n>=1. The incomplete tests are expected to contribute to the detection of detectable defects around the site of the undetectable fault. The computation of missing assignments for a test is performed in linear time by avoiding fault simulation, and considering all the undetectable faults simultaneously. Experimental results demonstrate the extent to which a given test set can be improved without increasing the number of tests.

Enhancing Speculative Execution with Selective Approximate Computing

Speculative execution is an optimization technique in modern processors by which predicted instructions are executed in advance with an objective of overlapping the latencies of slow operations. Branch prediction and load value speculation are examples of speculative execution used in modern pipelined processors to avoid an execution stall. However, speculative executions incur a performance penalty as an execution roll-back, when there is a misprediction. In this work, we propose to aid speculative execution with approximate computing by relaxing the execution roll-back penalty associated with a misprediction. We propose a sensitivity analysis method for data and branches in a program in order to identify the data load/store and branch instructions which can be executed without any roll-back in the pipeline and yet can assert a certain user specified quality of service of the application with a probabilistic reliability. Our analysis is based on statistical methods, particularly hypothesis testing and Bayesian analysis. We perform an architectural simulation of our proposed approximate execution and report the benefits in terms of CPU cycles and energy utilization on AxBench, Accept, and Parsec 3.0 benchmarks.

A Novel Resistive Memory based Process-In-Memory Architecture for Efficient Logic and Add Operations

The coming era of big data revives the Processing-In-Memory (PIM) architecture to relieve the memory wall problem that embarrasses the modern computing system. However, most existing PIM designs just put computing units closer to memory, rather than a complete integration of them due to their incompatibility in CMOS manufacturing. Fortunately, the emerging Resistive-RAM (ReRAM) offers new hope to this dilemma owing to its inherent memory and computing capability using the same device. In this paper, we propose a ReRAM memory structure with efficient PIM capability of both logic and add operations. It first leverages non-linearity to suppress \emph{sneak current} and thus sustains high memory density. Using a differential bit cell, it also enables efficient processing of arbitrary logic functions using the same memory cells with non-destructive operations. Then, a novel PIM adder is proposed, which customizes a sneak current path as the carry-chain for fast carry propagation and improves adder performance significantly. In the experiment, the proposed PIM demonstrates higher efficiency in both computing area and performance for logic and addition, which greatly increases the ReRAM PIM applicability for future computable architectures.

Integrated Approach of Airgap Insertion for Circuit Timing Optimization

Airgap technology enables air to be introduced in inter metal dielectric (IMD). Airgap between certain wires reduces coupling capacitance of them due to the reduced permittivity; this can be utilized to decrease circuit delay. We propose an integrated approach of airgap insertion with the goal of circuit timing optimization. It consists of three sub-problems. We first select the layers that employ airgap, called airgap layers, that maximize total negative slack (TNS) improvement; this yields TNS improvement of 7% to 15%, compared to a simple assumption of airgap layers. Second, we reassign the layers of wires, such that the more wires on critical paths can be placed in airgap layers. This is formulated as integer linear programming (ILP), and more practical heuristic algorithm is also proposed. It provides an additional 17% TNS improvement. Finally, we perform airgap insertion through ILP formulation, where a number of design rules are modeled with linear constraints. To reduce the heavy runtime of ILP, layout partitioning technique is also applied. It implements a feasible airgap mask in a manageable time, where the amount of inserted airgap is close to the optimal soltion.

Design Automation for Dilution of a Fluid using Programmable Microfluidic Device based Biochips

Microfluidic lab-on-a-chips has emerged as a new technology for implementing biochemical protocols on small-sized portable devices targeting low-cost medical diagnostics. Among various efforts of fabrication of such chips, a relatively new technology is programmable microfluidic device (PMD) for implementation of flow-based lab-on-a-chips. A PMD chip is suitable for automation due to its symmetric nature. In order to implement a bioprotocol on such a reconfigurable device, it is crucial to automate sample preparation on-chip as well. In this paper, we propose a dilution algorithm (namely DPMD) and its architectural mapping scheme (namely GAMA) for addressing fluidic cells of such a device to perform dilution of a reagent fluid on-chip. We used an optimization function that first minimizes the number of mixing steps and then reduces the waste generation and further reagent requirement. Simulation results show that the proposed DPMD scheme is comparative to existing state-of-the-art dilution algorithm. The proposed design automation using architectural mapping scheme reduces the required chip area, hence, minimizes the valve switching that, in turn, increases the life-span of PMD-chip.

Integrated Latch Placement and Cloning for Timing Optimization

An algorithm for integrated timing-driven latch placement and cloning is presented. Given a circuit placement, the proposed algorithm relocates some latches while circuit timing is improved. Some latches are replicated to further improve the timing; the number of replicated latches along with their locations are automatically determined. After latch cloning, each of the replicated latches is set to drive a subset of the fanouts that have been driven by the original single latch. The proposed algorithm is then extended such that relocation and cloning are applied to some latches together with their neighbor logic gates. Experimental results demonstrate that the worst negative slack and the total negative slack are improved by 24% and 59%, respectively, on average of test circuits. The negative impacts on circuit area and power consumption are both marginal, 0.7% and 1.9% respectively.

Adaptive Test for RF/Analog Circuit Using Higher Order Correlations Among Measurements

As process variations increase and devices get more diverse in their behavior, using the same test list for all devices is increasingly inefficient. Methodologies that adapt the test sequence with respect to lot, wafer, or even device?s own behavior help contain the test cost while maintaining test quality. In adaptive test selection approaches, initial test list, a set of tests that are applied to all devices to learn information, plays a crucial role in the quality outcome. Most adaptive test approaches select this initial list based on fail probability of each test individually. Such a selection approach does not take into account the correlations that exist among various measurements and potentially will lead to the selection of correlated tests. In this work, we propose a new adaptive test algorithm that includes a mathematical model for initial test ordering that takes correlations among measurements into account. The proposed method can be integrated within an existing test flow running in the background to improve not only the test quality but also the test time. Experimental results using four distinct industry circuits and large amounts of measurement data show that the proposed technique outperforms prior approaches considerably.

Formal Modeling and Verification of a Victim DRAM Cache

The emerging Die-stacking technology enables DRAM to be used as a cache to break the ?Memory Wall? problem. Recent studies have proposed to use DRAM as a victim cache in both CPU and GPU memory hierarchies to improve performance. DRAM caches are large in size and hence, when realized as a victim cache, a non-inclusive design is preferred. This non-inclusive design adds significant differences to the conventional DRAM cache design in terms of its probe, ll and writeback policies. Design and verification of a victim DRAM cache can be much more complex than that of a conventional DRAM cache. Hence, without rigorous modeling and formal verification, ensuring the correctness of such a system can be difficult. The major focus of this work is to show how formal modeling is applied to design and verify a victim DRAM cache. In this approach, we identify the agents in the victim DRAM cache design and model them in terms of interacting state machines. We derive a set of properties from the specifications of a victim cache and encode them using Linear Temporal Logic (LTL). The properties are then proven using symbolic and bounded model checking. Finally, we discuss how these properties are related to the data flow paths in a victim DRAM cache.

All ACM Journals | See Full Journal Index

enter search term and/or author name