Latest Articles

## A Novel Rule Mapping on TCAM for Power Efficient Packet Classification

Packet Classification is the enabling function performed in commodity switches for providing various services such as access control, intrusion... (more)

## Improving Test and Diagnosis Efficiency through Ensemble Reduction and Learning

Machine learning is a powerful lever for developing, improving, and optimizing test methodologies to cope with the demand from the advanced nodes.... (more)

## Revealing Cluster Hierarchy in Gate-level ICs Using Block Diagrams and Cluster Estimates of Circuit Embeddings

Contemporary integrated circuits (ICs) are increasingly being constructed using intellectual... (more)

## Stress-Induced Performance Shifts in 3D DRAMs

3D-stacked DRAMs can significantly increase cell density and bandwidth while also lowering power consumption. However, 3D structures experience significant thermomechanical stress due to the differential rate of contraction of the constituent materials, which have different coefficients of thermal expansion. This impacts circuit performance. This... (more)

## Exploring the Role of Large Centralised Caches in Thermal Efficient Chip Design

In the era of short channel length, Dynamic Thermal Management (DTM) has become a challenging task for the architects and designers engineering modern... (more)

## Reducing DRAM Refresh Rate Using Retention Time Aware Universal Hashing Redundancy Repair

As the device capacity of Dynamic Random Access Memory (DRAM) increases, refresh operation becomes a significant contributory factor toward total... (more)

## Time-Multiplexed FPGA Overlay Architectures: A Survey

This article presents a comprehensive survey of time-multiplexed (TM) FPGA overlays from the research literature. These overlays are categorized based on their implementation into two groups: processor-based overlays, as their implementation follows that of conventional silicon-based microprocessors, and; CGRA-like overlays, with either an array of... (more)

## Energy Efficient Chip-to-Chip Wireless Interconnection for Heterogeneous Architectures

Heterogeneous multichip architectures have gained significant interest in high-performance computing clusters to cater to a wide range of... (more)

## Approximate Data Reuse-based Accelerator Design for Embedded Processor

Due to increasing diversity and complexity of applications in embedded systems, accelerator designs trading-off area/energy-efficiency and... (more)

## Modeling and Simulation of Dynamic Applications Using Scenario-Aware Dataflow

The tradeoff between analyzability and expressiveness is a key factor when choosing a suitable dataflow model of computation (MoC) for designing,... (more)

##### NEWS

Call for Special Issue on Machine Learning for CAD: This Special Issue focuses on machine learning methods for all aspects of CAD for VLSI and electronic system design. Deadline for submission is June 15th 2019.

ACM TODAES new page limit policy: Manuscripts must be formatted in the ACM Transactions format; a 35-page limit applies to the final paper. Rare exceptions are possible if recommended by the reviewers and approved by the Editorial Board.

ORCID is a community-based effort to create a global registry of unique researcher identifiers for the purpose of ensuring proper attribution of works to their creators. When you submit a manuscript for review, you will be presented with the opportunity to register for your ORCID.

Welcome ACM Associate Editors

##### Forthcoming Articles
Optimization of Threshold Logic Networks with Node Merging and Wire Replacement

In this paper, we present an optimization method for threshold logic networks (TLNs) based on observability don?t care-based node merging. To reduce gate count in a TLN, it iteratively merges two gates that are functionally equivalent or whose differences are never observed at the primary outputs. Furthermore, it is able to identify redundant wires and replace wires for removing more gates. Basically, the proposed method is primarily adapted from an ATPG-based node-merging approach which works for conventional Boolean logic networks. To extend the approach for TLNs, we develop a method for computing the mandatory assignments of a stuck-at fault test on a threshold gate and a method for conducting logic implication in a TLN. Additionally, to achieve a better optimization quality, we integrate the proposed method with other optimization methods. The experimental results show that the overall optimization method can save an average of approximately 4.7% threshold gates for a set of TLNs which are generated by using the latest TLN synthesis method. The experimental results also demonstrate the efficiency of the optimization method.

IP Protection and Supply Chain Security through Logic Obfuscation: A Systematic Overview

The globalization of the semiconductor supply chain introduces ever-increasing security and privacy risks. Two major concerns are IP theft through reverse engineering and malicious modification of the design. The latter concern in part relies on successful reverse engineering of the design as well. IC camouflaging and logic locking are two research techniques that can thwart reverse engineering by end-users or foundries. However, developing low overhead locking/camouflaging schemes that can resist the ever-evolving state-of-the-art attacks has been a research challenge for several years. This article provides a comprehensive review of the state-of-art with respect to locking/camouflaging techniques. We start by defining a systematic threat model for these techniques and discuss how various real-world scenarios relate to each threat model. We then discuss the evolution of generic algorithmic attacks under each threat model leading to the strongest existing attacks. The paper then systematizes defences, discussing attacks that are more specific to certain kinds of locking/camouflaging. In conclusion the paper discusses open problems and future directions.

Impact of Electrostatic Coupling on Monolithic 3D-enabled Network on Chip

Monolithic-3D-integration (M3D) improves the performance and energy efficiency of 3D ICs over conventional TSV-based counterparts. The smaller dimensions of monolithic inter-tier vias (MIVs) offer high density integration, the flexibility of partitioning logic blocks across multiple tiers and significantly reduced total wire-length enable high-performance and energy-efficiency. However, the performance of M3D ICs degrades due to the presence of electrostatic coupling when the inter-layer-dielectric (ILD) thickness between two adjacent tiers is less than 50nm. In this work, we evaluate the performance of an M3D-enabled Network-on-chip (NoC) architecture in the presence of electrostatic coupling. Electrostatic coupling induces significant delay and energy overheads for the multi-tier NoC routers. This in turn results in considerable performance degradation if the NoC design methodology does not incorporate the effects of electrostatic coupling. We demonstrate that electrostatic coupling degrades the energy-delay-product (EDP) of an M3D NoC by 18.1% averaged over eight different applications from SPLASH-2 and PARSEC benchmark suites. As a countermeasure, we advocate the adoption of electrostatic coupling-aware M3D NoC design methodology. Experimental results show that the coupling-aware M3D NoC reduces performance penalty by lowering the number of multi-tier routers significantly.

Smart-Hop Arbitration Request Propagation: Avoiding Quadratic Arbitration Complexity and False Negatives in SMART NoCs

SMART-based NoC designs achieve ultra-low latencies by enabling flits to traverse multiple hops within a single clock cycle. Notwithstanding the clear performance benefits, SMART-based NoCs suffer from several shortcomings: each router must arbitrate among a quadratic number of requests, which leads to high costs; each router independently makes its own arbitration decisions, which leads to a problem called false negatives that causes throughput loss. In this paper, we propose a new SMART-based NoC design called SHARP that overcomes these shortcomings. Our evaluation demonstrates that SHARP increases throughput by up to 19% and average link utilization by up to 24% by avoiding false negatives. By avoiding quadratic arbitration, our evaluation further demonstrates that SHARP reduces the wiring and area overhead significantly.

JAMS-SG: A Framework for Jitter-Aware Message Scheduling for Time-Triggered Automotive Networks

Time-triggered automotive networks use time-triggered protocols (FlexRay, TT Ethernet, etc.) for periodic message transmissions that often originate from safety and time-critical applications. One of the major challenges with time-triggered transmissions is jitter, which is the unpredictable delay-induced deviation from the actual periodicity of a message. Failure to account for jitter can be catastrophic in time-sensitive systems, such as automotive platforms. In this article, we propose a novel scheduling framework (JAMS-SG) that satisfies timing constraints during message delivery for both jitter-affected time-triggered messages and high priority event-triggered messages in automotive networks. At design time, JAMS-SG performs jitter-aware frame packing (packing of multiple signals from Electronic Control Units (ECUs) into messages), and schedule synthesis with a hybrid heuristic. At runtime, a Multi-Level Feedback Queue (MLFQ) handles jitter affected time-triggered messages, and high priority event-triggered messages which are scheduled using a runtime scheduler. Our simulation results, based on messages and network traffic data from a real vehicle, indicate that JAMS-SG is highly scalable and outperforms the best-known prior work in the area, in the presence of jitter.

Hidden in Plaintext: An Obfuscation-based Countermeasure against FPGA Bitstream Tampering Attacks

Field Programmable Gate Arrays (FPGAs) have become an attractive choice for diverse applications due to their reconfigurability and unique security features. However, designs mapped to FPGAs are prone to malicious modifications or tampering of critical functions. Besides, targeted modifications have demonstrably compromised FPGA implementations of various cryptographic primitives. Existing security measures based on encryption and authentication can be bypassed using their side-channel vulnerabilities to execute bitstream tampering attacks. Furthermore, numerous resource-constrained applications are now equipped with low-end FPGAs which may not support power-hungry cryptographic solutions. In this paper, we propose a novel obfuscation-based approach to achieve strong resistance against both random and targeted pre-configuration tampering of critical functions in an FPGA design. Our solution first identifies the unique structural and functional features that separate the critical function from the rest of the design using a machine learning guided framework. The selected features are eliminated by applying appropriate obfuscation techniques, many of which take advantage of ?FPGA dark silicon?? unused lookup table resources, to mask the critical functions. Furthermore, following the same obfuscation principle, a redundancy-based technique is proposed to thwart targeted, rule-based, and random tampering. We have developed a complete methodology and custom software tool?ow that integrates with commercial tools. By applying the masking technique on a design containing AES, we show the e?ectiveness of the proposed framework in hiding the critical S-Box function. We implement the redundancy integrated solution in various cryptographic designs to analyze the overhead. In order to protect 16.2% critical component of a design, the proposed approach incurs an average area overhead of only 2.4% over similar redundancy-based approaches, while achieving strong security.

Security-Aware Routing and Scheduling for Control Applications on Ethernet TSN Networks

Today, it is common knowledge, in the cyber-physical systems domain, that the tight interaction between the cyber and physical elements provides the possibility of substantially improving the performance of these systems that is otherwise impossible. On the downside, however, this tight interaction with cyber elements makes it easier for an adversary to compromise the safety of the system. This becomes particularly important since such systems typically comprise several critical physical components, e.g., adaptive cruise control or engine control that allow deep intervention in the driving of a vehicle. As a result, it is important to ensure not only the reliability of such systems, e.g., in terms of schedulability and stability of control plants, but also resilience to adversarial attacks. In this article, we propose a security-aware methodology for routing and scheduling for control applications in Ethernet networks. The goal is to maximize the resilience of control applications within these networked control systems to malicious interference, while guaranteeing the stability of all control plants, despite the stringent resource constraints in such cyber-physical systems. Our experimental evaluations demonstrate that careful optimization of available resources can significantly improve the resilience of these networked control systems to attacks.

Architectural Design of Flow-based Microfluidic Biochips for Multi-Target Dilution of Biochemical Fluids

Microfluidic technologies enable replacement of time consuming and complex steps of biochemical laboratory protocols with a tiny chip. Sample preparation (i. e., dilution or mixing of fluids) is one of the primary tasks of any bioprotocol. In real-life applications where several assays need to be executed for different diagnostic purposes, the same sample fluid is often required with different target concentration factors (CFs). Although several multi-target dilution algorithms have been developed for digital microfluidic ({\em DMF}) biochips, they are not efficient for implementation with continuous-flow based microfluidic ({\em CMF}) chips, which are preferred in the laboratories. In this paper, we present a multi-target dilution algorithm ({\em MTDA}) for {\em CMF} biochips, which, to the best of our knowledge, is the first-of-its-kind. We design a flow-based rotary mixer with a suitable number of segments depending on the target-$CF$ profile, error-tolerance, and optimization criteria. In order to schedule several intermediate fluid-mixing tasks, we develop a multi-target scheduling algorithm ({\em MTSA}) aiming to minimize the usage of storage units, while producing dilutions with multiple $CF$s. Furthermore, we propose a storage architecture for efficiently loading (storing) of intermediate fluids from (to) the storage units.

Investigating the Impact of Image Content On the Energy Efficiency of Hardware Accelerated Digital Spatial Filters

Battery operated low-power portable computing devices are becoming an inseparable part of human daily life. One of the major goals is to achieve the longest battery life in such a device. Additionally, the need for performance in processing multimedia content is ever increasing. Processing image and video content consume more power than other applications. A common approach to improving energy efficiency is to implement the computationally intensive functions as digital hardware accelerators. Spatial filtering is one of the most commonly used methods of digital image processing. As per the Fourier theory, an image can be considered as a two-dimensional signal that is composed of spatially extended two-dimensional sinusoidal patterns called gratings. Spatial frequency theory states that sinusoidal gratings can be characterised by its spatial frequency, phase, amplitude and orientation. This paper presents results from our investigation into assessing the impact of these characteristics of a digital image on the energy efficiency of hardware accelerated spatial filters employed to process the same image. Two greyscale images each of size 128x128 pixels comprising of two-dimensional sinusoidal gratings at maximum spatial frequency of 64 cycles per image orientated at 0 and 90 degrees respectively, were processed in a hardware implemented Gaussian smoothing filter. The energy efficiency of the filter was compared with the baseline energy efficiency of processing a featureless plain black image. The results show that energy efficiency of the filter drops to 12.5% when the gratings are orientated at 0 degrees whilst rises to 72.38% at 90 degrees.