ACM Transactions on

Design Automation of Electronic Systems (TODAES)

Latest Articles

A Cross-level Verification Methodology for Digital IPs Augmented with Embedded Timing Monitors

Smart systems are characterized by the integration in a single device of multi-domain subsystems of... (more)

Thermal-aware 3D Symmetrical Buffered Clock Tree Synthesis

The semiconductor industry has accepted three-dimensional integrated circuits (3D ICs) as a possible solution to address speed and power management... (more)

Compilation of Dataflow Applications for Multi-Cores using Adaptive Multi-Objective Optimization

State-of-the-art system synthesis techniques employ meta-heuristic optimization techniques for Design Space Exploration (DSE) to tailor application... (more)

Augmenting Operating Systems with OpenCL Accelerators

Heterogeneous computing leverages more than one kind of processors to boost the performance of user-space applications with the heterogeneous... (more)

Comparing Platform-aware Control Design Flows for Composable and Predictable TDM-based Execution Platforms

We compare three platform-aware feedback control design flows that are tailored for a composable and... (more)

Data-driven Anomaly Detection with Timing Features for Embedded Systems

Malware is a serious threat to network-connected embedded systems, as evidenced by the continued and rapid growth of such devices, commonly referred... (more)

SSA-AC: Static Significance Analysis for Approximate Computing

Recently, the quest to reduce energy consumption in digital systems has been the subject of a number of ongoing studies. One of the most researched focuses is approximate computing (AC). AC is a new computing paradigm in both hardware and software designs that aim to achieve energy-efficient digital systems. Although a variety of AC techniques have... (more)

An Optimized Cost Flow Algorithm to Spread Cells in Detailed Placement

Placement is an important and challenging step in VLSI physical design. The placement solution can significantly impact timing and routability. In... (more)

Enabling IC Traceability via Blockchain Pegged to Embedded PUF

Globalization of IC supply chain has increased the risk of counterfeit, tampered, and re-packaged chips in the market. Counterfeit electronics poses a... (more)


ACM TODAES new page limit policy: Manuscripts must be formatted in the ACM Transactions format; a 35-page limit applies to the final paper. Rare exceptions are possible if recommended by the reviewers and approved by the Editorial Board.

ORCID is a community-based effort to create a global registry of unique researcher identifiers for the purpose of ensuring proper attribution of works to their creators. When you submit a manuscript for review, you will be presented with the opportunity to register for your ORCID.

Welcome ACM Associate Editors

Forthcoming Articles
Analysis of Dissipative Losses in Modular Reconfigurable Energy Storage Systems using SystemC TLM and SystemC-AMS

Battery storage systems are becoming more and more popular in the automotive industry as well as in stationary applications. To fulfill the requirements in terms of power and energy, the literature is increasingly discussing electrically reconfigurable interconnection topologies. However, these topologies use switching elements on the cell and module level, which exhibit an electric resistance due to their design and hence generate undesirable dissipative losses. In this paper, we propose a new analysis and optimization framework to examine and minimize the losses in such topologies. For this purpose we develop a SystemC model to investigate static and dynamic load scenarios, e.g., from the automotive domain. The model uses SystemC TLM for the digital subsystem, SystemC-AMS for the mixed-signal subsystem and host-compiled simulation for the microcontroller executing the embedded software. Here, we analyze the impact of the dissipative losses on the system efficiency which depend on the modularization level, implying the number of serial and parallel switching elements. Our analysis clearly shows that in reconfigurable topologies, the modularization level has a significant influence on the losses, which in our automotive example comprise several orders of magnitude. An optimization shows the highest efficiency when a parallel-only modularization is aspired and the number of serial switching elements is minimized. It is also shown that the losses of the state-of-the-art topology with one battery pack protection switch are almost as high as in a smart cell approach in which each energy storage cell has its own switching element. However, due to the high number of switching elements, this results in a reduction of energy density and increases the system costs, showing that this is a multi-criteria optimization problem.

Compiler-Assisted and Profiling-Based Analysis for Fast and Efficient STT-MRAM On-Chip Cache Design

Spin Transfer Torque Magnetic Random Access Memory (STT-MRAM) is a promising candidate for large on-chip memories as a zero-leakage, high density and non-volatile alternative to the present SRAM technology. Since memories are the dominating component of a System-on-Chip, the overall performance of the system is highly dependent on that memories. Nevertheless, the high write energy and latency of the emerging STT-MRAM are the most challenging design issues in a modern computing system. By relaxing the non-volatility of these devices, it is possible to reduce the write energy and latency costs, at the expense of reducing the retention time, which in turn may lead to loss of data. In this paper, we propose a hybrid STT-MRAM design for caches with different retention capabilities. Then, based on the application requirements (i.e., execution time and memory access rate), program data layout is re-arranged at compilation time for achieving fast and energy efficient hybrid STT-MRAM on-chip memory design with no reliability degradation. The application requirements have been defined at function granularity based on profiling and static code analysis, which estimate the required retention time and memory access rate, respectively. Experimental results show that the proposed hybrid STT-MRAM cache combined with profiling-based and compiler level analysis for the data re-arranging, on average, reduces the write energy per access by 49.7%. At system level, overall static and dynamic energy of the cache are respectively reduced by 8.1% and 44%. Whereas, the system performance has been improved up to 8.1%.

Layout Resynthesis by Applying Design-for-Manufacturability Guidelines to Avoid Low-Coverage Areas of a Cell-Based Design

Design-for-manufacturability (DFM) guidelines are recommended layout design practices intended to capture layout features that are difficult to manufacture correctly. Avoiding such features prevents the occurrence of potential systematic defects. Layout features that result in DFM guideline violations may not be avoided completely due to the design constraints of chip area, performance and power consumption. A framework for translating DFM guideline violations into potential systematic defects, and faults, was described earlier. In a cell-based design, the translated faults may be internal or external to cells. In this article we focus on undetectable faults that are external to cells. Using a resynthesis procedure that makes fine changes to the layout while maintaining the design constraints, we target areas of the design where large numbers of external faults related to DFM guideline violations are undetectable. By eliminating the corresponding DFM guideline violations, we ensure that the circuit does not suffer from low-coverage areas that may result in detectable systematic defects escaping detection, but failing the circuit in the field. The layout resynthesis procedure is applied to benchmark circuits and logic blocks of the OpenSPARC T1 microprocessor. Experimental results indicate that the improvement in the coverage of potential systematic defects is significant.

MEMS-IC Robustness Optimization Considering Electrical and Mechanical Design and Process Parameters

MEMS-based sensor circuits are traditionally designed separately using CAD tools specific to each energy domain (electrical and mechanical). The paper presents a complete approach for combined MEMS-IC robustness optimization. Advanced methods for robustness analysis and optimization considering design, operating and process parameters, developed for integrated circuits, are transferred to MEMS-IC systems. Both electrical and mechanical design and process parameters are included in the optimization. The methodology is exemplified on two demonstrator examples: a MEMS microphone and a MEMS accelerometer, each with an integrated readout circuit. A successful optimization requires the simultaneous inclusion of design parameters and process tolerances from both energy domains. To save CPU time, a reduced-order, circuit-level model is used for the MEMS part and this model is created only when necessary. To integrate the generation of the simplified model into the optimization flow, a simulation-in-a-loop flow based on commercial tools for both the electrical and the mechanical domain has been implemented.

DCW: A Reactive and Predictable Programming Framework for LET-based Distributed Real-time Systems

Real-time systems continuously interact with the physical environment and often have to satisfy stringent timing constraints imposed by their interactions. Those systems involve two main properties: reactivity and predictability. Reactivity allows the system to continuously react to a non-deterministic external environment, while predictability guarantees the deterministic execution of safety-critical parts of applications. However, with the increase in software complexity, traditional approaches to develop real-time systems make temporal behaviors difficult to infer, especially when the system is required to address non-deterministic aperiodic events from the physical environment. In this paper, we propose a reactive and predictable programming framework, Distributed Clockwerk (DCW), for distributed real-time systems. DCW introduces the Servant, which is a non-preemptible execution entity, to implement periodic tasks based on the Logical Execution Time (LET) model. Furthermore, a joint schedule policy, based on the slack stealing algorithm, is proposed to efficiently address aperiodic events with no violated hard time constraints. To further support predictable communication among distributed nodes, DCW implements the Time-Triggered Controller Area Network (TTCAN) to avoid collisions while accessing the shared communication medium. Moreover, a programming framework implements to provide a set of programming APIs for defining timing and functional behaviors of concurrent tasks. An example is further implemented to illustrate the DCW design flow. The evaluation results demonstrate that our proposal can improve both periodic and aperiodic reactivity compared with existing work, and the implemented DCW can also ensure the system predictability by achieving extremely low overheads.

On Chip Reconfigurable CMOS Analog Circuit Design and Automation Against Aging Phenomena: Sense and React

Adaptive Test for RF/Analog Circuit Using Higher Order Correlations Among Measurements

As process variations increase and devices get more diverse in their behavior, using the same test list for all devices is increasingly inefficient. Methodologies that adapt the test sequence with respect to lot, wafer, or even device?s own behavior help contain the test cost while maintaining test quality. In adaptive test selection approaches, initial test list, a set of tests that are applied to all devices to learn information, plays a crucial role in the quality outcome. Most adaptive test approaches select this initial list based on fail probability of each test individually. Such a selection approach does not take into account the correlations that exist among various measurements and potentially will lead to the selection of correlated tests. In this work, we propose a new adaptive test algorithm that includes a mathematical model for initial test ordering that takes correlations among measurements into account. The proposed method can be integrated within an existing test flow running in the background to improve not only the test quality but also the test time. Experimental results using four distinct industry circuits and large amounts of measurement data show that the proposed technique outperforms prior approaches considerably.

Electronics Supply Chain Integrity Enabled by Blockchain

Electronic systems are ubiquitous today, playing an irreplaceable role in our personal lives as well as in critical infrastructures such as power grid, satellite communication, and public transportation. In the past few decades, the security of software running on these systems has received significant attention. However, hardware has been assumed to be trustworthy and reliable ``by default'' without really analyzing the vulnerabilities in the electronics supply chain. With the rapid globalization of the semiconductor industry, it has become challenging to ensure the integrity and security of hardware. In this paper, we discuss the integrity concerns associated with a globalized electronics supply chain. More specifically, we divide the supply chain into six distinct entities: IP owner/foundry (OCM), distributor, assembler, integrator, end user, and electronics recycler, and analyze the vulnerabilities and threats associated with each stage. To address the concerns of the supply chain integrity, we propose a blockchain-based certificate authority framework that can be used to manage critical chip information such as electronic chip identification (ECID), chip grade, transaction time, etc. The decentralized nature of the proposed framework can mitigate most threats of the electronics supply chain, such as recycling, remarking, cloning, and overproduction.

A Novel Rule Mapping on TCAM for Power Efficient Packet Classification

Packet Classification is the enabling function performed in commodity switches for providing various services like access control, intrusion detection, load balancing and so on. Ternary Content Addressable Memories (TCAMs) are the de-facto standard for performing packet classification at high speeds. However, TCAMs are highly costlier both in terms of cost and power consumption, forcing the switch vendors towards placing lots of effort for power management. Hence, power efficient solutions for TCAM based packet classification are highly relevant even today. In this paper, we propose a novel rule placement algorithm based on the unique field values presence within the rule databases. We evaluate the total search that is needed to be inspected with respect to traditional placement approach and the proposed placement approach based on the information content within the fields. Simulation results showed an average reduction of 30.55% in the search space by the proposed placement approach thereby resulting in an average reduction of 18.85% per search energy over TCAM. With typical TCAM clock -speeds ranging between 200 - 400 MHz, this reduction in the per search energy maps to a huge reduction in the total energy consumed by the TCAM based network switches. The proposed solution is plug and play type requiring only minimal preprocessing within the Network Processing Unit (NPU) of the switches and edge routers.

Cross-point Resistive Memory: Nonideal Properties and Solutions

Emerging computational resistive memory is a promising candidate to overcome DRAM challenges and the memory wall bottleneck. However, its cell-level and array-level nonideal properties significantly degrade the reliability, performance, accuracy, and energy efficiency during memory access and analog computation. Cell-level nonidealities include nonlinearity, asymmetry, variability, etc. Array-level nonidealities include interconnect resistance, parasitic capacitance, sneak path, etc. This review summarizes solutions that can mitigate the impact of these nonideal properties. Firstly, we introduce several typical resistive memory devices with focus on their switching modes and characteristics. Secondly, we review resistive memory cells and memory array structures, including 1T1R, 1R, 1S1R, 1TnR, and CMOL. We also overview 3D cross-point arrays and their structural properties. Thirdly, we analyze the impact of cell-level and array-level nonideal properties during memory access and analog arithmetic operation with focus on dot product operation and matrix-vector multiplication. Fourthly, we discuss how to mitigate these nonideal properties by static physical and geometric parameter optimization and dynamic runtime optimization from the viewpoint of cell-array interaction-and-codesign. Dynamic runtime operation schemes include line connection, voltage bias, logical-to-physical mapping, state partition, read reference setting, and switching mode reconfiguration. We also highlight challenges on multilevel cell cross-point arrays and 3D cross-point arrays during these operations. Finally, we survey peripheral circuits design considerations. We also portray an unified reconfigurable computational memory architecture.

All ACM Journals | See Full Journal Index

enter search term and/or author name