Logic encryption is an IC protection technique for preventing an IC design from overproduction and unauthorized use. It hides a designs functionality by inserting key gates and key inputs, such that a secret key is required to activate the design for functioning correctly. The security of a logic encryption algorithm is evaluated according to the difficulty of cracking the secret key. The state-of-the-art attack method identifies a secret key with a series of SAT solving calls to prune all the incorrect keys. Although it can break most of the existing logic encryption algorithms within a few hours, we observe that there exist two enhancements for increasing its efficiency. First, we introduce a pre-process to identify and eliminate redundant key inputs for simplifying the SAT problems. Second, we present a key checking process for increasing the pruned incorrect keys in each SAT solving iteration. We conducted the experiments on a set of benchmark circuits encrypted by six different logic encryption algorithms. The results show that the enhanced method can successfully unlock 10 benchmark circuits which originally cannot be cracked within 1 hour. For all the benchmark circuits, the average speedup is approximately 2.2x. Furthermore, a recent logic encryption method locks a design by creating cyclic paths, which can invalidate the SAT-based attack method. We analyze the impact of the cyclic paths and propose an enhancement to break the cyclic logic encryption method.
Ordered escape routing is a critical issue in high-speed PCB routing. Differential pair and thermal-blockage-avoided are useful in PCB design to obtain high noise immunity and low electromagnetic interference. In this paper, a Min-cost Multi-commodity Flow (MMCF) approach is proposed to solve the ordered escape routing. First, the characteristic of grid pin array and staggered pin array is analyzed and then a basic network model is used to convert ordered escape routing to MMCF model. To satisfy the constraints of ordered escape routing, three novel transformations, such as non-crossing transformation, ordering transformation and capacity transformation, are used to convert the basic network model to the final correct MMCF model. After that, the differential pair in ordered escape routing is discussed. Finally, a method to deal with the blockage issue is proposed. Experimental results show that our method achieves 100% routability for all the test cases. The method can get both a feasible solution and an optimal solution for ordered escape routing. Compared to published approaches, our method improves in both wire length and CPU time remarkably. At the same time, the proposed method can effectively avoid the blockage and deal with the differential pair.
As designs continue to grow in size and complexity, EDA paradigm shifts from flat to hierarchical timing analysis. In this paper, we present compact and accurate timing macro modeling, which is the key to efficient and accurate hierarchical timing analysis. Our goal is to contain only a minimal amount of interface logic in our timing macro model. The main idea is to separate the interface logic into variant and constant timing regions. Then, the variant timing region is reserved for accuracy, while the constant timing region is reduced for compactness. For reducing the constant timing region, we propose anchor pin insertion and deletion by generalizing existing timing graph reduction techniques. Furthermore, we devise a lookup table index selection technique to achieve high model accuracy over the possible operating condition range. Compared with two common models used in industry, extracted timing model and interface logic model, our model has high model accuracy and small model size. Based on the TAU 2016 and 2017 timing macro modeling contest benchmark suites, our results show that our algorithm delivers superior efficiency and accuracy: Hierarchical timing analysis using our model can significantly reduce runtime and memory compared with flat timing analysis on the original design. Moreover, our algorithm outperforms TAU 2016 and 2017 contest winners in model accuracy, model size, model generation performance, and model usage performance.
The rise of mixed-critical embedded systems imposes novel challenges on the specification, development, and functional validation in a design flow. In the emerging dynamic scheduling context of mixed-criticality platforms, the system behaviour needs to be estimated in an early step in the design flow to assess the integration impact, especially for quality of service-driven, low-critical subsystems. We provide a modelling and integration flow for specifying, estimating, and evaluating software functions, ranging from an initial executable specification to an implementation candidate on an MPSoC. Based on a data-driven model to evaluate dynamic resource consumption effects of high-critical subsystems and the scheduling overhead, we propose a systematic method for constructing workload models of high-critical software components on the target. Our proxies provide an integration environment for low-critical functions by mimicking the high-critical temporal behaviour on the target. By integrating a low-critical video encoding subsystem with a benchmark suite as the high-critical subsystem we show that the performance model allows for evaluating end-to-end execution times in the low-critical function with an average error of 0.37 % and the application proxy only introduces a maximum error of 1.14 % in a performance evaluation.
Parallel on-chip voltage regulation, where multiple regulators are connected to the same power grid, has recently attracted significant attention with the proliferation of small on-chip voltage regulators. In this paper, the number, size, and location of parallel low-dropout (LDO) regulators and intentional decoupling capacitors are optimized using mixed integer non-linear programming (MINLP) formulation. The proposed optimization function concurrently considers multiple objectives such as area, power noise, and overall power consumption. Certain objectives are optimized by putting constraints to the other objectives with the proposed technique. Additional constraints have been added to avoid the overlap of LDOs and decoupling capacitors in the optimization process. The results of an optimized LDO allocation in the POWER8 chip is compared with the recent LDO allocation in the same IBM chip in a case study where a 20% reduction in the noise is achieved. The results of the proposed multi-criteria objective function under a different area, power, and noise constraints are also evaluated with a sample ISPD'11 benchmark circuits in another case study.
The legalization step is performed after global placement where wire length and routability are optimized or during timing optimization where buffer insertion or gate sizing are applied to meet timing requirements. Therefore, an ideal legalization approach must preserve the quality of the input placement in terms of routability, wire length, and timing constraints. These requirements indirectly impose maximum and average cell movement constraints during legalization. In addition, the legalization step should effectively manage white space availability with a highly efficient runtime in order to be used in an iterative process such as timing optimization. In this article, a robust and fast legalization method called Eh?Legalizer for standard-cell placement is presented. Eh?Legalizer legalizes input placements while minimizing the maximum and average cell movements using a highly efficient novel network flow-based approach. In contrast to the traditional network flow based legalizers, areas with high cell utilizations are effectively legalized by finding several candidate paths and there is no need for a post-process step. The experimental results conducted on several benchmarks show that Eh?Legalizer results in 2.5 times and 3.3 times less the maximum and average cell movement, respectively, while its runtime is significantly (18x) lower compared to traditional legalizers. In addition, the experimental results illustrate the scalability and robustness of Eh?Legalizer with respect to the floorplan complexity. Finally, the detailed-routing results show detailed routing violations are reduced on average by 23% when Eh?Legalizer is used to generate legal solutions.
Owing to existing intellectual properties (IPs), pre-routed nets, and power/ground wires, the routing of a System on Chip (SoC) design demands to detour around multi-layer (ML) obstacles. Maze routing has been applied to solve practical routing constraints; however, the much higher complexity of a multi-layer routing graph than that of a single-layer (SL) routing graph increases the required runtime of conducting maze routing on a multi-layer routing graph. Traditional approaches for the ML obstacle-avoiding rectilinear Steiner tree (ML-OARST) problem are thus non-maze routing based approaches for runtime issue; on the other hand, they cannot be directly applied to deal with additional constraints such as variant edge weights on a routing layer. In this paper, we propose the first maze routing-based methodology with bounded exploration and path-assessed retracing to reduce runtime and routing cost for the constrained ML-OARST construction problem. The exploration of maze routing is bounded to reduce the runtime; the costs of connecting pins are computed to select Steiner points in the retracing phase. To further reduce the routing cost, we develop a Steiner point-based ripping-up and re-building scheme for altering tree topology. Experimental results on industrial and randomly generated benchmarks demonstrate that the proposed methodology can provide a solution with good quality in terms of routing cost and has a significant speed-up compared to the traditional maze routing. A commercial tool is also used to show the effectiveness of the proposed methodology.
As the conventional silicon-based CMOS technology marches towards the sub-10nm region, the problem of high leakage power becomes increasingly serious. Under this circumstance, the carbon-nanotube field effect transistors (CNFETs) emerge as a promising alternative to the conventional CMOS devices. However, they experience much larger variability than the CMOS devices, which results in a large circuit delay variation and hence, a significant timing yield loss. One of the main variation sources is the carbon-nanotube (CNT) density variation. However, it shows a special property not existing for CMOS devices, namely the asymmetric spatial correlation. In this work, we propose a novel global placement algorithm to reduce the timing yield loss caused by the CNT density variation. We apply a statistical timing characterization to the CNFET standard cells for estimating the delay of the circuit under CNT density variation. Then, we apply a segment-based placement strategy to reduce the delays of the statistically critical paths. Experimental results demonstrated that our approach effectively improves the timing yield.
Modern field-programmable gate array (FPGA) devices contain complex clock architectures on top of configurable logics. Unlike application specific integrated circuits (ASICs), the physical structure of clock networks in an FPGA is pre-manufactured and cannot be adjusted to different applications. Furthermore, clock routing resources are typically limited for high-utilization designs. Consequently, clock architectures impose extra clock constraints and further complicate physical synthesis tasks such as placement. Traditional ASIC placement techniques only optimize conventional design metrics such as wirelength, routability, power, and timing without clock legality consideration. It is imperative to have new techniques to honor clock constraints during placement for FPGAs. In this paper, we propose a high-performance FPGA placement engine called UTPlaceF 2.0 that optimizes wirelength and routability while honoring complex clock constraints. Our proposed approaches consist of an iterative minimum-cost-flow-based cell assignment as well as a clock-aware packing for producing clock-legal yet high-quality placement solutions. UTPlaceF 2.0 won the first place in ISPD17 clock-aware FPGA placement contest organized by Xilinx, outperforming the second- and the third-place winners by 4.0% and 10.0%, respectively, in routed wirelength with competitive runtime, on a set of industry benchmarks.
This paper presents a novel approach to overcome the challenges of third-party IP-integration based on software-defined System-on-Chips (SoC) and Graph Grammars. The IP-supplier prepares a HW-accelerated software library (HASL) for the SoC architect. As a key point of our approach, HW integration knowledge is encoded in the library as a set of rules. These rules are defined in the machine-readable standardized IP-XACT IEEE 1685-2009 format. The library preparation step also includes the generation of configurable HW drivers, schedulers and the software library functions. For the SoC architect, we have developed the graph-grammar-based IP-integration (GRIP) tool. The software application is developed using the functions supplied in the HASL. According to the utilized functions, the GRIP tool automatically integrates IP-blocks using the rule information supplied with the library. The SoC architecture and rules are transformed into the graph domain to apply graph rewriting methods. Iterative rule application spans a design space search tree with multiple candidate SoC architectures for a targeted software-defined SoC. The GRIP tool is model-driven and based on the Eclipse Modeling Framework (EMF). Applying code generation techniques, SoC candidate architectures can be transformed to hardware descriptions for the target platform. The HW/SW interfaces between SW library functions and IP blocks can be automatically generated for bare-metal or Linux-based applications. The approach is demonstrated with a case-study on the Xilinx Zynq-based ZedBoard evaluation board using a HASL for computer vision, which yielded 150x performance improvement. The generated drivers overhead is only 0.28% compared to the manually-written drivers.
Two novel layout generation methods of analog modules, a symmetrical twin-row-style for MOS transistors and a twisted common-centriod style for unit capacitors array, are introduced. Based on these limited layout styles and the reasonable algorithms, the symmetry and common-centriod placement patterns for analog devices are realized to guarantee the matching property. On this basis, as the most prominent contribution of this paper, the sensible channel routing based algorithms for two proposed layout styles could achieve the 100% routability due to the limited device placement and the corresponding lower routing complexity. The improved algorithms also bring benefits such as smaller layout area by maximizing the diffusion-sharing of MOS transistors, less routing layer usage for common-centroid devices array. Moreover, we apply our algorithms to layout designs with typical analog modules such as 2-stage operating amplifier and SAR-ADCs, generation results with the circuit simulations demonstrate the effectiveness of our algorithms in terms of the routability and matching property. These feasible and efficient algorithms can be also extended to apply varieties of essential MOS analog circuits simply.
Outsourcing of design and manufacturing processes makes integrated circuits (ICs) vulnerable to adversarial changes and raises concerns about their integrity. Reverse engineering the manufactured netlist helps identify malicious insertions. In this paper, we present an automated approach that, given a reference design description, infers high-level blocks in an untrusted gate-level (test) implementation. Using the structural connectivity of the netlists, we compute a geometric embedding for each wire in the circuits, which, then, is used to compute a bipartite matching between the nodes of the two designs and identify functional units in the test circuit. Experiments to evaluate the efficacy of the proposed technique on various-sized designs, including the multi-core processor OpenSparc T1, show that it can correctly match over 90% of gates in the test circuit to their corresponding block in the reference model.