Due to process variations, every path in the circuit is associated with a probability of being critical and a measure of this probability is the criticality of the path. Path criticality computation usually proceeds in two steps, namely, identification of a candidate path set followed by computation of path criticality. As criticality computation is expensive, the candidate path set is chosen using simpler metrics. However, these metrics are not directly related to path criticality and often, the set also contains low criticality paths that do not need to be optimized or tested. In this paper, we propose a hierarchical technique which directly gives all paths above a criticality threshold. The circuit is divided into disjoint groups at various levels. We show that the criticality of a group at each level of hierarchy can be computed using criticality of the parent group and the local complementary delay within the group. Low criticality groups are pruned at every level, making the computation efficient. This recursive partitioning and group criticality computation is continued until the group criticality falls below a threshold. Beyond this, the path selection within the group is done using branch-and-bound algorithm with criticality as the metric. This is possible, since our method for criticality computation is very efficient. Unlike other techniques, path selection and criticality computation are integrated together so that when the path selection is complete, path criticality is also obtained. The proposed algorithm is tested with ISCAS'85, ISCAS'89 and ITC'99 benchmark circuits and the results are verified using Monte Carlo simulation. The experimental results suggest that the proposed method gives better accuracy on average with around 90% reduction in run-time.
Accurate timing prediction for real-time embedded software execution is becoming a problem due to the increasing complexity of computer architecture, and the presence of mixed-criticality workloads. Probabilistic caches were proposed to set bounds to Worst Case Execution Time (WCET) estimates and help designers improve real-time embedded system resource use. Static Probabilistic Timing Analysis (SPTA) for probabilistic caches is nevertheless difficult to perform, because cache accesses depend on execution history, and the computational complexity of SPTA makes it intractable for calculation as the number of accesses increases. In this paper, we explore and improve SPTA for caches with evict-on-miss random replacement policy using a state space modeling technique. A non-homogeneous Markov model is employed for single-path programs in discrete-time finite state space representation. To make this Markov model tractable, we limit the number of states and use an adaptive method for state modification. Experiments show that compared to the state-of-the-art methodology, the proposed adaptive Markov chain approach provides better results at the occurrence probability of 1015: in terms of accuracy, the state-of-the-art SPTA results are more conservative, by 11% more on average. In terms of computation time, our approach is not significantly different from the state-of-the-art SPTA.
The kernel code injection is a common behavior of kernel-compromising attacks where the attackers aim to gain their goals by manipulating an OS kernel. Several security mechanisms have been proposed to mitigate such threats, but they all suffer from non-negligible performance overhead. This paper introduces a hardware reference monitor, called Kargos, which can detect the kernel code injection attacks with nearly zero performance cost. Kargos monitors the behaviors of an OS kernel from outside the CPU through the standard bus interconnect and debug interface available with most major microprocessors. By watching the execution traces and memory access events in the monitored target system, Kargos uncovers attempts to execute malicious code with the kernel privilege. On top of this, we also applied the architectural supports for Kargos to the detection of ROP attacks. KS-Stack is the hardware component that builds and maintains the shadow stacks using the existing supports to detect this ROP attacks. According to our experiments, Kargos detected all the kernel code injection attacks that we tested, yet just increasing the computational loads on the target CPU by less than 1% on average. The performance overhead of the KS-Stack was also less than 1%.
In real-time mixed-critical systems, Worst-Case Execution Time analysis (WCET) is required to guarantee that timing constraints are respected at least for high criticality tasks. However, the WCET is pessimistic compared to the real execution time, especially for multi-core platforms. As WCET computation considers the worst-case scenario, it means that whenever a high criticality task accesses a shared resource in multi/many-core platforms, it is considered that all cores use the same resource concurrently. This pessimism in WCET computation leads to a dramatic under utilization of the platform resources, or even failing to meet the timing constraints. In order to increase resource utilization while guaranteeing real-time guarantees for high criticality tasks, previous works proposed a run-time control system to monitor and decide when the interferences from low criticality tasks cannot be further tolerated. However, in the initial approaches, the points where the controller is executed were statically predefined. In this work, we propose a dynamic run-time control which adapts its observations to on-line temporal properties, increasing further the dynamism of the approach, and mitigating the unnecessary overhead implied by existing static approaches. Our dynamic adaptive approach allows to control the on-going execution of tasks based on run-time information, and increases further the gains in terms of resource utilization compared with static approaches.
The speed requirement for the routing table lookup and the packet classification is rapidly increasing due to increase in the number of packets that are needed to be processed per second. The hardware based packet classification relies on TCAM (ternary content addressable memory) to meet this speed requirement. However, TCAM consumes huge power and also it supports only for longest prefix match and exact match, where classification rule also has range match field. Hence, it is mandatory to encode the range match into prefix match to accommodate the rule in TCAM. In the worst case, one rule is encoded into (2W-2)^2 rules (where W is a number of bits to represent range). This work proposes a novel range matching architecture and a detailed analysis about range field on the standard data set and real life classifier rules is presented. The proposed architecture doesnt require any encoding/ conversion so that there will be a single entry for every rule. It is observed that just 4% of the two-dimensional range rules are present in this data set and it will increase the rule set size by four times in the best case and nearly 30 times in the worst case. The proposed range matching circuit is operated in parallel with TCAM, without compromising the speed and this circuit saves huge power around 70% and area around 61%. The proposed architecture is well suited for IPv4 and IPv6 based network.