Report Number: CSL-TR-92-520
Institution: Stanford University, Computer Systems Laboratory
Title: An Empirical Study of an Abstract Interpretation of Scheme Programs
Author: Kanamori, Atty
Author: Weise, Daniel
Date: April 1992
Abstract: Abstract Interpretation, a powerful and general framework for performing global program analysis, is being applied to problems whose difficulty far surpasses the traditional "bit-vector'' dataflow problems for which many of the high-speed abstract interpretation algorithms worked so well. Our experience has been that current methods of large scale abstract interpretation are unacceptably expensive. We studied a typical large-scale abstract interpretation problem: computing the control flow of a higher order program. Researchers have proposed various solutions that are designed primarily to improve the accuracy of the analysis. The cost of the analyses, and its relationship to accuracy, is addressed only cursorily in the literature. Somewhat paradoxically, one can view these strategies as attempts to simultaneously improve the accuracy and reduce the cost. The less accurate strategies explore many spurious control paths because many flowgraph paths represent illegal execution paths. For example, the less accurate strategies violate the LIFO constraints on procedure call and return. More accurate analyses investigate fewer control paths, and therefore may be more efficient despite their increased overhead. We empirically studied this accuracy versus efficiency tradeoff. We implemented two fixpoint algorithms, and four semantics (baseline, baseline + stack reasoning, baseline + contour reasoning, baseline + stack reasoning + contour reasoning) for a total of eight control flow analyzers. Our benchmarks test various programming constructs in isolation --- hence, if a certain algorithm exhibits poor performance, the experiment also yields insight into what kind of program behavior results in that poor performance. The results suggest that strategies that increase accuracy in order to eliminate spurious paths often generate unacceptable overhead in the parts of the analysis that do not benefit from the increased accuracy. Furthermore, we found little evidence that the extra effort significantly improves the accuracy of the final result. This suggests that increasing the accuracy of the analysis globally is not a good idea, and that future research should╩investigate adaptive algorithms that use different amounts of precision on different parts of the problem.