Thesis
Optimizing a primitive based approach to generic taint analysis
- Abstract:
-
Dynamic taint analysis is a fundamental technique in software security that tracks the flow of interesting or suspicious data during the execution of a target application. However, it is also known to suffer from excessively high runtime overhead which limits its overall practicality. The overhead is especially problematic when the taint analysis performed is generic. While this type of analysis is more powerful than standard taint status tracking, due to its support of user-defined custom taint policies, its versatility naturally implies that it cannot be optimized for a single task. Along these lines, this dissertation addresses the challenging problem of enhancing the performance of generic taint analysis.
Our research focuses on taint engines implemented using dynamic binary instrumentation. Essentially, the performance bottleneck of these systems stems from complex taint propagation code that is dynamically instrumented, at the level of machine instructions, into the target application. In order to facilitate the implementation of generic taint analysis, instrumentation is typically achieved by inserting transparent calls, which conveniently invoke propagation routines without disrupting the execution behaviour of the analysed application. However, these calls perform expensive context-switching, and therefore degrade overall runtime performance significantly.
We hypothesize that the overhead of generic taint analysis can be reduced in comparison to the slowdowns incurred by the analysis implemented with the use of transparent calls. This hypothesis is tested by investigating three optimizations. Firstly, call-free generic taint propagation is evaluated as a means to mitigate expensive context switching. Secondly, the idea of vectorized generic taint analysis is explored to determine the effectiveness of using SIMD features to propagate multiple taint labels simultaneously. Thirdly, dynamic fast path generation is proposed so that the execution of ineffective taint propagation can be adaptively elided. Our optimizations are presented with respect to a generic taint analysis driven by user-defined primitives, which act as building blocks for custom taint propagation. Crucially, this primitive based approach enables the user to focus on optimizing core pieces of functionality regarding the desired policy, without the need to delve into the complex internals of the system.
The optimizations are integrated into a generic taint tracker called the Taint Rabbit, and their effectiveness is evaluated on various real-world software and CPU-bound benchmarks. With average speed-ups of around 42% on SPEC CPU benchmarks, and as high as 99% on data compression tools, our results demonstrate the advancement of optimized and generic taint analysis.
Actions
Authors
Contributors
- Institution:
- University of Oxford
- Division:
- MPLS
- Department:
- Computer Science
- Role:
- Supervisor
- Funding agency for:
- Galea, J
- Programme:
- EPSRC Centre for Doctoral Training in Cyber Security
- DOI:
- Type of award:
- DPhil
- Level of award:
- Doctoral
- Awarding institution:
- University of Oxford
- Language:
-
English
- Keywords:
- Subjects:
- Pubs id:
-
2043015
- Local pid:
-
pubs:2043015
- Deposit date:
-
2021-05-28
Terms of use
- Copyright holder:
- Galea, J
- Copyright date:
- 2020
If you are the owner of this record, you can report an update to it here: Report update to this record