White-Box Search-Based Software Engineering

There is an ever increasing tendency to consider software engineering as an optimization problem – from bug detection to fault identification and, more recently, to software design – known today as SBSE (Search Based Software Engineering). In that perspective, software building amounts to properly assemble existing components, or cleverly tune the use of ensembles/portfolios of existing (black-box) software components. However, complex tasks often require different characteristics at different times during execution for the software at hand – and static off-line approaches rapidly fail to fulfill such dynamic requirements: the decision making problem (which algorithm/heuristic/operator to use from the available ensemble/portfolio) is in fact a sequential decision-making problem, thus amenable to Reinforcement Learning (RL).

We propose in this new project to address this sequential decision-making problem using Multi-Armed Bandit (MAB) and Monte-Carlo Tree Search (MCTS) techniques, on-line, within (white-box) existing software. Two milestones will allow us to instantiate the first steps on the path to RL-Based Software Engineering.

Within existing Constraint Programming solvers, MCTS will be used to improve the efficiency of the successive tree walks, provided a meaningful reward is designed, and a history-preserving mechanism is implemented to accumulate knowledge between restarts.

Within highly distributed SAT solving, MAB can be used to dynamically tune the structure of the graph of communications, allowing each node to gather information only from nodes where it has been demonstrated useful in the recent past, leading to a scalable clause sharing mechanism.