Readings

Since 2022, I have developed the habit of sharing the papers I read. This practice offers a glimpse into my current focus and areas of expertise. By exploring these readings, you can better understand the topics that interest me and the research directions I am pursuing.


Table of Contents


Reinforcement Learning

2023

Book: A Course In Reinforcement Learning

Year: 2023

Authors: Dimitri P. Bertsekas

Link to book: Athena Scientific

Abstract:

This book is based on lecture notes prepared for use in the 2023 ASU research-oriented course on Reinforcement Learning (RL) that I have offered in each of the last five years, as the field was rapidly evolving. The purpose of the book is to give an overview of the RL methodology, particularly as it relates to problems of optimal and suboptimal control, as well as discrete optimization. More broadly, we will aim to provide a framework for structured thinking about RL and its connections to the decision and control methodology, which is couched on, but is not dominated by mathematics.

Branch and Bound

2024

Learning to Branch with Offline Reinforcement Learning

Year: 2024

Journal: ICLR 2024 Conference

Authors: Shengyu Feng, Yiming Yang

Link to paper: OpenReview

Abstract:

Mixed Integer Linear Program (MILP) solvers are mostly built upon a branch-and-bound (B&B) algorithm, where the efficiency of traditional solvers heavily depends on hand-craft heuristics for branching. Such a dependency significantly limits the success of those solvers because such heuristics are often difficult to obtain, and not easy to generalize across domains/problems. Recent deep learning approaches aim to automatically learn the branching strategies in a data-driven manner, which removes the dependency on hand-crafted heuristics but introduces a dependency on the availability of high-quality training data. Obtaining the training data that demonstrates near-optimal branching strategies can be a difficult task itself, especially for large problems where accurate solvers have a hard time scaling and producing near-optimal demonstrations. This paper overcomes this obstacle by proposing a new offline reinforcement learning (RL) approach, namely the \textit{Ranking-Constrained Actor-Critic} algorithm, which can efficiently learn good branching strategies from sub-optimal or inadequate training signals. Our experiments show its advanced performance in both prediction accuracy and computational efficiency over previous methods for different types of MILP problems on multiple evaluation benchmarks.

Learning in Spatial Branching: Limitations of Strong Branching Imitation

Year: 2024

Journal: ICLR 2024 Conference

Authors: Brais González-Rodríguez, Ignacio Gómez-Casares, Bissan Ghaddar, Julio González-Díaz, Beatriz Pateiro-López

Link to paper: arxiv

Abstract:

Over the last few years, there has been a surge in the use of learning techniques to improve the performance of optimization algorithms. In particular, the learning of branching rules in mixed integer linear programming has received a lot of attention, with most methodologies based on strong branching imitation. Recently, some advances have been made as well in the context of nonlinear programming, with some methodologies focusing on learning to select the best branching rule among a predefined set of rules leading to promising results. In this paper we explore, in the nonlinear setting, the limits on the improvements that might be achieved by the above two approaches: learning to select the best variable (strong branching) and learning to select the best rule (rule selection).

Machine Learning Augmented Branch and Bound for Mixed Integer Linear Programming

Year: 2024

Journal: Optimization and Control

Authors: Lara Scavuzzo, Karen Aardal, Andrea Lodi, Neil Yorke-Smith

Link to paper: arxiv

Abstract:

Mixed Integer Linear Programming (MILP) is a pillar of mathematical optimization that offers a powerful modeling language for a wide range of applications. During the past decades, enormous algorithmic progress has been made in solving MILPs, and many commercial and academic software packages exist. Nevertheless, the availability of data, both from problem instances and from solvers, and the desire to solve new problems and larger (real-life) instances, trigger the need for continuing algorithmic development. MILP solvers use branch and bound as their main component. In recent years, there has been an explosive development in the use of machine learning algorithms for enhancing all main tasks involved in the branch-and-bound algorithm, such as primal heuristics, branching, cutting planes, node selection and solver configuration decisions. This paper presents a survey of such approaches, addressing the vision of integration of machine learning and mathematical optimization as complementary technologies, and how this integration can benefit MILP solving. In particular, we give detailed attention to machine learning algorithms that automatically optimize some metric of branch-and-bound efficiency. We also address how to represent MILPs in the context of applying learning algorithms, MILP benchmarks and software.

An Improved Reinforcement Learning Algorithm for Learning to Branch

Year: 2022

Journal: Artificial Intelligence (cs.AI); Optimization and Control

Authors: Q Qu, X Li, Y Zhou, J Zeng, M Yuan, J Wang, J Lv, K Liu, K Mao

Link to paper: arxiv

Abstract:

Most combinatorial optimization problems can be formulated as mixed integer linear programming (MILP), in which branch-and-bound (B\&B) is a general and widely used method. Recently, learning to branch has become a hot research topic in the intersection of machine learning and combinatorial optimization. In this paper, we propose a novel reinforcement learning-based B\&B algorithm. Similar to offline reinforcement learning, we initially train on the demonstration data to accelerate learning massively. With the improvement of the training effect, the agent starts to interact with the environment with its learned policy gradually. It is critical to improve the performance of the algorithm by determining the mixing ratio between demonstration and self-generated data. Thus, we propose a prioritized storage mechanism to control this ratio automatically. In order to improve the robustness of the training process, a superior network is additionally introduced based on Double DQN, which always serves as a Q-network with competitive performance. We evaluate the performance of the proposed algorithm over three public research benchmarks and compare it against strong baselines, including three classical heuristics and one state-of-the-art imitation learning-based branching algorithm. The results show that the proposed algorithm achieves the best performance among compared algorithms and possesses the potential to improve B\&B algorithm performance continuously.

2022

An Improved Reinforcement Learning Algorithm for Learning to Branch

Year: 2022

Journal: Artificial Intelligence (cs.AI); Optimization and Control

Authors: Q Qu, X Li, Y Zhou, J Zeng, M Yuan, J Wang, J Lv, K Liu, K Mao

Link to paper: arxiv

Abstract:

Most combinatorial optimization problems can be formulated as mixed integer linear programming (MILP), in which branch-and-bound (B\&B) is a general and widely used method. Recently, learning to branch has become a hot research topic in the intersection of machine learning and combinatorial optimization. In this paper, we propose a novel reinforcement learning-based B\&B algorithm. Similar to offline reinforcement learning, we initially train on the demonstration data to accelerate learning massively. With the improvement of the training effect, the agent starts to interact with the environment with its learned policy gradually. It is critical to improve the performance of the algorithm by determining the mixing ratio between demonstration and self-generated data. Thus, we propose a prioritized storage mechanism to control this ratio automatically. In order to improve the robustness of the training process, a superior network is additionally introduced based on Double DQN, which always serves as a Q-network with competitive performance. We evaluate the performance of the proposed algorithm over three public research benchmarks and compare it against strong baselines, including three classical heuristics and one state-of-the-art imitation learning-based branching algorithm. The results show that the proposed algorithm achieves the best performance among compared algorithms and possesses the potential to improve B\&B algorithm performance continuously.

Reinforcement Learning for Branch-and-Bound Optimisation Using Retrospective Trajectories

Year: May 2022

Journal: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence

Authors: Christopher W. F. Parsonson, Alexandre Laterre, Thomas D. Barrett

Link to paper: arxiv

Abstract:

Combinatorial optimisation problems framed as mixed integer linear programmes (MILPs) are ubiquitous across a range of real-world applications. The canonical branch-and-bound algorithm seeks to exactly solve MILPs by constructing a search tree of increasingly constrained sub-problems. In practice, its solving time performance is dependent on heuristics, such as the choice of the next variable to constrain ('branching'). Recently, machine learning (ML) has emerged as a promising paradigm for branching. However, prior works have struggled to apply reinforcement learning (RL), citing sparse rewards, difficult exploration, and partial observability as significant challenges. Instead, leading ML methodologies resort to approximating high quality handcrafted heuristics with imitation learning (IL), which precludes the discovery of novel policies and requires expensive data labelling. In this work, we propose retro branching; a simple yet effective approach to RL for branching. By retrospectively deconstructing the search tree into multiple paths each contained within a sub-tree, we enable the agent to learn from shorter trajectories with more predictable next states. In experiments on four combinatorial tasks, our approach enables learning-to-branch without any expert guidance or pre-training. We outperform the current state-of-the-art RL branching algorithm by 3-5x and come within 20% of the best IL method's performance on MILPs with 500 constraints and 1000 variables, with ablations verifying that our retrospectively constructed trajectories are essential to achieving these results.

2020

Reinforcement Learning for Variable Selection in a Branch and Bound Algorithm

Year: May 2020

Journal: Machine Learning (stat.ML)

Authorls: Marc Etheve, Zacharie Alès, Côme Bissuel, Olivier Juan, Safia Kedad-Sidhoum

Link to paper: arxiv

Abstract:

Mixed integer linear programs are commonly solved by Branch and Bound algorithms. A key factor of the efficiency of the most successful commercial solvers is their fine-tuned heuristics. In this paper, we leverage patterns in real-world instances to learn from scratch a new branching strategy optimised for a given problem and compare it with a commercial solver. We propose FMSTS, a novel Reinforcement Learning approach specifically designed for this task. The strength of our method lies in the consistency between a local value function and a global metric of interest. In addition, we provide insights for adapting known RL techniques to the Branch and Bound setting, and present a new neural network architecture inspired from the literature. To our knowledge, it is the first time Reinforcement Learning has been used to fully optimise the branching strategy. Computational experiments show that our method is appropriate and able to generalise well to new instances.

Improving Learning to Branch via Reinforcement Learning

Year: 2020

Journal: NA

Authors: H Sun, W Chen, H Li, L Song

Link to paper: OpenReview

Abstract:

Branch-and-Bound~(B\&B) is a general and widely used algorithm paradigm for solving Mixed Integer Programming~(MIP). Recently there is a surge of interest in designing learning-based branching policies as a fast approximation of strong branching, a human-designed heuristic. In this work, we argue strong branching is not a good expert to imitate for its poor decision quality when turning off its side effects in solving linear programming. To obtain more effective and non-myopic policies than a local heuristic, we formulate the branching process in MIP as reinforcement learning~(RL) and design a policy characterization for the B\&B process to improve our agent by novelty search evolutionary strategy. Across a range of NP-hard problems, our trained RL agent significantly outperforms expert-designed branching rules and the state-of-the-art learning-based branching methods in terms of both speed and effectiveness. Our results suggest that with carefully designed policy networks and learning algorithms, reinforcement learning has the potential to advance algorithms for solving MIPs.

2014

Learning to search in branch and bound algorithms

Year: 2014

Journal: Advances in neural information processing systems, 2014

Authors: H He, H Daume III, JM Eisner

Link to paper: neurips

Abstract:

Branch-and-bound is a widely used method in combinatorial optimization, including mixed integer programming, structured prediction and MAP inference. While most work has been focused on developing problem-specific techniques, little is known about how to systematically design the node searching strategy on a branch-and-bound tree. We address the key challenge of learning an adaptive node searching order for any class of problem solvable by branch-and-bound. Our strategies are learned by imitation learning. We apply our algorithm to linear programming based branch-and-bound for solving mixed integer programs (MIP). We compare our method with one of the fastest open-source solvers, SCIP; and a very efficient commercial solver, Gurobi. We demonstrate that our approach achieves better solutions faster on four MIP libraries.

Branch and Cut

2024

Learning Cut Generating Functions for Integer Programming

Year: 2024

Journal: Optimization and Control

Authors: Hongyu Cheng, Amitabh Basu

Link to paper: arxiv

Abstract:

The branch-and-cut algorithm is the method of choice to solve large scale integer programming problems in practice. A key ingredient of branch-and-cut is the use of cutting planes which are derived constraints that reduce the search space for an optimal solution. Selecting effective cutting planes to produce small branch-and-cut trees is a critical challenge in the branch-and-cut algorithm. Recent advances have employed a data-driven approach to select optimal cutting planes from a parameterized family, aimed at reducing the branch-and-bound tree size (in expectation) for a given distribution of integer programming instances. We extend this idea to the selection of the best cut generating function (CGF), which is a tool in the integer programming literature for generating a wide variety of cutting planes that generalize the well-known Gomory Mixed-Integer (GMI) cutting planes. We provide rigorous sample complexity bounds for the selection of an effective CGF from certain parameterized families that provably performs well for any specified distribution on the problem instances. Our empirical results show that the selected CGF can outperform the GMI cuts for certain distributions. Additionally, we explore the sample complexity of using neural networks for instance-dependent CGF selection.