REINFORCE with Bound-guided Gradient Estimator for the traveling salesman problem toward scale generalization
Abstract: Highlights•RIDGE introduced: knowledge-inspired REINFORCE for TSP with BHH Theorem.•RIDGE uses sliding average shortest path as adaptive baseline for stability.•RIDGE tops small-scale TSP accuracy, boosts large-scale generalization.•Smaller training sets enhance model generalization.
Loading