Most non-convex optimization algorithms I have come across so far rely basically on random restart to find a better solution. e.g. Genetic Algorithm, Simulated Annealing, Metropolis Hastings Monte Carlo. Even in Stochastic Gradient Descent we can think of training each batch as a random perturbation.
There are some non-statistical algorithms in Dynamic Programming and Backtracking which can guarantee global optima, but I havent found any literature that works on noisy data.
What are the strategies which algorithms use (other than random restart) to get out of the local optima and search for the global optima for noisy data?