Travelling Salesman Problem as Sorting in 2D: Travelling Salesman Problem - dynamic programming and computational geometry approach

Travelling Salesman Problem as Dynamic Programming and Computational Geometry Approach

Sebastian Böhmer, Munich, 2016

Abstract

Travelling Salesman Problem (TSP) is a famous problem in informatics. It is common known and part of many problems, which are solved by matching and decent gradient search, currently.
Now, a new and best method to solve it correctly is announced here in this paper.
The invented solution guarantee a complete and correct result in the fastest way. Other methods have been tested since decades, and no correct solution was found. This is the only way to solve the TSP in best performance. There is no other method to solve it faster and also correctly. Tests have shown the correctness of the solution to 100%. It will be possible to program the solution in different implementations. Although, the basic path of this solution stays by the end of time.

Introduction

Different forms of the most three computer problems, are often kind of orientation, recognition and optimisation. They are all known to solve Travelling Salesman Problems (TSPs). This means, they solve the optimization of the TSP hidden.
Orientation gets its input from environment and decide depending on past movements the current position. The movement is unordered typically. Reducing this unordered movement to essential edges will define the space in this environment. TSP helps importantly to reduce the order of edges.
Recognition is a case of pattern matching typically. TSP helps by detecting edges on captured images. Edges are detected by pre-processing methods of image preparation.
A typical situation for optimisation is in numerical problems as route planning.
There are different kinds of routes. For example, one is to find the shortest round-trip for e.g. circuit tests on circuit's boards. You can test the correct connection of your components on boards by burning one line from start to an endpoint by defining the shortest round-trip, which is not crossing especially.

Description of TSP

The correct description of TSP is found on Wikipedia. www.wikipedia.com
There are different types of TSP. One with weights on their edges, one with edges not bidirectional and another not reaching all vertices by arbitrary other vertex. So TSP is a question on every network.
We assume here those three different kind of TSP are implemented by different length function. This means, if we have e.g. no edge between the vertices, we assume a infinite length between those points.

Here, we concentrate only of one general TSP. This TSP is defined as :

All vertices can be connected by all vertices on direct line.
There are no weights on those lines.
This means, we take only the euclidean length of those lines into account.
The way on those line is bidirectional and equal in length for both directions.

Now the TSP is the question :
What is the shortest round-trip through all vertices?

Round-trip means, we have to end at the point, we started.

In analogous of the metaphor from travelling Salesman, the situation is as following:
A salesman will sale it's product in (e.g.) 5 cities. In the end, he/she will get back home again and should have visited all cities.
Because his/her horses get weak on long journeys, they should take the shortest path.
This lead us to the description shortest round-trip. What means: the path have to end at start point and path should be the shortest.

The TSP is very good solvable by humans. This means they found a relatively good solution in very fast time.

In this paper a new and innovative method helps solving the Travelling Salesman Problem by an computer.

New Method

The basic idea came up for about two years. The experiments were going on since two years, in that time. It was invented only by me.
The basic idea is, that TSP Algorithm are sorting algorithms. This means, TSP is an sorting, and not a search.
Imagine, you have a start point, which is also the end point, you have two directions you can decide and the distances between the points in order are in the sum the smallest ones.

You can say sorting one dimensional is a simplified TSP with y=0 for all points.

Lets take a look on this example.
A sorted list in one dimension has the smallest or equal distance between the points to any other order of those points.
Distance is here in this paper every time the euclidean distance.

E.g.:
unordered
7 3 9 2 5 =
distances
4 + 6 + 7 + 3 + 2 = 22 = D_uno

ordered
2 3 5 7 9 =
distances
   1 + 2 + 2 + 2 + 7 = 14 = D_ord

Theorem 1:

D_ord <= D_uno

The equal case is possible, because there could be several orders with the same length. Later, this means there can be multiple ways with same length in TSP.

It is assumed in Theorem 1, that solving TSP is a Sorting-Algorithm.

Comparison

The book "The Travelling Salesman Problem: A Computational Study" from 2007 contains the current state of the art methods of TSP.
All authors expect searching in space or solving with equations.
Searching is completely off topic. Searching means you can always get into local minima and did not find the best solution. You would find a sub-optimal solution. Then, it is tried to improve this solution. However, a sub-optimal minima are far away from best solution in TSP. It shows, that a small error in TSP implies big costs.
It is far away and have big costs, because a change in the order of two points will have a change of at least three edges. At least means here, that if the order of two points are wrong, then it is typically, that the found minima will lead to at least three edges wrong, and mostly more than that edges. This means, if you interpret the order of two points wrong, your algorithm tends to interpret also other points wrong, because the order of points are depending of other points order.

So, the methods are not suitable for good performance. This means, good results in short time. Due to the fact, that small errors result in big costs and only a tiny speed-up with searching instead of sorting, leads to bad performance compared to sorting.
Further, methods in the survey as equation solving are described. It is known from computer-scientists, that every algorithm is possible to be described in equations.
This method is not very suitable. For many people it is more complicated to describe their Problems in equations than in program-code.

In the end, this complex method of sorting is still relevant. And is not redeemed by other methods.

There are people who believe that an algorithm of solving the TSP is as complex as O(n!). More over, they believe, that a sorting algorithm will have a such a complex merging, that leads to O(n!) again. They don't believe, that sorting is possible to solve TSP.

Merging is a part of sorting, which we will describe later.
The assumption is high, that TSP is in best-case at least O(n!) for complete and correct solutions. Although, it is not clear, if it could go faster.
One opposite argument is: As long it is not clear to solve faster, we could assume, that it behaves as one-dimensional sorting.

For one-dimensional sorting, it is possible to program an algorithm in O(n!) instead of the proved fastest O(n log(n)). It is still possible to program an algorithm which behaves more complex O(n!) than necessary O(n log(n)).

We assume, that met for TSP also. And we just have to find the proved fastest TSP solver.

Motivation

The TSP is a Problem, which is in very many places and times present. In cell growing, in gravity, in biology, in psychology and in engineering. TSP is a problem that is currently tried to solve very often. It is tried to solve in different ways. Some people didn't know that the try to solve a TSP. They have their own domain specific language and use their own specialised features to solve their tasks. Although, they are redundant to a TSP.
This solution, in this paper, is a unique, best and fastest solution. It is comparable with the best algorithm in sorting of one dimensional problems.
It is first described fully in this paper. There will be probably implementations with about 20% faster and 20% fewer memory consuming. Although, those implementations are similar to the concept described here. For this concept here, it exists currently no other description.
Tests show that every tested TSP is solved correctly already. Because the possible TSPs are infinite in infinite space to which they belong, it is not possible to test all TSP constellations.

NP = P

There was long time ago the question in informatics: What are difficult problems. It was defined an O-Notation. E.g. O(n!). It was the answer as we could distinguish between different complexity of programmes.
Now, we distinguished between two kind of programmes. The programmes of problems which are solvable in polynomial time and with one machine (typically O(n^k), n ! = k), and problems, which could be solved in polynomial time also, but with infinite parallel machines simultaneous working. The first kind is called P = polynomial, the second kind is called NP = not deterministic polynomial. (see Wikipedia)
Those NP problems use typically more than polynomial time on single machines, else, if they run in polynomial time on one deterministic machine, they would belong to the class of polynomial time problems P.
The TSP is classified as NP hard. This means it is not possible to reduce the problem to an P problem. There is no polynomial algorithm which transforms in polynomial time the problem to an other polynomial problem.
One of the big questions in informatics is, if the NP hard problems could be solved in polynomial time with appropriate Algorithm. It is still not clear if NP is an own class, or if NP = P.
That two dimensional sorting algorithm in this paper is currently the only example, that there exists an algorithm in P, which solve NP hard problems.
As long as we did not find a TSP constellation, which is not solvable by this two dimensional sorting algorithm presented in this paper, we see that NP = P.

Tamar Algorithm - the new Approach

Tamar is here the name for the basic algorithm describing the here presented and new developed method of solving the TSP. The algorithm is the heart of solving TSP. Many methods were tried to handle the basic principle. This was the master piece. The algorithm is similar to the quick-sort algorithm for one dimension. Of course the divide and merge steps differ for two dimensions accordingly.

Although, the principle is the same: DIVIDE AND CONQUER

Divide

As described in Wikipedia, the quick-sort algorithm divide the input sequence into different parts. In this way, it sorts all elements smaller than an selected pivot element to the left and all bigger to the right. Equal values leave untouched, as for Tamar algorithm also. The selected pivot element is free to choose from the input sequence. The new two lists are divided again accordingly.
The divide step in Tamar is a little bit more complicated. We have a two dimensional space. Therefore we have to divide in two axis. Luckily, there is famous splitting method.
We just split the space by a cross in the centre of all used points. With the help of the cross, we sort the points in the four quadrants. This mean, we check for every point of the point lay in the upper-left, upper-right, lower-right or lower-left corner. For the case one point is the centre, we decide to move it into the upper-left, and so on for points on the axis. This method is similar to the famous binary space partition tree. So, it is called BSP here.

Conqueror

The merging step in the divide and conquer approach is quite difficult. The problem is, that we have no relation-order which we could apply. There is no way to compare the point p1 with p2. In typical relation-orders, as smaller than, we can compare the values directly. It is not difficult to distinguish between q1 and q2, in one dimension. The figure-system is designed to solve this comparison. When we have two dimensions now, we have no way to compare them directly. In the end, we have no relation-order for two dimensional points. This is true for higher dimensional points, also.

Here in this approach, the order is no problem. The presented method does not compare points by an order. We use computational geometrical approaches. This mean, for the first steps of until four points, we use geometrical methods like calculating the mean and decide in which triangular the forth point lay. They will described in the final paper, presented in 2017, completely.

The input to the conqueror step is always two ways. w1 and w2. The count of points in both ways specifies which kind of conquer step we use. If the count is below of four points, we use the mentioned geometric approach.

If the count of points is more than four for both ways together, we concentrate on another computational, geometrical and general merging approach.

The divide and conquer approach is based on the assumtion:

Theorem 2:

Merging two partial correct solutions of TSP, gives a new correct solution of a bigger TSP.

General Merging by Aligning

The problem in shortest round-trip correlates with alignment. Most would expect to use the euclidean length of the way as criteria to value the best way. However, there is a problem with the euclidean length. It does not fit the typical criteria of a best round-trip.

E.g. in following example, it is explained why shortest way is not the most suitable for shortest round-trip.

On the left we have the shortest way, because the diagonal from p1 to p2 implies shorter length than the advantage by p2 to p4.
Although it is clear, that for alignments, the right picture fits better. The right picture brings home the bacon to round-trips. The alignment fits better with p2 aligned to edge p3,p4.
As you see the question in TSP is not the shortest length, it is the best alignment.
Here it is called: What is the best way for to collect another point too?

More over, it is interesting to see, that a detour is shorter, if we go over a long way between two cities/points.

Short way:

Long way:

The short way picture describes a route from Paris to Warshaw over Venedig.
The long way picture shows a route form Paris to Moscow over Venice.
It is clear, that the shorter way is less long than the long way. And probably it is clear that the detour over venedig is for the tour to Moscow shorter.

It is calculated like: detour_length = d(p1,p3) + d(p2,p3) - d(p1,p2)
Describing the length of way, needed to arrive in p3 = Venice also.

It results in following calculation, where |p1,...,pn| describes the euclidean way length reaching all points between p1 to pn:
|Paris, Warsaw| = 1591 km
|Paris, Venice, Warsaw| = 2391 km
→ d(Paris, Warsaw, Venice) = 2391 - 1591 = 800 km - detour of short way

While the tour,
|Paris, Moscow| = 3393 km (Paris, Prague, Kiev, Moskow)
|Paris, Venice, Moskow| = 3883 km
→ d(Paris, Moscow, Venice) = 3883 - 3393 = 490 km - detour of long way

Finally, the detour aligned to long ways is much shorter in euclidean length.

Theorem 3:

TSP is a question of alignment and not shortest way.

Used Metric

To implement the above question in TSP solver shortcut, it is introduced a new metric.

Here we call it SeboehDistance, or short d(p1, p2, p3).

(1) d(p1,p2,p3) = |p1,p3| + |p2,p3| - |p1,p2|

E.g.:

Example of detour - c1 is better than c2.

If we go from a to b and want to visit also c1 and c2, because we search the shortest round-trip. It is suitable to align c1 in that case, because the seboehDistance results in shorter detour-length. d(a,b,c1) < d(a,b,c2).
This is the method of alignment and results in an other interpretation of point-space. Another interpretation of space which the points belong to.
It is another metric based on euclidean distance.
If anybody know the exact name of this metric, please post it in the comments below.

Final Merging Technique

To solve the alignment we assume:

Theorem 4:

A local optimal alignment to convex hull edges, will result in global optimal alignment for TSP.

.... details soon coming!

Difficulties

One of the main trouble is the different order of alignment on other edges. The same points in order p1,p2,p3,p4 at edge a. Can align on edge b in order p2, p3, p1, p4. Currently that problem is solved by searching the best alignment between the points. That results in O(n²). For every point we have to visit the other points exactly one time. This will be done for all edges (here it is assumed we have also n edges). Then the complexity results in O(n³). This is done in divide-and-conquer practice. So, we reach a total complexity of O(n³ log(n)).

Another problem was to find the closest edge, to reduce the above complexity.
This is solved perfectly by an BSP (Binary Space Partition) tree.
The tree will result in fastest detection of closest edges.

This will influence the complexity by O(log(n)).

So, currently it is assumed that the complexity of shortcut is O(n^k log²(n)).

Experiments

There is an application to handle tests on the shortcut algorithm.
It is called TSP-Ackermann. Only in relation to the fast growing Ackermann-Function.

See: https://bitbucket.org/gruenblaues/tspseboeh
as link to the shortcut algorithm and the TSP-Ackermann test program.
Project name is: TSPSeboeh.

Points of one test (no.22)

Result on the left is shortcut and on the right is naive algorithm.
Both have same result. (test22)

The shortcut failes on that concave hull, currently. (test23)

The needed time is five times faster (1s) as the naive method (5s). (test41)

Detail of above image. Naive version needs five times longer than shortcut.

Results

It will be tested tsplib database. tsplib is a database of well-known TSP examples. They contain TSP constellations, which are already solved or have a close result only.
This is done, if the crowdfunding and Service page will work.

Conclusion

As long as not proofed by mathematicians, we just assume that shortcut will solve the TSP. It is similar to the question: Did we test all one-dimensional input sequences in sorting already?

Outlook

The nature is solving TSP all moment. They get best and close results all time.
The astronomy behaves with gravity as alignment. The biological cells behave as alignment. The enzymes structures build alignments according to their molecules. And last but not least, the economic participants behaves as alignment.

So the influence in the future is high for every simulation or calculation according those topics.

Do you see it also?

Kind regards, Sebastian

Donnerstag, 8. September 2016

Travelling Salesman Problem - dynamic programming and computational geometry approach - draft