Publication Entry

[Back]

Diploma and Master Theses (authored and supervised):

M. Schwengerer:
"Algorithm Selection for the Graph Coloring Problem";
Supervisor: N. Musliu; Institut für Informationssysteme, 2012; final examination: 11-20-2012.

English abstract:

The graph coloring problem (GCP) is one of the most-studied NP-hard problems in computer science. Given a graph, the task is to assign a color to all vertices such that no vertices sharing an edge receive the same color and that the number of used colors is minimal. In the recent years, various heuristic and exact approaches for this problem have been developed. However, all of them seem to have advantages and disadvantages, which highly depend on the concrete instance on which they are applied. Consequently, designing an algorithm which finds on each graph the best coloring is hard or, by analogy to the No Free Lunch theorems, even impossible.
One possibility to achieve a better performance is to predict for each instance the algorithm which achieves the best performance. This task is known as algorithm selection problem: Given a set of algorithms and a set of intrinsic features of a particular instance, select the algorithm which is predicted to show the best performance on that instance.
This thesis investigates the application of machine learning techniques to automatic algorithm selection for the GCP. For this purpose, we first present several specific features of a graph, which can be calculated in polynomial time. Then, we evaluate the performance of 7 state-of-the-art (meta)heuristic algorithms for the GCP based on experimental results on 1265 graphs of 3 public available instance sets. The results clearly show that none of the algorithms is superior to all others. In addition, we analyze the behavior of these algorithms on classes of instances with certain attributes. The experiments show that for each of these classes, there exists at least one heuristic which performs clearly better than the rest. In a subsequent step, we use the knowledge about the best-suited algorithm per instance in combination with intrinsic graph features to train 6 classification algorithms. These supervised learning methods are then used to predict for an unseen instance the most appropriate algorithm. For each classifier, we test multiple parameter settings. We further identify relevant subsets of features and investigate the impact of different data-preparation techniques on the performance of the classifiers. In addition, we study the effect of considering only a subset of heuristics on the overall quality of the prediction.
For a meaningful comparison with the underlying heuristics, we evaluate our proposed approach on a new generated set of instances. Our experiments show that algorithm selection based on machine learning is able to outperform all considered solvers regarding several performance criteria.

German abstract:

Das Graphenfärbeproblem ist eines der bekanntesten NP-schweren Probleme in der Informatik. Ziel dabei ist es, für einen gegebenen Graphen jedem Knoten eine Farbe zuzuweisen, sodass keine zwei Knoten, welche mittels einer Kante verbunden sind, die gleiche Farbe erhalten und dass die Anzahl der verwendeten Farben minimal ist. Da das Berechnen eine exakte Lösung dieses Problems im schlimmsten Fall eine exponentielle Laufzeit benötigt, wurde im Laufe der Jahre eine Vielzahl an verschiedenen (Meta)Heuristiken für das GCP entwickelt. Viele dieser Methoden weisen gute Erfolge auf, allerdings scheint es, als würden die Ergebnisse sehr oft von der konkreten Instanz abhängen. Dementsprechend ist es schwierig, wenn (in Analogie zu den No Free Lunch Theoremen) nicht sogar unmöglich, einen Algorithmus zu finden, welcher auf allen Graphen optimal ist.
Ein Lösungsansatz für dieses Problem wäre, nicht nur einen Algorithmus zu verwenden, sondern, abhängig von der konkreten Instanz, immer den geeignetsten auszuwählen. Bei dieser Herangehensweise, auch bekannt als Algorithm Selection, wird aus einer Menge von Algorithmen anhand bestimmter Attribute einer Instanz derjenige ausgewählt, von welchem auf dieser Instanz das beste Ergebnis prognostiziert wird.
Die vorliegende Arbeit befasst sich mit der Anwendung von Techniken des überwachten Lernens als Algorithm Selection für das GCP. Für diesen Zweck stellen wir verschiedene relevante Attribute eines Graphen vor, welche in polynomieller Zeit berechnet werden können. Des Weiteren evaluieren wir die Performance von 7 modernen (Meta)heuristiken auf 1265 öffentlich verfügbaren Instanzen. Die Ergebnisse dieser Experimente zeigen deutlich, dass keine der Heuristiken im Allgemeinen besser als jede andere ist. Zudem beweisen die Experimente, dass auf der einzelnen Untergruppen von Instanzen jeweils ein oder mehrere Algorithmen deutlich bessere Leistung als der Rest erzielen.
Im zweiten Teil dieser Arbeit wird die Information über den besten Algorithmus je Instanz mit ihren charakteristischen Attributen kombiniert, um damit 6 verschiedene Klassifikationsalgorithmen zu trainieren. Im Zuge dieser Experimente identifizieren wir erfolgreiche Attributkombinationen und evaluieren, welchen Einfluss verschiedene Attributtransformationstechniken ausüben. Darüber hinaus untersuchen wir, wie eine verringerte Anzahl von Auswahlmöglichkeiten (d.h. das Entfernen von Algorithmen aus der Menge an Lösungsalgorithmen) die Qualität der Vorhersagen verändert.
Im letzten Teil vergleichen wir die Performance eines Systems basierend auf Algorithm Selection mit den zugrundeliegenden Heuristiken auf einer Menge eigens erstellter Instanzen.
Diese Experimente zeigen eindeutig, dass Algorithm Selection in allen betrachteten Kriterien bessere Ergebnisse als die einzelnen Algorithmen erzielen kann.

Created from the Publication Database of the Vienna University of Technology.