Contributions to Proceedings:
I. Kordomatis, C. Herzog, R. Fayzrakhmanov, B. Krüpl-Sypien, W. Holzinger, R. Baumgartner:
"Web object identification for web automation and meta-search";
in: "Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics",
C. David, D. Moreno, R. Akerkar (ed.);
Web object identification plays an important role in research fields such as information extraction, web automation, and web form understanding for building meta-search engines. In contrast to other works, we approach this problem by analyzing various spatial, visual, functional and textual characteristics of web pages. We compute 49 unique features for all visible web page elements, which are then applied to machine learning classifiers in order to identify similar elements on other previously unexamined web pages. We evaluate our approach with different scenarios by analyzing the relevance of the chosen features and the classification rate of the applied classifiers. These scenarios focus on understanding search forms from the transportation domain, particularly flight, train, and bus connections. The results of the evaluation are very promising.
web object identi cation; web automation; web accessibility; machine learning; big data; web page visual representation
"Official" electronic version of the publication (accessed through its Digital Object Identifier - DOI)
Electronic version of the publication:
Project Head Reinhard Pichler:
TASK MINING from CROWD BEHAVIOUR
Created from the Publication Database of the Vienna University of Technology.