Contributions to Proceedings:

M. Goebel, M. Ceresna:
"Interactive Wrapper Learning for Web Documents Using Tree Alignment";
in: "Proceedings of IADIS International Conference Applied Computing 2007", N. Guimaraies, P. Isaias (ed.); issued by: IADIS Press; IADIS Press, 2007, ISBN: 978-972-8924-30-0, 363 - 370.

English abstract:
This paper proposes an interactive wrapper learning approach to Web information extraction for semi-automatic wrapper generation. In particular, we present an algorithm that learns patterns based on the structure of training instances using tree alignment techniques. This
is achieved by generating structural template models for both positive and negative examples. We evaluate our system on standard benchmarks, and evaluation shows that there exists great potential for structure learning for a variety of extraction tasks.

Information Extraction, Data mining, Tree alignment, Classification

Created from the Publication Database of the Vienna University of Technology.