Publications
2009
Thomas Gottron Content Extraction: Bestimmung des Hauptinhaltes in HTML Dokumenten
Ausgezeichnete Informatikdissertationen 2008, Dorothea Wagner et al. (Hrsg.), Lecture Notes in Informatics, 2009, 101—110. [PDF]
Yves Weißig, Thomas Gottron Combinations of Content Extraction Algorithms
LWA'09: Workshop Information Retrieval,
2009. [PDF] [BibTeX]
Constanze Lipowsky, Egor Dranischnikow, Herbert Göttler,
Thomas Gottron, Mathias Kemeter, Elmar Schömer Alignment of
Noisy and Uniformly Scaled Time Series DEXA'09: Proceedings of the 20th
International Conference on Database and Expert Systems Applications,
2009, 675—688. [PDF] [BibTeX]
Thomas Gottron
Document Word Clouds: Visualising Web Documents as Tag Clouds to Aid Users in Relevance Decisions
ECDL'09:
Proceedings of the 13th European Conference on Digital Libraries,
2009, 94—105. [PDF] [BibTeX]
Thomas Gottron, Roman Schneider
A Hybrid Approach to Statistical and Semantical Analysis of Web Documents
EuroIMSA'09:
Proceedings of the 5th European Conference on Internet and Multimedia Systems and Applications,
2009, 115—120. [PDF] [BibTeX]
Thomas Gottron
An Evolutionary Approach to Automatically Optimise Web Content Extraction
IIS'09:
Proceedings of the 17th International Conference Intelligent Information Systems,
2009, 331—343. [PDF] [BibTeX]
Thomas Gottron
Detecting Website Redesigns via Template Similarity on Streams of Documents
ITA'09:
Proceedings of the 3rd International Conference on Internet Technologies and Applications,
2009, 35—43. [PDF] [BibTeX]
ITA'09 Best Paper Award
Thomas Gottron, Ludger Martin
Estimating Web Site Readability Using Content Extraction
WWW'09:
Proceedings of the 18th International Conference on World Wide Web; Posters Track,
2009, 1169—1170. [PDF] [BibTeX]
2008
Thomas Gottron,
Content Extraction: Identifying the Main Content in HTML Documents
Dissertation, Johannes-Gutenberg Universität Mainz,
2008. [PDF] [BibTeX]
Thomas Gottron,
Combining Content Extraction Heuristics: The CombinE System
iiWAS'08:
Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services; Special Track: Emerging Research Projects, Applications and Services (ERPAS),
2008, 591—595. [PDF] [BibTeX]
Thomas Gottron,
Content Code Blurring: A New Approach to Content Extraction
TIR'08:
Proceedings of the 5th International Workshop on Text Information Retrieval,
2008, 29—33. [PDF] [BibTeX]
Thomas Gottron,
Clustering Template Based Web Documents
ECIR '08:
Proceedings of the 30th European Conference on Information Retrieval,
2008, 40—51. [PDF] [BibTeX]
Thomas Gottron,
Bridging the Gap: From Multi Document Template Detection to Single Document Content Extraction
EuroIMSA'08:
Proceedings of the IASTED Conference on Internet and
Multimedia Systems and Applications,
2008, 66—71. [PDF] [BibTeX]
2007
Thomas Gottron,
Evaluating Content Extraction on HTML Documents
ITA '07:
Proceedings of the 2nd International Conference on Internet Technologies and Applications,
2007, 123—132. [PDF] [BibTeX]
|