Web Data Extraction and Reconditioning


The task of this Bachelor Project was to investigate different tools offering support for Screen Scraping, also known as Web Data Extraction. The first part of the Thesis is based on the description of the research work spent on tools that roughly represent the various types of Screen Scraping, as to say graphical tools versus library tools. The main emphasis of the theoretical part is put on technical possibilities of Web Data Extraction, advantages and disadvantages of tools offered on the Internet, the description of some of these tools, and the Screen Scraping process in general. The second part describes the implementational task, containing further explanations on the practical scope of Screen Scraping. In the Use Case is illustrated a Screen Scraping program based on selected tools that extracts information from routing sites (train, car and flight), and hence minimizes the effort of comparing prices and durations.

Bachelor thesis, University of Innsbruck, Austria, Digital Enterprise Research Institute