Extraction of relational schema from deep web sources: a form driven approach

Publication TypeConference Paper
Year of Publication2014
AuthorsSaissi, Y, Zellou, A, Idri, A
PublisherIbn Zohr Univ; Moroccan Soc of Complex Syst; IEEE Morocco; Int Acad for Syst and Cybernet Sci IASCYS
ISBN Number978-1-4799-4647-1

The deep web is the biggest unexplored part of the web and we need to access directly to its entire data web sources without using any crawling or surfacing method. For this, we choose to use a virtual web integration system. However, the deep web virtual integration methods existing today, focuses only on the integration of the query interfaces giving access to the deep web. These query interfaces are integrated to build a global query interface able to query all the deep web sources. The objective of our work is to propose another vision of a deep web virtual integration system that uses a mediated schema built with a relational schema describing each deep web source. This paper proposes our approach to extract a relational schema describing a deep web source. The key idea underlying our approach is to analyze two structured information: the HTML Form and the HTML Table extracted from the deep web source to discover its data structure and to allow us to build a relational schema describing it. We use also a knowledge table to take profit of our learning experience on extracting relational schema from deep web source.




