Thesis icon

Thesis

Easing information extraction on the web through automated rules discovery

Abstract:

The advent of the era of big data on the Web has made automatic web information extraction an essential tool in data acquisition processes. Unfortunately, automated solutions are in most cases more error prone than those created by humans, resulting in dirty and erroneous data. Automatic repair and cleaning of the extracted data is thus a necessary complement to information extraction on the Web.

This thesis investigates the problem of inducing cleaning rules on web extracted d...

Expand abstract

Actions


Access Document


Files:

Authors


More by this author
Department:
University of Oxford

Contributors

Role:
Supervisor
More from this funder
Grant:
OUCL/2013/SO
Funding agency for:
Project
Type of award:
DPhil
Level of award:
Doctoral
Awarding institution:
University of Oxford

Terms of use


Metrics



If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP