Journal article icon

Journal article

OXPath: A language for scalable data extraction, automation, and crawling on the deep web.

Abstract:

The evolution of the web has outpaced itself: A growing wealth of information and increasingly sophisticated interfaces necessitate automated processing, yet existing automation and data extraction technologies have been overwhelmed by this very growth. To address this trend, we identify four key requirements for web data extraction, automation, and (focused) web crawling: (1) interact with sophisticated web application interfaces, (2) precisely capture the relevant data to be extracted, (3) ...

Expand abstract
Publication status:
Published

Actions


Access Document


Publisher copy:
10.1007/s00778-012-0286-6

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Computer Science
Role:
Author
Journal:
VLDB J. More from this journal
Volume:
22
Issue:
1
Pages:
47-72
Publication date:
2013-01-01
DOI:
EISSN:
0949-877X
ISSN:
1066-8888
Language:
English
Keywords:
Pubs id:
pubs:379778
UUID:
uuid:d3a36771-e283-450b-9874-f4759ff46698
Local pid:
pubs:379778
Source identifiers:
379778
Deposit date:
2013-11-17

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP