Conference item icon

Conference item

Browserless web data extraction: challenges and opportunities

Abstract:

Most modern web scrapers use an embedded browser to render web pages and to simulate user actions. Such scrapers (or wrappers) are therefore expensive to execute, in terms of time and network traffic. In contrast, it is magnitudes more resource-efficient to use a “browserless” wrapper which directly accesses a web server through HTTP requests, and takes the desired data directly from the raw replies. However, creating and maintaining browserless wrappers of high precision requires specialists...

Expand abstract
Publication status:
Published
Peer review status:
Peer reviewed
Version:
Publisher's version

Actions


Access Document


Files:
Publisher copy:
10.1145/3178876.3186008

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS Division
Department:
Computer Science
More by this author
Institution:
University of Oxford
Division:
MPLS Division
Department:
Computer Science
Spencer, B More by this author
More by this author
Institution:
University of Oxford
Division:
MPLS Division
Department:
Computer Science
More by this author
Institution:
University of Oxford
Division:
MPLS Division
Department:
Computer Science
Publisher:
Association for Computing Machinery Publisher's website
Pages:
1095-1104
Publication date:
2018-04-10
Acceptance date:
2017-12-22
DOI:
Pubs id:
pubs:820391
URN:
uri:c72a7aac-b296-419c-bf5b-3d1661c305e4
UUID:
uuid:c72a7aac-b296-419c-bf5b-3d1661c305e4
Local pid:
pubs:820391
ISBN:
978-1-4503-5639-8

Terms of use


Metrics



If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP