Conference item icon

Conference item

You Need Only One Clue for Effective Record Segmentation

Abstract:

Record segmentation is a core problem in data extraction. Previous approaches have focused on more and more sophisticated heuristics without knowledge of the concrete domain. In this work, we demonstrate that with only a single clue about mandatory attributes in a given domain, straightforward rules for record segmentation suffice to achieve 100% precise record extraction from the vast majority of web sites in that domain. These results are first outcomes of the just launched ERC project DIAD...

Expand abstract

Actions


Authors


Host title:
Proc. of 1st Intl Conf. on Web Intelligence‚ Mining and Semantics (WIMS)
Publication date:
2011-01-01
UUID:
uuid:6833bb7e-df67-42c1-94de-860e9f08bd68
Local pid:
cs:6424
Deposit date:
2015-03-31

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP