05-02-2015 дата публикации
Номер: US20150039611A1
Methods and arrangements for discovering entity types for a set of records. A set of records is input, with each record comprising attributes with associated attribute values. The records are grouped into candidate entity types in view of at least one of: the attribute values of the records, at least one domain ontology and at least one dimension hierarchy. An interestingness measure of each candidate entity type is calculated, via estimating interestingness based on at least one factor selected from the group consisting of: a correlation between attribute values of records, a number of attributes, a log of queries issued to a server, and an average group size for candidate entity types. At least one candidate entity type is validated based on the calculated interestingness measures. Other variants and embodiments are broadly contemplated herein. 1. A method of discovering entity types for a set of records , said method comprising:inputting a set of records, each record comprising attributes with associated attribute values;grouping the records into candidate entity types in view of at least one of: the attribute values of the records, at least one domain ontology and at least one dimension hierarchy;calculating an interestingness measure of each candidate entity type, via estimating interestingness based on at least one factor selected from the group consisting of: a correlation between attribute values of records, a number of attributes, a log of queries issued to a server, and an average group size for candidate entity types; andvalidating at least one candidate entity type based on the calculated interestingness measures.2. The method according to claim 1 , wherein:said validating comprises assisting a user in validating at least one candidate entity type; andassisting a user in creating at least one new candidate entity type.3. The method according to claim 2 , wherein said validating determining a relevance of each candidate entity type to at least one ...
Подробнее