DBpedia Release Evaluation: Difference between revisions

From DBpedia Mappings
Jump to navigationJump to search
No edit summary
No edit summary
Line 1: Line 1:
== DBpedia Release Evaluation ==
== DBpedia Release Evaluation ==


The [http://dbpedia.hg.sourceforge.net/hgweb/dbpedia/dbpedia/raw-file/946fc9d51783/qualityAssessmentFramework/Kreis2011__Design_of_a_Quality_Assessment_Framework.pdf Quality Assessment Framework (QAF)] is developed to document the quality of the knowledge base and furthermore the progress of DBpedia's extraction framework. The main idea of the QAF is a comparison between a manually created best-case dataset and the output from DBpedia's ontology based extraction. The QAF estimates the precision of the extraction framework and the completeness (recall) of DBpedia compared to its source Wikipedia.
The [http://dbpedia.hg.sourceforge.net/hgweb/dbpedia/dbpedia/raw-file/946fc9d51783/qualityAssessmentFramework/Kreis2011__Design_of_a_Quality_Assessment_Framework.pdf Quality Assessment Framework (QAF)] is developed to document the quality of the knowledge base and furthermore the progress of DBpedia's extraction framework. The main idea of the QAF is a comparison between a manually created best-case dataset (Gold Standard) and the output from DBpedia's ontology based extraction. The QAF estimates the precision of the extraction framework and the completeness (recall) of DBpedia compared to its source Wikipedia.
 
== Sample Data / Gold Standard ==
 
For a significant evaluation, only potentially extractable triples are considered. Only if a triple arise from a mapped property it can be extracted. Here this triples are called mapped triples.
The following table shows the number of mapped triples, the total number of triples in the Gold Standard and the percentage of mapped triples for each category. The categories result from different patterns in which the information is given in Wikipedia infoboxes. The number of cases differ in a high extent depending on the category. The results for the categories based on small numbers should be handled with care.
 
{| class="wikitable"
|+ Completeness / Recall (fixed mappings from 3.5.1)
|-
! Category
! Mapped Triples
! Triples
! %
|-
| Total
| 1504
| 3221
| 46.7
|-
| Plain Property
| 514
| 893
| 57.6
|-
| Number-Unit
| 51
| 76
| 67.1
|-
| Coordinate
| 36
| 54
| 66.7
|-
| Interval
| 22
| 31
| 71.0
|-
| List
| 478
| 801
| 59.7
|-
| One-Property-Table
| 242
| 447
| 54.1
|-
| Multi-Poprty-Table
| 83
| 625
| 13.3
|-
| Open Property
| 13
| 139
| 9.4
|-
| Open Property Table
| 0
| 26
| 0.0
|-
| Internal Template
| 58
| 116
| 50.0
|-
| Merged Properties
| 7
| 13
| 53.8
|}


== Evaluation Results ==
== Evaluation Results ==

Revision as of 15:50, 14 September 2011

DBpedia Release Evaluation

The Quality Assessment Framework (QAF) is developed to document the quality of the knowledge base and furthermore the progress of DBpedia's extraction framework. The main idea of the QAF is a comparison between a manually created best-case dataset (Gold Standard) and the output from DBpedia's ontology based extraction. The QAF estimates the precision of the extraction framework and the completeness (recall) of DBpedia compared to its source Wikipedia.

Sample Data / Gold Standard

For a significant evaluation, only potentially extractable triples are considered. Only if a triple arise from a mapped property it can be extracted. Here this triples are called mapped triples. The following table shows the number of mapped triples, the total number of triples in the Gold Standard and the percentage of mapped triples for each category. The categories result from different patterns in which the information is given in Wikipedia infoboxes. The number of cases differ in a high extent depending on the category. The results for the categories based on small numbers should be handled with care.

Completeness / Recall (fixed mappings from 3.5.1)
Category Mapped Triples Triples %
Total 1504 3221 46.7
Plain Property 514 893 57.6
Number-Unit 51 76 67.1
Coordinate 36 54 66.7
Interval 22 31 71.0
List 478 801 59.7
One-Property-Table 242 447 54.1
Multi-Poprty-Table 83 625 13.3
Open Property 13 139 9.4
Open Property Table 0 26 0.0
Internal Template 58 116 50.0
Merged Properties 7 13 53.8

Evaluation Results

Completeness / Recall (fixed mappings from 3.5.1)
Category Cases DBpedia 3.5.1 DBpedia 3.6 DBpedia 3.7
Total 1504 45,7% 60,2% 61,8%
Plain Property 514 80,5% 83,7% 86,0%
Number-Unit 51 68,6% 68,6% 66,7%
Coordinate 36 100,0% 100,0% 100,0%
Interval 22 72,7% 68,2% 72,7%
List 478 33,9% 75,7% 77,8%
One-Property-Table 242 5,4% 5,8% 5,8%
Multi-Poprty-Table 83 0,0% 0,0% 0,0%
Open Property 13 23,1% 23,1% 30,8%
Open Property Table 0 na na na
Internal Template 58 8,6% 10,3% 10,3%
Merged Properties 7 57,1% 57,1% 71,4%
Precision (fixed mappings from 3.5.1)
Category DBpedia 3.5.1 DBpedia 3.6 DBpedia 3.7
Total 91,2% 92,3% 92,4%
Plain Property 96,3% 96,6% 97,4%
Number-Unit 85,4% 85,4% 85,0%
Coordinate 100,0% 100,0% 100,0%
Interval 100,0% 100,0% 88,9%
List 91,5% 92,8% 93,2%
One-Property-Table 32,5% 36,8% 34,1%
Multi-Poprty-Table na na na
Open Property 100,0% 100,0% 100,0%
Open Property Table na na na
Internal Template 83,3% 75,0% 75,0%
Merged Properties 80,0% 80,0% 100,0%