Main Page: Difference between revisions

From DBpedia Mappings
Jump to navigationJump to search
No edit summary
Line 9: Line 9:
The type of Wikipedia content that is most valuable for the DBpedia extraction are infoboxes and tables. Infoboxes display an article's most relevant facts as a table of attribute-value pairs on the top right-hand side of the Wikipedia page.  
The type of Wikipedia content that is most valuable for the DBpedia extraction are infoboxes and tables. Infoboxes display an article's most relevant facts as a table of attribute-value pairs on the top right-hand side of the Wikipedia page.  


As Wikipedia's template system has decentrally envolved over time, different communities of Wikipedia editors use different templates to describe the same type of things (e.g.
As Wikipedia's infobox template system has decentrally envolved over time, different communities of Wikipedia editors use different templates to describe the same type of things (e.g.
infobox_city_japan, infobox_swiss_town and infobox_town_de). Different templates use different names for the same attribute (e.g. birthplace and
infobox_city_japan, infobox_swiss_town and infobox_town_de). Different templates use different names for the same attribute (e.g. birthplace and
placeofbirth). As many Wikipedia editors do not strictly follow the recommendations given on the page that describes a template, attribute values are
placeofbirth). As many Wikipedia editors do not strictly follow the recommendations given on the page that describes a template, attribute values are
expressed using a wide range of different formats and units of measurement.  
expressed using a wide range of different formats and units of measurement.  


By mapping Wikipedia templates and tables to the DBpedia ontology, a basis is established to improve the quality of the infobox extraction and to permit table extraction.  
In order to overcome the problems of synonymous attribute names and multiple templates being used for the same type of things, the DBpedia project maps Wikipedia templates as well as tables within an article to the [http://wiki.dbpedia.org/Ontology DBpedia ontology]. By mapping Wikipedia templates and tables to the DBpedia ontology, a basis is established to improve the quality of the infobox extraction and to permit table extraction.  


We specified a suitable mapping language based on requirements deducted from existing Wikipedia infobox and table usage. The mapping language makes use of MediaWiki templates that define DBpedia ontology classes and properties as well as template/table to ontology mappings. <!-- Using this language, the ontology schema and the mappings can be defined and maintained in this MediaWiki instance. -->
These mappings are specified using the DBpedia Mapping Language. The mapping language makes use of MediaWiki templates that define DBpedia ontology classes and properties as well as template/table to ontology mappings. <!-- Using this language, the ontology schema and the mappings can be defined and maintained in this MediaWiki instance. -->
The DBpedia extraction framework parses and validates the templates defined in this MediaWiki instance and extracts the Wikipedia content according to them.
The DBpedia extraction framework parses and validates the templates defined in this MediaWiki instance and extracts the Wikipedia content according to them.



Revision as of 15:49, 8 March 2010

About DBpedia

DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and to link other data sets on the Web to Wikipedia data. The DBpedia knowledge base, which has been created by extracting stuctured information from Wikipedia, currently describes more than 2.9 million things, including at least 282,000 persons, 339,000 places (including 241,000 populated places), 88,000 music albums, 44,000 films, 15,000 video games, 119,000 organizations (including 20,000 companies and 29,000 educational institutions), 130,000 species and 4400 diseases.

About this wiki

DBpedia Mappings

The type of Wikipedia content that is most valuable for the DBpedia extraction are infoboxes and tables. Infoboxes display an article's most relevant facts as a table of attribute-value pairs on the top right-hand side of the Wikipedia page.

As Wikipedia's infobox template system has decentrally envolved over time, different communities of Wikipedia editors use different templates to describe the same type of things (e.g. infobox_city_japan, infobox_swiss_town and infobox_town_de). Different templates use different names for the same attribute (e.g. birthplace and placeofbirth). As many Wikipedia editors do not strictly follow the recommendations given on the page that describes a template, attribute values are expressed using a wide range of different formats and units of measurement.

In order to overcome the problems of synonymous attribute names and multiple templates being used for the same type of things, the DBpedia project maps Wikipedia templates as well as tables within an article to the DBpedia ontology. By mapping Wikipedia templates and tables to the DBpedia ontology, a basis is established to improve the quality of the infobox extraction and to permit table extraction.

These mappings are specified using the DBpedia Mapping Language. The mapping language makes use of MediaWiki templates that define DBpedia ontology classes and properties as well as template/table to ontology mappings. The DBpedia extraction framework parses and validates the templates defined in this MediaWiki instance and extracts the Wikipedia content according to them.

mappings.dpedia.org

The mappings.dpedia.org ...

This wiki is read-only. If you like to edit the mappings or ontology schema, please register and the DBpedia team will add you to the editors list.

Tutorials

The full documentation on writing mappings can be found via the DBpedia SVN Repository.