DBpedia domains and URIs: Difference between revisions

From DBpedia Mappings
Jump to navigationJump to search
No edit summary
(→‎Deviations from these rules: Some international chapters use URIs, not IRIs (more than en and fr))
 
(9 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Long-term goals ==
== Long-term goals ==


* All languages use IRIs, not URIs
* '''All''' languages use IRIs, not URIs


{| class="wikitable"
{| class="wikitable"
|-
|-
! width = 300 | Items
! Items
! width = 300 | IRI pattern
! IRI pattern
|-
|-
| Ontology classes and properties
| Ontology classes and properties
Line 14: Line 14:
| <nowiki>http://dbpedia.org/datatype/xyz</nowiki>
| <nowiki>http://dbpedia.org/datatype/xyz</nowiki>
|-
|-
| Resource IRI for <nowiki>http://xx.wikipedia.org/wiki/Xyz</nowiki>
| Resource IRI for http://'''xx'''.wikipedia.org/wiki/Xyz
| <nowiki>http://xx.dbpedia.org/resource/Xyz</nowiki>
| http://'''xx'''.dbpedia.org/resource/Xyz
|-
|-
| Properties extracted by generic template extractor
| Properties extracted by generic template extractor from http://'''xx'''.wikipedia.org/ pages
| <nowiki>http://xx.dbpedia.org/property/xyz</nowiki>
| http://'''xx'''.dbpedia.org/property/xyz
|}
|}


== Current deviations from these rules  ==
== Deviations from these rules  ==


In the past, we did not follow these rules. For backwards compatibility, we will allow some deviations for a while.
In the past, we did not follow these rules. For backwards compatibility, we will allow some deviations for a while. Might be a long while. :-)


* English and French use URIs, not IRIs. DBpedia Berlin will publish additional datasets with IRIs, but the main datasets will use URIs.
* Some international chapters use URIs, not IRIs. See [http://wiki.dbpedia.org/Internationalization/Chapters the list of DBpedia chapters] for details. The main DBpedia release will offer additional datasets with IRIs for download, but the main datasets will use URIs.


{| class="wikitable"
{| class="wikitable"
|-
|-
! width = 300 | Items
! Items
! width = 300 | deviating IRI pattern
! deviating IRI pattern
|-
|-
| Resource URI for <nowiki>http://en.wikipedia.org/wiki/Xyz</nowiki>
| Resource URI for http://'''en'''.wikipedia.org/wiki/Xyz
| <nowiki>http://dbpedia.org/resource/Xyz</nowiki>
| <nowiki>http://dbpedia.org/resource/Xyz</nowiki>
|-
|-
| Properties extracted by generic template extractor from <nowiki>http://en.wikipedia.org/</nowiki> pages
| Properties extracted by generic template extractor from http://'''en'''.wikipedia.org/ pages
| <nowiki>http://dbpedia.org/property/xyz</nowiki>
| <nowiki>http://dbpedia.org/property/xyz</nowiki>
|}
|}
Line 43: Line 43:
The main DBpedia release and a DBpedia chapter must use the same syntax for equivalent IRIs.
The main DBpedia release and a DBpedia chapter must use the same syntax for equivalent IRIs.


We used to map page titles from non-English Wikipedias to IRIs using the inter-language link to the English Wikipedia. If there was no such inter-language link, we did not extract any data from the non-English page. That was because we used URIs like <nowiki>http://dbpedia.org/resource/Xyz</nowiki> for ''all'' languages and had to 'normalize' the URIs. We will ''not'' do that anymore - we will use <nowiki>http://xx.dbpedia.org/resource/Xyz</nowiki> IRIs.
We used to map page titles from non-English Wikipedias to IRIs using the inter-language link to the English Wikipedia. If there was no such inter-language link, we did not extract any data from the non-English page. That was because we used URIs like <nowiki>http://dbpedia.org/resource/Xyz</nowiki> for '''all''' languages and had to 'normalize' the URIs. We will '''not''' do that anymore - we will use <nowiki>http://xx.dbpedia.org/resource/Xyz</nowiki> IRIs.


== Implementation details ==
== Implementation details ==


During the extraction, the framework will use http://'''xx'''.dbpedia.org/ '''IRI'''s for '''all''' languages, even English. Different serializers will serializes them differently, according to some rules:
During the extraction, the framework will use http://'''xx'''.dbpedia.org/ '''IRI'''s for '''all''' languages, even English. Different serializers will serialize them differently, according to some rules, for example:
* convert en.dbpedia.org to dbpedia.org
* convert en.dbpedia.org to dbpedia.org
* convert IRIs to URIs for some languages (in subjects, predicates and objects)
* convert IRIs to URIs for some languages (in subjects, predicates and objects)

Latest revision as of 00:37, 24 May 2012

Long-term goals

  • All languages use IRIs, not URIs
Items IRI pattern
Ontology classes and properties http://dbpedia.org/ontology/Xyz
Datatypes http://dbpedia.org/datatype/xyz
Resource IRI for http://xx.wikipedia.org/wiki/Xyz http://xx.dbpedia.org/resource/Xyz
Properties extracted by generic template extractor from http://xx.wikipedia.org/ pages http://xx.dbpedia.org/property/xyz

Deviations from these rules

In the past, we did not follow these rules. For backwards compatibility, we will allow some deviations for a while. Might be a long while. :-)

  • Some international chapters use URIs, not IRIs. See the list of DBpedia chapters for details. The main DBpedia release will offer additional datasets with IRIs for download, but the main datasets will use URIs.
Items deviating IRI pattern
Resource URI for http://en.wikipedia.org/wiki/Xyz http://dbpedia.org/resource/Xyz
Properties extracted by generic template extractor from http://en.wikipedia.org/ pages http://dbpedia.org/property/xyz

Notes

The main DBpedia release and a DBpedia chapter must use the same syntax for equivalent IRIs.

We used to map page titles from non-English Wikipedias to IRIs using the inter-language link to the English Wikipedia. If there was no such inter-language link, we did not extract any data from the non-English page. That was because we used URIs like http://dbpedia.org/resource/Xyz for all languages and had to 'normalize' the URIs. We will not do that anymore - we will use http://xx.dbpedia.org/resource/Xyz IRIs.

Implementation details

During the extraction, the framework will use http://xx.dbpedia.org/ IRIs for all languages, even English. Different serializers will serialize them differently, according to some rules, for example:

  • convert en.dbpedia.org to dbpedia.org
  • convert IRIs to URIs for some languages (in subjects, predicates and objects)