What's in a Name: Difference between revisions

From DBpedia Mappings
Jump to navigationJump to search
No edit summary
 
(15 intermediate revisions by 3 users not shown)
Line 57: Line 57:
|}
|}


== Name Forms and Lifecycle ==
== Name Lifecycle ==


People's names change during their lifetime. Eg Cranach was born Lucas Maler (after the profession of his father), was renamed Lucas Cranach when he became famous (after his birthplace Cronach), then art historians stated calling him Lucas Cranach the Elder after his son (also Lucas Cranach) became an artist.
People's names change during their lifetime. Eg Cranach was born Lucas Maler (after the profession of his father), was renamed Lucas Cranach when he became famous (after his birthplace Cronach), then art historians stated calling him Lucas Cranach the Elder after his son (also Lucas Cranach) became an artist.


In some cultural heritage documentation systems (eg CIDOC CRM), name allocation and usage across time and space can be tracked with separate nodes. In DBpedia the situation is simpler, but we still need several name properties. But not as many as we find here! Actually this list is relatively short but is crucially important, since these props are used numerous times (everything has a name, right)?
In some cultural heritage documentation systems (eg CIDOC CRM), name allocation and usage across time and space can be tracked with separate nodes. In DBpedia the situation is simpler, but we still need several name properties. But not as many as we find here!  


Birth, previous, original, historical, former, old:
"Birth, former, historical, old, original, previous, same, present" name: in what situations should each one be used?


{|border="1"
{|border="1"
Line 98: Line 98:
|?
|?
|}
|}
This list is relatively short but is crucially important, since these props are used numerous times: everything has a name


== Language-specific Names ==
== Language-specific Names ==


There are thousands upon thousands of languages in the world. IANA has defined lang tags for a lot of them, following ISO2 and ISO3 codes, extending, and allowing custom extensions).
There are thousands upon thousands of languages in the world.  


Rather than making up a new property for each language in the world, we must use one property, with proper lang tag.
IANA has defined lang tags for a lot of them (following ISO2 and ISO3 codes, extending, and allowing custom extensions). See eg [http://vocab.getty.edu/doc/#IANA_Language_Tags Getty LOD documentation] for lang tag examples, and a script to fetch the IANA registry to a table. Open it in excel and search: [https://www.dropbox.com/s/g5j4bdiqt4mcyly/iana-lang-tags.xlsx?dl=1 iana-lang-tags.xlsx]
 
Rather than making up a new property for each language in the world, we must use *one* property, with proper lang tag.


Eg instead of
Eg instead of
Line 109: Line 112:
<pre class="example">dbr_fr:belgrade name &quot;Belgrade&quot;; cyrilliqueName &quot;Белград&quot;.
<pre class="example">dbr_fr:belgrade name &quot;Belgrade&quot;; cyrilliqueName &quot;Белград&quot;.
</pre>
</pre>
We should make this:
We should do:


<pre class="example">dbr_fr:belgrade name &quot;Belgrade&quot;@fr, &quot;Белград&quot;@sr-Cyrl.
<pre class="example">dbr_fr:belgrade name &quot;Belgrade&quot;@fr, &quot;Белград&quot;@sr-Cyrl.
</pre>
</pre>
The Template:PropertyMapping has a parameter &quot;language&quot; just for that purpose. See eg [http://vocab.getty.edu/doc/#IANA_Language_Tags Getty LOD documentation] for lang tag examples, and a script to fetch the IANA registry to a table.
The Template:PropertyMapping has a parameter &quot;language&quot; just for that purpose.  


We have to investigate the use of these of each case below. Eg:
We have to investigate the use of these of each case below. Eg:
Line 122: Line 125:


All these are tracked under [https://github.com/dbpedia/mappings-tracker/issues/15 #15].
All these are tracked under [https://github.com/dbpedia/mappings-tracker/issues/15 #15].
Unfortunately yhe extractor currently doesn't handle lang tags like "sr-Cyrl" [https://github.com/dbpedia/extraction-framework/issues/303 #303]
Unfortunately the extractor currently doesn't handle lang tags like "sr-Cyrl" [https://github.com/dbpedia/extraction-framework/issues/303 #303]


{|border="1"
{|border="1"
|dbo:alemmanicName
|dbo:alemmanicName
|??
|#15
|
|fixed
|-
|-
|dbo:algerianName
|dbo:algerianName
|#15
|#15 qqq-DZ. Not a single language, so we use "Private language used in specific region" Algeria
|
|fixed
|-
|-
|dbo:algerianSettlementName
|dbo:algerianSettlementName
|??
|#15
|
|fixed
|-
|-
|dbo:arabicName
|dbo:arabicName
Line 142: Line 145:
|-
|-
|dbo:arberishtName
|dbo:arberishtName
|
|#15
|
|fixed
|-
|-
|dbo:calabrianName
|dbo:calabrianName
|
|#15: x-calabria (custom tag). IANA doesn't have a code and https://en.wikipedia.org/wiki/Languages_of_Calabria says a mix of Neapolitan, Sicilian; even Greek, Occitan and Albanian
|
|fixed
|-
|-
|dbo:chaouiName
|dbo:chaouiName
|
|#15
|
|fixed
|-
|-
|dbo:cornishName
|dbo:cornishName
|
|#15
|
|fixed
|-
|-
|dbo:cyrilliqueName
|dbo:cyrilliqueName
Line 162: Line 165:
|-
|-
|dbo:dutchName
|dbo:dutchName
|
|#15
|
|fixed
|-
|-
|dbo:englishName
|dbo:englishName
|
|#15
|
|fixed
|-
|-
|dbo:finnishName
|dbo:finnishName
|
|#15
|
|fixed
|-
|-
|dbo:frenchName
|dbo:frenchName
|
|#15
|
|fixed
|-
|-
|dbo:frenchNickname
|dbo:frenchNickname
|
|#15, replaced by foaf:nick "nickname"@fr
|
|fixed
|-
|-
|dbo:frioulanName
|dbo:frioulanName
|
|#15
|
|fixed
|-
|-
|dbo:gaelicName
|dbo:gaelicName
|
|#15
|
|fixed
|-
|dbo:gagaouze
|#15, @gag
|fixed
|-
|-
|dbo:germanName
|dbo:germanName
|
|#15
|
|fixed
|-
|-
|dbo:greekName
|dbo:greekName
|
|#15
|
|fixed
|-
|-
|dbo:irishName
|dbo:irishName
|
|#15
|
|fixed
|-
|-
|dbo:italianName
|dbo:italianName
|
|#15
|
|fixed
|-
|-
|dbo:japanName
|dbo:japanName
|
|#15
|
|fixed
|-
|-
|dbo:kabyleName
|dbo:kabyleName
|
|#15
|
|fixed
|-
|-
|dbo:kanjiName
|dbo:kanjiName
|
|#15, @ja-Hani
|
|fixed
|-
|-
|dbo:ladinName
|dbo:ladinName
|[[Mapping fr:Infobox Commune d'Italie]] nom ladin, see [[https://en.wikipedia.org/wiki/Ladin_language Ladin language]]
|#15
|
|fixed
|-
|-
|dbo:luxembourgishName
|dbo:luxembourgishName
|
|#15
|
|fixed
|-
|-
|dbo:manxName
|dbo:manxName
|
|#15
|
|fixed
|-
|-
|dbo:maoriName
|dbo:maoriName
|
|#15
|
|fixed
|-
|dbo:messierName
|
|
|-
|-
|dbo:moldavianName
|dbo:moldavianName
|
|#15: mo
|
|fixed
|-
|-
|dbo:mozabiteName
|dbo:mozabiteName
|
|#15
|
|fixed
|-
|-
|dbo:occitanName
|dbo:occitanName
|
|#15
|
|fixed
|-
|dbo:russianName
|#15, @ru
|fixed
|-
|-
|dbo:sardinianName
|dbo:sardinianName
|
|#15
|
|fixed
|-
|-
|dbo:scotishName
|dbo:scotishName
|
|#15
|
|fixed
|-
|-
|dbo:scotsName
|dbo:scotsName
|
|#15
|
|fixed
|-
|-
|dbo:scottishName
|dbo:scottishName
|
|#15
|
|fixed
|-
|-
|dbo:sicilianName
|dbo:sicilianName
|
|#15
|
|fixed
|-
|-
|dbo:tamazightName
|dbo:tamazightName
|
|#15
|
|fixed
|-
|-
|dbo:tamazightSettlementName
|dbo:tamazightSettlementName
|
|#15
|
|fixed
|-
|-
|dbo:touaregName
|dbo:touaregName
|
|#15
|
|fixed
|-
|-
|dbo:touaregSettlementName
|dbo:touaregSettlementName
|
|#15
|
|fixed
|-
|-
|dbo:welshName
|dbo:welshName
|
|#15
|
|fixed
|}
|}


Line 318: Line 325:
|-
|-
|dbo:namedByLanguage
|dbo:namedByLanguage
|
|used with IntermediateNodeMapping, to be removed
|
|[https://github.com/dbpedia/mappings-tracker/issues/41 #41]
|-
|-
|dbo:personName
|dbo:personName
Line 327: Line 334:
|dbo:phonePrefixName
|dbo:phonePrefixName
|Say again?
|Say again?
|
|deleted
|-
|-
|dbo:reignName
|dbo:reignName
|
|Likely incorrect, it's used in http://mappings.dbpedia.org/index.php/Mapping_fr:Infobox_Rôle_monarchique for monarchic servants, eg https://fr.wikipedia.org/wiki/Henri_d'Orléans_(1822-1897)
|
|
|-
|-
Line 376: Line 383:
|fuelType of PowerStation as literal
|fuelType of PowerStation as literal
|
|
|-
|dbo:messierName
|
|Astrological object, as classified by Charles Messier
|-
|-
|dbo:peopleName
|dbo:peopleName

Latest revision as of 16:25, 16 February 2015

Analysis of Name Properties in DBpedia.

Note: this page was first edited it in emacs orgmode, then converted with

pandoc prop-names.org -w mediawiki >prop-names.mw

--VladimirAlexiev 18:49, 11 January 2015 (UTC)

Intro

Believe ot or not, DBO has 86 properties called "name". Isn't that a bit too much? Yes it is, and we need to fix this situation to avoid confusion.

If you ponder on the prop names below without reading my explanations, you'll appreciate the importance of documenting every property and class. I don't mean something complicated: just explain the purpose and when it's used.x

Basic Name

foaf:name Use for Person & Organisation ok
dbo:name Use for everything except Person & Organisation check
dbo:names Bug, came from some Greek Astronomy props #7

Names Forms

dbo:officialName
dbo:alternativeName
dbo:otherName
dbo:longName
dbo:commonName Eg "cat"
dbo:scientificName Eg "Felix catus" (biology)

Name Lifecycle

People's names change during their lifetime. Eg Cranach was born Lucas Maler (after the profession of his father), was renamed Lucas Cranach when he became famous (after his birthplace Cronach), then art historians stated calling him Lucas Cranach the Elder after his son (also Lucas Cranach) became an artist.

In some cultural heritage documentation systems (eg CIDOC CRM), name allocation and usage across time and space can be tracked with separate nodes. In DBpedia the situation is simpler, but we still need several name properties. But not as many as we find here!

"Birth, former, historical, old, original, previous, same, present" name: in what situations should each one be used?

dbo:birthName TODO
dbo:formerName TODO
dbo:historicalName TODO
dbo:oldName TODO
dbo:originalName TODO
dbo:previousName TODO
dbo:sameName TODO
dbo:presentName Duplicate of name ?

This list is relatively short but is crucially important, since these props are used numerous times: everything has a name

Language-specific Names

There are thousands upon thousands of languages in the world.

IANA has defined lang tags for a lot of them (following ISO2 and ISO3 codes, extending, and allowing custom extensions). See eg Getty LOD documentation for lang tag examples, and a script to fetch the IANA registry to a table. Open it in excel and search: iana-lang-tags.xlsx

Rather than making up a new property for each language in the world, we must use *one* property, with proper lang tag.

Eg instead of

dbr_fr:belgrade name "Belgrade"; cyrilliqueName "Белград".

We should do:

dbr_fr:belgrade name "Belgrade"@fr, "Белград"@sr-Cyrl.

The Template:PropertyMapping has a parameter "language" just for that purpose.

We have to investigate the use of these of each case below. Eg:

  • germanName should be fixed but alemmanicName might have some cultural significance
  • algerianName should be fixed but algerianSettlementName should be investigated
  • frenchName should be fixed to "name" but frenchNickname might become otherName@fr or something like this
  • some of them might map to originalName (plus language), depending on how they're used

All these are tracked under #15. Unfortunately the extractor currently doesn't handle lang tags like "sr-Cyrl" #303

dbo:alemmanicName #15 fixed
dbo:algerianName #15 qqq-DZ. Not a single language, so we use "Private language used in specific region" Algeria fixed
dbo:algerianSettlementName #15 fixed
dbo:arabicName #15 fixed
dbo:arberishtName #15 fixed
dbo:calabrianName #15: x-calabria (custom tag). IANA doesn't have a code and https://en.wikipedia.org/wiki/Languages_of_Calabria says a mix of Neapolitan, Sicilian; even Greek, Occitan and Albanian fixed
dbo:chaouiName #15 fixed
dbo:cornishName #15 fixed
dbo:cyrilliqueName #15 fixed
dbo:dutchName #15 fixed
dbo:englishName #15 fixed
dbo:finnishName #15 fixed
dbo:frenchName #15 fixed
dbo:frenchNickname #15, replaced by foaf:nick "nickname"@fr fixed
dbo:frioulanName #15 fixed
dbo:gaelicName #15 fixed
dbo:gagaouze #15, @gag fixed
dbo:germanName #15 fixed
dbo:greekName #15 fixed
dbo:irishName #15 fixed
dbo:italianName #15 fixed
dbo:japanName #15 fixed
dbo:kabyleName #15 fixed
dbo:kanjiName #15, @ja-Hani fixed
dbo:ladinName #15 fixed
dbo:luxembourgishName #15 fixed
dbo:manxName #15 fixed
dbo:maoriName #15 fixed
dbo:moldavianName #15: mo fixed
dbo:mozabiteName #15 fixed
dbo:occitanName #15 fixed
dbo:russianName #15, @ru fixed
dbo:sardinianName #15 fixed
dbo:scotishName #15 fixed
dbo:scotsName #15 fixed
dbo:scottishName #15 fixed
dbo:sicilianName #15 fixed
dbo:tamazightName #15 fixed
dbo:tamazightSettlementName #15 fixed
dbo:touaregName #15 fixed
dbo:touaregSettlementName #15 fixed
dbo:welshName #15 fixed

To Investigate

dbo:colonialName
dbo:informationName
dbo:leaderName
dbo:legislativePeriodName
dbo:meshName
dbo:municipalityRenamedTo Anything can be renamed, why "municipality". Is that the current name?
dbo:namedByLanguage used with IntermediateNodeMapping, to be removed #41
dbo:personName
dbo:phonePrefixName Say again? deleted
dbo:reignName Likely incorrect, it's used in http://mappings.dbpedia.org/index.php/Mapping_fr:Infobox_Rôle_monarchique for monarchic servants, eg https://fr.wikipedia.org/wiki/Henri_d'Orléans_(1822-1897)
dbo:sharingOutName dc:type of a part Place #8
dbo:sharingOutPopulationName unused deleted
dbo:signName
dbo:statName
dbo:subdivisionName

Is Ok

dbo:nameAsOf Date when "name" was first assigned
dbo:filename Filename of a Sound see #19
dbo:iupacName IUPAC name of a Chemical
dbo:ngcName NGC name of a CelestialBody
dbo:fuelTypeName fuelType of PowerStation as literal
dbo:messierName Astrological object, as classified by Charles Messier
dbo:peopleName Name for the people of a certain place, eg Bulgaria->Bulgarian
dbo:spouseName Spouse of someone as literal
dbo:teamName Name of a School's athletic teams
dbo:nameDay Name-day of a saint (a xsd:gMonthDay)
dbo:namedAfter Person after whom something is named (eg School, Disease, Theorem etc)
dbo:colourName The colors of a party, school, taxon(?)
dbo:policeName The police detachment serving a UK place, eg Wakefield -> "West Yorkshire Police" ok

Not Ok

dbo:genereviewsname Bad capitalization (camelCase) #18
dbo:circuitName replace with raceTrack bug