Warning: preg_match(): Compilation failed: group name must start with a non-digit at offset 8 in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 902

Warning: preg_match(): Compilation failed: group name must start with a non-digit at offset 8 in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 902

Warning: preg_match(): Compilation failed: group name must start with a non-digit at offset 8 in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 902

Warning: preg_match_all(): Compilation failed: group name must start with a non-digit at offset 4 in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 876

Warning: Invalid argument supplied for foreach() in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 877

Warning: preg_replace(): Compilation failed: group name must start with a non-digit at offset 4 in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 881

Warning: preg_match_all(): Compilation failed: group name must start with a non-digit at offset 4 in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 876

Warning: Invalid argument supplied for foreach() in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 877

Warning: preg_replace(): Compilation failed: group name must start with a non-digit at offset 4 in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 881

Warning: preg_match_all(): Compilation failed: group name must start with a non-digit at offset 4 in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 876

Warning: Invalid argument supplied for foreach() in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 877

Warning: preg_replace(): Compilation failed: group name must start with a non-digit at offset 4 in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 881

Warning: preg_match_all(): Compilation failed: group name must start with a non-digit at offset 4 in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 876

Warning: Invalid argument supplied for foreach() in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 877

Warning: preg_replace(): Compilation failed: group name must start with a non-digit at offset 4 in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 881

Warning: preg_match(): Compilation failed: group name must start with a non-digit at offset 8 in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 902

Warning: preg_match_all(): Compilation failed: group name must start with a non-digit at offset 4 in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 876

Warning: Invalid argument supplied for foreach() in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 877

Warning: preg_replace(): Compilation failed: group name must start with a non-digit at offset 4 in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 881

Warning: preg_match_all(): Compilation failed: group name must start with a non-digit at offset 4 in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 876

Warning: Invalid argument supplied for foreach() in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 877

Warning: preg_replace(): Compilation failed: group name must start with a non-digit at offset 4 in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 881

Warning: preg_match(): Compilation failed: group name must start with a non-digit at offset 8 in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 902

Warning: preg_match(): Compilation failed: group name must start with a non-digit at offset 8 in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 902

Warning: preg_match(): Compilation failed: group name must start with a non-digit at offset 8 in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 902
Difference between revisions of "Internationalization" - PCGen

Difference between revisions of "Internationalization"


Warning: preg_match(): Compilation failed: group name must start with a non-digit at offset 8 in /home/pcgen1/public_html/wiki/includes/MagicWord.php on line 902
From PCGen
Jump to: navigation, search
(added link to Localization)
(Application Packaging)
 
Line 60: Line 60:
  
 
Cons
 
Cons
* file could get quite big
+
* file could get quite big. But if it is text, it compress usually really well.
  
 
== Genre class names ==
 
== Genre class names ==
Line 148: Line 148:
  
 
Apparently there is already unique identifiers in use for items, so the idea would be to have a file with that id as the key and the translation as the value.
 
Apparently there is already unique identifiers in use for items, so the idea would be to have a file with that id as the key and the translation as the value.
 +
 +
== Output Sheet and Data Language ==
 +
 +
The question is: should output sheets be locale dependent?
 +
If it is, when a sheet is corrected for one locale, it will need to be edited in all languages. Having languages taken into account in the same sheet make it more complicated.
 +
 +
=== Gender, race and level example ===
 +
 +
There is also other problem as illustrated by the gender, race and level.
 +
 +
In English, the gender propose Male, Female, Neuter, Unknown.
 +
In a English statblock, it is usually a line like: Gender Race Level N, as “Female drow cleric 3” for a drow noble [http://www.d20pfsrd.com/bestiary/monster-listings/humanoids/drow-common/drow-noble]. As of this writing, the software output “Female Drow Noble Cleric3”, or “Female Drow Noble Cleric 3” instead in most statblock sheets.
 +
The Unknown gender, and maybe the neuter one too, should probably not output as is in statblock sheets, while not
 +
On my system, with the preferences to use the system language, the female translation is used instead.
 +
 +
In French, it is Mâle, Femelle, Neuter, Inconnu. The same noble drow is “Drow, prêtre 3” (prêtre is the translation of cleric) [http://www.pathfinder-fr.org/Wiki/Pathfinder-RPG.Drow%20noble.ashx]. I remember reading “Drow (f), prêtre 3”, where the sex is included by using a single letter in parenthesis. It could also have been “Drow, guerrière 3”, where guerrière is the feminine of fighter.
 +
As you can remark, the order is different.
 +
 +
In Japanese, it looks like “ドラウ(女性)の貴族の3レベル・クレリック” [http://www29.atwiki.jp/prdj/pages/306.html#drow]. ドラウ is drow, 女性 means woman, の is the possessive, 貴族 is noble, レベル is level, クレリック is cleric. That would look like drow(female)’s noble’s 3rd level cleric. When the gender is unknown, it is not mentioned, goblin is such an example [http://www29.atwiki.jp/prdj/pages/244.html#goblin].
 +
 +
In Italian, it seems to be “Drow nobile femmina Chierico 3”, ie Drow noble female Cleric 3. [http://prd.5clone.com/mostri/1982-drow]
 +
 +
I think that in German, Female used as the choice and female as in female something is not written the same.
 +
 +
Note that my examples uses Pathfinder because I do not know of translations of the SRD/RSRD online.
 +
 +
This is only the first element of the stat block, and it seems already complicated. And there was no example of right-to-left languages, like Arabic or Hebrew. One thing seems common is that the gender value used on display, on the one used in stat blocks are different, whatever the language, and it is the same for race. For a standard character sheet, it seems that the UI value can be reused as is. In the case of gender, there is already two (three?) token, GENDER.SHORT and GENDER.LONG (and GENDER?). GENDER.SHORT, which has the same value as GENDER.LONG, could be changed to be the output value. A new value could be introduced, maybe something like GENDER.STATBLOCK.
 +
That still doesn’t handle the problem of the order of gender/race/class levels.
 +
 +
=== Data translation and output sheet values locale ===
 +
 +
The issue is of what language to use when outputting the gender or other localized fields. At the moment, the data is in English, but there is a mix of English and whatever language the UI is defined to use. My primary example is Gender.
 +
Once a data language is introduced, it might make sense to use this language in the output sheets rather than the UI one (if they differ).
  
 
== Tom’s technical proposal ==
 
== Tom’s technical proposal ==
Line 154: Line 187:
  
 
First a few base facts as background:
 
First a few base facts as background:
# For [almost] every item (Class, Skill, etc.) there is a unique identifier (generally referred to as a Key - note that if the "KEY" token is not used, then the Key is the name (first entry on the line in the data) (There is an exception to this we'll call problem #1)
+
# For [almost] every item (Class, Skill, etc.) there is a unique identifier (generally referred to as a Key - note that if the "KEY" token is not used, then the Key is the name (first entry on the line in the data) (There is an exception to this we'll call [[#problem1|problem #1]])
 
# There are basically 4 types of things that need translation:
 
# There are basically 4 types of things that need translation:
 
## (2a) item names
 
## (2a) item names
Line 161: Line 194:
 
## (2d) Strings (like descriptions)
 
## (2d) Strings (like descriptions)
 
## [if anyone can think of more, let me know]
 
## [if anyone can think of more, let me know]
# Most (but not all) tokens are "unique" or otherwise "addressable" in that they can only occur once per object. (Anyone notice that the tokens have started to be specific in the test code & docs about whether they overwrite, add, etc. - this is one reason why - and yes, I've been slowly trying to make progress on this even in the 2007-2009 work) [The exception to this is problem #2)
+
# Most (but not all) tokens are "unique" or otherwise "addressable" in that they can only occur once per object. (Anyone notice that the tokens have started to be specific in the test code & docs about whether they overwrite, add, etc. - this is one reason why - and yes, I've been slowly trying to make progress on this even in the 2007-2009 work) [The exception to this is [[#problem2|problem #2]])
  
 
A few principles about l10n:
 
A few principles about l10n:
Line 168: Line 201:
  
 
With that:
 
With that:
* Following from #1, almost everything we have in the data today is "addressable". By that, I mean that the OUTPUTNAME for a Skill called "FooBar" can be uniquely called out. The name hierarchy is something like: SKILL//Foo//OUTPUTNAME (exceptions are problems #1,2 to be addressed later)
+
* Following from #1, almost everything we have in the data today is "addressable". By that, I mean that the OUTPUTNAME for a Skill called "FooBar" can be uniquely called out. The name hierarchy is something like: SKILL//Foo//OUTPUTNAME (exceptions are problems #[[#problem1|1]],[[#problem2|2]] to be addressed later)
  
Given that, we can actually set principle #1 to "The data remains unchanged" (again, except for problem #2). The entire data set is produced (assume US English for a moment). We then have a unique file for l10n that has things like:
+
Given that, we can actually set principle #1 to "The data remains unchanged" (again, except for [[#problem2|problem #2]]). The entire data set is produced (assume US English for a moment). We then have a unique file for l10n that has things like:
 
* SKILL:FooBar|Oobarf-ay
 
* SKILL:FooBar|Oobarf-ay
 
* SKILL:FooBar:OUTPUTNAME|ooBarOutF-ay
 
* SKILL:FooBar:OUTPUTNAME|ooBarOutF-ay
Line 197: Line 230:
 
This brings us to problems #1 and #2:
 
This brings us to problems #1 and #2:
  
Problem #1: Non-unique names
+
<a id="problem1">Problem #1</a>: Non-unique names
 
* We glossed over this in 6.x, but the truth is that Spell names are not unique. Some of the *RDs have duplicate names (not all, but I forget which). Same is true for languages.
 
* We glossed over this in 6.x, but the truth is that Spell names are not unique. Some of the *RDs have duplicate names (not all, but I forget which). Same is true for languages.
 
* (1a) Spells can theoretically be differentiated by evaluating the "TYPE" token for Divine, Arcane, or Psionic (those are "magical" items in our code.
 
* (1a) Spells can theoretically be differentiated by evaluating the "TYPE" token for Divine, Arcane, or Psionic (those are "magical" items in our code.
Line 203: Line 236:
 
I believe both of those forms of magic are things on our backlog of FREQs to clean up... and the reason they are on the cleanup list really for L10N (as much as it is to just cleanup the overuse of TYPE)
 
I believe both of those forms of magic are things on our backlog of FREQs to clean up... and the reason they are on the cleanup list really for L10N (as much as it is to just cleanup the overuse of TYPE)
  
Problem #2: Non-unique tokens
+
<a id="problem2">Problem #2</a>: Non-unique tokens
 
* There are only a few tokens that are not unique. DESC is one of them, if I recall correctly. The probably solution here is to simply give an identifier to each token. This might require a change to LST Something like:
 
* There are only a few tokens that are not unique. DESC is one of them, if I recall correctly. The probably solution here is to simply give an identifier to each token. This might require a change to LST Something like:
 
* DESC*Overall:x
 
* DESC*Overall:x
Line 266: Line 299:
  
 
That is also an error (just like the "unconstructed reference" items are errors.  By capturing both type 1 and type 2 error (things that aren't translated as well as things that shouldn't have been translated) we are capturing the vast majority of the simple problems.
 
That is also an error (just like the "unconstructed reference" items are errors.  By capturing both type 1 and type 2 error (things that aren't translated as well as things that shouldn't have been translated) we are capturing the vast majority of the simple problems.
 +
 +
=== Also need translation ===
 +
 +
Page number: either the original book is referenced, but it doesn’t help people that have the translation where the page is not the same, or the translated book is referenced and that number need to be localized.
 +
Usually data sets represents individual books, but there is sometimes compilation done of those. That means that several dataset would represent a single one. I know the French editor of Pathfinder does this. No idea what the content is and how it is organized, but it might work by changing the description to include something like chapter X of book Z in the description.
 +
 +
The data sets points to the English books. Should the translation points to the localized one, if it exists?
  
 
== See also ==
 
== See also ==
  
 
* [[Localization]]
 
* [[Localization]]

Latest revision as of 22:16, 15 September 2012

Personal tools
Namespaces

Variants
Actions
Start
Teams
Community
Strategic Development
Misc
Toolbox