STRG (Metroid Prime 3): Difference between revisions

Latest revision as of 04:04, 29 August 2016

See STRG (File Format) for the other revisions of this format.

The STRG format in Metroid Prime 3 is another update to the STRG format, used both in Prime 3 as well as Donkey Kong Country Returns; the only difference between the two is that DKCR supports more languages. The most significant change from the Echoes STRG format is that the string encoding was changed from UTF-16 to UTF-8. There was also a more robust system for string offsets implemented, allowing for the same string of text to be reused for multiple languages if the string is identical in both.

Format

The initial header is identical to Prime 1/2. The differences start after that; the name table now precedes the language table, and the language tables and string tables are structured differently.

Offset	Type	Count	Name	Notes
0x0	u32	1	Magic	Always `0x87654321`.
0x4	u32	1	Version	Always 3. See hub article for a list of possible version numbers.
0x8	u32	1	Language Count	Number of languages that this table has strings for.
0xC	u32	1	String Count	Number of strings contained in the file per language.
0x10	Name Table	1	Name Table	Associates each string in the file with a name.
	char	4 × Language Count	Language ID Array	Array of fourCCs that defines which languages have strings included in the file. See below for a list of possible language codes.
	Language	Language Count	Language Table	Table that defines the languages that are present in the file. Each element in the array corresponds to the language in the Language ID Array at the same index.
	String	Varies	String Array	Contains the actual string data. The reason the count varies is because if a string is identical between multiple languages then the same string data will be used for all of them, so there's no value that can tell you the real string count directly.
End of file

Possible language codes:

ID	Language	MP3	DKCR
`ENGL`	English	✔	✔
`GERM`	German	✔	✔
`FREN`	French	✔	✔
`SPAN`	Spanish	✔	✔
`ITAL`	Italian	✔	✔
`DUTC`	Dutch	✔	✔
`JAPN`	Japanese	✔	✔
`SCHN`	Simplified Chinese	✖	✔
`TCHN`	Traditional Chinese	✖	✔
`UKEN`	U.K. English	✖	✔
`KORE`	Korean	✖	✔
`NAFR`	North American French	✖	✔
`NASP`	North American Spanish	✖	✔

Notes:

In DKCR, Japanese appears after English instead of after Italian.
The languages DUTC, SCHN, and TCHN are unused and don't actually appear in any STRG file. However, their fourCCs can be found in the dol alongside the other language codes, so presumably the game supports them.

Name Table

This part of the file is a table that allows names to be assigned to strings. It's identical to the structure from Echoes, so check the Echoes documentation for details.

Language

This is a small structure that defines where the strings for a particular language are located. It appears once per language.

Offset	Type	Count	Name	Notes
0x0	u32	1	Strings Size	This is the combined size of each string this language uses.
0x4	u32	String Count	String offsets	Relative to the start of the first string.
End of language definition

String

Offset	Type	Name	Notes
0x0	u32	String Size	Size of the string data in bytes.
0x4	string	String	Zero-terminated string encoded with UTF-8 Unicode.
End of string