STRG (Metroid Prime 3): Difference between revisions

Revision as of 18:09, 29 May 2015

See STRG (File Format) for the other revisions of this format.

The STRG format in Metroid Prime 3 is another update to the STRG format, used both in Prime 3 as well as Donkey Kong Country Returns. The most significant change from the Echoes STRG format is that the string encoding was changed from UTF-16 to UTF-8. There was also a more robust system for string offsets implemented, allowing for the same string of text to be reused for multiple languages if the string is identical in both.

Format

The header should be familiar to you if you've worked with the Prime 1/2 STRG format in the past. It remains unchanged.

Offset	Type	Size	Description
0x0	u32	4	Magic; always 0x87654321
0x4	u32	4	Version; see hub article
0x8	u32	4	Language count
0xC	u32	4	String count
0x10	End of header

String Names

The next part of the file is a table that allows names to be assigned to strings. It's identical to the name table structure from Echoes.

Offset	Size	Description
0x0	4	Name count
0x4	4	Name table size
0x8	Name entries begin

Each entry is structured as follows:

Offset	Size	Description
0x0	4	Name offset (relative to after the name table size value)
0x4	4	String index - this is the string number that the name is associated with
0x8	End of entry

After every name entry comes all the names in the form of a large UTF-8 string array. The names are zero-terminated, and they're sorted in alphabetical order; the sorting is case-sensitive, so 'Z' will appear before 'a'.

Languages

Next is a segment that defines which languages are included by the file and where their strings are. This section of the file starts with one fourCC for each supported language. Then this small structure repeats per language:

Offset	Type	Size	Description
0x0	u32	4	Language size; this is the combined size of each string this language uses
0x4	u32[]	4 × string count	String offsets; relative to the start of the first string

Strings

Finally, the actual string data. Each string is composed of a 32-bit size followed by UTF-8 Unicode string data.