STRG (Metroid Prime 3)

From Retro Modding Wiki
Revision as of 18:09, 29 May 2015 by >Aruki (Created page with "''See STRG (File Format) for the other revisions of this format.'' The '''STRG format''' in Metroid Prime 3 is another update to the STRG format, used both in Prime 3 as...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

See STRG (File Format) for the other revisions of this format.

The STRG format in Metroid Prime 3 is another update to the STRG format, used both in Prime 3 as well as Donkey Kong Country Returns. The most significant change from the Echoes STRG format is that the string encoding was changed from UTF-16 to UTF-8. There was also a more robust system for string offsets implemented, allowing for the same string of text to be reused for multiple languages if the string is identical in both.

Format

The header should be familiar to you if you've worked with the Prime 1/2 STRG format in the past. It remains unchanged.

Offset Type Size Description
0x0 u32 4 Magic; always 0x87654321
0x4 u32 4 Version; see hub article
0x8 u32 4 Language count
0xC u32 4 String count
0x10 End of header

String Names

The next part of the file is a table that allows names to be assigned to strings. It's identical to the name table structure from Echoes.

Offset Size Description
0x0 4 Name count
0x4 4 Name table size
0x8 Name entries begin

Each entry is structured as follows:

Offset Size Description
0x0 4 Name offset (relative to after the name table size value)
0x4 4 String index - this is the string number that the name is associated with
0x8 End of entry

After every name entry comes all the names in the form of a large UTF-8 string array. The names are zero-terminated, and they're sorted in alphabetical order; the sorting is case-sensitive, so 'Z' will appear before 'a'.

Languages

Next is a segment that defines which languages are included by the file and where their strings are. This section of the file starts with one fourCC for each supported language. Then this small structure repeats per language:

Offset Type Size Description
0x0 u32 4 Language size; this is the combined size of each string this language uses
0x4 u32[] 4 × string count String offsets; relative to the start of the first string

Strings

Finally, the actual string data. Each string is composed of a 32-bit size followed by UTF-8 Unicode string data.