STRG (Metroid Prime 3): Difference between revisions
>Aruki (Created page with "''See STRG (File Format) for the other revisions of this format.'' The '''STRG format''' in Metroid Prime 3 is another update to the STRG format, used both in Prime 3 as...") |
(No difference)
|
Revision as of 18:09, 29 May 2015
See STRG (File Format) for the other revisions of this format.
The STRG format in Metroid Prime 3 is another update to the STRG format, used both in Prime 3 as well as Donkey Kong Country Returns. The most significant change from the Echoes STRG format is that the string encoding was changed from UTF-16 to UTF-8. There was also a more robust system for string offsets implemented, allowing for the same string of text to be reused for multiple languages if the string is identical in both.
Format
The header should be familiar to you if you've worked with the Prime 1/2 STRG format in the past. It remains unchanged.
Offset | Type | Size | Description |
---|---|---|---|
0x0 | u32 | 4 | Magic; always 0x87654321 |
0x4 | u32 | 4 | Version; see hub article |
0x8 | u32 | 4 | Language count |
0xC | u32 | 4 | String count |
0x10 | End of header |
String Names
The next part of the file is a table that allows names to be assigned to strings. It's identical to the name table structure from Echoes.
Offset | Size | Description |
---|---|---|
0x0 | 4 | Name count |
0x4 | 4 | Name table size |
0x8 | Name entries begin |
Each entry is structured as follows:
Offset | Size | Description |
---|---|---|
0x0 | 4 | Name offset (relative to after the name table size value) |
0x4 | 4 | String index - this is the string number that the name is associated with |
0x8 | End of entry |
After every name entry comes all the names in the form of a large UTF-8 string array. The names are zero-terminated, and they're sorted in alphabetical order; the sorting is case-sensitive, so 'Z' will appear before 'a'.
Languages
Next is a segment that defines which languages are included by the file and where their strings are. This section of the file starts with one fourCC for each supported language. Then this small structure repeats per language:
Offset | Type | Size | Description |
---|---|---|---|
0x0 | u32 | 4 | Language size; this is the combined size of each string this language uses |
0x4 | u32[] | 4 × string count | String offsets; relative to the start of the first string |
Strings
Finally, the actual string data. Each string is composed of a 32-bit size followed by UTF-8 Unicode string data.