This post outlines some of the information I've learned while writing EorSeisa for FF14. Specifically, this explains how Kitanai14 extracts files from FF14.
A file path in FF14 has 3 main components:
A path can look something like this: bg/ex4/05_zon_z5/shared/for_bg/sgbg_z5c1_u1_str01.sgb
A breakdown of that path is as follows:
+============================================================================+
| bg / ex4 / 05 _ zon_z5/shared/for_bg/sgbg_z5c1_u1_str01.sgb |
| Category | Expansion | Chunk | File path |
+============================================================================+
Once you have the category, expansion, and number, you must then parse the correct index file to determine where in the package your file is.
The index file you need to parse is in: sqpack/$EXPANSION/$BG_ID$EXPANSION_NUMBER$CHUNK.win32.index{,2} Some chunks only have a .index OR a .index2 and some have both, it doesn't seem to matter which one you use for chunks that have both.
The index file has 2 headers:
struct SqPackHeader {
char Magic[8];
uint8_t PlatformId;
uint8_t _unk01[3];
uint32_t Size;
uint32_t Version;
uint32_t Type;
};
struct SqPackIndexHeader {
uint32_t Size;
uint32_t Version;
uint32_t IndexDataOffset;
uint32_t IndexDataSize;
uint8_t IndexDataHash[64];
uint32_t DataFileCount;
uint32_t SynonymDataOffset;
uint32_t SynonymDataSize;
uint8_t SynonymDataHash[64];
uint32_t DirIndexDataOffset;
uint32_t DirIndexDataSize;
uint8_t DirIndexDataHash[64];
uint32_t IndexType;
uint8_t _unk01[646];
uint8_t SelfHash[64];
};
All of the hashes are SHA1 hashes of the relevant data, but padded to 64 bytes for some unknown reason.
The data at IndexDataOffset is a list of either of these two structs, depending on if you're reading a .index file or a .index2 file.
struct IndexHashTableEntry {
uint64_t Hash;
uint32_t Data;
uint32_t _padding;
bool IsSynonym() const { return (Data & 0x1) == 1; }
uint8_t DataFieldId() const { return (Data & 0xE) >> 1; }
uint8_t DatFile() const { return (Data & 0xF) / 2; }
int64_t Offset() const { return (Data & ~0xF) * 0x08; }
};
struct Index2HashTableEntry {
uint32_t Hash;
uint32_t Data;
bool IsSynonym() const { return (Data & 0x1) == 1; }
uint8_t DataFieldId() const { return (Data & 0xE) >> 1; }
uint8_t DatFile() const { return (Data & 0xF) / 2; }
int64_t Offset() const { return (Data & ~0xF) * 0x08; }
};
Once the index has been found and parsed, you must look for an entry with the hash of your file. Once you have the the entry, finally the data file must be read. The file you need is the index file, but with the .index{,2} extension replaced with .dat$DataFieldId.
The data files do have headers, but I have yet to find a compelling reason to bother trying to parse them. You can just seek to the offset you got from the index entry, and read the file header from there.
Now we can finally start reading the file itself :) The file header is the following format:
enum class FileType : uint32_t {
Empty = 0x01,
Standard = 0x02,
Model = 0x03,
Texture = 0x04,
};
struct SqPackFileInfo {
uint32_t Size; /**< Size of the compressed file in bytes */
FileType Type; /**< Type of the file, stored as 4 bytes for some reason */
uint32_t RawFileSize; /**< Size of the uncompressed file in bytes */
uint32_t _unk01[2];
uint32_t BlockCount; /**< Number of blocks the file is made up of */
};
The most basic file type is Standard (0x02), which is made up of BlockCount blocks. The block info headers are as follows, and come immediately after the file header.
struct DataStandardFileBlockInfo {
uint32_t Offset;
uint16_t CompressedSize;
uint16_t UncompressedSize;
};
Note that the Offset in the block info is relative to the end of the file header.
TODO
TODO
| Name | ID | Description |
|---|---|---|
| common | 0x00 | |
| bgcommon | 0x01 | |
| bg | 0x02 | |
| cut | 0x03 | |
| chara | 0x04 | |
| shader | 0x05 | |
| ui | 0x06 | |
| sound | 0x07 | |
| vfx | 0x08 | |
| exd | 0x0A | |
| game_script | 0x0B | |
| music | 0x0C | |
| _sqpack_test | 0x12 | |
| _debug | 0x13 |