rRES - raylib resources custom fileformat

Ray

#10275

January 8, 2017

By default, raylib supports the following file formats for resources:

- IMAGE (Uncompressed): PNG, BMP, TGA, JPG, GIF, HDR (stb_image.h)
- IMAGE (Compressed): DDS, PKM, KTX, PVR, ASTC
- AUDIO (Sound): WAV, OGG (stb_vorbis.c), FLAC (dr_flac.h)
- AUDIO (Streaming): OGG, FLAC, XM (jar_xm.h), MOD (jar_mod.h)
- FONTS: BMFont (FNT), TTF (stb_truetype.h), IMAGE-based
- MODEL (Mesh + Material): OBJ, MTL

Since raylib 1.0, I also added support for a custom fileformat: RRES

At that moment (3 years ago) I didn't have the same experience than now and so,
I decided to redesign it to be more generic, versatil and useful (but keeping it simple):

I'm not an expert designing file formats so I'd appreciate any feedback about it.

Edited by Ray on December 10, 2017, 9:47am

Mārtiņš Možeiko

#10276

January 8, 2017

What does "RSA" means for crypto type? You cannot encrypt arbitrary large data with rsa. You typically encrypt stream cipher key (AES) with RSA and then use stream cipher to encrypt rest of data. So you need place to store encrypted AES key as part of data.

Using juts unauthenticated crypto algorithm (AES/XOR/Blowfish) is a bad idea. Always use authentication - something like HMAC or GCM block mode. So you'll need place for signature. Alternatively don't roll out your own crypto and take good high-level implementation - nacl (or libsodium) which is proven to be resistent to crypto attacks (for example its code doesn't use data dependent branches). Instead of ancient blowsfish I would take much more modern crypto primitives - Salsa20 or ChaCha20 for stream cipher and poly1305 for authentication, both of them can implemented very efficiently in software.

I would include also zstd for compression. It is from same author as lz4, but offers faster performance and more compression than lz4 or even zlib.

For vertex formats - does seach component (Normal, Position) needs to in separate resource block? Is vertexType enum or bitset?
What about custom attributes?

Edited by Mārtiņš Možeiko on January 8, 2017, 11:49am

Ray

#10278

January 8, 2017

Hi mmozeiko! Thank you very much for your feedback! :D

Sorry, I don't know much about cryptography; actually, current rRES implementation doesn't consider any kind of encryption, just added that field by recommendation of some gamedev friends. Thanks for the libsodium reference, just been checking libtomcrypt.

About compression modes, just added the most common ones for reference, currently only DEFLATE compression is supported, again, I'm neither an expert on that field.

About vertex data, first I designed it to be full mesh (one single resource) and use bit fields to define the attributes provided but I realized it was simpler to just store every vertex array independently, it gives the user more control over stored vertex data. Additionally, I added the partsCount field in the InfoHeader, it's used to define resources that consist of multiple parts (one after the other, same resource id).

DataType, CompressionType, EncryptionType, ImageFormat, VertexFormat are just enums, that way rRES can be extended just adding required formats, for example: RRES_VERT_CUSTOM1 en VertexType (actually, enums naming has been adapted to be more descriptive).

Jeroen van Rijn

#10280

January 8, 2017

Let me preface this by saying it's not a bad design, but I think improvements can be had.

While I agree with Mārtiņš Možeiko about the encryption part of the format, my main concern with the file format as proposed is perhaps a bit more fundamental still: rRES seems to be structured in a way that's perhaps too specific.

While it's certainly a good thing to specify the various types of resources, maybe having 4 parameters - which can be used or reserved for future use - isn't the best idea. What if you want to add a new type of resource that needs 5 or more parameters? What if the majority of the objects you're packing into this resource file need no parameters at all?

I don't know what the 'part' field does in type, comp, crypt, part. Maybe it's there to split up a given resource into more than 1 part? Why would you want to split up an image into more than 1 part? Or is this intended for a font with at most 256 glyphs, where each glyph is its own part?

This is one of those things that feels very specific, but where it's unclear that you're actually going to use this option field in question. Might this byte instead be used as a 'param count' or 'param bytes'? The latter would give you a bit of flexibility, where each resource can have between 0 and 255 bytes worth of params, laid out as makes sense for that resource type.

(Edit: I see now in your reply above that it stands for parts count. Might this not be a parameter? Some types of resources wouldn't be split up like this.)

You can still use the width, height, format, mipmaps for RRES_IMAGE, you'd just set param_bytes to 16.

Alternatively, let the type of resource determine how many bytes of parameters directly follow the header, preceding the actual payload. If you see it's a type RRES_IMAGE, you know the next 16 bytes are parameters. If you see it's RRES_TEXT, there's only 8 bytes worth of params.

Additionally, as a resource pack file format, one thing I seem to be missing is a central directory. It seems you need to know the ID of the resource you need and keep this information elsewhere. The resource blocks themselves don't appear to have a place to store a filename or other identifier, and there's no directory in the diagram.

Perhaps a good addition would be to have the final entry be an RRES_DIRECTORY, which allows you to identify the resources by a name or some other identifier, and which has an offset into the file for the resource in question.

The last 32 bits of the file could be the directory length, so you could open the file and verify the rRES header, then seek to the end, read the directory length and jump back that many bytes and confirm you've landed on RRES_DIRECTORY. Alternatively, add a 32 bit DirectoryOffset to the FileHeader after `count`? If that DirectoryOffset == 0, it means you know what each of the resources are and can identify them by their ID in another way, with the directory omitted from the file. This would give you an optional directory.

Some ideas to ponder :)
It's not a bad design, but I think there's probably a few lessons to be learned from the RIFF format and how extensible it is, the PKZIP format and its directory structure, and Per Vognsen's GOB (which riffs on RIFF and allows zero-copy use of assets).

Edited by Jeroen van Rijn on January 8, 2017, 1:49pm Reason: parts

Ray

#10294

January 8, 2017

Thanks for the reply Jeroen! Very interesting! Still thinking about your points! :D

Kelimion
While it's certainly a good thing to specify the various types of resources, maybe having 4 parameters - which can be used or reserved for future use - isn't the best idea. What if you want to add a new type of resource that needs 5 or more parameters? What if the majority of the objects you're packing into this resource file need no parameters at all?

Actually, current implementation (3 years old) uses a custom number and size of parameters per type, just changed it now for simplicity; my reflexion: every resource has 4 int parameters, use them as you like, if a resource type needs more than that, make them fit or divide resource type into multiple resource types, using partCount to link the multiple resource parts with same ID.

For example:

Mesh (3 vertex arrays with position, texcoords, normal) --> Can be packed as 3 resources of type RRES_VERTEX and vertexType POSITION, TEXCOORD1, NORMAL; partCount for the 3 resources would be 3, same Id, one after the other.

SpriteFont (spritefont image, chars data) --> Can be packed as 2 resources of type RRES_IMAGE and RRES_FONT_INFO (baseSize, charsCount, reserved, reserved) - just int array with required font data (value, rectangle, offsets, xadvance), partCount is 2, same ID.

Well, it seemed simpler to me just setting a fixed InfoHeader size with a fixed number of parameters, even wasting some bytes (and have to say that I come from microcontrollers programming, where every bit counts!). I'll think a bit more about it...

Kelimion
Perhaps a good addition would be to have the final entry be an RRES_DIRECTORY, which allows you to identify the resources by a name or some other identifier, and which has an offset into the file for the resource in question.

The idea was to keep the resource ID as the unique identifier for every resource (or resource parts), in current implementation, when .rres is generated from files, a .h file is also generated containing a bunch of:

#define RRES_background_png 0x0204F75A

The .h file can be included in the project and just load every resource like:

RRESData data = LoadResource(resources.rres, RRES_background_png)

Exposing only the ID added an extra level of security over the assets (also resource crypto keys could own to program the same way...). That's how it works now but I liked the idea of the RRES_DIRECTORY to store that table data in the same rres, just in case.

I'll check the RIFF format and GOB, thanks for the link and the extensive review! :)

Edited by Ray on October 6, 2018, 8:22am

Jeroen van Rijn

#10296

January 9, 2017

raysan5
That's how it works now but I liked the idea of the RRES_DIRECTORY to store that table data in the same rres, just in case.

I'll check the RIFF format and GOB, thanks for the link and the extensive review! :)

No problem :)

What I like about the central directory is that you can more quickly jump to a specific resource within the pack without having to read each resource's info struct, determine its length and jump to the next one until you land on the resource you're interested in. Perhaps you do want to load all of them in some cases, but there will also be cases where you'd want to page them in.

Of course you might say that this .h file includes not only the ID but also the offset into the pack file (or just the offset for that matter, because your asset loader can then read its info struct), but then you'd need to recompile your program every time you want to ship a new version of your assets, which feels less than ideal.

There's a trade-off to be had there, for sure. If you wanted to make it less obvious what's in the pack even when you have this central directory, you could consider encrypting the identifiers as well, so this directory would have an 'encrypted' field just like the other resources.

Edited by Jeroen van Rijn on January 9, 2017, 12:24am

Ray

#13712

December 10, 2017

This weekend I've been working again on my custom rRES file format.

Last week Milan Nikolic (github: gen2brain), creator of raylib-go, just implemented initial design in Go and also created a command line tool.

Using the advise provided in this forum and carefully reviewing RIFF and ZIP file-formats, I improved my previous design:

Now every resource could be divided into several chunks (useful for resources like SpriteFonts or Meshes that consist of several sets of data), every chunk is defined by a chunkType, compressionType, cryptoType and could contain a variable number of parameters (4-byte each). Also added CRC32 support for every chunk.

Central Directory is also supported as an additional Resource type (that could be available or not).

As usual, I tried to keep it simple just adding the most relevant information... chunks support complicates a bit the design but makes it more versatile and customizable.

Any feedback is very welcomed! :)