Internals Last updated on 05/18/16
Karazeh Internals
Current dependencies
- boost 1.46+ for the
filesystem
library - cURL
- CMake 2.8+ for building
Embedded Dependencies
Identifying the application version
Karazeh tells which version of the application the user is currently running by way of calculating checksums of a number of files referred to as the identity files. Those checksums are joined together and the resulting "blob" is finally digested into another checksum known as the identity checksum. The identity checksum is, to Karazeh, the version of the running application.
Identity files could be a singular binary file (the application's executable, for example), or a list of files; maybe the executable joined with the core data archive, or a crtiical script component. The usage of identity files guarantees that should any of them be tampered with, the identity checksum will no longer match the expected one resulting in an unidentified version, or a corrupt one.
Finally, each release can define its own list of identity files. For example, a game might use its binary as the identity file until its expansion comes out where another binary is introduced, so the expansion releases can define both binaries to be the identity files.
Versioning schemes
Thanks to using identity files for identifying the version of the application, Karazeh does not force you to any particular versioning scheme. In fact, a "version", or tag as it is referred to internally, is really nothing but a string to it that has no meaning. The tag can be used by you, the application developer, to label your versions in a way you see fit. The tag is also what's commonly displayed to the user, since it's much more friendly than a hex digest (the identity checksum).
Examples of some popular versioning schemes: 1.0.3-rc1
, 10.6
, b01cd.0
, 10.7 Lion
, etc.
Notice how you can also use "codenames" as well as numbered schemes; define the tag you way you want, and parse it the way you want, Karazeh will not interfere.
Generating binary diffs
Binary files and data archives are usually very large, and it's inefficient to force the user to re-download them everytime they're updated. Karazeh can help you transmit only the parts that change - to an extent - in those binary files via a delta encoding solution, librsync
.
The hunt for a delta encoding solution
The difficulty lied in the number of requirements the library had to meet for it to be usable in Karazeh:
- memory efficiency in both encoding and decoding routines
- licensing compatibility (not proprietary, nor GPL)
- support for processing large enough basis files that can acommodate today's binary file requirements (some games have data archives as large as 20GBytes)
- delta filesize efficiency
- cross-platform operabiliity
So in my hunt for a solution, I came accross the following:
bsdiff
- unforgivably fast and efficient, but just as much memory-hungry; it didn't satisfy requirement#1xdelta3
- the most memory efficient solution I've tried, and the fastest, with a bit higher patch file sizes but that's a trivial price to its speed - but it's GPL licensedopen-vcdiff
- maybe I didn't know how to use it, but it ate all my memory while patching a 1GByte archive just likebsdiff
did
Finally, I met rdiff
/librsync
and it won on all grounds; license compatibility, memory requirements, speed, and patch filesize.
Operations
The following breakdown satisfies the operation requirements: Creation, Renaming, Updating, and Deletion of files (or directories, when applicable.)
create
Arguments:
- the source file: this is the file that will be fetched from the server and "copied" to a final destination
- the destination: the fully qualified path the file should be placed at when the patch is committed
Staging
- verify that no file exists at the destination
- verify that the running user has write permissions
- verify that there's enough space to hold the file
- fetch the file and store it in the staging reposistory
- validate the file's integrity, and redownload if necessary
Deployment
- move the file from the staged source to the destination
create
in the release manifest:
<create>
<source checksum="1234" size="500"><![CDATA[/path/to/source]]></source>
<destination><![CDATA[/path/to/destination]]></destination>
</create>
update
Arguments:
- the source file; the file to be patched
- the source file's checksum
- the patch file
- the patch file's checksum
- the patch file's size
Staging
- verify that the source exists
- validate the integrity of the source
- verify that there's enough space to hold the patch file and the backup of the source file
- fetch the patch file
- validate the integrity of the patch file
- create a backup of the source file
Deployment
- apply the patch on the clone (aka backup)
- validate the integrity of the patched file:
- if the integrity test fails, announce a rollback
- remove the source file
- move the patched file to the source's destination
create
in the release manifest:
<update>
<target pre-checksum="1234" post-checksum="5678"><![CDATA[/path/to/file]]></target>
<patch checksum="5678" size="100"><![CDATA[/path/to/patch]]></patch>
</update>
rename
Arguments:
- the fully qualified source path
- the fully qualified destination path
Staging
- verify that the source exists
- verify that the destination is clear
Deployment
- move the source to the destination
rename
in the release manifest:
<rename>
<from checksum="[1234]"><![CDATA[/path/to/file]]></from>
<to><![CDATA[/new/path/to/file]]></to>
</rename>
delete
Arguments:
- the fully qualified source path
Staging
- verify that the source exists
Deployment
- if the source is a directory, recursively empty its contents
- remove the source
delete
in the release manifest:
<delete>
<target checksum="[1234]"><![CDATA[/path/to/file-or-directory]]></target>
</delete>
Notes
<create executable="true">
will cause the patcher to mark the created file as executable (chmod 0711)<delete>
entries will internally mark those paths as "to be deleted", so any subsequent<create>
entry with one of those paths will know that they will be deleted, and will not cause a staging error; effectively, we achieve the effect of<replace>
without having to implement any!- running with
-v
will cause theresource_manager
to print out the content of all downloaded files <delete>
recursively removes directories as well as files
The Version Manifest
**Embedding error**: the file type you tried to embed (``) is not supported. (**Source**: [https://raw.github.com/amireh/karazeh_v2/master/doc/version_manifest.template.xml](https://raw.github.com/amireh/karazeh_v2/master/doc/version_manifest.template.xml))
The Release Manifest
**Embedding error**: the file type you tried to embed (``) is not supported. (**Source**: [https://raw.github.com/amireh/karazeh_v2/master/doc/release_manifest.template.xml](https://raw.github.com/amireh/karazeh_v2/master/doc/release_manifest.template.xml))