[Scummvm-devel] kMD5FileSizeLimit = 1024 * 1024

Max Horn max at quendi.de
Fri Aug 4 22:07:19 CEST 2006


Yo,

to get matters back on topic and back to a more neutral tone (sorry  
if the end of my previous mail was a bit harsh, but I felt quite  
irritated), let me recap my proposal, the reasoning behind it, and  
the single problem with it that I am aware of. If you are aware of  
further issues. And if you are one of the maintainers of Kyra, Lure,  
or Gob, feel free to add your view regarding the "ISSUES" entry  
below, and state whether this issue is one for your engine, or not.


PROPOSAL: The constant kMD5FileSizeLimit is currently set to  
1024*1024 (=1MB) in four engines (SCUMM, Kyra, Lure, Gob). I propose  
that we reduce this to a lower number, something between 4 and 128kb,  
depending on the engine.
In particular, for reasons recently discussed on this list, the SCUMM  
engine would be changed to use 128kb. I am not sure what would be  
suitable limits for the other engines.


BACKGROUND: Some game engines (SCUMM, Kyra, Lure, Gob) use a constant  
named kMD5FileSizeLimit. Each has its own "copy" of that constant,  
i.e. the value can be adjusted on a per-engine bases. They use that  
constant to control how much of a file is used when computing a  
"fingerprint" of that file. (Note that the "simon" and "saga" engines  
have a similar constant named FILE_MD5_BYTES.) If a file is smaller,  
all of it is used to compute the fingerprint, otherwise the number of  
bytes specified by that constant is used, the rest being ignored. The  
engines use this fingerprint to recognize and distinguish (variants  
of) games they support. The fingerprint is *not* meant to prevent  
users from tampering with the file or provide any other security  
features. A different hash function could be used, but for historical  
reasons we are using MD5.


JUSTIFICATION: In most cases it's really sufficient to use the first  
couple KB of a file to get a fingerprint suitable for our purposes.  
At the same time, on many systems, disk I/O is a bottleneck compared  
to CPU and memory speeds. Hence, using less data to compute the  
fingerprint has a positive effect on performance. Especially on older  
systems it gets noticeable when scanning many files. On newer systems  
no speedup (and hence no benefit) may be visible, however the change  
still has absolutely no negative implications.


ISSUES: We have to recompute some MD5 checksums, namely whenever the  
detect file exceeds the new value of kMD5FileSizeLimit. If only very  
few game variants are affected, this shouldn't be an actual problem.  
However, if many game variants are affected, and especially if those  
game variants are not all in the possession of team members, it might  
become unfeasible to recompute all these fingerprints. I rely on  
engine maintainers to find out whether this is the case for GOB,  
Kyra, or Lure. In each case, if this turns out to be a problem, we  
can choose to either abandon this proposal for that engine; or we can  
implement a transition approach, where for a time we allow both  
fingerprints to coexist, while we wait for external contributors to  
provide the missing "new" fingerprints.

No other issues with the proposal are currently known to me.





Bye,
Max




More information about the Scummvm-devel mailing list