[Scummvm-devel] How file handling works

Max Horn max at quendi.de
Mon Oct 13 14:25:16 CEST 2008


(Somebody might want to extend & correct this and put it into the Wiki)

Hi all,

today I was quite surprised to read this statement: "The file handling  
in ScummVM has evolved to the point where I no longer
understand it." I was surprised because I feel that it now is a lot  
simpler than it used to be. But maybe things are not self-explanatory  
enough for everybody else to share that view. Hence I'll try to  
briefly sketch how things work.

Super brief quick reference
===========================
* Use File::open to open files (for reading) as before.
* However, do *not* use (absolute) paths with File::open() !
* an FSNode is kind of a "portable path". If you need to process a  
path (e.g. coming from the config file), first create an FSNode from  
it, then use that for whatever you need to (e.g. pass the FSNode to  
File::open)


How to open a file
==================
Here is just a quick guide on the various ways to open a file, without  
explaining much background.

* Most of you can simply keep using class Common::File as before. The  
main changes in it:
  - you must not pass absolute paths to File::open() ! If you must  
open a file using a path, the correct way is to first create an FSNode  
from the path, then pass that to File::open.
  - you can pass relative paths in a limited fashion; you must use the  
"/" character as separator. I am too lazy to provide details on this  
right now; but check out the doxygen docs of class FSDirectory.
  - there is a new File::open methods: You can pass an "Archive"  
subclass (e.g. a ZIPArchive) to it, and it will search for that file  
in the Archive)
  - in particular, the new file SearchManager (short: SearchMan) is  
such an Archiv subclass, and can wrap arbitrary paths, ZIP archives,  
etc. . By providing Archive subclasses, you can extend this arbitrarily.

* If you have an FSNode and want to read from the corresponding file,  
you can use File::open(FSNode). Internally, this calls  
FSNode::openForReading(). Alternatively, you can use this method  
directly (but then you also must delete the ReadStream returned by it  
later on). This can be useful if you need to keep a pointer to the  
file stream anyway. E.g. assume you used to do this:
   Common::File *f = new Common::File;
   if (f && f->open(node))
     return f;
   else {
     delete f;
     return 0;
   }
you can now do this instead:
   return node.openForReading();


* You can also invoke the SearchManager directly to open a file with a  
specific name, but looking for it in various places (the game path,  
extrapath, DATADIR, current dir, etc.): SearchMan::openFile(filename).  
This is almost exactly what File::open(String) does, only that the  
later also tries to open the filename with a dot appened. I.e. if you do
   file.open("foo")
then if it can't find "foo", it also tries to open "foo." (this is to  
workaround problems on some systems, where an old DOS file without an  
extension incorrectly may get a dot appended when copying it).
If you don't need this extra lookup, and if you may want to use the  
Stream directly, instead of wrapping it in a Common::File instance,  
you can use the SearchMan directly. To continue my example from above,  
instead of
   Common::File *f = new Common::File;
   if (f && f->open(filename))
     return f;
   else {
     delete f;
     return 0;
   }
you could now do this instead (with almost identical meaning, except  
for the "trailing dot" hack):
   return SearchMan.openFile(filename);




The parts of the system
=======================

FSNode (from common/fs.h)
------
Represents one specific file or directory in the filesystem (which may  
or may not exist). Think of it as a generalization of a path. Example:  
A FSNode could refer to "/home/you/foo.txt" (for the Windows folks, "C: 
\Documents/foo.txt"). This might be a file or a directory, or it might  
not exist at all. The FSNode provides methods for checking this, though.

Ways to create FSNodes:
* from a simple filename (assumes that the file resides in the current  
directory)
* from another FSNode:
  - by making a copy of an existing FSNode
  - by asking a node for the node of its parent dir
  - by asking a directory FSNode for a child node

Finally, you can creat FSNodes from "paths". Caveat: You may not  
assume anything about the path format, like what the separator char  
is; in fact, there may not even exist the *concept* of a path  
separator on the target system. Hence, the only valid way to do that  
is to feed a "path" created by another FSNode to this FSNode. I.e. you  
can "serialize" an FSNode to a path, via the FSNode::getPath() method,  
then write that to a config file. Later, you read it back in, and  
create a new FSNode from it. That works fine, as long as you stay on  
the same OS / platform.

Warning: a "path" as returned by FSNode::getPath should *not* be  
passed to File::open(). If you want to open a file at a specific path,  
first create an FSNode from it, then use that to open the file.


SearchMan (from common/archive.h)
---------
Think of this class as the analog of the "PATH environment variable:  
It manages a list of locations (usually, just directories in your  
filesystem, but more elaborate things are possible, more on that  
later). When you want to open a file using class File, all these  
locations are searched.

This replaces the old "File::addDefaultDirectory()" system (these  
methods currently still exist for backwards compatibility, but  
directly call through to the SearchManager; they will be scrapped  
eventually).


Archive (from common/archive.h)
-------
This class represents an accumulation of "files". It can be a file  
system directory (implemented by FSDirectory), and then "contains" all  
the files in that directory (and possibly also files contained some  
levels deep in that dir). It can be a ZIP archive (see common/ 
unzip.h), and then the "files" in it are members of that ZIP file. It  
can even be an accumulation of multiple other archives, and then it  
contains all files contained in all of these dirs (see class  
SearchSet). The SearchManager is in fact just an instance of this  
class, too.

This concept is simple yet powerful. By  using SearchSet, you can  
group together multiple directories and ZIP files, and then with a  
single call, search for files in all of them simultaneously.

You can ask an Archive whether it contains a given file; open a file  
(either via the openFile() method, which returns a ReadStream pointer  
you have to delete later on; or via the new File::open(name, archive)  
method); get a list of all files in the archive; or get a list of  
files matching a specific pattern.

Important note: Archives look up files in a case *in*sensitive  
fashion. This is very convenient for most engines, but it can of  
course in some cases lead to clashes. These are normally resolved by  
just taking the first match, and then assuming that normal games do  
not have multiple files with the same name, only differing in case.

Also, if you a file simple occurs multiple times (say, an engine-data  
file, like "kyra.dat", may live in the current dir, extrapath and  
gamepath, in multiple versions), then the basic API of Archive will  
just give you any of these. If you need to distinguish multiple files  
with the same name (e.g. because you want to pick the "kyra.dat" with  
the correct version), there is also an advanced API which lets you  
deal with that.



Notes to implementors / porters
===============================
Porters can fully customize file I/O by overloading certain methods
* Provide a custom (Abstract)FSNode subclass if your system handles  
paths / the filesystem differently from one of the existing FSNode  
subclasses.
* If you need custom file I/O code (e.g. because the standard C I/O  
stuff, like fopen/fread is not available/buggy/too limited), you can  
overload Abstract FSNode::openForReading() and Abstract  
FSNode::openForWriting(). The Symbian port already makes use of that,  
so look there and/or ask me if you need details. Note that supporting  
writable FSNodes is optional, so your subclass can simply return 0 in  
its openForWriting() method

* Savestates are by default handled via FSNodes, too. If you need to  
handle savestates differently (like many console ports do),  by  
providing a custom SavefileManager implementation (many ports do that  
already)

* If your port has the config file in a special location (be it in a  
special path, or you want to store it in the Windows registry, or in  
some special NVRAM, or whatever -- go nuts with ideas), you can  
overload OSystem::openConfigFileForReading() and
OSystem::openConfigFileForWriting() methods
  (NOTE: Various ports already do it, but I'd wish the iPhone, PS2,  
PSP and PalmOS ports would finally clean out their cruft from common/ 
system.cpp ;)

* If you want ScummVM to look for file in additional dirs (like, on  
OSX, we store the engine-data files like sky.cpt and kyra.dat inside  
the application bundle), your port can overload the  
OSystem::addSysArchivesToSearchSet() method to hook these extra  
locations into the SearchMan -- this way, the location is checked  
whenever the default File::open method is used, for example. You can  
hook up arbitrary Archive classes here (even .zip files), not just  
directories, so it's quite flexible.


TODO:
=====
* I think SearchMan is badly named, I'd like to rename it, e.g.:
   ArchiveSearchManager, SearchPathManager, FileSearchManager, or
   DefaultSearchSet, MainSearchSet, GlobalSearchSet
* Also, I consider renaming the various openForFOO methods to  
something like makeReadStream / makeWriteStream. Rationale: If you do  
"node.openForReading()", this suggest that you open the *node* for  
reading. But you don't, rather, a complete new Stream object is  
created and returned to you, allowing you to read from the file  
*referred to* by the node. The new name would represent that more  
clearly, and also indicate that the caller is responsible for deleting  
the returned object.



Bye,
Max




More information about the Scummvm-devel mailing list