[Scummvm-devel] File handling revamp

sunmax at libero.it sunmax at libero.it
Fri Aug 8 07:33:54 CEST 2008

Hi there ScummVM Team!

Ahhh files! (it sounds like a porn ;-)

I hope I am not too late to drop my 2 cents...

... I am just catching up with 2 weeks of FSNode redesign interesting
e-mails. I copied and pasted the parts I thought were most important
and added some notes and drop some ideas.

Here they come!

>> This approach makes it easy to dynamically add/remove dirs from the
>> search list (just dump the SearchDirectory object); or to search for
>> files only in a subset of all search dirs.

In how many dirs do we need to search for a file? ;-)

I thought it was just one (or at least one per kind: gamedata,
savedata, config), and its subdirs of course.

Why do we provide dedicated code for "savedata" and "config files" ?

>> easier to implement read buffering. If desired, we could even try
>> to provide some generic read buffering code.

I will check the current implementation on the PS2 backend which I
inherited and get back on this point later.

>> opening files in directories: case (in-)sensivity.

I know it may sound radical and fascist, but since we have do "Add"
the game anyhow, why don't we just refactor the filename cases at
that stage, I mean rename all of them to the only one case we allow,
let's say all lowercase? (or whatever the platform dictates).

The only issue would be for games played from CD where we cannot
rename the media, but how many people out there are doing this?
(well, I actually am on PS2 ;-).

We could even provide on the website games data already with the
correct case.

>> FSList

I think we should go for a FSMap instead: have a key used by the
engines to locate the data (e.g. "res0.dat") and match it to a

That's probably what engine designers want (I am not one, so it's
honestly just a hunch):

 - specify a data token (let's say always lowercase, e.g. again
   "res0.dat") and get and handle for it. The only place where we
   should be looking for when the game is started is "gamedata",
   which should be a cached TOC Map:

   you look for "res0.dat" and you get -> a handle for

   (or "C:GAMES\COMI\RES0.DAT" - the way it's stored is actually
   platform dependent cause it's still a FSNode at the end)

   We should cache the nodes recursively starting from "root"
   ("mass:comi/" or "C:GAMES\COMI\" in the above example), the "key"
   used by the engine is always going to be lowercase, the "fullpath"
   whatever was the actual full path of the file, so we know that is
   going to work, if somebody removes the files while scummvm is 
   running, or renames them to a different case, well we can't help..

   It that's already the way it's implemented, forgive me for being

>> A SearchDirectory object sounds like a good idea

What would be the advantage of a SearchDirectory object over
a simple Map?

>> So if you create a FSNode("foo"), and do ->exists() in it, should it
>> check whether there exists a "foo" in the namespace of Files, in the
>> namespace of ConfigFiles, or the namespace of SaveFiles?

Good point.

But in my vision we always start from a "root" to populate a list/map
(or SearchDir whatever), then you pick a node using its "key", if you
get one then you already now it exists (otherwise you would get NULL)
and in which namespace (or topology) it lives.

>> Btw, if we are going to redo the fs, I'd again like to propose
>> having different calls for creating a file node and a directory
>> node

Why do we need a "directory node" at all, when we have a full list of
cached nodes contained in "root" (ie. gamedata folder) ?

You just want to access the data from a keyword, whether it is a dir
or not should be hidden to the engines, or am I picking it wrong?

Of course I assume that there won't be something like "comi/res0.dat",
"comi/data/res0.dat", etc. But there is really a game with that issue ?

>> If you just want to open (during play time) a single file "bar.dat"
>> inside some directory "foo", then of course you should not be forced
>> to walk an FSList, agreed. You just want to do something like
>> f.open("foo/bar.dat");
>> or
>> f.open(FSNode("foo/bar.dat"));
>> or
>> f.open(FSNode("foo").getChild("bar.dat"));
>> or
>> f.open("bar.dat", mySearchPaths);
>> or even just
>> f.open("bar.dat"); // Search in the global list of default seach
>> paths

>> The first two simply represent convenience accessors, easy to
>> implement by writing a simple parser (relative) for POSIX style
>> passes, which would then internally create the corresponding
>> FSNodes, in a case insensitive fashion.

A FSNode is a pretty small entity, and there is a limited amount of
files in the few games I played, I would consider to TOC-cache them
all at start (I mean all the ones in gamedata folder), so that we
don't have to bother anymore later. Everything in -the list- exists
and it's reachable with a lowercase key.

I know, I am repeating myself ;-)

>> Search in the global list of default seach paths

We should have only a search path active at a time. When we play the
game is "gamedata dir", when we save a game is "savedata dir", and so
on. I don't think we need to search all of them all the times, right?

>> though (note: If dir "foo" does not exist, FSNode("foo") represents
>> an invalid node,

Not an issue, if we let the recursion from root to build the list of
FSNode for us at start. In this case everything that it's there exists,
and there is no need to add more FSNodes, no?

>> it's better to not jumble *all* the files in possibly a dozen dirs
>> into one big list, but rather try to be a more bit selective about
>> it :).

I agree: we need a list per category: game, save, config, etc.
But only one of them should be active at a time.

>> 1) f.open("foo/bar.dat");

Is there any reason why do we allow that? Are there multiple files with
the same name at different level in the same game?

>> FSNodes are only for data files (and maybe dump files, for writing),
>> they are *never* to be used for SaveFiles and ConfigFiles

I know it sounds like a dumb question: why do we handle SaveFiles and
ConfigFiles as special cases? (beside the fact that they are rw).

>> So, FSLists probably should store a ref to the FSNode they are
>> coming from..

;-) My penny goes to FSMap : "lowercase key" -> FSNode

>> When reading files, we want to search them everywhere.

How could they be outside gamedata when you are playing the game?

>> just open resources from GameDirectory, which would implement the
>> searching criteria (maybe leveraging on trait classes), and then 
>> return a SeekableReadStream*. Searching can be requested in
>> sub-directories, with patterns... (paste your fantastic ideas here)
>> We can even inhibit Common::File usage from engines!

Now, I do like this!

>> games can be safely considered to have non-ambiguous file names
>> (case-wise), and because the implementational burden is moved to the
>> backends.

Yep. That's the way it should be. You should be able to access any
resource just using one case (I favor lowercase) and the backend
will handle all the dirty work!

>> The concept is to model directories instead of a game file, and
>> give the engine an instance of the directory where the game is
>> stored. This could be asked to open a file, for which it would
>> return a readable stream, or a directory, for which it would return
>> another directory object.

I agree.

Only one point again, and since I am not an engine writer I am probably
missing the point: why do engines need to ask for a directory? Don't
they just need data? How does the location make a real difference?

>> The key is an object that manages a set of the dirs introduced
>> above, together with an array of settings that specify how searches
>> should be performed (per directory!), for example how deep a certain
>> directory can be searched, if hidden files should be considered,...

Here my solution slightly differs, cause I envisage passing to the
engines a black-box object where you enter a resource key and you get
an handle to access it, skipping the whole directories thing.

>> Also, these objects should cache the directory contents, making
>> lookups faster and relieving backends from the burden of doing such
>> caching

I agree with Max (both of them ;-)

The backend could then do "data caching" (beside TOC) or "read ahead"
on its file handling implementation behind the scenes it this is not
supported or sucks on that plataform.

>> But of course we could easily flush the file/dir caches at certain
>> points, too (like when returning to the launcher).

Agreed. I feel like you read my mind!

>> * Caching the contents of the given dir is mandatory. Just keep a
>> HashMap of the files. Easy to do (case insensitive) lookups against
>> that, too. I think I would lazily create it, the first time somebody
>> tries to access a file in the dir.

I would always do it on "gamedata path", before starting the game,
we are going to use those files pretty soon ;-)

>> We could use the full path provided by FSNode as the key into the
>> HashMap.

I would just use the keyword (ie. the resource name) that the engine
is going to access, and map it to a full FSNode, which then contain
the full path of the file.

 HashMap<String, FSNode>
           ---- lowercase resource key used by engines

>> we might want to also search for files in the "resource" or
>> "RESOURCE" subdir

I would say to skip it. Or to let the entity that is going to create
the HashMap to decide which one to include but the key generated
should be always lowercase so to give to engines developer a
universal rule.

>> After all, we would retain the case on the strings when storing
>> them in the hashmap.

I would keep the case in the path inside the FSNode, but drop it in
the key.

>> I would be glad if engine authors would drop their considerations
>> about it.

Yes, please do.

Looking forward to your feedback and ideas guys!


More information about the Scummvm-devel mailing list