[Scummvm-devel] [RFC] ScummVM website re-write

Fri Jun 27 21:58:25 CEST 2008

Hi,

Here's my request for comments regarding the ScummVM website re-write. 
Everything should hopefully be covered below. I've attached a tarball (bzipped) 
with the new files (and some old, like the news items). Should it for some 
reason have been removed I have also mirrored it at [1].

[1] http://web.telia.com/~u85920559/RFC-ScummVM_website_rewrite.tar.bz2

Best regards,
Fredrik

                             REQUEST FOR COMMENTS

                           ScummVM website re-write

+---====---+
| CONTENTS |
+---====---+
I ..... Pros / Cons
II .... Design descisions
III ... Current project layout
IV .... What's done / missing
V ,.... How to handle...
VI .... Some thoughts
VII ... Future work

+-=--        --=-+
| I. Pros / Cons |
+-=--        --=-+

Pros:
- New code base, rewritten from scratch keeping logic, data and presentation
   separated.
- Should adhere to the ScummVM Code Formatting Conventions (let me know if I
   missed something).
- Uses XML for storing data, being a standard it can be reused easier than
   storing data in PHP variables.
- A bonus from separating out presentation is that the template engine (Smarty)
   supports caching, which should speed things up a bit. [1]
- The new codebase will work under PHP5 (there are some minor things in the old
   codebase that will not work under PHP5), which is more futureproof as PHP4
   will reach it's end of support on 8th august 2008.
- PHPDoc commented code (similiar to JavaDoc), should make it easier to follow
   the code.

Cons:
- Old, proved to work code is "lost".
- For a while the internal knowledge about the website is removed and the team
   must depend on external knowledge.
- URLs have changed (example: from /downloads.php to /?p=downloads), old forum
   posts and such will have outdated links (I will fix all the links in the news
   files and other internal data). [2]

[1] Currently only a few things are cached, one would be the FAQ-page (faq.inc).

[2] I would really like to use mod_rewrite and make them even nicer so that you
     can type "http://scummvm.org/downloads" (see section VII).

+-=--              --=-+
| II. Design decisions |
+-=--              --=-+

I've choosen keep the code separated like a backend/frontend, which should allow
allow for greater reuse as well. If some future subproject (in PHP) wishes to
use the data on the website it will be easy to do so without rendering HTML.
The frontend would be the Controller and all Page-classes, while the backend
would be the rest (roughly).

I've opted for moving away from keeping the data in PHP (as arrays/variables)
and using pseudo-XML to use real XML. I believe it's better to keep the data and
logic separated, both because it makes the code easier to read and it allows for
better reuse of the data. The choice to use XML was a small step going from the
pseudo-XML already used for parts of the website. Just adding the
<?xml [...] ?> tag and a root element makes it well-formed XML. Though you'll
notice that unlike the pseudo-XML I use lowercase for all the tags (as I find it
nicer and I'm XHTML biased).

And since PHP have the basic XML-parser functions enabled by default (and as far
as I can tell they are enabled on SourceForge as well) it seemed a lot better
than making up a new text-based format and write a parser for that.

Separating out the presentation and putting it in templates was something that
really needed to be done. The current codebase is one heck of a mix between
PHP and HTML that's not particulary easy to follow. The template engine of
choice is Smarty [2] as I've got previous experience using it, and it supports
caching. Being able to cache the rendered result should speed things up a bit.
It should be more I/O than CPU processing when caching is enabled.

For those of you who has read about design patterns, you should hopefully see
that this is me trying to follow the MVC (Model-View-Controller) [1] pattern.

Since I'm using XML, and the simple XML-parser doesn't speak HTML I have had use
something else. I've opted for using BBCode since I think it's easy enough to
use, and most people should be familiar with it already.

[1] http://en.wikipedia.org/wiki/Model-view-controller

[2] http://smarty.net

+-=--                     --=-+
| III. Current project layout |
+-=--                     --=-+

Currently the code works as follows:

1. index.php will load the 'config.inc' file, the Utils class and then the
    Controller/MenuFactory classes. It will do a trivial check just to check that
    there is some config and then call on Utils::switchPage().

2. This static function will look up what page was requested and then load the
    PHP-file, create a new instance of the class and call on that class'
    index()-function. If the requested page is not part of the lookup table of
    valid pages it will default to the mainpage.

    All *Page-classes are subclasses of the Controller.php class which contains
    the needed set up steps and wrappers for Smarty.

3. The *Page-class will load the needed *Factory-class to get the data that
    should be presented on the page. Each *Factory.php file contains not only
    the Factory class definition, but also the object class definitions for the
    objects being created by that Factory. (This is just so I don't have to add
    even more require_once()-calls all over the place. Could be replaced with
    an autoloader function in PHP5.)

4. The Factory classes use the XMLParser.php to read the data out of the
    XML-files.

As you can see there isn't much to it, it's mostly the Factory-classes that do
something. The Page-classes pretty much only fetches data from the
Factory-classes and sets up the variables for Smarty, which then renders the
HTML. All functions should return false incase they aren't able to complete
their operation (ie: file not found, or wrong permissions to read a file).
But at the moment most of the code just assumes that everything went fine.

The file structure is currently:

/ (project root)
--/css
--/data
   --/compatibility
   --/media
   --/news
   --/screenshots
--/downloads
--/include
   --/smarty
--/templates

The css and templates directories contains all stylesheets respectively
templates. The data directory, and it's subdirectories contains all the data
used (XML-files, screenshots and media files). The downloads directory should
contain the downloads not available as "normal" SourceForge files, ie: daily
SVN builds and subproject SVN builds. The include directory contains the
'config.inc' file, the Factory classes and the Utils class. It also contains
a small PHP-file called 'emulate_php5.php' containing simulated functions that
are available in PHP5 but not PHP4. The subdirectory smarty contain all the
smarty code. Finally the templates directory contain all the templates. The root
directory contains the index.php file and all the *Page.php files.

+-=--                   --=-+
| IV. What's done / missing |
+-=--                   --=-+

What's done:

Currently all the items under 'Misc. Menu' are done. Under the 'Documentation'
menu group I've got Compatibility done, as well as FAQ/Credits but since those
just read the already generated HTML they don't really count. The documentation
is also pretty much done, except for the 'SCUMM Data File MD5 Checksums' which
also reads some pre-generated content. I'd rather write a parser for each of
these that reads from the source directly. That way data/presentation separation
can be reached for those pages as well. For the 'Main Menu' I've got the home
page and all the news reading done. Though it still reads the old format as I've
not converted them over.

What's missing:

Well, plenty of stuff really. I haven't touched the screenshots page since the
data is at the moment stored in two different places. The screenshots category
data is stored in PHP, while the information about the images and their captions
are stored in a CSV-file. I'm not sure what to do with this, but personally I
don't think it's nice to have half the data stored in one place and the other
half in another. I could make it all XML if that is desired, or if there are
some really good reasons to keep the CSV-file I could just put the category
information into XML.

The downloads page is something I've avoided as well. Those seems to be
hardcoded and only a few of them (like the sources) use the $current_release
variable. These can easily be converted to XML, though I'm not sure if Smarty
will be able to parse variables that way. If not I could always have the factory
that will parse the XML take care of that. And the XML-parser should ignore
commented blocks so if some build is late, you should only need to comment it
out.

On the documents page there is a document called 'SCUMM Data File MD5 Checksums'
that is some prerendered HTML. As I've already mentioned that I rather parse the
source directly.

The news is still handled by reading the old format. I would however like to
change those to real XML as well. That would however mean to change all the
HTML-tags to BBCode (or something else) at the same time. That's something I
don't want to do until everyone have voiced their opinions.

BBCode parsing isn't enabled at the moment, I'm thinking of creating some
abstract class which all objects should inherit from. In that class I'd put a
function that given an array it would check the keys and set the class variables
where there is a matching key. That function would also make sure to parse the
data first.

+-=--             --=-+
| V. How to handle... |
+-=--             --=-+

All the pre-rendered content is something I'd like to get rid off. I've looked
at the credits.pl script and wouldn't mind hack it to make XML for the website
instead of HTML. The --html switch could become --xml-website (and the --xml one
could be changed to --xml-docbook) to reflect the change.

I found a faq.xml in docs/trunk/docbook, as well as a faq-inc.xsl, is that what
is used to generated the faq.inc file? That one could be difficult to use, at
least with the XML-parser as it would pick up all the docbook tags as XML-tags.
I could write some code that makes the XML-parser ignore a given list of tags
and that should hopefully be enough.

I haven't found the md5table tool, nor the file it uses as source. I could write
a script that includes the md5.inc and produces XML. But unless the source for
md5table is difficult to parse with PHP it seems like a waste of time to convert
the already converted data in yet another step.

These are of course low priority at the moment, getting everything to work is
the first and only priority at the moment.

+-=--           --=-+
| VI. Some thoughts |
+-=--           --=-+

This is a small collection of some of my thoughts on the current code and how
some small things could change.

The could should work fine on PHP4, I have PHP5 installed and that's what I've
tested against. However I've made sure to only use what's available in PHP4. But
there might of course be something that I've missed that prevents it from
working. Report all such issues so I can take care of those.

At the moment all the different pages have their own *Page class, which is
unnecessary. The reason for this is that at the beginning I was too eager to
start coding that I didn't go through all the code. Unless anyone have any
strong opinions to keep them separated I'm going to merge them into one Page.php
file as different functions. The Utils::switchPage() function will just call on
those functions instead.

On the news page, the date is formatted using PHP's date() function, however
Smarty only supports the strftime() options. That's fine for everything, except
where the english ordinal suffix (-st, -nd, -rd, -th) is used. Currently I'm
using a third-party Smarty plugin, but I rather generate the correct output in
the Factory and add it to the object. I think it's better than to depend on
the plugin, which might stop working with future Smarty versions. Or the file
might get deleted when updating Smarty.

I might have taken the OO-part a bit to the edge by making objects out of pretty
much everything. Just look at the SubprojectsFactory, the Subprojects-class
only contains an info string and an array with Project-objects. It could easily
have been represented with an array containing the same information instead.

+-=--          --=-+
| VII. Future work |
+-=--          --=-+

Something that really should be done is update all the templates to move away
from the current design which is built with tables. Instead it would be nicer to
use CSS/divs.

I also would like to change how the URLs are handled by using mod_rewrite [1].
That would mean instead of http://scummvm.org/?p=downloads you would get
something like http://scummvm.org/downloads. It's not really necessary, I just
think it looks nicer, and it might also help for search engine optimization.

[1] http://httpd.apache.org/docs/1.3/mod/mod_rewrite.html or
     http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: RFC-ScummVM_website_rewrite.tar.bz2
Type: application/octet-stream
Size: 476501 bytes
Desc: not available
URL: <http://lists.scummvm.org/pipermail/scummvm-devel/attachments/20080627/57f81bf1/attachment.obj>