[Scummvm-devel] README
Max Horn
max at quendi.de
Mon Feb 9 01:36:04 CET 2004
Am 08.02.2004 um 18:51 schrieb Marcus Comstedt:
>
> Max Horn <max at quendi.de> writes:
>
>> Well, all this still leaves my question open:
>>
>> How do you output useable PLAIN TEXT from a (La)TeX source?
>
>
> A working, but slightly hackish, way to do it is
>
> latex2html -no_navigation -info 0 -no_subdir foo.tex
> lynx -dump foo.html > foo.txt
>
And if you add "-split 0", it'll put everything into a single HTML
file, yeah. I ended up trying this:
latex2html -no_navigation -split 0 -info 0 -dir html
-show_section_numbers readme.tex
At least to me, already the result of latex2html don't look that great.
But when I then use lynx to dump it into a text file, it's far far
worse than our manually formatted README. Granted, I always expected a
loss compared to the hand tuned plain text layout in there.
I also tried html2text (from
http://userpage.fu-berlin.de/~mbayer/tools/html2text.html). In many
cases the output was better; but in some places, lynx "wins". I used
this command for it:
html2text -o readme.txt -nobs -style pretty readme.html
If you wonder what I am talking about, some sensitive spots:
* "7.5 Using MP3 files for CD audio": Look at the example command (and
whether it sticks to the paragraph before it or is nicely distinct)
* "1 About": Are paragraphs separated by a blank line, or do they all
stick together
* "2.1 Reporting Bugs": look at the text "Please include the following
information": does it stick to the paragraph before it, or the table
after it, or is it separated from both with blank lines?
* "5.1 Command Line Options": The list of command options is unreadable
in the lynx output, but fine with the other tools
* In the same section, the Examples are much better in html2text than
in all the others
* The credits are quite sensitive (esp. the 'headlines' of the
subsections)
* html2text does funny things with the underline style
("This_is_underlined_text")
* the slight indention lynx uses everywhere makes it a bit easier to
visually navigate to certain sections
Furthermore I tried elinks and w3m. It seems elinks produces output
which is strictly better than lynx. With w3m, I noticed that it
produces "bad" lists (a blank line between each list item).
Next thought was that maybe some improvements could be achieved by
using another converter from LaTeX to html. The only one I know is
"hevea" (http://pauillac.inria.fr/hevea), which is written in Ocaml.
First thing to notice, it's *much* faster than latex2html. I am talking
about an order of magnitude at least.
Running the hevea created HTML through w3m, elinks and html2text gives
results which are pretty good, I think. Definitely better than
converting the latex2html output. Forget about lynx, though, it's still
giving crap output.
Hevea has actually a text output mode! Very nice, since it does fancy
things like doing "ASCII underline", e.g.:
2.1 Reporting Bugs
===================
At first some of the tables looked *really* bad, though. In the credits:
Hannes Readme Conversion
Niederhause
n
However, that turned out to be caused by the fixed (4cm) table column
width. We could either change that back, or maybe there is a switch for
hevea to make it ignore that. Anyway, reverting "p{4cm}" back to "l" in
said tables, it worked fine. Still, I have some grieves with this
direct text mode.
Hevea also allows embedding HTML commands in the LaTeX source, for
custom formating, which I think would be useful.
Summary: To me, the only adequate output was generated by hevea +
elinks/w3m/html2text. But to let you draw your own conclusions, I
included the generated files in an attachment to the mail. [I had to
remove the attachment, apparently the first time I sent this mail, 12
hours ago, it got filtered out due to it]
The HTML output of hevea and latex2html is OK, too. As a result, I
don't see any advantage in writing the docs in HTML (converting them to
text would still have the same problem described here; but making nice
PDF output would be harder, and we loose all the structural information
which LaTeX has about a text).
Cheers,
Max
More information about the Scummvm-devel
mailing list