CS99I HTML INFO

Abstract by Gio Wiederhold.

HTML briefly

We describe only a few basic commands of the HyperText Markup Language (HTML). The current common version is HTML 2.0, but 3.0 is often available. In a browser you can inspect or save the source file to learn about the formatting that was used. Not all browsers handle all formats, and they certainly don't treat them the same way.

Conventions

HTML is an application conforming to ISO 8879 (Standard Graphic Markup Language or SGML). SGML uses embedded directives to indicate formatting, while leaving the interpretation to the client's display program and its knowledge about the screen, paper, user preferences, etc. These directives are bracketed by Less-Than(<) and Greater-Than (>) symbols. To enable this note to show the directives we use square brackets ( [ , ] ) in their place. Browsers may ignore stuff in these [brackets] they don't recognize.
There are also special characters, which start with an ampersand (&).

General layout

Each document should start with
[!Doctype html public "-//W3O//DTD/ W3 HTML 2.0//EN"]
and
[HTML].
Most commands have a corresponding closu[/HTML] at end of the document.

 

A document is split into a HEAD and a BODY.

 The HEAD is for external information, as the TITLE, used by the browser for its frame, and the external name of the page to the browser, i.e.,
[HEAD][TITLE]HTML information for CS99I book[/TITLE]
[BASE HREF="http://www-db.stanford.edu/pub/gio/CS99I/html-info.html"]
[/HEAD]

and a BODY, i.e.,
[BODY] followed by everything in the document, until the closing [/BODY],
except for [! comments not to be displayed ]

Headers and paragraph breaks

There are six levels of section headers:
[Hx]heading text[/Hx] x = 1..6
We use [H1] for the chapter headings, [H2] for the major sections, and [H3] for subsections.

[P] starts a paragraph
and

[BR] forces a linebreak (used liberally in this document).

Lists are a of three types:
[yL] list: [UL] unumbered; [OL] numbered; [DL] definition
Each list entry starts with [LI]
and the list is terminated by [/yL].

Normally you want to leave as much formatting as possible to the browser, since it will adjust itself to the available page size and customer preferences, but formatting can be disabled by bracketting
[PRE] preformatted asis [/PRE].

Cross References

The ability to go to other documents is the main innovation of HTML.
[A HREF="filename"] mousearea [/A] as
[A HREF="http://db.stanford.edu/pub/gio/CS99I/intro.html"]CS99I Introductory Chapter[/A]
This also works to go to files that are in other formats, if your browser has the appropriate plugin, say Ghostscript for
[A HREF="http://db.stanford.edu/pub/gio/slides/atarpa.ps"]ARPA postscript slides[/A].
One can also go into the middle of a document, if a name has been given to the entrypoint:
[A HREF="#SecSix"]Section 6[/A] --> [A NAME="SecSix"]
(Note: The NAME=definition appears not to work inside of TABLEs)

Images

There are many image formats, they in general
[IMG Align=top/middle SRC="imagefilename.format"]
Standard formats are
  1. .gif, the most common graphic image format used with HTML
  2. .tiff (Tagged image format) is often avaialable as well;
  3. perhaps .xbm for XBitmaps, a UNIX format
  4. .jpg or .jpeg is becomimg more popular.
It depends on the browser's plugins what can be handled.
One can also create clickable areas within an image.
In UNIX use xv to edit images.

email addresses

Other commands [BLOCKQUOTE] for quotations[/BLOCKQUOTE]
[ADDRESS] for addresses [\ADDRESS]
[CENTER] text [/CENTER]

Special characters

Some of these symbols starting with &
  1. &lt for <
  2. &gt for >
  3. &amp for &
  4. &quot for "
  5. &nbsp for a non-breaking space
  6. &shy for a low dash (­)
  7. &ouml for o-umlaut (ö)
  8. &#169 copyright symbol (©)
  9. &#trade trademark symbol (™)
and many others. A semicolon can be used after a symbol to terminate it, the semicolon will not show.
[NULL] creates an invisible break, useful when combining special and ordinary characters.
A [HR] creates a horizontal rule,

+0123456789|10111213141516171819|
0 | |
20| !"#$%&'|
40()*+,-./01|23456789:;|
60<=>?@ABCDE|FGHIJKLMNO|
80PQRSTUVWXY|Z[\]^_`abc|
100defghijklm|nopqrstuvw|
120xyz{|}~|ƒˆŠ|
140ŒŽ|˜šœžŸ|
160 ¡¢£¤¥¦§¨©|ª«¬­®¯°±²³|
180´µ·¸¹º»¼½|¾¿ÀÁÂÃÄÅÆÇ|
200ÈÉÊËÌÍÎÏÐÑ|ÒÓÔÕÖ×ØÙÚÛ|
220ÜÝÞßàáâãäå|æçèéêëìíîï|
240ðñòóôõö÷øù|úûüýþÿ|

Font Styles

Styles, relative sizes, and colors can be indicated, but your browser chooses the actual representation.
[FONT with options to increase the SIZE=+1 until [/FONT]
and/or set the COLOR=BLUE] until [/FONT]

Logical styles


[EM] Emphasis italics [EM] ; we use these for words cited in the glossary.
[STRONG] Strong emphasis italics [STRONG]
[CITE] book, journal citation italics [CITE]
[KBD] typing font [KBD]; we use these for examples of type-ins.
[VAR] substitution example font [/VAR]

Physical styles

[B] bold [B]
[I] italic [I]
[TT] typewriter [TT]

Tables

Just a summary example.
[TABLE] [TABLE BORDER=3] [TABLE CELLSPACING=2 (standard)]
[CAPTION] one line only, centered, plain, last line wins[/CAPTION]
[TR][TH]a row of centered (default) header items [TH] more [TH] for as many columns as wanted
[TH WIDTH=pixels or WIDTH=percent%]
[TR][TD]a row of left-aligned (default) data fields [TD] more data [TD]
[TR] more rows, joint field width automatic, multi line automatic[TD] [TD]
[TR]more rows
[TD or TH options include

[/TABLE]

Counters

The counter we are using for the Web-book is installed on the server amberjack.stanford.edu. An example would be:
<img src="http://amberjack.stanford.edu/cgi-bin/Counter/Count.cgi?df=sample.dat">

More information on can be found at the counter's home page.

HTML Checkers

One possible HTML checker is the Web Site Garage.

Notes

See Chris Hector "rtftohtml" to convert Word files to html Cray Research Tech.report, 1995 ftp://ftp.cray.com/src/wwwstuff/RTF/rtftohtml_overview.html.
See also the references.