CS73N Meeting 11 Notes: XML and B2B for 14 Feb 03

By Gio Wiederhold, 19 Jan 2000., rearranged 13 feb. 2003

Topics Covered briefly

How to write

On the web stuff will be read by a wide variety of people, who must be convinced that they will benefit from the reading effort. In traditional publishing the publisher may provisde an editor to help you. If you publish on the web, and don't have an editor, be extra careful. Keep in mind that you must help the readers and customers. Wrting for business and science is not the same as writing novels, where you can expose the truth gradually.

  1. Introduce the specific topic and your objective early
  2. Identify yourself and give the date of writing - stuff will stick around
  3. Identify -- politely -- the expected level of the audience
  4. Use words appropriate to that level
  5. Use the same word throughout for the same concept
  6. Use different words, even if you have to grope for them, for concept that differ
  7. Minimize pronomial references. If there is any doubt of what an instance of `it, they, that, previous, following', etc. refers to, repeat the term.
  8. Avoid terms that introduce vagueness: `may, would, could', etc. If theree are exceptions, state what they are or how frequently they occur. Saying `the sky may be blue' conveys no information; saying `the sky is blue when there is no overcast' conveys information.
  9. Use chapters for very distinct topics
  10. Use separate sections or paragraphs each for specific argument
  11. Say only one thing in a sentence
  12. Write `Gender neutral' -- use plurals or repeat nouns and adjectives
  13. Recapitualte the points made at the end.


B2B needs: automation

Earlier time: HTML: for document transmittal, varied presentation, hierarchically structured + links to other HTML, IMAGES, etc.; ordered
Tags provide metadata for presentation ( HTML intro). Problem: The nice-for-people presentation doesn't really define what is being represented. For business use we want web pages that can be processed automatically.

To the rescue: XML: for document processing, hierarchically structured + links, more; ordered (except for attributes)

Read more in XML intro.

Whereas the HTML tags are common to all HTML documents, the XML tags are domain dependent. Domains might be:

For each domain the allowable tags, and the structure in which they appera has to be defined. That is done in a Data Tag Definition (DTD). To indicate if alements are optional, or can be repeated they are labeled with characters used in Regular Expressions.

Regular expression syntax

Important for formulating

  1. Representation grammars
  2. queries (getting some subset of the representation) sequence: (a,b,c)
    alternatives: (x|y), in combination (x|y, b,c) {x,b,c or y, b, c}
    optional: q$ {q | nothing}
    any: r* {nothing | r | rr | rrr | rrr... }
    repeats: s+ { s | ss | sss | sss... }

Example:
(((S|s)ection|paragraph(s$) )*.)
matches all citations looking like
Section xx., section xx., paragraph xx., paragraphs xx.
By setting a marker for xx, those text can be retrieved for display ot processing. A regular language is capable, but not really user-friendly.

XSL

To look at an XML file it must be transformed, best to HTML. Examples are given in the XML description.
For instance an XML catlog with entries as <Product> <Name>Pencil </Name> <Quantity>12 </Quantity> <Price>1.50 </Price> <Weigtht>60 </Weight> <Color> yellow </Color> </Product>
ETC

Would be instructed through an XSL program to

  1. Put a heading in " ITEM (boxed), Quantity/box , Price per box , ...
  2. Start a new line whenever a <Product> tag appears
  3. put a dollar sign in front of price
  4. put gram behind </Weight>

for a U.S. customer divide the number by 32? and put oz. behind </Weight> <

  1. Perhaps add a java routine to compute a total at the end, when the customer clicks [done]

References

Brief intro to XML.
0 Brief intro to RDS ADO [ASP, 25Feb 2000].
XSL information
See also the references.