CS73N Meeting 01 Notes Protocols, Software 10Jan03

Updated by Gio Wiederhold, 10 Jan, 16, 23, 24 Jan, 8 Feb  2003.

Topics Covered briefly

Do: Suggestions for topics, glossary, ...

Technology Origin:

Arpanet, Internet growth, business.

Resource sharing for ARPA (the Defense Department's Advanced Research Projects Agency) and its Contractors, academic and industrial.

(o ARPA more focus on benefits , influence on industry and technology-- populist,  dominant in Democratic administrations

 o DARPA - more isolation - more science -   elitist  - dominant in Republican administrations)

Long development cycle:

Reliability required

Robustness: redundant linking over multiple paths

Limited loss on failure, load balancing: packeting of messages

Motivation for changes in the communication infrastructure: reliability in the presence of threats through redundant paths, nodes; then access for all academics, commercial operation and participation for viability.

Scalability (dealing with millions versus thousands of nodes ) required

 1. three level naming

 2. mapping from names to IP addresses, performed by Domain Name Servers (DNSs)

Transmission Control Protocol: TCP, specifies

Packets with headers: from, to, number
Nodes with forwarding information tables

Later: four level adresses: 171.64.64.64. (what is yours?)
Because address tables got too large - Now cached at designated nodes: Domain Name Servers (DSN>.
Three level addresses: cs.stanford.edu, ranslated by domain name servers (DSN)

Transmission Services

Shared resource use required:

  1. Remote use of computers: TELNET protocol
  2. File transfer for remote execution: FTP protocol

Provided a real population on which stuff could be tested  -- without realistic use and feedback the net would have been a pipedream.

Email was not thought of initially, first accomplished via multiple FTPs and TELNET operations. Later a simplified protocol for email: SMTP

HTTP: HyperText Transmission Protocol

HTML: HyperText Markup Language -- an instance of SGML

SGML: Structured Graphic MarkUp Language

                        from 1970ies - for paper document formatting on computers, also motivated by military, designed by IBM

            SGML is a Meta language - HTML is one choice, others could be the format for the Journal of Psychology, for AirForce reference manuals, for McGraw-Hill College books etc.

Markup: tags for printer or computer presentation

Hypertext:  a document with links to more relevant stuff-- look at a document  <A HREF= "something else, somewhere else">what</A>

Current Technologies

Sharing among major backbone providers {MCI, SPRINT, Worldcom, UUnet, AT&T } of bandwidth resources by informal balancing of contributed capacity versus load experienced.

Internet protocol (IP), provided for interconnection of local subnetworks networks (LAN>, connected by Internetwork routers.
LANs often use the Ethernet protocol for locally wired networks with

  1. Carrier Sense Multiple Access (CSMA) and
    Collision detection (CD) and
    exponential backoff

developed originally for satelite networks (Aloha net in Hawaii).

The ethernet protocol is a broadcast protocol, versus a point-topoint protocol.

 

Software development and testing

How did we get into this topic?

Unit testing: programmers test their own stuff: Easy to miss errors, because the testing assumptions will match the authoring assumptions

Alpha testing: components (or units) are integrated to become a product, and tested by a specialized group, unaware of the assumptions.

Beta testing- Software released to selected customer. Now actual practice, not assumptions dominates.

After delivery - error reports come in, have to be filtered because they are likely redundant -- or trivial. But what's distilled from them can be used to fix the software, maybe in a later version.

Markup  (Back to text)

Specify layout, type of print, bold, italic, size, headers, paragraph boundaries, tables, etc.

with otherwise invisible Commands as <B>boldface stuff</B>.

HTML

Hyper (multi-linked) Text (documents) Markup (with format annotations) Language, Used to markup documents so they can be easily shown on a variety of computer devices, and reference ( HREF ) local and remote documents and images. Remote documents require a computer address (http://www.somewhere.xxx ) so they can be found.

Document Formats

Paper: arbitrarily structured/unstructured; physical order.
Books: somewhat structured/unstructured; layout order; metadata: ToC, index.
Tables: very structured. Exceptions awkward -- footnotes
Databases: very structured. Machine processable, queryable. Exceptions awkward.

relational: tabular based, links by references, join operator; unordered. student|><|course-info
object-oriented: tree-based, structural (and optional reference) links; ordered (often)

SGML: for document printing, hierarchically structured; ordered
HTML: for document transmittal, varied presentation, hierarchically structured + links; ordered

Components

Three older inventions combined:

    1. Document Markup for typesetting: SGML [IBM -- Air Force about 1975]. Markups are metadata for presentation  ( HTML intro).
    2. Hypertext linkages to create a hierarchical document [Nelson, about 1960]. Uses Hyperlinks: http://computer/directory/file+/entrypoint$ (see Regular expression syntax)
    3. Simplified FTP, with embedded site address (http://cs.stanford.edu/account/...) avoiding having to login [BernersLee@CERN], uses Internet-based addressing for remote documents

Two Technologies:

    1. Ability to access documents remotely FTP extension: Hypertext transfer (Http) -- an FTP that responds to markup entries.
    2. A browser [Mosaic by [Andreesen, Bina the Univ.of Illinois HPPC center. A browser program interprets HTML, with http, and integrates text, images, and remote references (hyperlinks)

and a business requisite

A community of high-energy physicists who

    1. benefited from rapid access to complex documents [ at CERN]and
    2. had the computers on which the (free) browsers could be installed.  [Mark Andressen & Erik Bina at Univ,of Illinois Champaign Urbana center]

Browser competition [Clark-Netscape] [Gates-Microsoft]

Learn by reading and doing

Reading: Bring in a simple HTML web document (like this one), and see what it looks like

    1. in Netscape [View] [Pagesource]
    2. in MS Internet explorer [View] [Source]

If you look at a `commercial' web page you will find many markups that we won't have to care about. Make notes about the ones that puzzle you and discuss them in class. The essential ones are listed in our CS99I HTML notes.
Doing, indirectly: Create a document with, say, Microsoft Word, save it as HTML, and look at it.
Doing, directly: Create a document with HTML markups yourself, as shown in the notes, and then save it as text. May be easiest to use a dumb editor, as Wordpad, Notepad on PCs or vi, Emacs on UnIX.

Change (rename) the postfix from .txt to .html, and then look at what you have created.

Role of HTML in e-commerce?

Advantages and Limits

Reliability
Readability
Processability
Granularity
-- (structure: word, line, paragraph, chapter, book )
-- (object: value, name-value pair, item, person, group, community ) with alternatives (family vs dorm)

Internet Service Providers (ISPs)

Multiple levels of providers:

  1. Local servers for LANs. Connect to enterprise networks or ISPs. Buy bandwidth from ISPs for external use. 
  2. Enterprise networks (in large companies, universities) or small ISPs provide shared servers, DSN, routers and linkages among their LANs.  Buy bandwidth from major ISPs for external use.
  3. Local Internet service providers (ISPs). Buy bandwidth from regional providers for regional - wide area  (WAN) use.
  4. Regional Internet services {BART, Los Nettos, (what was yours in your home town?) }  Buy bandwidth from backbone providers for long-range use.

Backbone linkages -- wide, transcontinental links (leased from the phone companies, as MCI etc.). Trade bandwidth, portal demands inormally. {MCI, SPRINT, Worldcom, UUnet, AT&T }

Reading: CS99 chapter about the Internet.

Research project requirement

Heilmeyer's Catechism:

What is the problem, why is it hard? How is it solved today? What is the new technical idea; why can we succeed now? What is the impact if successful? How will the program be organized? How will intermediate results be generated? How will you measure progress? What will it cost?

[George Heilmeyer, ex ARPA director, then TI, GE Aero, then at Bellcore, retired?]

Growth

Expectation versus reality: Powerpoint Figures or HTM Figures, some slides from 1998.

Excessive expectations at the point of visibility.

Insufficient realization of the concomitant speed of social change -- 10-20 year -- generational -- cycle

Primary effect - many high risk-taking startups -- much upside promise, little downside risk of failure.

Secondary effect (Not recognized in 1998); orders brought to infrastructure companies were in the aggregate in excess of what a realistic market could absorb, but its very hard to say no to your salesmen.

Now we have much available infrastructure - some ready to be used, as fiber in the ground