This part of the course introduces Web science. The course has no prerequisites and it is targeted to non-technical undergraduates.

Web science, an emerging interdisciplinary field, takes the Web as its primary object of study. This study incorporates both the social interactions enabled by the Web's design and the applications that support them. The Web is often studied at the micro scale, as an infrastructure of protocols, programming languages, and applications. However, it is the interaction of human beings creating, linking, and consuming information that generates the Web's behavior as emergent properties at the macro scale. These properties often generate surprising properties that require new analytic methods to be understood.

For example, when Mosaic, the first popular Web browser, was released publicly in 1992, the number of users quickly grew by several orders of magnitude, with more than a million downloads in the first year. The wide deployment of Mosaic led to a need for a way to find relevant material on the growing Web, and thus search became an important application, and later an industry, in its own right. The enormous success of search engines has inevitably yielded techniques to game the algorithms (an unexpected result) to improve search rank, leading, in turn, to the development of better search technologies to defeat the gaming. More recent macro-scale examples include photo-sharing on Flickr, video-uploading on YouTube, and social-networking sites like Instagram, Facebook and Bonfyre (a growing social network startup based in St.Louis, MO).

The essence of Web science is to understand (see e.g., this video) and to design (see this other video) systems to produce the effects we want.

Given the breadth of the Web and its inherently multi-user (social) nature, its science is necessarily interdisciplinary, involving at least mathematics, computer science, sociology, psychology, and economics.

Micro vs. Macro: Themes of Web science

We focus on these themes as they apply to Web builders – people who contribute links and other content to the Web:

Some Web builders consider themselves Web developers; others consider themselves bloggers; others merely post an occasional comment on someone else's blog or discussion forum. We say “Web builder” to encompass the full spectrum of people who contribute links and other content to the Web.

Our lab curriculum provides an informal hands-on approach to the task of building a Web site. The web science material (which you are now reading) explain large scale Web phenomena and other interesting web algorithms and dynamics; for example, the Amazon recommendation algorithm and the Google PageRank algorithm.

The sociology, psychology, and economics of this course follow Duncan Watts’ Six Degrees, which we recommend as a narrative companion to our own material. Our complete suggested reading list is below.

Online safety

Networks

Sets Explicit Notation for Sets rightarrow

Cardinality rightarrow

Subsets

Venn Diagrams rightarrow

Union and Intersection rightarrow

Ordered Lists rightarrow

Implicit Notation for Sets rightarrow

Logical Expressions rightarrow

Compound expressions with “or” rightarrow

Compound expressions with “and” rightarrow

Union and intersection defined formally rightarrow

Similarity of Sets

Graphs Undirected and Directed rightarrow

Neighborhood and Degree rightarrow

Density and Average Degree rightarrow

Paths in undirected graphs defined formally rightarrow

Paths in directed graphs rightarrow

Length

Distance

Network Structure

Connected: a word of many meanings rightarrow

Induced Subgraphs rightarrow

“Connected” defined formally rightarrow

Connected graphs and connected components rightarrow

Hubs

Clusters

Defining clusters rightarrow

Connected components and Cliques

Network Dynamics

Limitations of traditional graph theory rightarrow

Introduction to network dynamics rightarrow

Three models of dynamic graphs rightarrow

Random graphs rightarrow

Demonstration of random graph dynamics rightarrow

Random graph algorithm rightarrow

Clusters and homophily rightarrow

Triadic closure rightarrow

Triadic closure algorithm rightarrow

Hubs and cumulative advantage rightarrow

Preferential attachment algorithm

Variables, Probability, and Scale-Free Networks

Understanding that the Web is a scale-free network requires some probability theory:

Variables in mathematics rightarrow

Variables in algorithms rightarrow

Random variables rightarrow

Discrete vs. continuous variables rightarrow

Probability distributions rightarrow

Degree distributions

Information and Computation

Information, computation, and algorithms rightarrow

Summation: an example of what computation is rightarrow

HTML: an example of what computation is not rightarrow

Computing distance, part one: Information diffusion rightarrow

Computing distance, part two: Example rightarrow

Computing distance, part three: Algorithm

Collaborative Filtering

“Expert opinions” without the experts rightarrow

Tuples: content of CF rightarrow

Bipartite graphs: structure of CF rightarrow

Structural equivalence: computation of CF rightarrow

The four steps of collaborative filtering

The Long Tail

Macro-analytic view of collaborative filtering rightarrow

Power law revisited rightarrow

Niches, megahits, and the neglected middle rightarrow

Macro-analytic view of the long tail rightarrow

Macro view of Web programming

Influence in Networks

Popularity, influence, and centrality rightarrow

Introduction to PageRank rightarrow

NetRank: a simplified version of PageRank rightarrow

Normalization and convergence rightarrow

The NetRank algorithm rightarrow

Dividing by outdegree: the NR* formula rightarrow

The PageRank formula rightarrow

The damping factor: PageRank as probability

Competition and Cooperation

Dynamics of popularity and influence rightarrow

PageRank competition rightarrow

Doing the right thing rightarrow

Mutually assured construction rightarrow

Authority, reciprocity, reputation rightarrow

Game theory rightarrow

Winners’ dilemma

CSCI-1080 Intro To CS: World Wide Web

Saint Louis University
Department of Computer Science

Web Science Preface

Micro vs. Macro: Themes of Web science

Online safety

Networks

Network Structure

Network Dynamics

Variables, Probability, and Scale-Free Networks

Information and Computation

Collaborative Filtering

The Long Tail

Influence in Networks

Competition and Cooperation

CSCI-1080 Intro To CS: World Wide Web

Saint Louis University Department of Computer Science

Web Science Preface

Micro vs. Macro: Themes of Web science

Online safety

Networks

Network Structure

Network Dynamics

Variables, Probability, and Scale-Free Networks

Information and Computation

Collaborative Filtering

The Long Tail

Influence in Networks

Competition and Cooperation

Saint Louis University
Department of Computer Science