CSCI-1080 Intro To CS: World Wide Web

Saint Louis University, Department of Computer Science

Graph Theory

A graph is a formal mathematical representation of a network (“a collection of objects connected in some fashion”).

6 nodes graph 

Each object in a graph is called a node (or vertex). Corresponding to the connections (or lack thereof) in a network are edges (or links) in a graph. Each edge in a graph joins two distinct nodes.

More formally, we define a graph G as an ordered pair G = (V,E) where

  • V is a set of nodes (vertices).

  • E is a set of edges (links).

  • Each edge is a pair of vertices.

In other words, each element of E is a pair of elements of V. Example: The picture above represents the following graph:

  • V = {1,2,3,4,5,6}

  • E = {{1,2},{1,5},{2,3},{2,5},{3,4},{4,5},{4,6}}

  • G = (V,E)

Loops: An edge with identical endpoints is a loop: We disallow loops: any graph G=(V,E) by definition has no loops in its set of edges E.

Undirected and Directed

Undirected graph: The edges of a graph are assumed to be unordered pairs of nodes. Sometimes we say undirected graph to emphasize this point. In an undirected graph, we write edges using curly braces to denote unordered pairs. For example, an undirected edge {2,3} from vertex 2 to vertex 3 is the same thing as an undirected edge {3,2} from vertex 3 to vertex 2.

Directed graph: In a directed graph, the two directions are counted as being distinct directed edges. In an directed graph, we write edges using parentheses to denote ordered pairs. For example, edge (2,3) is directed from 2 to 3 , which is different than the directed edge (3,2) from 3 to 2. Directed graphs are drawn with arrowheads on the links, as shown below:

3 nodes network 

Neighborhood and Degree

Two vertices are called adjacent if they share a common edge, in which case the common edge is said to join the two vertices. An edge and a vertex on that edge are called incident.

See the 6-node graph for examples of adjacent and incident:

6 nodes graph 
  • Nodes 4 and 6 are adjacent (as well as many other pairs of nodes)

  • Nodes 1 and 3 are not adjacent (as well as many other pairs of nodes)

  • Edge {2,5} is incident to node 2 and node 5.

  • The neighborhood of a vertex v in a graph G is the set of vertices adjacent to v. The neighborhood is denoted N(v). The neighborhood does not include v itself. For example, in the graph below N(5) = {4,2,1} and N(6) = {4}.

The degree of a vertex is the total number of vertices adjacent to the vertex. The degree of a vertex v is denoted deg(v). We can equivalently define the degree of a vertex as the cardinality of its neighborhood and say that for any vertex v, deg(v) = |N(v)|.

The following is a vertex degree table of the graph above.

Vertex ID Vertex Degree
1 2
2 3
3 2
4 3
5 3
6 1

In a directed graph, we define degree exactly the same as above (and note that “adjacent” does not imply any direction or lack of direction).

In a directed graph it is important to distinguish between indegree and outdegree. Recall that any directed edge has two distinct ends: a head (the end with an arrowhead) and a tail. Each end is counted separately. The sum of head endpoints count toward the indegree of a vertex and the sum of tail endpoints count toward the outdegree of a vertex.

directed graph 

The directed graph above has the following degrees, indegrees, and outdegrees:

Vertex ID Vertex Degree Indegree Outdegree
1 2 0 2
2 4 3 1
3 1 0 1
4 2 1 1
5 1 1 0

Density and Average Degree

The density of a graph G = (V,E) measures how many edges are in set E compared to the maximum possible number of edges between vertices in set V.

Density is calculated as follows: An undirected graph has no loops and can have at most |V| * (|V| - 1) / 2 edges, so the density of an undirected graph is 2 * |E| / (|V| * (|V| - 1)). A directed graph has no loops and can have at most |V| * (|V| - 1) edges, so the density of a directed graph is |E| / (|V| * (|V| - 1)).

The average degree of a graph G is another measure of how many edges are in set E compared to number of vertices in set V. Because each edge is incident to two vertices and counts in the degree of both vertices, the average degree of an undirected graph is 2*|E|/|V|.

Paths

A path in a graph represents a way to get from an origin to a destination by traversing edges in the graph. For example, in the undirected graph G=(V,E) drawn below, there are many paths from node 6 to node 1. One such path is highlighted with red:

path1 

We write a path P as an ordered list of directed edges: P = ((v1,v2),(v2,v3),...,(vk,vk+1)). The first vertex of the first edge of a path is the origin and the second vertex of the last edge is the destination. Both origin and destination are called endpoints of the path.

Examples:

  • The red path above is ((6,4), (4,3), (3,2), (2,5), (5,1)); it is a path in G from node 6 to node 1

  • The blue path below is ((6,4), (4,5), (5,1)); it is also a path in G from node 6 to node 1

path2 

Paths in undirected graphs defined formally

Even in an undirected graph (such as the 6-node drawing above), we define a path as an ordered list of directed edges: P = ((v_1,v_2),(v_2,v_3),dots,(v_k,v_{k+1})). The strict ordering of nodes in the definition of a path corresponds to the order in which nodes and edges are traversed to get from the origin to the destination.

Let us suppose we have an undirected graph G=(V,E) and an ordered list of directed edges P=(e_1, e_2,ldots, e_k). P is a path in G only if the following are true of every directed edge e_i=(v_i,v_{i+1}) in P:

Directed edge e_i in P corresponds to an undirected edge in E.

In other words, {v_i,v_{i+1}} in E.

If directed edge e_i is not the last edge in P, then let e_{i+1} be the next edge in P. Directed edge e_{i+1} starts with a tail node identical to the head node of e_i.

This means that e_i=(v_i,v_{i+1}) and e_{i+1}=(v_{i+1},v_{i+2}).

The final requirement for P to be a path in G is that P cannot revisit any nodes. An ordered list of directed edges that meets the above two requirements and that does revisit nodes is called a walk. Examples of walks that are not paths, based on the 6-node graph drawn above:

  • W = ((6,4), (4,3), (3,2), (2,5), (5,4), (4,3), (3,2), (2,5), (5,1)) is a walk from node 6 to node 1

  • W = ((6,4), (4,6), (6,4), (4,5), (5,1), (1,5), (5,1)) is a walk from node 6 to node 1

A walk does not have to revisit nodes. Therefore, any path by definition is also a walk.

Paths in directed graphs

A directed path is a path in a directed graph where the directions of edges in the path match the directions of edges in the directed graph. For example, suppose we consider this directed graph:

path4 

The blue lines highlight P = ((6,4), (4,5), (5,1)) which is a directed path from 6 to 1 in directed graph G.

As a counter-example, consider the same directed graph with a different ordered list of edges that is not a directed path:

path4 

In this case, the red lines highlight P = ((6,4), (4,3), (3,2), (2,5), (5,1)); the 4^{th} edge of P – (2,5) – goes the “wrong way”; and so P is not a directed path in G. We instead refer to P as an undirected path in a directed graph. Information Today's “Bow Tie Structure of the Web” article puts it this way: “In an undirected path, hyperlinks can be followed forward or backward, a technique available to search engine spiders but not to a person using a Web browser.”

Length

The length of a path is the number of edges that it uses. Note that this definition is merely a restatement of the definition of the length of an ordered list.

Examples:

  • The length of the red path below is 5

  • The length of the blue path below is 3

red and blue paths 

Distance

Given a graph G, The distance d(x,y) between two vertices x and y is the length of the shortest path from x to y, considering all possible paths in G from x to y.

The distance between any node and itself is 0. If there is no path from x to y then d(x,y) is infinity.

Examples:

  • In the preceding graph, the distance from 6 to 1 is 3.

  • There is no path from 6 to 1 shorter than 3.

  • There is only one path from 6 to 1 with length 3, and that path is ((6,4), (4,5), (5,1)).

  • There are many longer paths from 6 to 1, none of which has any bearing on the distance from 6 to 1.

  • In the graph below, the distance from b to c is 2.

  • There is no path from b to c shorter than 2.

  • There are multiple paths from b to c of length 2. One such path is ((b,f), (f,c)). There are also many longer paths from b to c.

  • In the graph below, the distance from a to b is infinity.

  • There is no path from a to b.

red and blue paths 

The above examples all rely on small graphs where a simple visual inspection reveals what is the shortest path between any specified pair of nodes. For larger graphs, it is generally not possible to discern shortest paths so easily. Computer science provides algorithms, such as the breadth-first search algorithm, to compute shortest paths (and hence distances) in a graph of any size.

For now, we will limit our calculations of shortest paths to small graphs where visual inspection is easy and a formal algorithmic approach is not necessary.


The contect of this site is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.