CSCI-1080 Intro To CS: World Wide Web

Saint Louis University, Department of Computer Science

Network Structure

“Connected”: a word of many meanings

The word “connected” speaks to the most basic structural properties of networks. It is arguably both the most important and the most overused term in the network vocabulary.

Important and proper uses of “connected”:

Connected has several distinct formal definitions, each of which is important. We introduce three of them here and elaborate the details later.

A path connects its origin and destination nodes
Two nodes are connected if there is at least one path that connects them
A graph is connected if each of its nodes is connected to all the others

Examples:

Nodes 6 and 2 are connected in graph G1 below
Nodes a and b are not connected in graph G2 below
Graph G1 is connected; graph G2 is not connected

Graph G1 = (V1,E1)

Graph G2 = (V2,E2)

Common sloppy uses of “connected”:

Sometimes “connected” is used informally when another term would be more clear. For example, avoid using “connected” when one of these unambiguous words would apply:

Joined
Adjacent
Incident

Induced Subgraphs

It is often useful to consider only part of a graph. Induced subgraphs are one particularly convenient way to define a specific sub-part of a larger graph.

Given a graph G=(V,E) , an induced subgraph is defined by a subset of vertices (i.e., S subseteq V ). Then

The nodes of the subgraph induced by are simply the nodes in , and
The edges of the subgraph induced by are all edges in that have both endpoints in .
We can write the induced subgraph formally as , where is the set of edges defined above (edges in that have both endpoints in ).

Example 1: Consider the graph G=(V,E) drawn below.

Original graph

The subgraph induced by the subset of nodes S={1, 2, 3, 4} can then be drawn:

Induced subgraph

Example 2: Let G=(V,E) be the Facebook friends network:

V = {x: x has a Facebook account}
E = {{x,y}: x and y are Facebook friends}

This is a big network: if we want to draw it, we must consider a small subgraph.

“Connected” defined formally

Earlier we stated that:

A path connects its origin and destination nodes
Two nodes are connected if there is at least one path that connects them
A graph is connected if each of its nodes is connected to all the others

“Connected nodes” defined formally:

Given an undirected graph G=(V,E) , two nodes x and y (i.e., x in V and y in V ) are connected if and only if one of the following is true:

x=y, or
there is at least one path in G with origin x and destination y

The most important clarification in our formal definition of connected nodes is that any node is by definition connected to itself.

Connected graphs and connected components

A connected graph (as defined above) is said to consist of a single connected component. If a graph is not connected, then it consists of two or more connected components.

The book Six Degrees offers this intuitive metaphor for connected component:

“Suppose the nodes of a graph are buttons and the edges of the graph are threads joining the buttons. The buttons and threads are all on the floor. Grab one button and lift it. If that button has any threads, then other buttons will start to rise in addition to the one button you are holding. Each additional button that rises off the floor may have even more threads that cause even more buttons to rise. Keep lifting until every button connected to the one in your hand (by any combination of threads) is off the floor. The buttons and threads that have risen off the floor are a connected component.”

The following examples illustrate the idea of connected component.

Graph G1 is connected and therefore has 1 connected component:

Graph G2 has 2 connected components:

Graph G3 has 3 connected components:

A connected component can be defined as an induced subgraph. For example, G_3 (above) consists of three connected components:

the subgraph induced by {1,2,5,6}
the subgraph induced by {3,4}
the subgraph induced by {7,8}

Hubs

A hub is a node in a graph with much higher degree than average. For example, nodes 1 and 2 are both hubs in the figure below:

Clusters

Informally speaking, a cluster is a group of nodes in a graph that are more connected to each other than they are to the rest of the graph. For example, the red and yellow regions below are clusters:

Defining clusters, part one: connected components

The following sections present a “semi-formal” definition of clusters in three parts:

First, we formally define connected component
Second, we introduce and formally define clique
Third, we informally define cluster based on the above two formal definitions

Given an undirected graph G=(V,E) , we define a connected component to be a subgraph induced by node set Ssubset V i.e., G_S = (S,E_S) , with the following two properties:

is connected
There is no edge in E that joins a node in S to a node not in S

The above 2 mathematical properties of a connected component translate into the button-and-thread metaphor as follows:

G_S is connected: The only buttons that rise off the floor do so because of the one button you are lifting and the threads that ultimately connect that button to other buttons (i.e., paths)

There is no edge in E that joins a node in S to a node not in S: You have lifted the one button in your hand high enough so that no more buttons will come off the floor no matter how much higher you lift the button you are holding.

For example:

The graph G_2 (drawn above and again below) is not a connected component because it violates property : it is not connected.

Graph G2 = (V2,E2)

The subgraph induced by S={c,d,e,f} (drawn with dark nodes and edges below) is not a connected component because it violates property . There are edges in E that join nodes in S to node b, which is a node not in S.

Defining clusters, part two: cliques

In an undirected graph G=(V,E) a clique is a subgraph induced by node set Ssubseteq V i.e., G_S = (S,E_S ) such that the density of G_S is 1. Put another way, in a clique, every pair of nodes is adjacent.

Example: Consider undirected graph G=(V,E) drawn below.

The following are cliques:

The subgraph induced by {c,d,e,f}
The subgraph induced by {b,f}

The following are not cliques:

The subgraph induced by {a,b,c}
The subgraph induced by {b,c,d,e}, which is drawn below. The subgraph is not a clique since its density is less than 1.

The subgraph induced by {c,d,e,f} is the largest clique in G: the clique with the most nodes. Usually the largest clique in a graph is the most interesting clique in that graph.

Defining clusters, part three

Our definition of cluster informally draws upon our formal definitions of connected components and cliques.

In an undirected graph G=(V,E) a cluster is a subgraph induced by node set S subseteq V i.e., G_S = (S,E_S) with the following two properties:

The average degree of is “relatively high”; (a relaxed adaptation of clique-ness)
There are “relatively few” edges in E that join a node in S to a node not in S; (a relaxed adaptation of connected component-ness)

Example: In the graph G=(V,E) drawn below, the following subsets of nodes induce subgraphs that can fairly be called clusters:

Subsets of nodes:

{1,2,3,4,5,7}
{20,21,22,23,24,25}
{14,15,16}

Corresponding clusters:

Not all connected components are clusters. For example:

In the graph above, the subgraph induced by {9,10,11,12,13} is a connected component, but it is arguably not a cluster, because the average degree of that induced subgraph is relatively low.

The subgraph induced by {6} is a connected component, but it is definitely not a cluster, because the average degree of the induced subgraph is zero.

Not all cliques are clusters – even relatively large cliques. For example:

The subgraph induced by {1,2,3,4} is a clique, but it is not a cluster because every single one of those nodes is adjacent to node 5; there are too many edges joining nodes in {1,2,3,4} to nodes not in {1,2,3,4}.

Even the largest clique in the above graph –the subgraph induced by {1,2,3,4,5}– is arguably still not a cluster because node 7 is adjacent to so many nodes in {1,2,3,4,5}.

The contect of this site is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.