CSCI-1080 Intro To CS: World Wide WebSaint Louis University, Department of Computer Science
|
Example: The result of rolling a 6-sided die is a random variable. Each time the die is rolled, the observed value may change. Any time we roll the die, any of the values 1, 2, 3, 4, 5, or 6 is equally likely to be the result.
Example: Suppose we choose one person from all the grown adults in the United States (each person being equally likely) and measure his or her height. The result is a random variable. Unlike the previous examples, some values are much more likely to occur than others. We would expect fairly often to observe a person's height feet 8 inches. By comparison, we would expect very rarely to observe a person's height feet.
Any variable may be discrete or continuous. In simplified terms the distinction is
A discrete variable has a finite set of possible values.
A continuous variable has an infinite set of possible values.
Examples:
Discrete: x is a positive integer no more than 10.
Discrete: x is a Web page.
Continuous: x is a decimal between 0 and 1. (There are infinitely many of these, e.g., 0.4901352…)
The set of all possible values for a variable is often called its domain. Our primary domains of interest are finite (e.g., people, Web pages). Hence we focus on discrete variables, using continuous variables when they allow us to simplify things.
(Readers wanting an unsimplified distinction of discrete and continuous should look up “countable” and “uncountable.”)
Any random variable has a corresponding probability distribution that describes the likelihood of observing different values of that random variable.
Example: The most famous probability distribution is the normal distribution (or “bell curve”), which is continuous:
Normal distribution |
The height of grown adults follows a normal distribution. The x-axis (or “k”) in this figure above indicates possible height values; the y-axis (or “p(k)”) indicates the likelihood of observing a person of height k. The curve indicates that most people are average height; relatively few are short or tall; and no one is radically shorter or taller than average. (E.g., no one is 6 times taller than average.)
Example: Another common probability distribution is the uniform distribution, in which all possible values are equally likely. The uniform distribution below describes the process of rolling one die, which is discrete:
Uniform distribution. |
The set of all possible outcomes is {1,2,3,4,5,6}; each outcome has an equal probability of occurring, and the total probability of all possible outcomes must equal 1. Therefore:
p(k) = 1/6 if
p(k) = 0 if
Example: The indegree of Web pages follows the power law distribution:
Power law distribution. |
Most pages have low indegree, hence p(k) is large when k is small. A small but significant number of Web pages have extremely high indegree. These are the hubs of the Web graph. Sites such as Amazon and Wikipedia are notable hubs on the Web; each has indegree that is many thousands of times higher than average.
Note that the power law distribution is continuous, but Web pages and indegrees are discrete. Using the continuous power law distribution simplifies our analysis of the indegree of Web pages.
Suppose we choose one node from an undirected graph (all nodes being equally likely) and measure the degree of that chosen node. The result is a random variable, and the probability distribution that describes that random variable is called the degree distribution of the graph.
Example: Consider graph drawn below. We calculate the degree distribution for this graph with the following 5 steps:
Measure the degree of each node in the graph–the “census.”
Find all the distinct degree values that result from the census.
For each distinct degree value, count how often it occurs in the census.
Divide the total number of occurrences of each degree value by the total number of nodes.
The resulting numbers represent the likelihood of each possible degree value: Draw the probability distribution.
Let us compute the degree distribution of this graph |
Step 1:
Vertex ID | Vertex Degree |
1 | 3 |
2 | 3 |
3 | 3 |
4 | 2 |
5 | 2 |
6 | 0 |
7 | 1 |
8 | 5 |
9 | 2 |
10 | 1 |
Step 2, 3 and 4:
Degree value | How oftern occurs | Divided by |V| |
0 | 1 | 1/10 |
1 | 2 | 2/10 |
2 | 3 | 3/10 |
3 | 3 | 3/10 |
4 | 0 | 0/10 |
5 | 1 | 1/10 |
Step 5:
Example: We use the same 5 steps to calculate the degree distribution of the graph drawn below. Notice that (1) the graph has a hub-and-spoke structure, and (2) the degree distribution of the graph is a rough discrete approximation of the power law distribution.
Step 1:
Vertex ID | Vertex Degree |
1 | 3 |
2 | 6 |
3 | 1 |
4 | 2 |
5 | 10 |
6 | 2 |
7 | 7 |
8 | 1 |
9 | 1 |
10 | 2 |
11 | 0 |
12 | 1 |
13 | 1 |
14 | 1 |
15 | 1 |
16 | 1 |
17 | 1 |
18 | 1 |
19 | 1 |
20 | 1 |
Step 2, 3 and 4:
Degree value | How oftern occurs | Divided by |V| |
0 | 1 | 1/20 |
1 | 12 | 12/20 |
2 | 3 | 3/20 |
3 | 1 | 1/20 |
4 | 0 | 0/20 |
5 | 1 | 1/20 |
6 | 1 | 1/20 |
7 | 0 | 0/20 |
8 | 0 | 0/20 |
9 | 0 | 0/20 |
10 | 1 | 1/20 |
Step 5:
The contect of this site is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.