Assignment #7: Pipelining and Hazards, and Cache and Memory Hierarchies

Contents:


Overview

Topics: Pipelining and Data Dependencies and Hazards, Memory Hierarchies, Average Memory Access Time, and Cache Organization
Related Reading: Sections 4.5, 6.2-6.4, and class notes


Practice Problems

Practice problems from the textbook (answers are at the end of the chapter):


Problems to be Submitted (25 points)

When you turn in your assignment, you must include a signed cover sheet (PDF version) with your assignment (you're assignment will not be graded without a completed cover sheet).

You are allowed to submit your assignment via email, but if you choose to do so, you must bring a hardcopy of your assignment along with a completed cover sheet to the instructor at the next class. (Note: Do not email the instructor any .zip file attachments, as SLU's email may not accept these emails; i.e. the instructor may not receive your email.)

  1. (6 points)
    The following table gives the parameters for a number of different caches. For each 
    cache, fill in the missing fields in the table. Recall that m is the number of
    physical address bits, C is the cache size (number of data bytes), B is the block size 
    in bytes, E is the associativity, S is the number of cache sets, t is the number of
    tag bits, s is the number of set index bits, and b is the number of block offset bits.
    
    Cache  m   C     B   E          S              t              s              b
    
    1.     32  1024  4   4    _____________  _____________  _____________  _____________
    
    2.     32  1024  4   256  _____________  _____________  _____________  _____________
    
    3.     32  1024  8   1    _____________  _____________  _____________  _____________
    
    4.     32  1024  8   128  _____________  _____________  _____________  _____________
    
    5.     32  1024  32  1    _____________  _____________  _____________  _____________
    
    6.     32  1024  32  4    _____________  _____________  _____________  _____________
    
  2. (12 points)
    Suppose we have a system with the following properties:
    . The memory is byte addressable.
    . Memory accesses are to 1-byte words (not to 4-byte words).
    . Addresses are 13 bits wide.
    . The cache is four-way set associative (E = 4), with a 4-byte block size (B = 4)
    and eight sets (S = 8).
    
    Consider the following cache state. All addresses, tags, and values are given in
    hexadecimal format. The Index column contains the set index for each set of four
    lines. The Tag columns contain the tag value for each line. The V columns contain
    the valid bit for each line. The Bytes 0–3 columns contain the data for each line,
    numbered left-to-right starting with byte 0 on the left.
    

    Part 1:

    A. What is size (C) of this cache in bytes?

    B. The box that follows shows the format of an address (one bit per box).
    Indicate (by labeling the diagram) the fields that would be used to determine
    the following:

    CO The cache block offset
    CI The cache set index
    CT The cache tag

    12 11 10 9 8 7 6 5 4 3 2 1 0

    Part 2:
    Supppose that a program using the cache above references the 1-byte
    word at address 0x071A. Indicate the cache entry accessed and the cache byte
    value returned in hex. Indicate whether a cache miss occurs. If there is a cache
    miss, enter "MISS" for "Cache byte returned". Hint: Pay attention to those valid bits!

    A. Address in binary (one bit per box):

    12 11 10 9 8 7 6 5 4 3 2 1 0

    B. Memory Reference

    Parameter Value
    Cache Offset (CO)  0x
    Cache Index (CI)  0x
    Cache Tag (CT)  0x
    Cache Hit? (Y/N)
    Cache Byte Returned    0x

    Part 3:
    Supppose that a program using the cache above references the 1-byte
    word at address 0x13DE. Indicate the cache entry accessed and the cache byte
    value returned in hex. Indicate whether a cache miss occurs. If there is a cache
    miss, enter "MISS" for "Cache byte returned". Hint: Pay attention to those valid bits!

    A. Address in binary (one bit per box):

    12 11 10 9 8 7 6 5 4 3 2 1 0

    B. Memory Reference

    Parameter Value
    Cache Offset (CO)  0x
    Cache Index (CI)  0x
    Cache Tag (CT)  0x
    Cache Hit? (Y/N)
    Cache Byte Returned    0x

    Part 4:
    Supppose that a program using the cache above references the 1-byte
    word at address 0x16C9. Indicate the cache entry accessed and the cache byte
    value returned in hex. Indicate whether a cache miss occurs. If there is a cache
    miss, enter "MISS" for "Cache byte returned". Hint: Pay attention to those valid bits!

    A. Address in binary (one bit per box):

    12 11 10 9 8 7 6 5 4 3 2 1 0

    B. Memory Reference

    Parameter Value
    Cache Offset (CO)  0x
    Cache Index (CI)  0x
    Cache Tag (CT)  0x
    Cache Hit? (Y/N)
    Cache Byte Returned    0x
  3. (6 points)

    A given computer system has the following cache access times:

    1. Program A has the following miss rates:

      • L1 cache: 4%
      • L2 cache: 30%
      • L3 cache: 50%
      • Main memory: (assume it always hits in main memory; i.e. miss rate is 0%)

      1. What are the hit rates for each level?

      2. What is the average memory access time (AMAT) for Program A? Define the AMAT to be:

        AMAT = (Hit Time) + (Miss Rate in L1)*(Average Miss Penalty)

        Where the Average Miss Penalty is defined for each level to be:

        AMP = (Hit Time) + (Miss Rate at that level)*(Average Miss Penalty at the previous level)

        Hint: Start by computing the AMP for memory.

    2. Program B has the following hit rates:

      • L1 cache: 93%
      • L2 cache: 80%
      • L3 cache: 65%
      • Main memory: (assume it always hits in main memory; i.e. hit rate is 100%)

      1. What are the miss rates for each cache level?

      2. What is the average memory access time (AMAT) for Program B?

    3. Oftentimes, the program with the lower L1 cache miss rate has the better (lower) average memory access time (AMAT)? Is that true in this case? If not, why?

    4. For Program A, what speedup would be achieved if you were able to optimize the program to achieve an 85% hit rate in L2 cache?
           Note: "Speedup" refers to the ratio (fraction) of the original AMAT (with 30% miss rate) to the new AMAT (with 85% hit rate).

    5. For Program B, if the L1 cycle access time was increased from 3 cycles to 4 cycles (during architecture design), what minimum hit rate would be needed in the L1 cache to achieve the same AMAT?