Homework 7
comp 125-609, Goldwasser
Due: 6:00pm Tuesday, November 23, 1999 (worth 5 points)
Late Option: if submitted between 6:00pm November 23 and 6:00pm November 30, you can recieve up to 3 of 5 points. No late homeworks will be accepted after 6:00pm November 30.
Purpose: practice writing and debugging more intricate methods.
Overview: Yet another sorting method! At this point, we have already seen four different methods for sorting an array of numbers; in this homework we will develop a fifth method called shellsort (Why was this name chosen? Answer: the sorting method was suggested in 1959 by Donald Shell)

This method can be thought of as an improvement over bubblesort, although we will describe it in full detail. During a bubblesort, the small (resp. large) values get pushed towards the beginning (resp. end) of the array, but the method is too slow because items still only get to move one spot at a time. The shellsort method will instead start by allowing exchanges which jump over rather large distances. Following this, the method will check over slighly smaller distances, and then smaller distances, until eventually it checks each element against its immediate neighbors.


The Details: Shellsort is based on an underlying procedure called h-sorting an array. Given an integer value h, the goal of h-sorting an array is to guarantee that every element of the array is less than the element which is h spaces to the right of it (if the array goes that far).

The method for h-sorting an array is relatively simple. Make a pass through the array from beginning to end, and at each spot check whether the array element is truly less than the element which is h-spaces farther in the array. If this is not true, then swap those two elements and then continue making your pass through the array until you reach the end (actually, until you are less than h spaces away from the end).

If you complete the entire pass without ever swapping an element, then the array must indeed be correctly h-sorted and you are done with the h-sort. However if one or more swaps take place, you must make another pass through the entire array. Again, if no swaps are made during this new pass, the h-sort is done, but otherwise you must again make another pass. Eventually, there will come a time when the file is h-sorted and this process ends.

So this is how you h-sort a file. Notice that if you 1-sort a file, you have guaranteed that every value is less than the value to its right, and so the entire array will be sorted. For other values of h, an h-sorted file will not be perfectly sorted, but it turns out to be useful to do these h-sorts anyway.

The overall shellsort is the following. A sequence of decreasing h-values is chosen, and h-sort is called for each of these values, ending with h=1 (thus perfectly sorting the file by the time we are done).

Note: The magic sequence of h-values chosen is unimportant for the assignment. Various sequences have been studied by computer scientists and we have chosen a relatively decent such sequence and written the code for this sequence.


Requirements: Most of the program is already written. In fact, the only thing you must do is write the code for the h-sort procedure. Do not modify any other part of the program we provided. In fact, you probably do not even want to read the rest of the program which we provided.

There are three files to download:
www.cs.luc.edu/~mhg/comp125/homework/shellsort.frm
www.cs.luc.edu/~mhg/comp125/homework/shellsort.frx
www.cs.luc.edu/~mhg/comp125/homework/shellsort.vbp

All you need to know is the following: The data is stored in a global array which was created with a line such as

    Dim TheArray(1 to arraysize) As Single
where arraysize is also a global variable. The actual data consists of floating point numbers between 0 and 1 which are chosen at random.

The function which you must write is at the bottom of the code, and is declared as:

    Private Sub hsort(h As Long)
  
    End Sub

Additional Requirements: When your program is working, compare the timing for shellsort versus mergesort on the following array sizes: 1000, 2000, 4000, 8000, 16000, 32000. Place a table of your results on a separate sheet of paper to be turned in with your homework (no graph necessary). Your shellsort should be faster than mergesort.
Advice: Although the general idea is simple, you must convert the prose description of the method into working Visual Basic code. You will probably find that getting the program to work properly takes a bit of finessing. You will have to write nested loops while being very careful about the exact bounds and conditions for repeating the loops.

Some specific suggestions:

  • Run your program with the "Verify Correctness" mode selected on the form when you are writing your code. This will give you a way to check whether your program seems to be working for various array sizes.
  • When you do find that your program is not working correctly, begin the detective work to discover why. Use the VB debugger to monitor your program as it runs. Try to run your program on some relatively small arrays (try 10 items or 40 items), and then try the sort on paper with the same array values and see if your code behaves the same way you think it does.
  • Use the "Fix Random Seed" option to run on the same data over and over when debugging
  • Use the graphics animation as a tool if that helps you.

  • An Example: Just to make sure that we both understand the method description, let's look at how the sort proceeds for a very specific example. If we run the program using the "Fix Random Seed" option, setting the seed equal to 1, we get an initial array of the following values (rounded):

    A(1) A(2) A(3) A(4) A(5) A(6) A(7) A(8) A(9) A(10)
    0.334 0.068 0.594 0.766 0.189 0.537 0.327 0.394 0.073 0.832

    Shellsort first attempts to 4-sort the file. During the first pass of the 4-sort, it exchanges the pairs, A(1):A(5) then A(3):A(7) then A(4):A(8) then A(5):A(9). Since exchanges took place, it makes a second pass while 4-sorting, causing the exchanges A(1):A(5) but no others. This necessitates a third pass during which no exchanges end up taking place and thus the file is correctly 4-sorted at this point. The 4-sort is animated as follows:

    A(1) A(2) A(3) A(4) A(5) A(6) A(7) A(8) A(9) A(10)
    0.334 0.068 0.594 0.766 0.189 0.537 0.327 0.394 0.073 0.832
    0.189 0.068 0.594 0.766 0.334 0.537 0.327 0.394 0.073 0.832
    0.189 0.068 0.327 0.766 0.334 0.537 0.594 0.394 0.073 0.832
    0.189 0.068 0.327 0.394 0.334 0.537 0.594 0.766 0.073 0.832
    0.189 0.068 0.327 0.394 0.073 0.537 0.594 0.766 0.334 0.832
    0.073 0.068 0.327 0.394 0.189 0.537 0.594 0.766 0.334 0.832

    After this point, the overall shellsort algorithm now calls for a 1-sort which causes swaps A(1):A(2), A(4):A(5), A(8):A(9) during the first pass; A(3):A(4) and A(7):A(8) during the second pass; A(6):A(7) during the third pass; A(5):A(6) during the fourth pass; and no exchanges during the fifth pass, thereby ending the 1-sort.

    A(1) A(2) A(3) A(4) A(5) A(6) A(7) A(8) A(9) A(10)
    0.073 0.068 0.327 0.394 0.189 0.537 0.594 0.766 0.334 0.832
    0.068 0.073 0.327 0.394 0.189 0.537 0.594 0.766 0.334 0.832
    0.068 0.073 0.327 0.189 0.394 0.537 0.594 0.766 0.334 0.832
    0.068 0.073 0.327 0.189 0.394 0.537 0.594 0.334 0.766 0.832
    0.068 0.073 0.189 0.327 0.394 0.537 0.594 0.334 0.766 0.832
    0.068 0.073 0.189 0.327 0.394 0.537 0.334 0.594 0.766 0.832
    0.068 0.073 0.189 0.327 0.394 0.334 0.537 0.594 0.766 0.832
    0.068 0.073 0.189 0.327 0.334 0.394 0.537 0.594 0.766 0.832


    Extra Credit (1 point): During an h-sort, notice that each pass moves from the left to the right. It turns out that you can speed up the performance further by alternating the direction between each pass. That is, have the first pass go from left to right, but then the second pass going from right to left. Continue alternating throughout each h-sort.

    If you do the extra credit, please report timing for three sets of experiments: mergesort, the original shellsort, and this modified shellsort.



    To submit your homework you should,
  • Print out your project as follows. Click "File". Click "Print". In the Range box, select "Current project". In the Print What box, click only on "Code".
  • Save all relevant files to a floppy disk
  • Create a table of timing results as discussed above
  • Place the printouts, the table of results, and the floppy disk inside a large manilla envelope. Please make sure that your name appears on the envelope as well as the disk and the printouts.