A Practical Guide to Data Structures and Algorithms Using Java
Sally A. Goldman and Kenneth J. Goldman
Washington University in St. Louis
Course Lectures
The lectures are provided in two formats: video and pdf. The
videos are avi files that include full audio and high resolution
screen capture of all activity on the tablet PC. The pdf files show
the complete content of each "page" of the lecture. Used in
conjunction, the pdf files are helpful to quickly identify sections of
lectures that you may want to view in detail within the videos.
These lectures can be used in several ways:
 as ideas for course instructors
 as reinforcement or review for students
 as optional enrichment for students on topics not covered in class
 as student preparation for active learning class sessions, so time in
class is more engaging
 Lecture 1: Introduction

Course Introduction (7:54)
[lecture notes]
 We describe the goals of this course.

Motivating asymptotic complexity (37:48)
[lecture notes]
 We use the problem of finding the closestpair of points in the
plane to help motivate the standard asymptotic time complexity
analysis that is used by computer scientists. One goal is to
demonstrate that the asymptotic value of the number of lines of code
executed is a good "backoftheenvelope" calculation that can be used
to evaluate which algorithms will run fastest in practice. To help
make this point, we report on empirical tests comparing execution time
to the number of statements executed. We stress the value of being
able to quickly rule out inferior algorithms, so that the time spent
on coding and empirical testing can be used for the most promising
algorithms.

Introduction of the closestpair problem (15:39)
[lecture notes]
 We present the closestpair problem, which is used at the start of
the course to motivate the theoretical foundations that are going to
be used throughout the course. This problem has been selected since
it helps students to appreciate that even though they may be able to
solve a problem from first principles, this course can help them
learn to develop more efficient solutions. This lecture component also
covers the brute force algorithm that computes the
distance between all pairs of points, and has a highlevel overview of
the divideandconquer algorithm, built up by combining ideas
suggested by the class.

Analysis of the brute force closestpair algorithm (6:35)
[lecture notes]
 We analyze the brute force closest pair algorithm that
computes the distance between all pairs of points including a brief discussion on ways
to compute the number of pairs of points from first principles for anyone without
combinatorics background.

Empirical comparisons of running times (5:19)
[lecture notes]
 We present actual running times (in seconds) for the brute
force and divideandconquer algorithms for the closestpair problem for differing input
sizes on two different computers that vary significantly in processor speed. The goal is to
help students appreciate the difference in real time between an O(n log n) and O(n^{2})
algorithm.
 Lecture 2: DivideandConquer Algorithms

Divideandconquer algorithm design technique (6:45)
[lecture notes]
[handout]
 We describe the divideandconquer technique for algorithm design.

Merge sort algorithm (14:12)
[lecture notes]
 We present merge sort as an example of the divideandconquer technique.

Detailed discussion of the divideandconquer closest pair algorithm (31:34)
[lecture notes]
 We expand upon the highlevel ideas introduced in the first lecture
for an indepth discussion of the divideandconquer closest pair algorithm.
 Handling
degeneracies in the divideandconquer closestpair algorithm (14:13)
[lecture notes]
 We discuss the problem that could occur when two or more
points share the same xcoordinate along the line that
divides the left and right halves. In particular, we describe how
a method (called leftOf) can be used to define a unique
ordering of the points that is sorted with
respect to the xcoordinate to resolve this potential difficulty.
 Lecture 3: DivideandConquer Algorithms (cont.)

Pseudocode for the divideandconquer closest pair algorithm (8:02)
[lecture notes]
 We wrap up our discussion of the divideandconquer
closest pair algorithm by overviewing it through pseudocode.

Correctness of the divideandconquer closest pair algorithm (6:51)
[lecture notes]
 We argue that the divideandconquer algorithm is guaranteed to return the correct
answer for all possible inputs. While the discussion uses this problem as an example,
it addresses more generally how induction can be used to prove the correctness
of any divideandconquer algorithm.

Analysis of the divideandconquer closest pair algorithm (42:43)
[lecture notes]
 We analyze the asymptotic time complexity of the
closest pair algorithm. While this lecture motivates the benefits of
the master method (a "cookbook" approach to solve recurrences of a specified form),
it uses the recursion tree technique to solve the recurrence.
 Lecture 4: Asymptotic Notation

Asymptotic Notation (38:19)
[lecture notes]
 We formally define asymptotic notation
(bigOh, bigOmega, bigTheta, littleoh, littleomega), and discuss the
limiting behavior as n approaches infinity. We build upon relationships already
understood in working with inequalities to develop intuition
about asymptotic notation.

Working with asymptotic notation (31:10)
[lecture notes]
 We use example problems to develop a better understanding of asymptotic notation. It's
important to be able to determine relationships between pairs of
functions. As part of this lecture, L'Hopital's rule for computing
limits is reviewed.

Analyzing code fragments that are nested loops (6:17)
[lecture notes]
 We analyze the asymptotic time complexity of example code fragments with nested loops.
 Lecture 5: Analyzing DivideandConquer Algorithms

Analyzing divideandconquer algorithms (12:36)
[lecture notes]
 We discuss how to define a recurrence relation to express
the asymptotic time complexity of a divideandconquer algorithm. Next we
present the master method to
quickly get an asymptotic solution, or how to use a recursion tree to
obtain an exact solution.

Master method to asymptotically solve a recurrence relation (19:04)
[lecture notes]
 We present the master method, a "cookbook" method
to asymptotically solve typical
recurrence relations that occur for divide and conquer algorithms when all subproblems are
roughly the same size. In particular, it can be used to solve recurrences of the form
T(n) = a T(n/b) + f(n) where f(n) is of the form n^{l}(log n)^{k} for
constants l &ge 0 and k &ge 0.

Exactly solving a recurrence relation (29:50)
[lecture notes]
 We illustrate how to use a recursion tree to exactly solve
recurrence relations that typical of divide and conquer algorithms.
 Lecture 6: ADT Taxonomy Part I and Positional Collection ADT

Overview of the ADT taxonomy (11:57)
[lecture notes]
 We overview the ADT taxonomy used throughout the book.

Introduction of the manually positioned collection (2:42)
[lecture notes]
 We provide an introduction to manually positioned collections.

A discussion of the positional collection data structures (62:29)
[lecture notes]
 We present positional collection data
structures, including the singlylinked list, doublylinked list, array, dynamic array,
circular array, and tracked array.
 Lecture 7: Quicksort, Expected time complexity

Quicksort (59:31)
[lecture notes]
 We present quicksort with medianofthree partitioning and also randomized
quicksort. We also discuss
the common optimization of terminating when the input size is a small value (around
30), and then using insertion sort to complete the sort.

Expected time complexity definition (9:00)
[lecture notes]
 We formally define expected time complexity.

Analysis of randomized quicksort (10:11)
[lecture notes]
 We analyze the expected time complexity of randomized quicksort.
 Lecture 8: Adversary Lower Bound Technique

Adversary lower bound technique (40:47)
[lecture notes]
[handout]
 Using "20 questions" as an example problem, we describe the adversary lower bound
technique. In particular, we focus on the basic adversary strategy of making
a list of all possible solutions, and then answering to maximize the number of remaining
solutions consistent with all past answers. This adversary
strategy gives the same bounds as one obtains using the decision tree technique.

Comparisonbased sorting lower bound (16:17)
[lecture notes]
 We apply the adversary lower bound technique to prove an
Ω(n log n) lower bound for any
comparisonbased sorting algorithm.

Lower bound for finding the minimum (13:08)
[lecture notes]
 We use the problem of finding the minimum element in an array
to illustrate that for some problems alternate adversary strategies
can yield a better lower bound.
 Lecture 9: Linear Time Sorting Algorithms

Counting sort (24:39)
[lecture notes]
 We present counting sort.

Digitizer interface (9:03)
[lecture notes]
 We describe the digitizer interface. Similar to the way in which Java's
Comparator interface allows a comparisonbased sorting algorithm to be
applied to any data elements that have a comparator, our Digitizer interface allows any
sorting algorithm that treats each element as a sequence of digits to be applied to any data elements
for which a Digitizer is defined.

Radix sort (50:19)
[lecture notes]
 We present radix sort.

Bucket sort (7:19)
[lecture notes]
 We present bucket sort.
 Lecture 10: ADT Taxonomy Part II, Set ADT

ADT taxonomy for algorithmically positioned collections (38:28)
[lecture notes]
 We return to our earlier discussion of the ADT taxonomy used in the book,
focusing on algorithmically positioned collections.

Set ADT (6:58)
[lecture notes]
 We describe the Set abstract data type.
 Lecture 11: Direct Addressing and Open Addressing

Direct addressing (18:56)
[lecture notes]
 We present the direct addressing data structure.

Introduction to hashing based Set data structures (27:39)
[lecture notes]
 We introduce hashing, and provide an overview of
separate chaining and open addressing.

Open addressing (44:32)
[lecture notes]
 We present the open addressing data structure.
 Lecture 12: Separate Chaining

Separate Chaining (10:18)
[lecture notes]
 We present the separate chaining data structure.

Comparison of the tradeoffs between Set data structures (11:08)
[lecture notes]
 We systematically compare the advantages and disadvantages of
direct addressing, open addressing, and separate chaining.
 Lecture 13: Priority Queue ADT and the Binary Heap Data Structure

Priority Queue ADT (7:00)
[lecture notes]
 We describe the priority queue abstract data type.

Binary Heap (56:34)
[lecture notes]
 We present the binary heap data structure.

Tracked Binary Heap (5:25)
[lecture notes]
 We illustrates how the elements stored in a binary heap can be tracked.

Comparison of the tradeoffs between priority queue data structures (6:14)
[lecture notes]
 We systematically compares the advantages and disadvantages of
a binary heap, leftist heap, pairing heap, and Fibonacci heap.
 Lecture 14: Ordered Collection ADT, Balanced Search Trees

Ordered collection ADT (5:31)
[lecture notes]
 We describe the ordered collection abstract data type.

Abstract search tree (14:24)
[lecture notes]
 We discuss the generalization of a binary search tree in which
each node can have any number of elements per node. As part of this discussion,
we present the properties of an abstract
search tree that allow efficient search, and discuss how an inorder traversal
can be used to iterate over the elements in sorted order.

Binary search tree review (17:23)
[lecture notes]
 We review the binary search tree data structure with a focus on
how to find the successor (and symmetrically predecessor) of an element, and how
to remove an element.

Balanced search tree (27:03)
[lecture notes]
 We describe the rotations that are used to help create balanced binary
search trees. This discussion includes a highlevel overview of the ways
in which data structures that maintain a balanced binary search tree
select when to perform rotations.
 Review for Midterm

Midterm topics (15:07)
[lecture notes]
 A review of the topics that will be covered on the midterm.
 Lecture 15: RedBlack Trees

Redblack tree properties (12:24)
[lecture notes]
 We introduce the redblack tree by describing its properties. We also
derive an upper bound on the depth of a redblack tree holding n elements.
 Redblack tree methods (54:49)
[lecture notes]
 We describe the redblack tree methods, starting with a
brief discussion of how the nonmutating methods are exactly as for a
standard binary search tree. Then we present the mutators. Insertion
is presented in depth, followed by a discussion of the highlevel
approach used by deletion, but without the details of the specific
cases that occur.
 Lecture 16: BTrees

Intuition behind the BTree design (33:19)
[lecture notes]
 We discuss secondary storage and use it to motivate the design of a Btree in terms of
its goal of reducing the number of page faults during a search.

BTree properties (7:00)
[lecture notes]
 We describe the BTree properties.

BTree insertion (19:49)
[lecture notes]
 We describe both the bottomup and topdown BTree insertion methods.

234 trees and their relation to redblack trees (17:08)
[lecture notes]
 We describe the 234 tree, which is simply a Btree of order 2 (t=2), and its
relationship to the redblack tree.
 Lecture 17: BTrees (cont.) and B+Trees

BTree deletion (18:47)
[lecture notes]
 We describe the BTree deletion with a focus on the highlevel ideas to develop
a good intuition behind the design of this method.

BTree analysis (16:27)
[lecture notes]
 We derive an upper bound on the height of the Btree, and use it to analyze
the asymptotic time complexity of the BTree methods.

B+tree (17:31)
[lecture notes]
 We briefly overview the B+Tree data structure.
 Lecture 18: Skip Lists

Intuition behind the skip list design (18:54)
[lecture notes]
 We motivate the design of a skip list in terms of additional navigation support
for a doublylinked list.

Skip list representation (5:24)
[lecture notes]
 We describe the internal representation of the skip list.

Skip list methods (17:17)
[lecture notes]
 We describe the skip list methods, focusing on searching, insertion, and deletion.
 Lecture 19: Skip List Analysis, and Comparisons of Ordered Collection Data Structures

Skip list analysis (30:55)
[lecture notes]
 We analyze the time complexities of the skip list methods.

Relationship between a skip list and B+tree (3:50)
[lecture notes]
 We describe the relationship between a skip list and B+tree.

Comparison and tradeoffs among ordered collection data structures (8:39)
[lecture notes]
 We systematically compare the ordered collection
data structures.
 Lecture 20: Digitized Ordered Collections, Spatial Collections/kd Tree

Building a trie (16:02)
 This video shows a short visualization of building a trie.
 Digitized ordered
collection ADT and data structures (33:43) [lecture notes]
 We
describe the digitized ordered collection abstract data type, and the
illustrate by example the trie, compact trie, compressed trie, suffix
tree, and indexing trie data structures..

Spatial collection ADT (9:37)
[lecture notes]
 We describe the spatial collection abstract data type.

kdtree data structure (14:34)
[lecture notes]
 We describe the kdtree data structure for the spatial collection.

Quad tree data structure overview(3:26)
[lecture notes]
 We briefly describe the quad tree data structure for the spatial collection.
 Lecture 21: Graph Problems and Graph Representations

Introduction to graphs (23:50)
[lecture notes]
 We introduce graphs and describe a variety of problems that are best modeled by
graphs.

Graph representations (64:38)
[lecture notes]
 After reviewing the different types of graphs, we discuss the adjacency list, adjacency set,
and adjacency matrix representations and the tradeoffs between them.
 Lecture 22: BreadthFirst Search (BFS)

Breadthfirst search (51:17)
[lecture notes]
 We consider the problem of finding the shortest path in an unweighted path using
breadthfirst search (bfs). We also introduce the shortest path tree as a way to represent
the output from breadthfirst search.
 Lecture 23: Dijkstra's Shortest Path Algorithm

Dijkstra's singlesource shortest path algorithm (38:55)
[lecture notes]
 We present Dijkstra's algorithm to find the shortest path in a weighted graph (with nonnegative
edge weights) to all vertices from a specified source vertex.

Short overview of Dijkstra's algorithm (11:32)
[lecture notes]
 A closing overview of Dijkstra's singlesource shortest path algorithm.

Guidance on the design for Project 4  Travel Agent Application of Dijkstra's Shortest Path Algorithm (22:37)
[lecture notes]
 An overview of the internals of the implementation for the last project.
 Lecture 24: Greedy Tree Builder

Minimum spanning tree problem definition (4:15)
[lecture notes]
 We define the minimum spanning tree problem.

Maximum bottleneck problem definition (3:11)
[lecture notes]
 We define the maximum bottleneck problem.

Greedy tree builder (55:41)
[lecture notes]
 We define a generalization of Dijkstra's singlesource shortest path problem and also
Prim's minimum spanning tree problem. We discuss how the greedy tree builder can be used
to solve either of these two problems, and also to solve the maximum bottleneck problem.

Time complexity analysis of the greedy tree builder (15:45)
[lecture notes]
 We analyze the time complexity of the greedy tree builder (and hence of both Dijkstra's
singlesource shortest path algorithm and Prim's minimum spanning tree algorithm) in terms
of the time complexity of the priority queue ADT methods that dominate the time complexity of
the greedy tree builder.
 Lecture 25: Prim's and Kruskal's Minimum Spanning Tree (MST) Algorithm

Correctness argument for Dijkstra's algorithm (23:35)
[lecture notes]
 We argue that Dijkstra's singlesource shortest path algorithm is correct.

Correctness argument for Prim's minimum spanning tree algorithm (23:44)
[lecture notes]
 We argue that Prim's minimum spanning tree algorithm is correct.

Kruskal's minimum spanning tree algorithm (11:48)
[lecture notes]
 We present Kruskal's minimum spanning tree algorithm and prove its correctness. We also briefly
introduce the unionfind data structure as a way to efficiently implement Kruskal's algorithm. Finally,
we analyze the time complexity of Kruskal's algorithm and briefly discuss it in relation to Prim's
algorithm.
 Lecture 26: DFS and Topological Sort

Depthfirst search (33:43)
[lecture notes]
 We present depthfirst search (dfs).

Topological sort (30:14)
[lecture notes]
 We present how depthfirst search can be used to order a set of processes in a precedence graph,
and prove that the algorithm is correct. This problem is called topological sort since it is trying to
order (or sort) the processes in a way that adheres to the given precedence constraints.

Strongly connected components (5:51)
[lecture notes]
 We define a strongly connected components, and then briefly describe how depthfirst search can be used
to solve this problem.
 Lecture 27: Garbage Collection Algorithms

What is garbage collection? (11:56)
[lecture notes]
 We discuss memory management and how one defines which memory cells are garbage.

Markandsweep garbage collection algorithm (51:02)
[lecture notes]
 We describe the markandsweep garbage collection algorithm, including
inplace depthfirst search and why that is so important for this
application. We close with a brief analysis of the markandsweep algorithm and an example
where novice Java programmers often unnecessarily create garbage.

Copying garbage collection algorithm (14:16)
[lecture notes]
We briefly overview the copying collection garbage collection algorithm and
discuss the tradeoff between markandsweep and copying collection.
 Lecture 28: Course Review

Graphs (15:23)
[lecture notes]
 We review the portion of the course on graphs and graph algorithms.

ADT taxonomy (11:59)
[lecture notes]
 We review the ADT taxonomy introduced and used throughout this course to select the best
ADT(s) for a given application.

Data structures (13:12)
[lecture notes]
 We review the data structures covered in this course.

Other topics (7:05)
[lecture notes]
 We review the remaining topics in this course.