-
Notifications
You must be signed in to change notification settings - Fork 0
/
readme
337 lines (286 loc) · 18.9 KB
/
readme
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
CS 61B Project 3
Weighted Undirected Graphs and Minimum Spanning Trees
Due 5:30 pm Friday, May 3, 2013
This is a team project. Form a team of 2 or 3 people. No teams of 1 or teams
of 4 or more are allowed.
Copy the Project 3 directory by doing the following, starting from your home
directory.
cp -r ~cs61b/hw/pj3 .
A figure accompanies this "readme" as the files pj3graph.ps (PostScript) or
pj3graph.pdf (PDF). Both files are the same figure.
The purpose of this project is to become comfortable with applications that
combine a variety of data structures and algorithms in a big ol' mess of
references pointing just every which way. I'm not even kidding.
Part I: Implement a Weighted Undirected Graph
==============================================
Implement a well-encapsulated ADT called WUGraph in a package called graph.
A WUGraph represents a weighted, undirected graph in which self-edges are
allowed. Any object whatsoever can serve as a vertex of a WUGraph.
For maximum speed, you must store edges in two data structures: unordered
doubly-linked adjacency lists and a hash table. You are expected to support
the following public methods in the running times specified. (You may ignore
hash table resizing time when trying to achieve a specified running time--but
your hash table should resize itself when necessary to keep the load factor
constant.) Below, |V| is the number of vertices in the graph, and d is the
degree of the vertex in question.
O(1) WUGraph(); construct a graph having no vertices or edges.
O(1) int vertexCount(); return the number of vertices in the graph.
O(1) int edgeCount(); return the number of edges in the graph.
O(|V|) Object[] getVertices(); return an array of all the vertices.
O(1) void addVertex(Object); add a vertex to the graph.
O(d) void removeVertex(Object); remove a vertex from the graph.
O(1) boolean isVertex(Object); is this object a vertex of the graph?
O(1) int degree(Object); return the degree of a vertex.
O(d) Neighbors getNeighbors(Object); return the neighbors of a vertex.
O(1) void addEdge(Object, Object, int); add an edge of specified weight.
O(1) void removeEdge(Object, Object); remove an edge from the graph.
O(1) boolean isEdge(Object, Object); is this edge in the graph?
O(1) int weight(Object, Object); return the weight of this edge.
A "neighbor" of a vertex is any vertex connected to it by an edge. See the
file graph/WUGraph.java for details of exactly how each of these methods should
behave.
Here are some of the design elements that will help achieve these goals.
[1] A calling application can use any object whatsoever to be a "vertex" from
its point of view. You will also need to have an internal object that
represents a vertex in a WUGraph and maintains its adjacency list; this
object is HIDDEN from the application. Therefore, you will need a fast
way to map the application's vertex objects to your internal vertex
objects. The best way to do this is to use the hash table you implemented
for Homework 6--modified so it resizes itself to keep the load factor
constant as |V| changes. (You may NOT use Java's built-in hash tables.)
The hash table also makes it possible to support isVertex() in O(1) time.
In Java, every object has a hashCode() method. The default hashCode()
is defined in the Object class, and hashes the _reference_ to the Object.
(This is not something you could do yourself, because Java does not give
you direct access to memory addresses.) This means that for some classes,
two distinct objects can act as different keys, even if their fields are
identical. However, many object classes such as Integer and String
override hashCode() so that items with the same fields are equals().
Recall that Java has a convention that if two objects are equals(), they
have the same hash code; defying this convention tends to break hash
tables badly. For the purposes of this project, two objects provided by
the calling application represent the same vertex if they are equals().
[2] To support getVertices() in O(|V|) time, you will need to maintain a list
of vertices. To support removeVertex() in O(d) time, the list of vertices
should be doubly-linked. getVertices() returns the objects that were
provided by the calling application in calls to addVertex(), NOT the
WUGraph's internal vertex data structure(s), which should always be
hidden. Hence, each internal vertex representation must include
a reference to the corresponding object that the calling application is
using as a vertex.
Alternatively, you could implement getVertices() by traversing your hash
table. However, this runs in O(|V|) time ONLY if your hash table resizes
in both directions--specifically, it must shrink when the load factor
drops below a constant. Otherwise, it will run too slowly if we add many
vertices to a graph (causing your table to grow very large) then remove
most of them.
[3] To support getNeighbors() in O(d) time, you will need to maintain an
adjacency list of edges for each vertex. To support removeEdge() in O(1)
time, each list of edges must be doubly-linked.
[4] Because a WUGraph is undirected, each edge (u, v) must appear in two
adjacency lists (unless u == v): u's and v's. If we remove u from the
graph, we must remove every edge incident on u from the adjacency lists
of u's neighbors. To support removeVertex() in O(d) time, we cannot walk
through all these adjacency lists. There are several ways you could
obtain O(d) time, and you may use any of these options:
[i] Since (u, v) appears in two lists, we could use two nodes to
represent (u, v); one in u's list, and one in v's list.
Each of these nodes might be called a "half-edge," and each is
the other's "partner." Each half-edge has forward and backward
references to link it into an adjacency list. Each half-edge
also maintains a reference to its partner. That way, when we
remove u from the graph, we can traverse u's adjacency list and
use the partner references to find and remove each half-edge's
partner from the adjacency lists of u's neighbors in O(1) time
per edge. This option is illustrated in the accompanying
figure, pj3graph.ps or pj3graph.pdf (both figures are the same).
[ii] You could use just one object to represent (u, v), but equip it
with two forward and two backward references. However, you must
be careful to follow the right references as you traverse
a node's adjacency list.
[iii] If you want to use an encapsulated DList class, you could use
just a single object to represent an edge, and put this object
into both adjacency lists. The edge object contains two
DListNode references (signifying its position in each DList), so
it can extract itself from both adjacency lists in O(1) time.
[5] To support removeEdge(), isEdge(), and weight() in O(1) time, you will
need a _second_ hash table for edges. The second hash table maps an
unordered pair of objects (both representing application-supplied vertices
in the graph) to your internal edge data structure. (If you are using
half-edges, following suggestion [4i] above, you could use the reference
from one half-edge to find the other.) To help you hash an edge in a
manner that does not depend on the order of the two vertices, I have
provided a class VertexPair.java designed for use as a key in hash tables.
The methods VertexPair.hashCode and VertexPair.equals are written so that
(u, v) and (v, u) are considered to be equal keys with the same hash code.
Read them, but don't change them unless you know what you're doing.
We recommend you use the VertexPair class as the key for your edge hash
table. However, you are not required to do so, and you may change
VertexPair.java freely to suit your needs.
(Technically, you don't need a second hash table; you could store vertices
and edges in the same table. However, you risk confusing yourself; having
two separate hash tables eases debugging and reduces the likelihood of
human error. But it's your decision.)
To support removeVertex() in O(d) time, you will need to remove the edges
incident on a vertex from the hash table as well as the adjacency lists.
You will also need to adjust the vertex degrees. Hence, each edge or
half-edge should have references to the vertices it is incident on.
[6] To support vertexCount(), edgeCount(), and degree() in O(1) time, you will
need to maintain counts of the vertices, the edges, and the degree of each
vertex, and keep these counts updated with every operation.
For those of you who are keeping score, my own Part I solution is 350 lines
long, not counting the hash table code.
Interfaces
----------
You may NOT change Neighbors.java or the signatures and behavior of
WUGraph.java. We will test that your WUGraph class correctly implements the
interface we have specified.
Neighbors.java is a class provided so the method WUGraph.getNeighbors() can
return two arrays at once. That is its only purpose. It is not appropriate to
use the Neighbors class for any other purpose. You CANNOT change it, because
it is part of the interface of getNeighbors(), and calling programs (including
the autograder) are relying on you to return Neighbors objects according to
spec. It appears as follows:
public class Neighbors {
public Object[] neighborList;
public int[] weightList;
}
Given an input vertex, getNeighbors() returns a Neighbors object. neighborList
is a list of all the vertices (application-provided objects, not internal
vertex representations) connected by an edge to the input vertex (including the
input vertex itself if it has a self-edge). weightList lists the weight of
each edge. The length of both lists is the degree of the input vertex.
getNeighbors() should construct and return a _new_ Neighbors object each time
it is called.
Your WUGraph should be well-encapsulated: no internal field or class used to
represent your graph should be public. The Neighbors class is public because
it's part of the interface of the WUGraph ADT, and it's not part of the
internal representation of your graph.
Part II: Kruskal's Algorithm for Minimum Spanning Trees
========================================================
Implement Kruskal's algorithm for finding the minimum spanning tree of a graph.
Minimum spanning trees, and Kruskal's algorithm for constructing them, are
discussed by Goodrich and Tamassia, Sections 13.6-13.6.1. Your algorithm
should be embodied in a static method called minSpanTree() in a class called
Kruskal, which is NOT in the graph package. Your minSpanTree() method should
not violate the encapsulation of the WUGraph ADT, and should only access a
WUGraph by calling the methods listed in Part I. You may NOT add any public
methods to the WUGraph class to make Part II easier (e.g., a method that
returns all the edges in a WUGraph). Remember the encapsulation rule: your
Kruskal code should work correctly with the WUGraph code of any other group
taking CS 61B, and your WUGraph code should work with their Kruskal code.
The signature of minSpanTree() is:
public static WUGraph minSpanTree(WUGraph g);
This method takes a WUGraph g and returns another WUGraph that represents the
minimum spanning tree of g. The original WUGraph g is NOT changed! Let G be
the graph represented by the WUGraph g. Your implementation should run in
O(|V| + |E| log |E|) time, where |V| is the number of vertices in G, and |E| is
the number of edges in G.
Kruskal's algorithm works as follows.
[1] Create a new graph T having the same vertices as G, but no edges (yet).
Upon completion, T will be the minimum spanning tree of G.
[2] Make a list (not necessarily linked) of all the edges in G. You cannot
build this list by calling isEdge() on every pair of vertices, because
that would take O(|V|^2) time. You will need to use multiple calls to
getNeighbors() to obtain the complete list of edges.
Note that your edge data structure should be defined separately from any
edge data structure you use in WUGraph.java (Part I). Encapsulation
requires that the internal data structures of the WUGraph class not be
exposed to applications (including Kruskal).
[3] Sort the edges by weight in O(|E| log |E|) time. You may write the
sorting algorithm yourself or use one from lab, but do not use a Java
library sorting method. (You can instead use a priority queue, as
Goodrich and Tamassia suggest, but sorting in advance is more
straightforward and is probably faster.)
[4] Finally, find the edges of T using disjoint sets, as described in Lecture
33 and Goodrich & Tamassia Section 11.4. The disjoint sets code from
Lecture 33 is included in DisjointSets.java in the "set" package/
directory. To use the disjoint sets code, you will need a way to map the
objects that serve as vertices to unique integers. Again, hash tables
are a good way to accomplish this. (You cannot use the same hash table as
the WUGraph; that should be encapsulated so Kruskal can't see it.)
Be forewarned that the DisjointSets class has no error checking, and will
fail catastrophically if you union() two vertices that are not roots of
their respective sets, or if you union() a vertex with itself. If you
add simple error checking, it might save you a lot of debugging time (here
and in Homework 9).
My own Part II solution is 100 lines long, not counting the sorting method or
hash table code.
Since Parts I and II are on opposite sides of the WUGraph interface, a partner
can easily begin Part II before Part I is working.
Style Rules
===========
You will be graded on style, documentation, efficiency, and the use of
encapsulation.
1) Each method must be preceded by a comment describing its behavior
unambiguously. These comments must include descriptions of what each
parameter is for, and what the method returns (if anything).
They must also include a description of what the method does (though
not how it does it) detailed enough that somebody else could implement
a method that does the same thing from scratch.
2) All classes, fields, and methods must have the proper public/private/
protected/package qualifier. We will deduct points if you make things
public that could conceivably allow a user to corrupt the data structure.
3) We will deduct points for code that does not match the following style
guidelines.
- Classes that contain extraneous debugging code, print statements, or
meaningless comments that make the code hard to read will be penalized.
- Your file should be indented in the manner enforced by Emacs (e.g., a
two-space or four-space indentation inside braces), and used in the lecture
notes throughout the semester. The indentation should clearly show the
structure of nested statements like loops and if statements.
- All if, else, while, do, and for statements should use braces.
- All classes start with a capital letter, all methods and (non-final) data
fields start with a lower case letter, and in both cases, each new word
within the name starts with a capital letter. Constants (final fields) are
in all capital letters.
- Numerical constants with special meaning should always be represented by
all-caps "final static" constants.
- All class, method, field, and variable names should be meaningful to a
human reader.
- Methods should not exceed about 100 lines; any method that long can
probably be broken up into logical pieces. The same is probably true for
any method that needs more than 7 levels of indentation.
- Avoid unnecessary duplicated code; if you use the same (or very similar)
fifteen lines of code in two different places, those lines should probably
be a separate method call.
- Programs should be easy to read.
The Autograders
===============
If possible, make sure that your program passes both of the test programs
provided, WUGTest.java and KruskalTest.java. IMPORTANT NOTE: If you attempt
to cheat and thwart the test code by writing code that looks for specific tests
and provides canned answers, rather than by writing code that correctly
implements a weighted undirected graph data structure and Kruskal's algorithm,
the graders will notice, and you will receive a score of -20 and a letter at
the Office of Student Conduct. Please don't try it.
Submitting your Solution
========================
Write a file called GRADER that briefly documents your data structures and the
design decisions you made in WUGraph.java and Kruskal.java that extend or
depart from those discussed here. In particular, tell us what choices you made
in your implementation to ensure that removeVertex() runs in O(d) time (as
described in Part I, design element [4]) and getVertices() runs in O(|V|) time
(design element [2]).
Designate one member of your team to submit the project. If you resubmit, the
project should always be submitted by the same student. If for some reason a
different partner must submit (because the designated member is out of town,
for instance), you must send [email protected] a listing of your team members,
explaining which of them have submitted the project and why. Let us know which
submission you want graded.
The designated teammate only: make sure your project compiles and runs on the
_lab_ machines (with WUGTest and KruskalTest) just before you submit.
Change (cd) to your pj3 directory, which should contain your GRADER, your
Kruskal.java, the graph directory (package), the set directory (package),
the dict directory (package) containing your hash table, and possibly a list
directory (package) if you choose to use an encapsulated list ADT. The graph
directory should contain your WUGraph.java and (if you use it) VertexPair.java.
The set directory should contain whatever code you are using for disjoint sets.
If you are using VertexPair.java and/or DisjointSets.java, you must submit them
because you're allowed to change them; the autograder won't supply the original
files. Make sure any other files your project needs, possibly including a list
ADT, are present as well. You won't be able to submit Neighbors.java, because
you're not allowed to change it.
Type "submit pj3". You may submit as often as you like. Only the last version
you submit will be graded, unless you send email to [email protected] asking that
an earlier version be graded.