Title: CS187 Programming with Data Structures Applications from James Allen
1CS187 Programming with Data StructuresApplication
s (from James Allen)
- Oliver Brock
- UMass Amherst
- Spring 2009
2Topics/structures we have covered
- Lists
- Iterators
- Single or double linked
- Stack
- Queue
- Recursion
- Trees
- Self-balancing
- Heaps
- Sets
- Maps
- Hashes
3This class
- Will present a problem
- Design data structure aspects of solution
- Wont worry about specifics of coding
- Presented solutions are one possibility only
- Numerous other possibilities may be valid
- Assumptions
- No database system available
- All structures in Java (this is a Java class,
after all) - Might be useful to extend a Java structure
- Memory/disk issue should be considered
- Where should data be stored, while used and for
future?
4Problem One
Lists Stack Queue Recursion Trees Heaps Sets Maps
Hashes
- User authentication component
- Needed capabilities
- Add new users
- Userid, name, email address, password,
- Modify their information
- Look up for validation of password
5Solution to problem one
- Individual data, stored in a userRecord object
- String userid
- String name
- String password
- String emailAddress
- Needs
- Rapid access by userid
- Userid is unique
- Data structure
- Hashing by userid
- HashMapltuserid,userRecordgt
- .hashCode() would use userid only
6Problem two
Lists Stack Queue Recursion Trees Heaps Sets Maps
Hashes
- Same as before
- User authentication component
- Needed capabilities
- Add new users
- Userid, name, email address, password,
- Modify their information
- Look up for validation of password
- But
- Need to suggest alternate userids
- For registration
- Need to look up userid by email address
- Forgot password?
- Need to list users in name order (not userid
order)
7Solution to problem two
- Hash structure still needed
- How to look up by email address
- Iterate through HashMap
- Second structure hashed by email address
- Probably just email address and userid
- Dont want to duplicate other information
- (Called normal forms see CS445)
- HashMap sort of like useridpassword
- Need sorted order by name
- Again, need another (third) structure
- Since just list (not lookup) could use
- LinkedList, PriorityQueue
- Trees probably overkill, unless lists are very
long
8Problem three
Lists Stack Queue Recursion Trees Heaps Sets Maps
Hashes
- You are helping to support the implementation of
a new programming languages compiler - Source code ? something that runs
- E.g., X.java ? X.class
- You are charged with keeping track of variables
- Name, type, scope
- What else?
- Lookup is by variable name
9Solution to problem three
- No need for disk access
- Functionality
- Rapid storage of variables and scope
- Rapid lookup
- By variable name and current scope
- Ability to purge variables when a scope is exited
- Possibilities
- Hash by ltvariable,scopegt allows storage and
lookup - Sorted linked lists by scope allows easy deletion
by scope - Sorted ArrayList type of structure if not too
large - Perhaps secondary store to list all variables in
a particular scope - HashMap scope ? set of variable names
10Problem four
Lists Stack Queue Recursion Trees Heaps Sets Maps
Hashes
- Extract all words in a very large file and find
the 20 most frequent
11Solution to problem four
- Functionality
- Map from word to a count (integer)
- Sort by the count
- Note that sorting is only needed at the end
- Possibility
- HashMapltString,Integergt
- When done, put into a max-Heap sorted by count
- Pull out first 20 entries
- Could also keep running track of top 20
- Will be faster if file is sufficiently large
12Problem five
Lists Stack Queue Recursion Trees Heaps Sets Maps
Hashes
- You are implementing a spreadsheet and have to
keep track of dependencies - Cell A15 is the sum of cell B3 and C13
- If B3 changes, you know to update A15
- Given a cell ID that has changed
- Return list of cells to be recalculated
- Return them in order
- If F6 is twice the value in A15, then the order
above after changing B3 would be A15 then F6 - Simplification
- This spreadsheet does not allow a cell to be
referenced by more than one other cell - (Well provide structures to remove this in two
weeks)
13Solution to problem five
- Observations
- A cells value can depend on any number of other
cells - It can have multiple children
- A cells value can be used by at most one other
cell - It can have only one parent
- Loops are not possible
- Cannot have A15 depend on A14 and A14 depend on
A15 - Spreadsheets disallow this
- Sounds like a tree (really a set of trees)
- Functionality
- Given a cell, find its location in the tree
- Find all ancestors of the cell in the tree
- Its parent, grandparent, etc the path to the
root - Implementation is a variation of P3 addEdge()
- Also need pathToRoot()
14Problem six
Lists Stack Queue Recursion Trees Heaps Sets Maps
Hashes
- You need to assign seats to audience members as
they buy tickets - Seats are assigned first-come first-served
- Groups always placed together in a row
- Cannot buy tickets if no span that big
- Patrons can select group of seats from list
- Most groups prefer seats near the front and in
the center
15Solution to problem six
- Functionality
- Keep track of which seats taken
- Find all spans of N seats
- Sort them in front/center order
- Keeping track
- Perhaps a simple array of arrays
- Each row is a boolean array
- The auditorium is an array of rows
- Finding spans
- Scan each row and generate ltspan, row,
middleSeatgt - Sorting spans
- Convert row and middleSeat to a number
(somehow) - Store spans in a sorted list, a heap, a sorted
set,
16Problem seven
Lists Stack Queue Recursion Trees Heaps Sets Maps
Hashes
- Youre implementing part of a search engine
- Given a word, you need to store a list of every
document that contains the word - The list will only be accessed sequentially
- It will be sorted by document
- The list will be huge and will not fit in memory
17Solution to problem seven
- Issues
- Need to have list sorted by document
- Need to minimize disk access
- But cannot have it all in memory
- Reminder
- Computers fetch blocks efficiently
- Want big but appropriately sized blocks
- Implementation
- Unspecified access to list by word
- Linked list of blocks
- Sorted list (ArrayListltdocidgt) in each block
- Note that constructing list might not store it
sorted this way - Compare to the word count problem from earlier
18Problem eight
Lists Stack Queue Recursion Trees Heaps Sets Maps
Hashes
- You are compiling a list of bird sightings
- Each birder has some device for noting sighting
- Birds name
- When done, all lists need to be combined
- Want to know
- Which birds were seen
- Which birds (from a larger group) were not seen
19Solution to problem eight
- Just need to know which birds were seen
- Dont need to know where, when, etc
- Dont care if seen more than once
- Great use of Set
- When combining, do the union
- To compare, subtract seenSet from expectedSet
20Problem nine
Lists Stack Queue Recursion Trees Heaps Sets Maps
Hashes
- You are helping host a web server
- You need to keep track of accesses
- URL, who loaded it, when did they do that
- This is just a log
- Will be occasionally scanned
21Solution to problem nine
- Easiest solution is to dump the data to a text
file - (Essentially a list ordered by time added)
- No fancier data structures needed
22Problem ten
Lists Stack Queue Recursion Trees Heaps Sets Maps
Hashes
- You need a program that plays tic-tac-toe (or
some other game) - Assume a game without too many possible moves
- You solve it using a brute force approach
- Given current state of the board
- Try all possible moves you can take
- From there, all of your opponents possible moves
- For every one of your possible moves
- And so on until all games are won by someone
- You need to find your best move
23Solution to problem ten
- General idea tree search problem
- Nodes in tree represent board states
- Root is current state
- Children of root are all of your possible moves
- Grandchildren of root are all possibilities of
opponent - Functionality
- Construct the tree given the current state
- Recognize a completed/winning board (and who won)
- Score a node based on how likely it is that you
won - Many other possibilities could be used here
- Implementation
- Board data structure
- Recursion to generate tree of possible moves
- Recursion to search tree and score nodes
24Problem eleven
Lists Stack Queue Recursion Trees Heaps Sets Maps
Hashes
- You are helping to implement a path planning
system for a robot - It needs to move from point A to point B
- The algorithm knows paths between some points
but not all of them - To get a path from C to D it estimates a point
E between them and then finds the path from C to
E then E to D - Your job is to store the sets of paths that need
resolution - Want to get moving as quickly as possible
25Solution to problem eleven
- Observations
- This is a recursive problem and you could solve
it entirely recursively - Without recursion, break problem into smaller
pieces and store those pieces somewhere - Data structures
- A stack or deque is sufficient for this
- A traditional queue wont work because need to
put sub-problems at the start of the work list