Title: Chapter 4: Linked Lists
1Chapter 4 Linked Lists
2The problem with arrays
- An array has a fixed size
- Its size is determined at compile time
- Very cumbersome to try to dynamically allocate an
array - Having wasted memory that we arent using is not
an efficient solution to a problem
3Linked Lists
- We could alter the way that we look at an array
to make it more efficient - Rather than having a static block of memory
allocated to an array, what if we only locked off
what we needed? - In a manner of speaking, we could make an array
of one element and then link it somehow to other
elements as they are added - We would link it to the other elements via a
pointer
4How to do a linked list
- We start with the primitive of our list for the
sake of argument, well say that we are dealing
with integers - We create a new data type that has both a integer
and a pointer contained within it - The integer is the data we want to keep
- The pointer points to the next item in the array
5Inserting items into the linked list
- If we want to insert an item, we simply
instantiate our data type - If the list is unsorted, we set the pointer in
the list to point to the newly created element - If the list is sorted, we find where it goes and
change two pointers - The neophyte element copies the pointer value of
the element before it - The element before our neophyte has its pointer
changed to our new element
6Deleting items from the linked list
- To delete an item, we look at the element before
the one we want to delete - Change that elements pointer to the pointer
value of the one were deleting - We then garbage collect the deleted item to
ensure that we do not create a memory leak! - We can leave the deleted element around for a bit
if we think were going to use it again, but be
sure that you remember to garbage collect it at
some point
7A holistic look at the linked list
8Why we call it a linked list
- The items in the data structure are linked to one
another, i.e. the first item points to the second
which in turn points to the third which in turn
points to the fourth
9A few pointers
- A pointer variable or more commonly referred to
as simply a pointer contains the location or
address in memory where the item in question
resides - By using pointers, we can quickly locate data
that the operating system moves around
10The Pointer sisters
- Lets say that Jennifer Pointer moves about quite
a bit - She shops at all the trendiest places
- She has many friends that she likes to visit
- If I want to find Jennifer Pointer, I need to
know some constant about her that will allow me
to locate her - Unfortunately, Jennifer is a bit of a technophobe
and does not like cellular telephones - However, I do know where she lives
11Where in the world is Jennifer Pointer?
- Since I know where she lives and that she is not
given to moving her bivouac willy-nilly, I can
drive to her abode and talk to her more homebody
sibling, Melody Pointer - Melody always knows where I can find Jennifer
- Melody tells me where Jennifer is and then I
drive to that location to see her - In this case, Melody becomes a pointer to
Jennifer she is a static (non-moving) reference
to a dynamic (moving) resource I can always find
where Jennifer is because I know where to find
Melody
12A picture of what I mean
13All this sounds great, but...
- How do I get a pointer variable p to point to a
memory cell (location)? - How do you use p to get to the content of the
memory cell to which p points? - Whoa there, partner youre a bit ahead of
yourself First, let us declare p as a pointer - int p
14Be careful with your syntax!
- int p, q and int p,q
- are the same as saying
- int p
- int q
- To declare both as pointers, we do the following
- int p, q
15When is this memory allocated?
- This memory is allocated statically which means
that it is done at compile-time - We therefore refer to these variables as
statically allocated variables.
16Working with pointers
- In addition to being careful with our syntax, we
also need to be a bit careful with how we handle
pointers - If we declare an integer x, we cannot just set
px for this statement will be rejected by the
compiler because p and x are different
fundamental types - We can, however, use the address or address-of
operator, - p x
- which places the address of x into p.
17Dynamic memory allocation
- We can make a variable to be allocated at
run-time this is called dynamic memory
allocation. - A variable of this type is said to be a
dynamically allocated variable (real shocker,
eh?) - We accomplish this by use of the C keyword new
- p new int
- It is important to note that after executing this
statement, the value of p is indeterminate it is
not initialized to some particular value - You therefore need to initialize it to some value
to prevent weird things from happening to your
code!
18Unused memory
- Suppose that we no longer require the services of
a pointer - We could assign the pointer to NULL which makes
the pointer point to the language-default
never-land, i.e. a pointer that points to NULL
should never be used until its value is
reassigned to something more tangible such as
0xfcde0895 (just an example). - But what if we know were not going to use that
pointer again?
19Deleting a variable to recover memory
- In this case, we want to delete the variable so
that the memory is recovered by the operating
system - If we do not do this, the memory cell remains
allocated to the program thus producing the
much-feared memory leak - We could delete p in the following manner
- delete p
20A few caveats on delete
- When we delete a variable, we do not de-allocate
the variable - We simply leave its contents undefined
- It is no longer protected by the program or the
OS - The memory cell remains allocated to the program
even though it is no longer accessible
referencing p after we have done a delete p can
be disastrous! - We stave off this problem by assigning
- p NULL
21Why delete doesnt do this
- So why doesnt delete automatically set the value
of the pointer to NULL? - The system cannot always clearly determine who
should be set to NULL. - You may have more than one pointer that points to
that location - It will therefore remain the responsibility of
the developer to set that value to NULL.
22An incorrect pointer to a non-protected node
23An example
24Example continued
25End of the example
26Dynamic array allocation
- We can allocate arrays dynamically with a bit of
chicanery - int arraySize 50
- int anArray new intarraySize
- The pointer variable anArray will point to the
first item in the array. - Since arraySize is a variable, we could change
the size of the array - We then create a new array
- Copy the old array to the new
- garbage collect anArray
- This can be inefficient if anArray is
sufficiently large!
27Pointer arithmetic
- C treats the name of an array as a pointer to
the first element in the array, e.g. - anArray is equivalent to anArray0
- (anArray1) is the same as anArray1
- This is called pointer arithmetic.
- Be careful! If a pointer points to an array of
integers, you must add sizeof(int) to get to the
next value! - Most compilers will handle this for you
- This type of arithmetic can really haunt you if
you port this to another system whose compiler
doesnt compute it for you!
28Deleting the array
- To effectively de-allocate the array, use the
following notation - delete anArray
- Remember that this memory is returned to the
system for future use - The values it contains may still be valid
- Set the pointer to NULL so that others (and you)
will not be tempted to use it!
29Pointer-based linked lists
- Each component is called a node
- Each node has two components
- The data itself
- A pointer to the next item in the list
- Since each node in the linked list contains two
pieces of native primitives, it is natural to
conclude that the linked list should be a struct
instead of a class.
30Some pointers on pointers
- A pointer can point to almost anything
- Integers
- Chars
- Arrays
- Floats
- Structs
- A pointer cannot, however, point to a file (there
are special file pointers to do this) - Therefore a pointer in our node structure can
point to another node structure
31More on linked lists
- We have all the elements pointing to one another,
but what about the beginning and end of the list? - We have an additional pointer that points to the
beginning of the list the head pointer or head
of the list - The head is usually pointed to NULL when the list
is initialized - The head is also pointed to NULL if the list
becomes empty - The last item in the linked list can also point
to NULL to indicate that it is indeed the last
thing on the list.
32A picture of the linked list
33Displaying the contents of a linked list
- If we want to show all the elements in a linked
list, we can simply employ a loop to iterate
through the entire list displaying each one of
the elements - This solution requires that we keep around an
additional pointer cur which keeps track of the
current node to which we are pointing
34Some code
- //Display the data in a linked list
- //Loop invariant cur points to the next
- //node to be displayed.
- for(Node curhead cur!NULL curcur-gtnext)
- cout ltlt cur-gtitem ltlt endl
35Some N.B.s on the code
- A common error is to compare cur-gtnext with NULL
instead of cur. - When cur points to the last node of a non-empty
linked list, cur-gtnext NULL. - This means you wont display the last item in the
list! - Displaying a linked list is an example of a
common operation called list traversal which
sequentially visits each node in the list until
it reaches the end of the list.
36Deleting a node from the linked list
- We simply take the next pointer of the previous
entry in the list and set it equal to next
pointer of the item we wish to delete - The node we have deleted still remains in
existence! It must be garbage collected. - prev-gtnext cur-gtnext
37Does this work for every node in the list?
- Unfortunately, the answer is no.
- If we try to delete the first element in the
list, prev-gtnext is undefined. - Fortunately, there is a simple solution
- head cur-gtnext
38Avoiding a memory leak
- After we delete a node from the list, the node
still exists - We must return the node to the OS for garbage
collection - cur-gtnext NULL
- delete cur
- cur NULL
39To delete, we perform 3 tasks
- Locate the node we wish to delete by list
traversal - We can delete the ith node
- We can delete a node that has a particular data
item - Disconnect the node from the list by changing
pointers - Return the disconnected node to the system to be
garbage collected
40Inserting a node in the linked list
- We do just the opposite of the deletion code
- We create our new node
- newPtr new Node
- We initialize our new node with our data
- We traverse the list until we find where we wish
to insert the node - newPtr-gtnext cur
- prev-gtnext newPtr
41Inserting at the beginning of the linked list
- As you might have suspected, this is a special
case - We just point head to the newly minted node and
let the neophytes next item be the old head - newPtr-gtnext head
- head newPtr
42To insert, we perform 3 tasks
- Traverse the list to determine the point of
insertion - Create a new node and store the new data in it
- Connect the new node to the linked list by
changing pointers
43More on inserting
- In order to insert an item in our list, we need
to keep a trailing pointer prev that points to
the previous item in the list - In this manner, we can look at the value of the
current node and see if the new node has to go
before it - We can then back up a node and do the insertion
44Determining the point of insertion/deletion
- //Determine the point of insertion/deletion
- //for a sorted linked list
- for (prev NULL, curhead (cur ! NULL)
(newValue gt cur-gtitem) prev cur, cur
cur-gtnext)
45Pointer-based ADT List
- Unlike the array-based implementation, there is
no shifting of items necessary during
insertion/deletion - It is therefore a much more efficient algorithm
for larger data sets - It also cuts down on the memory footprint for the
code - It also does not impose a strict minimum/maximum
for the size of the list other than the amount of
memory available to the system
46ADT List redefined
- Suppose that we define a function find(i) that
finds the ith node of the list - To insert/delete at this node, we also need to
know the location of the previous node - Since weve taken 2315, we think smartly and
realize that we could make find(i) find the
(i-1)th node which leaves cur pointed to the
(i-1)th node and cur-gtnext pointed to the ith item
47More on find()
- It isnt a specified ADT operation
- Moreover, find() returns a pointer
- Recall that pointers are powerful and we dont
want just anyone to have them - We would therefore not want any client of the
class to call it
48General observation on ADTs
- In general, it is perfectly reasonable for an ADT
to define variables and functions that the rest
of the program should not access. - Many ADTs require a special constructor called a
copy constructor so that your code can correctly
handle - List yourList myList
49Shallow or deep copy?
- If we only need a shallow copy of a data
structure, we do not need to provide a copy
constructor as the compilers rendition will
suffice - If we need a deep copy (as we do for the ADT
List), we must provide our own copy constructor
50Whats the difference?
Shallow copy
Deep copy
51Destructors
- Classes that only use statically allocated memory
can rely upon the compiler-generated destructor
to free up memory - However, classes that use dynamically allocated
memory need to have their own custom written
destructor that returns all used resources to the
system - A destructor for List would be List()
52Header file of ADT List
- //
- // Header file ListP.h for the ADT list. //
Pointer-based implementation. - //
- include "ListException.h"
- include "ListIndexOutOfRangeException.h"
- typedef desired-type-of-list-item ListItemType
- class List
- public
- // constructors and destructor
- List() // default constructor
- List(const List aList) // copy constructor
- -List() // destructor
- // list operations
- bool isEmpty() const
- int getLength() const
- void insert(int index, ListItemType newItem)
- throw(ListIndexoutOfRangeException,
ListException) - void remove(int index)
- throw(ListIndexOutOfRangeException)
53The implementation file
- Since we cant put everything on a single page,
we will do the implementation file piecemeal - The default constructor is simple
- ListList() size(0), head(NULL)
-
- //Nothing needed here
- //end default constructor
54Copy constructor
- List(const List aList) size(aList.size)
-
- if (aList.head NULL)
- head NULL //original list empty
- else
- //Copy first element
- head new ListNode
- assert(head ! NULL) //check allocation
- head-gtitem aList.head-gtitem
- //Copy rest of list
- ListNode newPtr head //Last node in new
list - for (ListNode origPtr aList.head-gtnext
origPtr ! NULL origPtr origPtr-gtnext) -
- newPtr-gtnext new ListNode
- assert(newPtr-gtnext ! NULL)
- newPtr newPtr-gtnext
- newPtr-gtitem origPtr-gtitem
- //end for
- newPtr-gtnext NULL
55Destructor
- We can de-allocate the entire list by continually
removing an element until the list is empty - ListList()
-
- while (!isEmpty())
- remove(1)
- //end destructor
56List operations
- bool ListisEmpty() const
-
- return bool(size 0)
-
- int ListgetLength() const
-
- return size
57More list operations
- Because the list doesnt allow direct access to
elements the retrieval, insertion and deletion
operations must all traverse the list from the
beginning until the specified point is reached - Because of this, we define the operation find(i).
58find(i)
- ListListNode Listfind(int index) const
- //Locates a node in the list
- //Precondition index is number of node desired
- //Postcondition Returns pointer to desired node.
If node not - // located, NULL returned.
-
- if ((index lt 1) (index gt getLength()))
- return NULL
- else
-
- ListNode cur head
- for (int skip 1 skip lt index skip)
- cur cur-gtnext
- return cur
-
-
59retrieve(i)
- void Listretrieve(int index, ListItemType
dataItem) const -
- if ((index lt 1) (index gt getLength()))
- throw ListIndexOutOfRangeException(Index out
of range.) - else
-
- ListNode cur find(index)
- dataItem cur-gtitem
-
60insert(i, newItem)
- void Listinsert(int index, ListItemType
newItem) -
- int newLength getLength() 1
- if ((index lt 1) (index gt newLength)) throw
ListIndexOutOfRangeException( "ListOutOfRangeExcep
tion insert index out of range") - else
- // create new node and place newItem in it
- ListNode newPtr new ListNode
- if (newPtr NULL)
- throw ListException( "ListException insert
cannot allocate memory") - else
-
- size newLength newPtr-gtitem newItem
- // attach new node to list
- if (index 1)
- // insert new node at beginning of list
- newPtr-gtnext head
- head newPtr
-
- else
61delete(i)
- void Listremove(int index)
-
- ListNOde cur
- if ((index lt 1) (index gt getLength()))
- throw ListIndexoutofRangeException(
"ListoutofRangeException remove index out of
range") - else
-
- --size
- if (index 1)
- // delete the first node from the list
- cur head // save pointer to node
- head head-gtnext
-
- else
-
- ListNOde prev find(index-1)
- cur prev-gtnext //Save pointer to node
- prev-gtnext cur-gtnext
- //end if
62Comparing the array-based and pointer-based
implementations
- As usual, there are pros and cons to each
implementation strategy - You should carefully weigh these pros and cons
before selecting a strategy - Arrays have a fixed size
- Arrays have direct access because their elements
are stored one after the other - This is called an implicit address
- A pointer-based has to explicitly specify the
next address - Because an array-based implementation doesnt
need address information for the next element,
they require a smaller memory footprint - Arrays dont require you to traverse the entire
list to find your element - Arrays require you to shift the data anytime you
insert or delete elements
63Saving and restoring a linked list from a file
- The algorithm that restores a linked list also
demonstrates how you can build a linked list from
scratch - Writing the pointers to a file serves no purpose
because those pointers become invalid once the
program terminates - Therefore, writing out the entire node to a file
is not an eloquent solution - All you really need to save in the file is the
data portion of each node (easy to do if each
item has a fixed size, but a bit trickier if
youre storing strings or other variable length
data)
64More restoring from a file
- You can use the native insert() code to keep
adding items to your list - However, each time you insert something to the
end requires a traversal to the end of the list - We could save the file in reverse order of the
list so we always insert at the head of the list - We could make a tail pointer that points to the
end of the list - tail could be local and destroyed after the list
is created - Or it could exist as long as head exists its
up to you!
65Passing a linked list to a function
- It is sufficient to pass the head pointer to the
function - This should not be the case if the function is
outside of the class scope (remember, pointers
are powerful and this would violate the wall of
the ADT!) - Recursive functions might need the head pointer
as an argument - These must not be in the public section of the
class - This keeps our ADT safe from others
- Pass the head pointer by reference
- A linked list passed to a function as an argument
is shallow copied, not deep copied - Passing the head pointer causes a deep copy by
the copy constructor
66Recursively processing linked lists
- If the recursive functions are members of a class
they should not be public because they require
the linked lists head pointer as an argument - One such recursive function would be list
traversal for writeBackward() - Another example would be repeated insertion which
eliminates the need for both a trailing pointer
and a special case of inserting at the beginning
of a list
67Repeated insertion
- Suppose we want to insert into a sorted linked
list with a recursive function - The linked list is sorted if
- head NULL
- head-gtnext NULL
- head-gtitem lt head-gtnext-gtitem and the pointer
head-gtnext points to a sorted linked list - The first two of those cases become our base cases
68Some code to do just so!
- void linkedListInsert(Node headPtr, ItemType
newltem) -
- if ((headPtr NULL) II (newItem lt
headPtr-gtitem)) - //base case insert newltem at beginning
- Node newPtr new Node
- if (newPtr ! NULL)
- throw ListException( "ListException insert
cannot allocate memory") - else
-
- newPtr-gtitem newItem
- newPtr-gtnext headPtr
- headPtr newPtr
- //end if
- else
- linkedListInsert(headPtr-gtnext, newItem)
- //end linkedListInsert
69Some N.B.s
- The function inserts at one of the base cases
- Either when the list is empty
- Or when the data item is smaller than all the
other data items in the list - In either one of these cases, you need to insert
the item at the front of the list
70General Insert (yes, sir!)
- The general case in which the item is inserted
somewhere in the innards of the list is very
similar - When the base case is reached, the next pointer
of the node is the argument that corresponds to
the headPtr in our recursive definition
71The general insert case
72Variations on a theme
- There are different flavors of a linked list
- Circular(ly) linked lists
- Dummy head nodes
- Doubly linked lists
- Circular(ly) doubly linked lists
- Which one you should use depends on what you are
trying to do within your design
73Circular(ly) linked lists
- If we make the next pointer of the last element
point to the first element in the list, we have
created a circularly linked list - No node contains NULL in its pointer
- We must be careful when traversing to the end of
the list or we will create an infinite loop - We can save the current pointer and keep
traversing until we hit that pointer again - We dont have to keep track of the head pointer,
just the current pointer
74Dummy Head Nodes
- Both the insertion and deletion algorithms
require a special case for the head node - The Dummy Head Node method eliminates the need
for this special case - The Dummy Head Node is present even when the list
is empty - In this implementation, the first item in the
list is actually the second item in the linked
list - The insertion and deletion algorithms initialize
prev to point to the dummy head node instead of
to NULL.
75Doubly linked lists
- When deleting a node from a list, it would be
handy to not have to remember a prev pointer or
to have to re-traverse the list to find the
previous node - With a doubly linked list, we have two pointers
packaged with the data item - A next pointer which points to the next node
- A prev pointer which points to the previous item
- Because there are more pointers, the mechanics of
doing an insert or delete are a bit more
involved, especially at the head or tail of the
list - It is common to use a dummy head node with a
linked list to eliminate some of its inherent
problems
76Circular doubly linked list
- You can take a doubly linked list and change the
next pointer for the last item to make it a
circular doubly linked list - The pointer will now point to the head node/dummy
head node of the linked list