Title: Data%20Abstractions
1Data Abstractions
- EECE 310 Software Engineering
2Learning Objectives
- Define data abstractions and list their elements
- Write the abstraction function (AF) and
representation invariant (RI) of a data
abstraction - Prove that the RI is maintained and that the
implementation matches the abstraction (i.e., AF) - Enumerate common mistakes in data abstractions
and learn how to avoid them - Design equality methods for mutable and immutable
data types
3Data Abstraction
- Introduction of a new type in the language
- Type can be abstract or concrete
- Has one of more constructors and operations
- Type can be used like a language type
- Both the code and the data associated with the
type is encapsulated in the type definition - No need to expose the representation to clients
- Prevents clients from depending on implementation
4Isnt this OOP ?
- NO, though OOP is a way to implement ADTs
- OOP is a way of organizing programs into classes
and objects. Data abstraction is a way of
introducing new types ADTs with meanings. - Encapsulation is a goal shared by both. But data
abstraction is more than just creating classes. - In Java, every data abstraction can be
implemented by a class declaration. But every
class declaration is not a data abstraction.
5Elements of a Data Abstraction
- The abstraction specification should
- Name the data type
- List its operations
- Describe the data abstraction in English
- Specify a procedural abstraction for each
operation - Public vs. Private
- The abstraction only lists the public operations
- There may be other private procedures inside
6Example IntSet
- Consider a IntSet Data type that we wish to
introduce in the language. It needs to have - Constructors to create the data-type from scratch
or from other data types (e.g., lists, IntSets) - Operations include insert, remove, size and isIn
- A specification of what the data type represents
- Internal representation of the data type
7IntSet Abstraction
- public class IntSet
- //OVERVIEW IntSets are mutable, unbounded sets
of integers. - // A typical IntSet is x1, xn, where
xi are all integeres - // Constructors
- public IntSet()
- //EFFECTS Initializes this to be the empty
set - // Mutators
- public void insert (int x)
- // MODIFIES this
- // EFFECTS adds x to the set this, i.e,
this_post this u x - public void remove (int x)
- // MODIFIES this
- // EFFECTS this_post this - x
- //Observers
- public boolean IsIn(int x)
- // EFFECTS returns true if x e this,
false otherwise - public int size()
- // EFFECTS Returns the cardinality of this
8Group Activity
- Consider the Polynomial data-type below. Write
the specifications for its methods. - public class Poly
- public Poly(int c, int n) throws NegException
- public Poly add(Poly p) throws NPException
- public Poly mul(Poly p) throws NPException
- public Poly minus()
- public int degree()
-
-
9Learning Objectives
- Define data abstractions and list their elements
- Write the abstraction function (AF) and
representation invariant (RI) of a data
abstraction - Prove that the RI is maintained and that the
implementation matches the abstraction (i.e., AF) - Enumerate common mistakes in data abstractions
and learn how to avoid them - Design equality methods for mutable and immutable
data types
10Abstraction Versus Representation
- Abstraction External view of a data type
- Representation Internal variables to represent
the data within a type (e.g., arrays, vectors,
lists)
Abstraction
Representation
11Example Representation
0
N
VectorltIntegergt elems of size N to represent an
IntSet
- Vector directly holds the set elements
- if integer e is in the set, there exists 0 lt i lt
N, such that elemsi e
- Vector is a bitmap for denoting set elements
- If integer i is in the set, then elemsi True,
else elemsi False
Can you tell how the representation maps to the
abstraction ?
12Abstraction Function
- Mathematical function to map the representation
to the abstraction - Captures designers intent in choosing the rep
- How do the instance variables relate to the
abstract object that they represent ? - Makes this mapping explicit in the code
- Advantages Code maintenance, debugging
13IntSet Abstraction Function
- AF ( c ) c.elemsi.intValue
- 0 lt i lt c.elems.size
-
-
- AF( c )
- j 0 lt j lt 100
- c.elemsj
-
- The abstraction function is defined for concrete
instances of the class c, and only includes the
instance variables of the class. Further, it
maps the elements of the representation to the
abstraction.
14Abstraction Function Valid Rep
- The abstraction function implicitly assumes that
the representation is valid for the class - What happens if the vector contains duplicate
entries in the first scenario ? - What happens in the second scenario if the
bitmap contains values other than 0 or 1 ? - The AF holds only for valid representations. How
do we know whether a representation is valid ?
15Representation Invariant
- Captures formally the assumptions on which the
abstraction function is based - Representation must satisfy this at all times
(except when executing the ADTs methods) - Defines whether a particular representation is
valid invariant satisfied only by valid reps.
16IntSet Representation Invariant
- c.elems / null
- c.elems has no null elements
- 3. there are no duplicates in c.elems i.e., for
0lti, j ltN, - c.elemsi.intValue c.elemsj.intValuegt i
j.
- 1. c.elements / null
- 2. c.elements.size maxValue
NOTE The types of the instance variables are NOT
a part of the Rep Invariant. So there is not need
to repeat what is there in the type signature.
17Rep Invariant Important Points
- Rep invariant always holds before and after the
execution of the ADTs operations - Can be violated while executing the ADTs
operations - Can be violated by private methods of the ADT
- How much shall the rep invariant constrain?
- Just enough for different developers to implement
different operations AND not talk to each other - Enough so that AF makes sense for the
representation
18AF and RI How to implement ?
- Public method to check if the rep invariant holds
- Useful for testing/debugging
- public boolean repOK()
- // EFFECTS Returns true
- // if the rep invariant holds,
- // Returns false otherwise
- Public method to convert a valid rep to a String
form - Useful for debugging/printing
- public String toString( )
- // EFFECTS Returns a string
- // containing the abstraction
- // represented by the rep.
19Uses of RI and AF
- Documentation of the programmers thinking
- RepOK method can be called before and after every
public method invocation in the ADT - Typically during debugging only
- toString method can be used both during debugging
and in production - Both the RI and AF can be used to formally prove
the correctness of the ADT
20Group Activity
- Assume that the Polynomial data type is
represented as an array trms and a variable deg.
The co-efficients of the term xi are stored in
the ith element of trms array, and the variable
deg represents the degree of the polynomial
(i.e., its highest exponent). - Write its abstraction function
- Write its rep-invariant
21Learning Objectives
- Define data abstractions and list their elements
- Write the abstraction function (AF) and
representation invariant (RI) of a data
abstraction - Prove that the RI is maintained and that the
implementation matches the abstraction (i.e., AF) - Enumerate common mistakes in data abstractions
and learn how to avoid them - Design equality methods for mutable and immutable
data types
22Reasoning about ADTs - 1
- ADTs have state in the form of representation
- Need to consider what happens over a sequence of
operations on the abstraction - Correctness of one operation depends on
correctness of previous operations - We need to reason inductively over the operations
of the ADT - Show that constructor is correct
- Show that each operation is correct
23Reasoning about ADTs - 2
- First, need to show that the rep invariant is
maintained by the constructor operations - Then, show that the implementation of the
abstraction matches the specification - Assume that the rep invariant is maintained
- Use the abstraction function to map the
representation to the abstraction
24Why show that Rep Invariant is maintained ?
- Consider the implementation of the IntSet using
the unsorted vector representation. We wish to
compute the size of the set (i.e., its
cardinality). - public int size()
- return elems.size()
-
- Is the above implementation correct ?
25Why show that Rep Invariant is maintained ?
- Yes, but only if the Rep Invariant holds !
- c.elems ! Null c.elems has no null elements
- c.elems has no duplicates
- Otherwise, size can return a value gt cardinality
- public int size()
- return elems.size()
-
26Showing Rep Invariant is maintainedData Type
Induction
- Show that the constructor establishes the Rep
Invariant - For all other operations,
- Assume at the time of the call the invariant
holds for - this and
- all argument objects of the type
- Demonstrate that the invariant holds on return
for - this
- all argument objects of the type
- for returned objects of the type
A Valid Rep
Function Body
Another Valid Rep
27IntSet getIndex
Assume that IntSet has the following private
function. Note that private methods do not need
to preserve the RI.
- private int getIndex( int x )
- // EFFECTS If x is in this, returns
index - // where x appears in the Vector elems
- // else return -1 (do NOT throw an
exception) - for (int i 0 i lt els.size( ) i )
- if ( x elements.get(i).intValue()
) - return i
- return 1
-
28IntSet Constructor
Show that the RI is true at the end of the
constructor
- public IntSet( )
- // EFFECTS Initializes this to be empty
- elems new VectorltIntegergt()
RI c.elems ! NULL c.elems has no null
elements c.elems has no duplicates
Proof When the constructor terminates, Clause 1
is satisfied because the elems vector is
initialized by constructor Clause 2 is satisfied
because elems has no elements (and hence no null
elements) Clause 3 is satisfied because elems has
no elements (and hence no duplicates)
29IntSet Insert
Show that if RI holds at the beginning, it holds
at the end.
- public void insert (int x)
- // MODIFIES this
- // EFFECTS adds x to the set such that
this_post this u x - if ( getIndex(x) lt 0 )
- elems.add( new Integer(x) )
RI c.elems ! NULL c.elems has no null
elements c.elems has no duplicates
Proof If clause 1 holds at the beginning, it
holds at the end of the procedure. - Because
c.elems is not changed by the procedure. If
clause 2 holds at the beginning, it holds at the
end of the procedure - Because it adds a
non-null reference to c.elems If clause 3 holds
at the beginning, it holds at the end of the
procedure - Because getIndex() prevents
duplicate elements from being added to the vector
30IntSetRemove
Show that if RI holds at the beginning, it holds
at the end.
- pubic void remove(int x)
-
- // MODIFIES this
- // EFFECTS this_post this - x
- int i getIndex(x)
- if (i lt 0) return // Not found
- elems.set(i, elems.lastElement() )
- elems.remove(elems.size() 1)
-
RI c.elems ! NULL c.elems has no null
elements c.elems has no duplicates
31IntSet Observers
Show that if RI holds at the beginning, it holds
at the end.
- public int size()
- return elems.size()
-
- public boolean isIn(int x)
- return getIndex(x) gt 0
-
RI c.elems ! NULL c.elems has no null
elements c.elems has no duplicates
This completes the proof that the RI holds in the
ADT. In other words, given any sequence of
operations in the ADT, the RI always holds at
the beginning and end of this sequence.
32Group Activity
- Consider the implementation of the Polynomial
Datatype described earlier (also on the code
handout sheet) - Show using data-type induction that the Rep
Invariant is preserved
33Are we done ?
- Thus, we have shown that the RI is established by
the constructor and holds for each operation
(i.e., if RI is true at the beginning, it is true
at the end). Can we stop here ?
No. To see why not, consider an implementation of
the operators that does nothing. Such an
implementation will satisfy the rep invariant,
but is clearly wrong !!!
To complete the proof, we need to show that the
Abstraction provided by the ADT is correct. For
this, we use the (now proven) fact that the RI
holds and use the AF to show that the rep
satisfies the AFs abstraction after each
operation.
34Abstraction Function IntSet
- Show that the implementation matches the ADTs
specification (i.e., its abstraction)
Pre-Rep
Abstraction function
Given
Pre-Abstraction
Function Spec
Function Implementation
Abstraction function
Prove that
Post- Rep
Post-Abstraction
35Abstraction Function Constructor
- AF ( c ) c.elemsi.intValue 0 lt i lt
c.elems.size
public IntSet( ) // EFFECTS
Initializes this to be empty
elems new VectorltIntegergt()
AF
Empty vector
Empty Set
Proof Constructor creates an empty set, so it is
correct.
36Abstraction Function Size
- AF ( c ) c.elemsi.intValue 0 lt i lt
c.elems.size
public int size() // EFFECTS Returns the
cardinality of this return elems.size( )
AF
Number of elements in vector
Cardinality of the set (Why ?)
Proof Because the rep invariant guarantees that
there are no duplicates in the vector, the number
of elements in the vector denotes the cardinality
of the set.
37Abstraction Function Insert
- AF ( c ) c.elemsi.intValue 0 lt i lt
c.elems.size
AF
public void insert (int x) // MODIFIES this
// EFFECTS adds x to the set // such that
this_post this U x if ( getIndex(x) lt
0 ) elems.add(new Integer(x))
Vector
this
Implementation
Vector with element added if and only if it did
not already exist
this_post this U x
AF
38Abstraction Function Remove
- AF ( c ) c.elemsi.intValue 0 lt i lt
c.elems.size
Vector
this
public void remove (int x) // MODIFIES this
// EFFECTS this_post this - x
int i getIndex(x) if (i lt 0) return
// Not found // Move last element to
the index i elems.set(i,
elems.lastElement() ) elems.remove(elems.s
ize() 1)
Vector with first instance of element removed if
it exists
this_post this - x
39Abstraction Function IsIn
AF ( c ) c.elemsi.intValue 0 lt i lt
c.elems.size
- public boolean isIn(int x)
- // EFFECTS Returns true if x belongs to
- // this, false otherwise
- return getIndex(x) gt 0
-
vector
this
True if and only if x is present in the vector
True if x belongs to this, False otherwise
40Proof Summary
- This completes the proof. Thus, weve shown that
the ADT implements it spec correcltly. This
method is called Data type induction, because
it proceeds using induction. - Step 0 Write the implementation of the ADT
- Step 1 Show that the RI is maintained by the ADT
- Step 2 Assuming that the RI is maintained, show
using the AF that the translation from the rep to
the abstraction matches the methods spec.
41Group Activity
- Consider the implementation of the Polynomial
Datatype described earlier (also on the code
handout sheet) - Show that the ADTs implementation matches its
specification assuming that the RI holds.
42Learning Objectives
- Define data abstractions and list their elements
- Write the abstraction function (AF) and
representation invariant (RI) of a data
abstraction - Prove that the RI is maintained and that the
implementation matches the abstraction (i.e., AF) - Enumerate common mistakes in data abstractions
and learn how to avoid them - Design equality methods for mutable and immutable
data types
43Exposing the Rep
- Note that the proof we just wrote assumes that
the only way you can modify the representation is
through its operations - Otherwise Rep invariant can be violated
- Is this always true ?
- What if you expose the representation outside the
class, so that any outside entity can change it ?
44Mistakes that lead to exposing the rep - 1
- Making rep components public
- public class IntSet
- public VectorltIntegergt elements
- Your rep must always be private. Otherwise, all
bets are off. - Hopefully, your code will not have this bug .
45Mistakes that lead to exposing the rep - 2
public class IntSet //OVERVIEW IntSets are
mutable, unbounded sets of integers. //
A typical IntSet is x1, xn private
VectorltIntegergt elems // no duplicates in
vector public VectorltIntegergt allElements ()
//EFFECTS Returns a vector containing the
elements of this, // each exactly
once, in arbitrary order return
elems intSet new IntSet() intSet.allElem
ents().add( new Integer(5) ) intSet.allElements()
.add( new Integer(5) ) // RI violated
duplicates !
46Mistakes that lead to exposing the rep - 3
public class IntSet //OVERVIEW IntSets are
mutable, unbounded sets of integers. //
A typical IntSet is x1, xn private
VectorltIntegergt elems //constructors public
IntSet (VectorltIntegergt els) throws
NullPointerException //EFFECTS If els is
null, throws NullPointerException, else //
initializes this to contain as elements all the
ints in els. if (els null) throw new
NullPointerException() elems
els VectorltIntegergt someVector new
Vector() intSet new IntSet(someVector) someVec
tor.add( new Integer(5) ) someVector.add( new
Integer(5) ) // RI violated duplicates !
47Summary of mistakes that expose the Rep
- NOT making rep components private
- Returning a reference to the reps mutable
components - Initializing rep components with a reference to
an outside mutable object - NOT performing deep copy of rep elements
- Use clone method instead
- Perform manual copies
48Group Activity
- For the polynomial example, how many mistakes of
exposing the rep can you find. How will you fix
them ? (refer to code handout sheet)
49Learning Objectives
- Define data abstractions and list their elements
- Write the abstraction function (AF) and
representation invariant (RI) of a data
abstraction - Prove that the RI is maintained and that the
implementation matches the abstraction (i.e., AF) - Enumerate common mistakes in data abstractions
and learn how to avoid them - Design equality methods for mutable and immutable
data types
50Mutable objects
- Objects whose abstract state can be modified
- Applies to the abstraction, not the
representation - Mutable objects Can be modified once they are
created e.g., IntSet, IntList etc. - Immutable objects Cannot be modified
- Examples Polynomials, Strings
51Equality Equals Method
- All objects are inherited from object which has a
method Boolean equals(Object o) - Returns true if object o is the same as the
current - Returns false otherwise
- Note that equals tests whether two objects have
the same state - If a and b are different objects, a.equals(b)
will return false even if they are functionally
identical
52Equality IntSet Example
- IntSet a new IntSet()
- a.insert(1) a.insert(2) a.insert(3)
- IntSet b new IntSet()
- b.insert(1) b.insert(2) b.insert(3)
- if ( a.equals(b) )
- System.out.println(Equal)
-
- What is printed by the above code ?
53Equality IntSet Example
- It prints nothing. Why ?
- Because the intsets are different objects and the
object.equals method only compares their hash - Therefore, a.equals(b) returns false
- But this is in fact the correct behavior !
- To see this, assume that you added an element to
a but not b after the equals comparison - a.equals(b) would no longer be true, even if you
have not changed the references to a or b
54Rule of Object Equality
- Two objects should be equal if it is impossible
to distinguish between them using any sequence of
calls to the objects methods - Corollary Once two objects are equal, they
should always be equal. Otherwise it is possible
to distinguish between them using some
combination of the objects methods.
55Mutability and the Equals Method
- For mutable objects, you can distinguish between
two objects by mutating them after the
comparison. Therefore, they are NOT equal. The
default equals method does the right thing
i.e., returns false. - If the objects are immutable AND have the same
state, then the equals method should return true.
So we need to override the equals for immutable
objects to do the right thing.
56Immutable Abstractions
- ADT does not change once created
- No mutator methods
- Producer methods to create new objects
- Appropriate for modeling objects that do not
change during their existence - Mathematical entities such as Rational numbers
- Certain objects may be implemented more
efficiently e.g., Strings
57Why use immutable ADTs ?
- Safety
- Dont need to worry about accidental changes
- Can be assured that rep doesnt change
- Efficiency
- May hurt efficiency if you need to copy the
object - In some cases, it may be more efficient by
sharing representations across objects e.g.,
Strings - Ease of Implementation
- May be easier for concurrency control
58Equality Immutable objects
- Immutable objects should define their own equals
method - Return true if the abstract state matches, even
if the internal state (i.e., rep) is different - Therefore, methods of an Immutable object can
modify its rep, but not the abstraction - Such methods said to have benevolent side effects
59Group Activity
- Design an equals method for two polynomials. What
will you do if the polynomials are not in their
canonical forms ?
60Learning Objectives
- Define data abstractions and list their elements
- Write the abstraction function (AF) and
representation invariant (RI) of a data
abstraction - Prove that the RI is maintained and that the
implementation matches the abstraction (i.e., AF) - Enumerate common mistakes in data abstractions
and learn how to avoid them - Design equality methods for mutable and immutable
data types
61To do before next class
- Submit assignment 2 in the lab
- Start working on assignment 3
- Prepare for the midterm exam
- Portions include everything covered so far
- In class on Feb 28th