Title: Data Collections Zelle - Chapter 11
1(No Transcript)
2Data CollectionsZelle - Chapter 11
- Charles Severance - www.dr-chuck.com
Textbook Python Programming An Introduction to
Computer Science, John Zelle
3What is not a Collection
- Most of our variables have one value in them -
when we put a new value in the variable - the old
value is over written
python Python 2.5.2 (r25260911, Feb 22 2008,
075753) GCC 4.0.1 (Apple Computer, Inc. build
5363) on darwin gtgtgt x 2 gtgtgt x 4 gtgtgt print x 4
4What is a Collection?
- A collection is nice because we can put more than
one value in them and carry them all around in
one convenient package. - We have a bunch of values in a single variable
- We do this by having more than one place in the
variable. - We have ways of finding the different places in
the variable
(Luggage) CC BY-SA xajondee (Flickr)
http//creativecommons.org/licenses/by-sa/2.0/deed
.en
5A Story of Two Collections..
- List
- A linear collection of values that stay in order
- Dictionary
- A bag of values, each with its own label
(Pringle's Can) CCBY-NC Roadsidepictures
(flickr) http//creativecommons.org/licenses/by-nc
/2.0/deed.en (Pringles) CCBY-NC Cartel82
(flickr) http//creativecommons.org/licenses/by-nc
/2.0/deed.en (Chips) CCBY-NC-SA Bunchofpants
(flickr) http//creativecommons.org/licenses/by-nc
-sa/2.0/deed.en (Bag) CCBY-NC-SA Monkeyc.net
(flickr) http//creativecommons.org/licenses/by-nc
-sa/2.0/deed.en
6The Python List Object
(Pringle's Can) CCBY-NC Roadsidepictures
(flickr) http//creativecommons.org/licenses/by-nc
/2.0/deed.en (Pringles) CCBY-NC Cartel82
(flickr) http//creativecommons.org/licenses/by-nc
/2.0/deed.en
7gtgtgt grades list() gtgtgt grades.append(100) gtgtgt
grades.append(97) gtgtgt grades.append(100) gtgtgt
print sum(grades) 297 gtgtgt print grades 100, 97,
100 gtgtgt print sum(grades)/3.0 99.0 gtgtgt
The grades variable will have a list of values.
Append some values to the list.
Add up the values in the list using the sum()
function.
What is in the list?
Figure the average...
8gtgtgt print grades 100, 97, 100 gtgtgt newgr
list(grades) gtgtgt print newgr 100, 97, 100 gtgtgt
newgr1 85 gtgtgt print newgr 100, 85,
100 gtgtgt print grades 100, 97, 100
What is in grades?
Make a copy of the entire grades list.
Change the second new grade (starts at 0)
The original grades are unchanged.
9Looking in Lists...
gtgtgt print grades 100, 97, 100 gtgtgt print
grades0 100 gtgtgt print grades1 97 gtgtgt print
grades2 100
- We use square brackets to look up which element
in the list we are interested in. - grades2 translates to grades sub 2
- Kind of like in math x2
10Why lists start at zero?
- Initially it does not make sense that the first
element of a list is stored at the zeroth
position - grades0
- Math Convention - Number line
- Computer performance - dont have to subtract 1
in the computer all the time
Elevators in Europe!
(elevator) CCBY marstheinfomage (flickr)
http//creativecommons.org/licenses/by-nc/2.0/deed
.en
11Fun With Lists
- Python has many features that allow us to do
things to an entire list in a single statement - Lists are powerful objects
12gtgtgt lst 21, 14, 4, 3, 12, 18 gtgtgt print
lst 21, 14, 4, 3, 12, 18 gtgtgt print 18 in
lst True gtgtgt print 24 in lstFalse gtgtgt
lst.append(50) gtgtgt print lst 21, 14, 4, 3, 12,
18, 50 gtgtgt lst.remove(4) gtgtgt print lst 21, 14,
3, 12, 18, 50
gtgtgt print lst 21, 14, 3, 12, 18, 50 gtgtgt print
lst.index(18) 4 gtgtgt lst.reverse() gtgtgt print
lst 50, 18, 12, 3, 14, 21 gtgtgt lst.sort() gtgtgt
print lst 3, 12, 14, 18, 21, 50 gtgtgt del
lst2 gtgtgt print lst3, 12, 18, 21, 33
z-343
13More functions for lists
gtgtgt a 1, 2, 3 gtgtgt print max(a) 3 gtgtgt print
min(a) 1 gtgtgt print len(a) 3 gtgtgt print
sum(a) 6 gtgtgt
http//docs.python.org/lib/built-in-funcs.html
14gtgtgtprint Ist 3,12,14,18,21,33 gtgtgtfor xval in
Ist print xval 3 12 14 18 21 33 gtgtgt
Looping through Lists
z-343
15List Operations
z-343
16Quick Peek Object Oriented
ltnerd-alertgt
17What is a List Anyways?
- A list is a special kind of variable
- Regular variables - integer
- Contain some data
- Smart variables - string, list
- Contain some data and capabilities
gtgtgt i 2 gtgtgt i i 1 gtgtgt x 1, 2, 3 gtgtgt
print x 1, 2, 3 gtgtgt x.reverse() gtgtgt print x 3,
2, 1
When we combine data capabilities - we call
this an object
18One way to find out Capabilities
Buy a book and read it and carry it around with
you.
19Lets Ask Python...
- The dir() command lists capabilities
- Ignore the ones with underscores - these are used
by Python itself - The rest are real operations that the object can
perform - It is like type() - it tells us something about
a variable
gtgtgt x list() gtgtgt type(x) lttype 'list'gt gtgtgt
dir(x) '__add__', '__class__', '__contains__',
'__delattr__', '__delitem__', '__delslice__',
'__doc__', '__eq__''__setitem__', '__setslice__',
'__str__', 'append', 'count', 'extend', 'index',
'insert', 'pop', 'remove', 'reverse', 'sort' gtgtgt
20Try dir() with a String
gtgtgt y Hello there gtgtgt dir(y) '__add__',
'__class__', '__contains__', '__delattr__',
'__doc__', '__eq__', '__ge__', '__getattribute__',
'__getitem__', '__getnewargs__', '__getslice__',
'__gt__', '__hash__', '__init__', '__le__',
'__len__', '__lt__', '__repr__', '__rmod__',
'__rmul__', '__setattr__', '__str__',
'capitalize', 'center', 'count', 'decode',
'encode', 'endswith', 'expandtabs', 'find',
'index', 'isalnum', 'isalpha', 'isdigit',
'islower', 'isspace', 'istitle', 'isupper',
'join', 'ljust', 'lower', 'lstrip', 'partition',
'replace', 'rfind', 'rindex', 'rjust',
'rpartition', 'rsplit', 'rstrip', 'split',
'splitlines', 'startswith', 'strip', 'swapcase',
'title', 'translate', 'upper', 'zfill'
21What does x list() mean?
gtgtgt a list() gtgtgt print a gtgtgt print
type(a) lttype 'list'gt gtgtgt b dict() gtgtgt print
b gtgtgt print type(b) lttype 'dict'gt gtgtgt
a.append("fred") gtgtgt print a 'fred' gtgtgt c
str() gtgtgt d int() gtgtgt print d 0
- These are called constructors - they make an
empty list, str, or dictionary - We can make a fully formed empty object and
then add data to it using capabilities (aka
methods)
22Object Oriented Summary
- Variables (Objects) contain data and capabilities
- The dir() function asks Python to list
capabilities - We call object capabilities methods
- We can construct fresh, empty objects using
constructors like list() - Everything in Python (even constants) are objects
23Python Dictionaries
(Chips) CCBY-NC-SA Bunchofpants (flickr)
http//creativecommons.org/licenses/by-nc-sa/2.0/d
eed.en (Bag) CCBY-NC-SA Monkeyc.net (flickr)
http//creativecommons.org/licenses/by-nc-sa/2.0/d
eed.en
tissue
calculator
perfume
money
candy
http//en.wikipedia.org/wiki/Associative_array
24Dictionaries
- Dictionaries are Pythons most powerful data
collection - Dictionaries allow us to do fast database-like
operations in Python - Dictionaries have different names in different
languages - Associative Arrays - Perl / Php
- Properties or Map or HashMap - Java
- Property Bag - C / .Net
http//en.wikipedia.org/wiki/Associative_array
(Bag) CCBY-NC-SA Monkeyc.net (flickr)
http//creativecommons.org/licenses/by-nc-sa/2.0/d
eed.en
25Dictionaries
- Lists label their entries based on the position
in the list - Dictionaries are like bags - no order
- So we mark the things we put in the dictionary
with a tag
gtgtgt purse dict() gtgtgt purse'money' 12 gtgtgt
purse'candy' 3 gtgtgt purse'tissues' 75 gtgtgt
print purse 'money' 12, 'tissues' 75, 'candy'
3 gtgtgt print purse'candy' 3 gtgtgt purse'candy'
purse'candy' 2 gtgtgt print purse 'money'
12, 'tissues' 75, 'candy' 5
26gtgtgt purse dict() gtgtgt purse'money' 12 gtgtgt
purse'candy' 3 gtgtgt purse'tissues' 75 gtgtgt
print purse 'money' 12, 'tissues' 75, 'candy'
3 gtgtgt print purse'candy' 3 gtgtgt
purse'candy' purse'candy' 2 gtgtgt print
purse 'money' 12, 'tissues' 75, 'candy' 5
(Purse) CCBY Monkeyc.net Stimpson/monstershaq200
0' s photostream (flickr) http//creativecommons.o
rg/licenses/by/2.0/deed.en
27Lookup in Lists and Dictionaries
- Dictionaries are like Lists except that they use
keys instead of numbers to look up values
gtgtgt lst list() gtgtgt lst.append(21) gtgtgt
lst.append(183) gtgtgt print lst 21, 183 gtgtgt
lst0 23 gtgtgt print lst 23, 183
gtgtgt ddd dict() gtgtgt ddd"age" 21 gtgtgt
ddd"course" 182 gtgtgt print ddd 'course' 182,
'age' 21 gtgtgt ddd"age" 23 gtgtgt print
ddd 'course' 182, 'age' 23
28gtgtgt lst list() gtgtgt lst.append(21) gtgtgt
lst.append(183) gtgtgt print lst 21, 183 gtgtgt
lst0 23 gtgtgt print lst 23, 183
List
Key
Value
21
0
lll
23
183
1
Dictionary
gtgtgt ddd dict() gtgtgt ddd"age" 21 gtgtgt
ddd"course" 182 gtgtgt print ddd'course' 182,
'age' 21 gtgtgt ddd"age" 23 gtgtgt print
ddd 'course' 182, 'age' 23
Key
Value
183
course
ddd
23
21
age
29Dictionary Operations
z-369
30Dictionary Literals (Constants)
- Dictionary literals use curly braces and have a
list of key value pairs - You can make an empty dictionary using empty
curly braces
gtgtgt jjj 'chuck' 1 , 'fred' 42, 'jan'
100 gtgtgt print jjj 'jan' 100, 'chuck' 1,
'fred' 42 gtgtgt ooo gtgtgt print ooo gtgtgt
31Dictionary Patterns
- One common use of dictionary is counting how
often we see something
Key
Value
gtgtgt ccc dict() gtgtgt ccc"csev" 1 gtgtgt
ccc"cwen" 1 gtgtgt print ccc 'csev' 1, 'cwen'
1 gtgtgt ccc"cwen" ccc"cwen" 1 gtgtgt print
ccc 'csev' 1, 'cwen' 2
32Dictionary Patterns
- It is an error to reference a key which is not in
the dictionary - We can use the in operator to see if a key is in
the dictionary
gtgtgt ccc dict() gtgtgt print ccc"csev" Traceback
(most recent call last) File "ltstdingt", line
1, in ltmodulegt KeyError 'csev' gtgtgt print "csev"
in ccc False
33ccc dict() if csev in ccc print
Yes else print No ccccsev 20 if
csev in ccc print Yes else print No
34Dictionary Counting
gtgtgt ccc dict() gtgtgt print ccc.get("csev",
0) 0 gtgtgt ccc"csev" ccc.get("csev",0) 1 gtgtgt
print ccc 'csev' 1 gtgtgt print ccc.get("csev",
0) 1 gtgtgt ccc"csev" ccc.get("csev",0) 1 gtgtgt
print ccc 'csev' 2
- Since it is an error to reference a key which is
not in the dictionary - We can use the dictionary get() operation and
supply a default value if the key does not exist
to avoid the error and get our count started.
dict.get(key, defaultvalue)
35What get() effectively does...
- The get() method basically does an implicit if
checking to see if the key exists in the
dictionary and if the key is not there - return
the default value - The main purpose of get() is to save typing this
four line pattern over and over
d dict() x d.get(fred,0) d dict() if
fred in d x dfred else x 0
36Retrieving lists of Keys and Values
- You can get a list of keys, values or items
(both) from a dictionary
gtgtgt jjj 'chuck' 1 , 'fred' 42, 'jan'
100 gtgtgt print jjj.keys() 'jan', 'chuck',
'fred' gtgtgt print jjj.values() 100, 1, 42 gtgtgt
print jjj.items() ('jan', 100), ('chuck', 1),
('fred', 42) gtgtgt
37Looping Through Dictionaries
- We loop through the key-value pairs in a
dictionary using two iteration variables - Each iteration, the first variable is the key and
the the second variable is the corresponding value
gtgtgt jjj 'chuck' 1 , 'fred' 42, 'jan'
100 gtgtgt for aaa,bbb in jjj.items() ...
print aaa, bbb ... jan 100 chuck 1 fred 42 gtgtgt
aaa
bbb
100
jan
1
chuck
42
fred
38Dictionary Maximum Loop
cat dictmax.py jjj 'chuck' 1 , 'fred'
42, 'jan' 100 print jjj maxcount None for
person, count in jjj.items() if maxcount
None or count gt maxcount maxcount
count maxperson person print
maxperson, maxcount
python dictmax.py 'jan' 100, 'chuck' 1,
'fred' 42 jan 100
None is a special value in Python. It is like
the absense of a value. Like nothing or
empty.
39Dictionaries are not Ordered
- Dictionaries use a Computer Science technique
called hashing to make them very fast and
efficient - However hashing makes it so that dictionaries are
not sorted and they are not sortable - Lists and sequences maintain their order and a
list can be sorted - but not a dictionary
http//en.wikipedia.org/wiki/Hash_function
40Dictionaries are not Ordered
gtgtgt lst dict() gtgtgt lst.append("one") gtgtgt
lst.append("and") gtgtgt lst.append("two") gtgtgt print
lst 'one', 'and', 'two' gtgtgt lst.sort() gtgtgt
print lst 'and', 'one', 'two' gtgtgt
gtgtgt dict "a" 123, "b" 400, "c" 50 gtgtgt
print dict 'a' 123, 'c' 50, 'b' 400
Dictionaries have no order and cannot be sorted.
Lists have order and can be sorted.
http//en.wikipedia.org/wiki/Hash_function
41Summary Two Collections
- List
- A linear collection of values that stay in order
- Dictionary
- A bag of values, each with its own label / tag
(Pringle's Can) CCBY-NC Roadsidepictures
(flickr) http//creativecommons.org/licenses/by-nc
/2.0/deed.en (Bag) CCBY-NC-SA Monkeyc.net
(flickr) http//creativecommons.org/licenses/by-nc
-sa/2.0/deed.en
42What do we use these for?
- Lists - Like a Spreadsheet - with columns of
stuff to be summed, sorted - Also when pulling
strings apart - like string.split() - Dictionaries - For keeping track of
(keyword,value) pairs in memory with very fast
lookup. It is like a small in-memory database.
Also used to communicate with databases and web
content.