Python%20Data%20Structures - PowerPoint PPT Presentation

About This Presentation
Title:

Python%20Data%20Structures

Description:

... is that they are 'mutable. ... name x that refers to a mutable object of some user-defined class. ... Because mutable data types can be changed in place ... – PowerPoint PPT presentation

Number of Views:196
Avg rating:3.0/5.0
Slides: 71
Provided by: verbsCo
Category:

less

Transcript and Presenter's Notes

Title: Python%20Data%20Structures


1
Python Data Structures
  • LING 5200
  • Computational Corpus Linguistics
  • Martha Palmer

2
An Overview of Python
3
Basic Datatypes
  • Integers (default for numbers)
  • z 5 / 2 Answer is 2, integer division.
  • Floats
  • x 3.456
  • Strings
  • Can use or to specify. abc abc
    (Same thing.)
  • Unmatched ones can occur within the string.
    matts
  • Use triple double-quotes for multi-line strings
    or strings than contain both and inside of
    them abc

4
Whitespace
  • Whitespace is meaningful in Python especially
    indentation and placement of newlines.
  • Use a newline to end a line of code. (Not a
    semicolon like in C or Java.)(Use \ when must
    go to next line prematurely.)
  • No braces to mark blocks of code in Python
    Use consistent indentation instead. The first
    line with a new indentation is considered outside
    of the block.
  • Often a colon appears at the start of a new
    block. (Well see this later for function and
    class definitions.)

5
Comments
  • Start comments with the rest of line is
    ignored.
  • Can include a documentation string as the first
    line of any new function or class that you
    define.
  • The development environment, debugger, and other
    tools use it its good style to include one.
  • def my_function(x, y)
  • This is the docstring. This function does
    blah blah blah. The code would go here...

6
Defining Functions
  • No header file or declaration of types of
    function or arguments.

def get_final_answer(filename) Documentation
String line1 line2 return total_counter
7
Python and Types
  • Python determines the data types in a program
    automatically. Dynamic Typing
  • But Pythons not casual about types, it enforces
    them after it figures them out. Strong
    Typing
  • So, for example, you cant just append an integer
    to a string. You must first convert the integer
    to a string itself.
  • x the answer is Decides x is string.
  • y 23 Decides y is integer.
  • print x y Python will complain about this.

8
Calling a Function
  • The syntax for a function call is
  • gtgtgt def myfun(x, y)
  • return x y
  • gtgtgt myfun(3, 4)
  • 12
  • Parameters in Python are Call by Assignment.
  • Sometimes acts like call by reference and
    sometimes like call by value in C. Depends
    on the data type.
  • Well discuss mutability of data types later
    this will specify more precisely how function
    calls behave.

9
Functions without returns
  • All functions in Python have a return value, even
    ones without a specific return line inside the
    code.
  • Functions without a return will give the
    special value None as their return value.
  • None is a special constant in the language.
  • None is used like NULL, void, or nil in other
    languages.
  • None is also logically equivalent to False.

10
Names and References 1
  • Python has no pointers like C or C. Instead,
    it has names and references. (Works a lot
    like Lisp or Java.)
  • You create a name the first time it appears on
    the left side of an assignment expression x
    3
  • Names store references which are like pointers
    to locations in memory that store a constant or
    some object.
  • Python determines the type of the reference
    automatically based on what data is assigned to
    it.
  • It also decides when to delete it via garbage
    collection after any names for the reference have
    passed out of scope.

11
Names and References 2
  • There is a lot going on when we typex 3
  • First, an integer 3 is created and stored in
    memory.
  • A name x is created.
  • An reference to the memory location storing the 3
    is then assigned to the name x.

Type Integer Data 3
Name x Ref ltaddress1gt
name list
memory
12
Names and References 3
  • The data 3 we created is of type integer. In
    Python, the basic datatypes integer, float, and
    string are immutable.
  • This doesnt mean we cant change the value of x
    For example, we could increment x.
  • gtgtgt x 3
  • gtgtgt x x 1
  • gtgtgt print x
  • 4

13
Names and References 4
  • If we increment x, then whats really happening
    is
  • The reference of name x is looked up.
  • The value at that reference is retrieved.
  • The 31 calculation occurs, producing a new data
    element 4 which is assigned to a fresh memory
    location with a new reference.
  • The name x is changed to point to this new
    reference.
  • The old data 3 is garbage collected if no name
    still refers to it.

Type Integer Data 3
Name x Ref ltaddress1gt
14
Names and References 4
  • If we increment x, then whats really happening
    is
  • The reference of name x is looked up.
  • The value at that reference is retrieved.
  • The 31 calculation occurs, producing a new data
    element 4 which is assigned to a fresh memory
    location with a new reference.
  • The name x is changed to point to this new
    reference.
  • The old data 3 is garbage collected if no name
    still refers to it.

Type Integer Data 3
Name x Ref ltaddress1gt
Type Integer Data 4
15
Names and References 4
  • If we increment x, then whats really happening
    is
  • The reference of name x is looked up.
  • The value at that reference is retrieved.
  • The 31 calculation occurs, producing a new data
    element 4 which is assigned to a fresh memory
    location with a new reference.
  • The name x is changed to point to this new
    reference.
  • The old data 3 is garbage collected if no name
    still refers to it.

Type Integer Data 3
Name x Ref ltaddress2gt
Type Integer Data 4
16
Names and References 4
  • If we increment x, then whats really happening
    is
  • The reference of name x is looked up.
  • The value at that reference is retrieved.
  • The 31 calculation occurs, producing a new data
    element 4 which is assigned to a fresh memory
    location with a new reference.
  • The name x is changed to point to this new
    reference.
  • The old data 3 is garbage collected if no name
    still refers to it.

Name x Ref ltaddress2gt
Type Integer Data 4
17
Assignment 1
  • So, for simple built-in datatypes (integers,
    floats, strings), assignment behaves as you would
    expectgtgtgt x 3 Creates 3, name x
    refers to 3 gtgtgt y x Creates name y,
    refers to 3.gtgtgt y 4 Creates ref for 4.
    Changes y.gtgtgt print x No effect on x,
    still ref 3.3

18
Assignment 1
  • So, for simple built-in datatypes (integers,
    floats, strings), assignment behaves as you would
    expectgtgtgt x 3 Creates 3, name x
    refers to 3 gtgtgt y x Creates name y,
    refers to 3.gtgtgt y 4 Creates ref for 4.
    Changes y.gtgtgt print x No effect on x,
    still ref 3.3

Name x Ref ltaddress1gt
Type Integer Data 3
19
Assignment 1
  • So, for simple built-in datatypes (integers,
    floats, strings), assignment behaves as you would
    expectgtgtgt x 3 Creates 3, name x
    refers to 3 gtgtgt y x Creates name y,
    refers to 3.gtgtgt y 4 Creates ref for 4.
    Changes y.gtgtgt print x No effect on x,
    still ref 3.3

Name x Ref ltaddress1gt
Type Integer Data 3
Name y Ref ltaddress1gt
20
Assignment 1
  • So, for simple built-in datatypes (integers,
    floats, strings), assignment behaves as you would
    expectgtgtgt x 3 Creates 3, name x
    refers to 3 gtgtgt y x Creates name y,
    refers to 3.gtgtgt y 4 Creates ref for 4.
    Changes y.gtgtgt print x No effect on x,
    still ref 3.3

Name x Ref ltaddress1gt
Type Integer Data 3
Name y Ref ltaddress1gt
Type Integer Data 4
21
Assignment 1
  • So, for simple built-in datatypes (integers,
    floats, strings), assignment behaves as you would
    expectgtgtgt x 3 Creates 3, name x
    refers to 3 gtgtgt y x Creates name y,
    refers to 3.gtgtgt y 4 Creates ref for 4.
    Changes y.gtgtgt print x No effect on x,
    still ref 3.3

Name x Ref ltaddress1gt
Type Integer Data 3
Name y Ref ltaddress2gt
Type Integer Data 4
22
Assignment 1
  • So, for simple built-in datatypes (integers,
    floats, strings), assignment behaves as you would
    expectgtgtgt x 3 Creates 3, name x
    refers to 3 gtgtgt y x Creates name y,
    refers to 3.gtgtgt y 4 Creates ref for 4.
    Changes y.gtgtgt print x No effect on x,
    still ref 3.3

Name x Ref ltaddress1gt
Type Integer Data 3
Name y Ref ltaddress2gt
Type Integer Data 4
23
Assignment 2
  • But well see that for other more complex data
    types assignment seems to work differently.
  • Were talking about lists, dictionaries,
    user-defined classes.
  • We will learn details about all of these type
    later.
  • The important thing is that they are mutable.
  • This means we can make changes to their data
    without having to copy it into a new memory
    reference address each time.
  • gtgtgt x 3 x some mutable object
  • gtgtgt y x y x
  • gtgtgt y 4 make a change to y
  • gtgtgt print x look at x
  • 3 x will be changed as well

immutable
mutable
24
Assignment 3
  • Assume we have a name x that refers to a mutable
    object of some user-defined class. This class
    has a set and a get function for some value.
  • gtgtgt x.getSomeValue()4
  • We now create a new name y and set yx.
  • gtgtgt y x
  • This creates a new name y which points to the
    same memory reference as the name x. Now, if we
    make some change to y, then x will be affected as
    well.
  • gtgtgt y.setSomeValue(3)gtgtgt y.getSomeValue()3gtgtgt
    x.getSomeValue()3

25
Assignment 4
  • Because mutable data types can be changed in
    place without producing a new reference every
    time there is a modification, then changes to one
    name for a reference will seem to affect all
    those names for that same reference. This leads
    to the behavior on the previous slide.
  • Passing Parameters to Functions
  • When passing parameters, immutable data types
    appear to be call by value while mutable data
    types are call by reference.
  • (Mutable data can be changed inside a function to
    which they are passed as a parameter. Immutable
    data seems unaffected when passed to functions.)

26
Naming and Assignment Details
27
Naming Rules
  • Names are case sensitive and cannot start with a
    number. They can contain letters, numbers, and
    underscores.
  • bob Bob _bob _2_bob_ bob_2 BoB
  • There are some reserved words
  • and, assert, break, class, continue, def, del,
    elif, else, except, exec, finally, for, from,
    global, if, import, in, is, lambda, not, or,
    pass, print, raise, return, try, while

28
Accessing Non-existent Name
  • If you try to access a name before its been
    properly created (by placing it on the left side
    of an assignment), youll get an error.
  • gtgtgt y
  • Traceback (most recent call last)
  • File "ltpyshell16gt", line 1, in -toplevel-
  • y
  • NameError name y' is not defined
  • gtgtgt y 3
  • gtgtgt y
  • 3

29
Multiple Assignment
  • You can also assign to multiple names at the same
    time.
  • gtgtgt x, y 2, 3
  • gtgtgt x
  • 2
  • gtgtgt y
  • 3

30
String Operations
31
String Operations
  • We can use some methods built-in to the string
    data type to perform some formatting operations
    on strings
  • gtgtgt hello.upper()
  • HELLO
  • There are many other handy string operations
    available. Check the Python documentation for
    more.

32
String Formatting Operator
  • The operator allows us to build a string out of
    many data items in a fill in the blanks
    fashion.
  • Also allows us to control how the final string
    output will appear.
  • For example, we could force a number to display
    with a specific number of digits after the
    decimal point.
  • It is very similar to the sprintf command of C.

33
Formatting Strings with
  • gtgtgt x abc
  • gtgtgt y 34
  • gtgtgt s xyz d (x, y)
  • abc xyz 34
  • The tuple following the operator is used to
    fill in the blanks in the original string marked
    with s or d.
  • Check Python documentation for whether to use s,
    d, or some other formatting code inside the
    string.

34
Printing with Python
  • You can print a string to the screen using
    print.
  • Using the string operator in combination with
    the print command, we can format our output text.
  • gtgtgt print s xyz d (abc, 34)
  • abc xyz 34
  • Print automatically adds a newline to the end
    of the string. If you include a list of strings,
    it will concatenate them with a space between
    them.
  • gtgtgt print abc gtgtgt print abc, def
  • abc abc def

35
Container Types in Python
36
Container Types
  • Last time, we saw the basic data types in Python
    integers, floats, and strings.
  • Containers are other built-in data types in
    Python.
  • Can hold objects of any type (including their own
    type).
  • There are three kinds of containers
  • Tuples
  • A simple immutable ordered sequence of items.
  • Lists
  • Sequence with more powerful manipulations
    possible.
  • Dictionaries
  • A look-up table of key-value pairs.

37
Tuples, Lists, and Strings Similarities
38
Similar Syntax
  • Tuples and lists are sequential containers that
    share much of the same syntax and functionality.
  • For conciseness, they will be introduced
    together.
  • The operations shown in this section can be
    applied to both tuples and lists, but most
    examples will just show the operation performed
    on one or the other.
  • While strings arent exactly a container data
    type, they also happen to share a lot of their
    syntax with lists and tuples so, the operations
    you see in this section can apply to them as well.

39
Tuples, Lists, and Strings 1
  • Tuples are defined using parentheses (and
    commas).
  • gtgtgt tu (23, abc, 4.56, (2,3), def)
  • Lists are defined using square brackets (and
    commas).
  • gtgtgt li abc, 34, 4.34, 23
  • Strings are defined using quotes (, , or ).
  • gtgtgt st Hello World
  • gtgtgt st Hello World
  • gtgtgt st This is a multi-line
  • string that uses triple quotes.

40
Tuples, Lists, and Strings 2
  • We can access individual members of a tuple,
    list, or string using square bracket array
    notation.
  • gtgtgt tu1 Second item in the tuple.
  • abc
  • gtgtgt li1 Second item in the list.
  • 34
  • gtgtgt st1 Second character in string.
  • e

41
Looking up an Item
  • gtgtgt t (23, abc, 4.56, (2,3), def)
  • Positive index count from the left, starting
    with 0.
  • gtgtgt t1
  • abc
  • Negative lookup count from right, starting with
    1.
  • gtgtgt t-3
  • 4.56

42
Slicing Return Copy of a Subset 1
  • gtgtgt t (23, abc, 4.56, (2,3), def)
  • Return a copy of the container with a subset of
    the original members. Start copying at the first
    index, and stop copying before the second index.
  • gtgtgt t14
  • (abc, 4.56, (2,3))
  • You can also use negative indices when slicing.
  • gtgtgt t1-1
  • (abc, 4.56, (2,3))

43
Slicing Return Copy of a Subset 2
  • gtgtgt t (23, abc, 4.56, (2,3), def)
  • Omit the first index to make a copy starting from
    the beginning of the container.
  • gtgtgt t2
  • (23, abc)
  • Omit the second index to make a copy starting at
    the first index and going to the end of the
    container.
  • gtgtgt t2
  • (4.56, (2,3), def)

44
Copying the Whole Container
  • You can make a copy of the whole tuple using .
  • gtgtgt t
  • (23, abc, 4.56, (2,3), def)
  • So, theres a difference between these two lines
  • gtgtgt list2 list1 2 names refer to 1 ref
  • Changing one affects both
  • gtgtgt list2 list1 Two copies, two refs
  • Theyre independent

45
The in Operator
  • Boolean test whether a value is inside a
    container
  • gtgtgt t 1, 2, 4, 5
  • gtgtgt 3 in t
  • False
  • gtgtgt 4 in t
  • True
  • gtgtgt 4 not in t
  • False
  • Be careful the in keyword is also used in the
    syntax of other unrelated Python constructions
    for loops and list comprehensions.

46
The Operator
  • The operator produces a new tuple, list, or
    string whose value is the concatenation of its
    arguments.
  • gtgtgt (1, 2, 3) (4, 5, 6)
  • (1, 2, 3, 4, 5, 6)
  • gtgtgt 1, 2, 3 4, 5, 6
  • 1, 2, 3, 4, 5, 6
  • gtgtgt Hello World
  • Hello World

47
The Operator
  • The operator produces a new tuple, list, or
    string that repeats the original content.
  • gtgtgt (1, 2, 3) 3
  • (1, 2, 3, 1, 2, 3, 1, 2, 3)
  • gtgtgt 1, 2, 3 3
  • 1, 2, 3, 1, 2, 3, 1, 2, 3
  • gtgtgt Hello 3
  • HelloHelloHello

48
MutabilityTuples vs. Lists
49
Tuples Immutable
  • gtgtgt t (23, abc, 4.56, (2,3), def)
  • gtgtgt t2 3.14
  • Traceback (most recent call last)
  • File "ltpyshell75gt", line 1, in -toplevel-
  • tu2 3.14
  • TypeError object doesn't support item assignment
  • Youre not allowed to change a tuple in place in
    memory so, you cant just change one element of
    it.
  • But its always OK to make a fresh tuple and
    assign its reference to a previously used name.
  • gtgtgt t (1, 2, 3, 4, 5)

50
Lists Mutable
  • gtgtgt li abc, 23, 4.34, 23
  • gtgtgt li1 45
  • gtgtgt liabc, 45, 4.34, 23
  • We can change lists in place. So, its ok to
    change just one element of a list. Name li still
    points to the same memory reference when were
    done.

51
Slicing with mutable lists
  • gtgtgt L spam, Spam, SPAM
  • gtgtgt L1 eggs
  • gtgtgt L
  • spam, eggs, SPAM
  • gtgtgt L02 eat,more
  • gtgtgt L
  • eat, more, SPAM

52
Operations on Lists Only 1
  • Since lists are mutable (they can be changed in
    place in memory), there are many more operations
    we can perform on lists than on tuples.
  • The mutability of lists also makes managing them
    in memory more complicated So, they arent as
    fast as tuples. Its a tradeoff.

53
Operations on Lists Only 2
  • gtgtgt li 1, 2, 3, 4, 5
  • gtgtgt li.append(a)
  • gtgtgt li
  • 1, 2, 3, 4, 5, a
  • gtgtgt li.insert(2, i)
  • gtgtgtli
  • 1, 2, i, 3, 4, 5, a
  • NOTE li li.insert(2,I) loses the list!

54
Operations on Lists Only 3
  • The extend operation is similar to
    concatenation with the operator. But while the
    creates a fresh list (with a new memory
    reference) containing copies of the members from
    the two inputs, the extend operates on list li in
    place.
  • gtgtgt li.extend(9, 8, 7)
  • gtgtgtli
  • 1, 2, i, 3, 4, 5, a, 9, 8, 7
  • Extend takes a list as an argument. Append takes
    a singleton.
  • gtgtgt li.append(9, 8, 7)
  • gtgtgt li
  • 1, 2, i, 3, 4, 5, a, 9, 8, 7, 9, 8, 7

55
Operations on Lists Only 4
  • gtgtgt li a, b, c, b
  • gtgtgt li.index(b) index of first occurrence
  • 1
  • gtgtgt li.count(b) number of occurrences
  • 2
  • gtgtgt li.remove(b) remove first occurrence
  • gtgtgt li
  • a, c, b

56
Operations on Lists Only 5
  • gtgtgt li 5, 2, 6, 8
  • gtgtgt li.reverse() reverse the list in place
  • gtgtgt li
  • 8, 6, 2, 5
  • gtgtgt li.sort() sort the list in place
  • gtgtgt li
  • 2, 5, 6, 8
  • gtgtgt li.sort(some_function)
  • sort in place using user-defined comparison

57
Tuples vs. Lists
  • Lists slower but more powerful than tuples.
  • Lists can be modified, and they have lots of
    handy operations we can perform on them.
  • Tuples are immutable and have fewer features.
  • We can always convert between tuples and lists
    using the list() and tuple() functions.
  • li list(tu)
  • tu tuple(li)

58
String Conversions
59
String to List to String
  • Join turns a list of strings into one
    string. ltseparator_stringgt.join( ltsome_listgt )
  • gtgtgt .join( abc, def, ghi )
  • abcdefghi
  • Split turns one string into a list of
    strings. ltsome_stringgt.split(
    ltseparator_stringgt )
  • gtgtgt abcdefghi.split( )
  • abc, def, ghi
  • gtgtgt I love New York.split()
  • I, love, New, York

60
Convert Anything to a String
  • The built-in str() function can convert an
    instance of any data type into a string.
  • You can define how this function behaves for
    user-created data types. You can also redefine
    the behavior of this function for many types.
  • gtgtgt Hello str(2)
  • Hello 2

61
Dictionaries
62
Basic Syntax for Dictionaries 1
  • Dictionaries store a mapping between a set of
    keys and a set of values.
  • Keys can be any immutable type.
  • Values can be any type, and you can have
    different types of values in the same dictionary.
  • You can define, modify, view, lookup, and delete
    the key-value pairs in the dictionary.

63
Basic Syntax for Dictionaries 2
  • gtgtgt d userbozo, pswd1234
  • gtgtgt duser
  • bozo
  • gtgtgt dpswd
  • 1234
  • gtgtgt dbozo
  • Traceback (innermost last)
  • File ltinteractive inputgt line 1, in ?
  • KeyError bozo

64
Basic Syntax for Dictionaries 3
  • gtgtgt d userbozo, pswd1234
  • gtgtgt duser clown
  • gtgtgt d
  • userclown, pswd1234
  • Note Keys are unique. Assigning to an
    existing key just replaces its value.
  • gtgtgt did 45
  • gtgtgt d
  • userclown, id45, pswd1234
  • Note Dictionaries are unordered. New
    entry might appear anywhere in the output.

65
Basic Syntax for Dictionaries 4
  • gtgtgt d userbozo, p1234, i34
  • gtgtgt del duser Remove one.
  • gtgtgt d
  • p1234, i34
  • gtgtgt d.clear() Remove all.
  • gtgtgt d

66
Basic Syntax for Dictionaries 5
  • gtgtgt d userbozo, p1234, i34
  • gtgtgt d.keys() List of keys.
  • user, p, i
  • gtgtgt d.values() List of values.
  • bozo, 1234, 34
  • gtgtgt d.items() List of item tuples.
  • (user,bozo), (p,1234), (i,34)

67
Assignment and Containers
68
Multiple Assignment with Container Classes
  • Weve seen multiple assignment before
  • gtgtgt x, y 2, 3
  • But you can also do it with containers.
  • The type and shape just has to match.
  • gtgtgt (x, y, (w, z)) (2, 3, (4, 5))
  • gtgtgt x, y 4, 5

69
Empty Containers 1
  • We know that assignment is how to create a name.
  • x 3 Creates name x of type integer.
  • Assignment is also what creates named references
    to containers.
  • gtgtgt d a3, b4
  • We can also create empty containers
  • gtgtgt li
  • gtgtgt tu ()
  • gtgtgt di

Note an empty containeris logically equivalent
to False. (Just like None.)
70
Empty Containers 2
  • Why create a named reference to empty container?
    You might want to use append or some other list
    operation before you really have any data in your
    list. This could cause an unknown name error if
    you dont properly create your named reference
    first.
  • gtgtgt g.append(3)
  • Python complains here about the unknown name
    g!
  • gtgtgt g
  • gtgtgt g.append(3)
  • gtgtgt g 3
Write a Comment
User Comments (0)
About PowerShow.com