LIS 559: Week 6 - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

LIS 559: Week 6

Description:

... g.: {'cabbage': 1, 'carrot': 3, 'onion': 1, 'spices': ['basil', 'pepper', 'dill', 'salt' ... urllib fetches data from the web ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 16
Provided by: publi2
Category:
Tags: lis | onion | the | week

less

Transcript and Presenter's Notes

Title: LIS 559: Week 6


1
LIS 559 Week 6
  • Summer 2006
  • Margaret Kipp
  • mkipp_at_uwo.ca
  • x88687
  • NCB 235

2
Concepts
  • Review
  • Dictionaries
  • String methods (e.g. find(), sort())
  • Internet Access with Python
  • urllib

3
Dictionaries
  • A dictionary is a group of key, value pairs.
  • e.g. A 'jane' '123-4567', 'joe' '234-5678'
  • print len(A)
  • 2
  • print A.keys()
  • 'jane', 'joe'
  • the keys() method returns a list of all the keys
  • print A.values()
  • '123-4567', '234-5678'
  • the values method returns a list of all the values

4
Dictionaries (cont.)
  • A.items()
  • returns a list of key, value pairs
  • e.g. ('jane', '123-4567'), ('joe', '234-5678')
  • A.has_key("jane")
  • the has_key(k) function returns True if the key k
    exists in the dictionary
  • A.get("jane")
  • returns the value associated with the given key,
    in this case "jane", if this key does not exist,
    get() will throw an exception, so it is important
    to check for the key first with has_key()

5
Dictionaires (cont).
  • b A"jane"
  • this sets b equal to the value associated with
    "jane"
  • del A"jane"
  • this would delete the key value pair with key
    "jane"
  • Note a dictionary can be nested in a list or
    another dictionary (or vice versa)
  • e.g. 'cabbage' 1, 'carrot' 3, 'onion' 1,
    'spices' 'basil', 'pepper', 'dill', 'salt'

6
String
  • if S is a string... (S "This is a string.")
  • S.find(substring, start, end)
  • returns the index of the location of a substring
    in S or -1 if it is not found
  • S.index(substring, start, end)
  • returns the index of the location of a substring
    in S or throws an exception if it is not found
  • start and end are optional, but useful if you
    only want to search part of the string
  • Remember that strings are counted from 0 to
    len(S)-1!

7
Internet
  • python can be used to access web pages and make
    dynamic web pages
  • Some Python Internet Modules
  • urllib
  • urllib2
  • access URLs and web pages
  • urlparse
  • works with URLs
  • cookie
  • handle cookies from web sites
  • cgi
  • functions for interactive website scripts

8
urllib module
  • urllib - stands for URL library, a library in
    programming is a collection of functions that can
    be reused
  • urllib fetches data from the web
  • this data can be a website, a text file, an image
    or any other file on the web
  • functions
  • urlopen()
  • urlretrieve()
  • urlcleanup()

9
urllib.urlopen()
  • urllib.urlopen(url)
  • this function opens a url and returns a variable
    that acts like a file variable
  • e.g. the following code will "open" the given
    URL and then print it to the screen
  • import urllib
  • website urllib.urlopen("http//publish.uwo.ca/m
    kipp/teaching/559.html")
  • data website.read()
  • print data

10
using urllib
  • website urllib.urlopen("http//publish.uwo.ca/m
    kipp/teaching/559.html")
  • data website.read()
  • stores the entire website into a string
  • data website.readline()
  • stores the first line of the website
  • data website.readlines()
  • stores the entire website into a list of strings

11
using urllib (cont.)
  • website urllib.urlopen("http//publish.uwo.ca/m
    kipp/teaching/559.html")
  • website.info()
  • returns the internet protocol headers
  • website.geturl()
  • returns the real URL
  • website.close()
  • closes the connection to the website

12
urllib.urlretrieve()
  • urllib.urlretrieve(url, filename)
  • opens a URL and copies the website or image to a
    local file (the filename is optional)
  • returns the name of the file and other
    information in a tuple (filename, headers)
  • e.g. the following code will save the website
    and print the filename
  • import urllib
  • (filename, headers) urllib.urlretrieve("http//p
    ublish.uwo.ca/mkipp/teaching/559.html",
    "559description.html")
  • print filename

13
other urllib functions
  • urllib.urlcleanup()
  • this function cleans up any temporary files
    created by urlretrieve()
  • urllib.quote(string)
  • e.g. this is a string becomes this20is20a20stri
    ng
  • urllib.unquote(string)
  • replace special characters in a string (e.g.
    replaces spaces with 20)
  • urllib.urlencode(dictionary)
  • converts a dictionary of key, value pairs into a
    list of parameters for a URL

14
urlparse
  • urlparse - stands for URL parser, this module
    splits and joins URLs
  • urlparse('http//www.uwo.ca/index.shtml')
  • ('http', 'www.uwo.ca', '/index.shtml', '', '',
    '')
  • the last three sections are for parameters, which
    this url doesn't have
  • urlunparse(('http', 'www.uwo.ca', '/index.shtml',
    '', '', ''))
  • 'http//www.uwo.ca/index.shtml'
  • urljoin('http//www.uwo.ca/index.shtml',
    'help.html')
  • 'http//www.uwo.ca/help.html'

15
webbrowser
  • webbrowser - this module provides functions for
    opening documents in a web browser
  • import webbrowser
  • webbrowser.open("http//www.uwo.ca/")
  • opens the main UWO website in a browser window
  • webbrowser.open_new("http//www.uwo.ca/")
  • opens the main UWO website in a new browser window
Write a Comment
User Comments (0)
About PowerShow.com