Introduction to Using Web Data - PowerPoint PPT Presentation

1 / 10
About This Presentation
Title:

Introduction to Using Web Data

Description:

Session 33. Introduction to Using Web Data. Questions from Lab13? ... pro-football. ... urllib.urlopen('http://www.pro-football- reference.com') contents = web. ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 11
Provided by: BenSc
Category:
Tags: data | introduction | using | web

less

Transcript and Presenter's Notes

Title: Introduction to Using Web Data


1
Session 33
  • Introduction to Using Web Data

2
Questions from Lab13??
3
PA08 now posted
  • Create a program to translate simple text files
    into Pig Latin
  • Four score and seven years ago our fathers
    brought forth on this continent a new nation
    conceived in liberty and dedicated to the
    proposition that all men are created equal.
  • Becomes
  • Ourfay orescay andyay evensay earsyay agoyay
    ouryay athersfay oughtbray orthfay onyay isthay
    ontinentcay ayay ewnay ationnay onceivedcay inyay
    ibertylay andyay edicatedday otay ethay
    opositionpray atthay allyay enmay areyay
    eatedcray equalyay.

4
Suppose we want to read a web page
  • www.weather.com
  • www.pro-football.reference.com
  • http//www.commonplacebook.com/journal/what_am_i/w
    hats_your_prof.shtm

5
Simple extraction of data
  • our version of recipe 82
  • def findTemperatureLive()
  • import urllib
  • web urllib.urlopen("http//www.kwwl.com")
  • contents web.read()
  • web.close()

6
Simple extraction of data
  • our version of recipe 82
  • def findTemperatureLive()
  • import urllib
  • web urllib.urlopen("http//www.kwwl.com")
  • contents web.read()
  • web.close()
  • startTemp contents.find("Temperature")
  • endTemp contents.find("deg",startTemp)
  • print contentsstartTempendTemp

7
More involved
  • def extractTeamCodes(filename)
  • web urllib.urlopen("http//www.pro-football-
    reference.com")
  • contents web.readlines()
  • web.close
  • for line in contents
  • if line.find("F.E.")gt-1
  • print line

8
More involved
  • def extractTeamCodes(filename)
  • web urllib.urlopen("http//www.pro-football-
    reference.com")
  • contents web.readlines()
  • web.close
  • for line in contents
  • if line.find("F.E.")gt-1 and
    line.find(teams)gt-1
  • print line

9
More involved
  • def extractTeamCodes(filename)
  • web urllib.urlopen("http//www.pro-football-
    reference.com")
  • contents web.readlines()
  • web.close
  • output open(filename,'wt')
  • for line in contents
  • if line.find("F.E.")gt-1 and
    line.find("teams")gt-1
  • tokens line.split("/")
  • output.write(tokens2"\n")
  • output.close()

10
More involved
  • def extractTeamCodes(filename)
  • web urllib.urlopen("http//www.pro-football-
    reference.com")
  • contents web.readlines()
  • web.close
  • output open(filename,'wt')
  • for line in contents
  • if line.find("F.E.")gt-1 and
    line.find("teams")gt-1
  • tokens line.split("/")
  • output.write(tokens2"\n")
  • output.close()
Write a Comment
User Comments (0)
About PowerShow.com