Title: Python Training for HP OSO
1Python Trainingfor HP OSO
- Guido van RossumCNRI
- 7/23/19999am - 1pm
2Plug
- The Practice ofProgramming
- Brian W. Kernighanand Rob Pike
- Addison-Wesley, 1999
Mostly about C, but very useful! http//cm.bell-l
abs.com/cm/cs/tpop/
3CODE STRUCTURE
4The importance of readability
- Most time is spent on maintenance
- Think about the human reader
- Can you still read your own code...
- next month?
- next year?
5Writing readable code
- Be consistent
- (but not too consistent!)
- Use whitespace judicously
- Write appropriate comments
- Write helpful doc strings
- not novels
- Indicate unfinished business
6Modifying existing code
- Conform to the existing style
- even if its not your favorite style!
- local consistency overrides global
- Update the comments!!
- and the doc strings!!!
7Organizing code clearly
- Top-down or bottom-up?
- Pick one style, stick to it
- Alternative group by functionality
- eg
- constructor, destructor
- housekeeping
- low level methods
- high level methods
8When to use classes(...and when not!)
- Use a class
- when multiple copies of state needed
- e.g. client connections drawing objects
- Use a module
- when on copy of state always suffices
- e.g. logger cache
- Use functions
- when no state needed e.g. sin()
9Class hierarchies
- Avoid deep class hierarchies
- inefficient
- multi-level lookup
- hard to read
- find method definitions
- easy to make mistakes
- name clashes between attribute
10Modules and packages
- Modules collect classes, functions
- Packages collect modules
- For group of related modules
- consider using a package
- minimizes chance of namespace clashes
11Naming conventions(my preferred style)
- Modules, packages lowercase
- except when 1 module 1 class
- Classes CapitalizedWords
- also for exceptions
- Methods, attrs lowercase_words
- Local variables i, j, sum, x0, etc.
- Globals long_descriptive_names
12The main program
- In script or program
- def main() ...
- if __name__ __main__
- main()
- In module
- def _test() ...
- if __name__ __main__
- _test()
- Always define a function!
13DOCUMENTATION
14Writing comments
- Explain salient points (only)
- n n1 include end point
- Note dependencies, refs, bugs
- Assume reader() handles I/O errors
- See Knuth, vol.3, page 410
- XXX doesnt handle xlt0 yet
15Writing doc strings
- """Brief one-line description.
- Longer description, documenting
- argument values, defaults,
- return values, and exceptions.
- """
16When NOT to use comments
- Dont comment whats obvious
- n n1 increment n
- Dont put a comment on every line
- Dont draw boxes, lines, etc.
- ------------------------
- def remove_bias(self)
- ------------------------
- self.bias 0
17One more thing...
- UPDATE THE COMMENTS WHEN UPDATING THE CODE!
- (dammit!)
18THE LIBRARY
19The library is your friend!
- Know what's there
- Study the library manual
- especially the early chapters
- Python, string, misc, os services
- Notice platform dependencies
- Avoid obsolete modules
20Stupid os.path tricks
- os.path.exists(p), isdir(p), islink(p)
- os.path.isabs(p)
- os.path.join(p, q, ...), split(p)
- os.path.basename(p), dirname(p)
- os.path.splitdrive(p), splitext(p)
- os.path.normcase(p), normpath(p)
- os.path.expanduser(p)
21PORTING YOUR BRAIN
22Class or module?
- Stateless operatons, factory funcs
- Java static methods
- Python functions in module
- Singleton state
- Java static members, methods
- Python module globals, functions
23Private, protected, public?
- Java
- private, protected, public
- enforced by compiler (and JVM?)
- Python
- __private
- enforced by compiler
- loophole _Class__private
- _private, _protected, public
- used by convention
24Method/constr. overloading
- Java
- class C
- int f() ...
- int f(int i) ...
- int f(int i, int arg)
- ...
- Python
- class C
- def f(i0, argNone)
- ...
25Java interfaces
- In Python, interfaces often implied
- class File
- def read(self, n) ...
- class CompressedFile
- def read(self, n) ...
26Abstract classes
- Not used much in Python
- Possible
- class GraphicalObject
- def draw(self, display)
- raise NotImplementedError
- def move(self, dx, dy)
- raise NotImplementedError
- ....
27ERROR HANDLING
28When to catch exceptions
- When there's an alternative option
- try
- f open(".startup")
- except IOError
- f None No startup file use defaults
- To exit with nice error message
- try
- f open("data")
- except IOError, msg
- print "I/O Error", msg sys.exit(1)
29When NOT to catch them
- When the cause is likely a bug
- need the traceback to find the cause!
- When the caller can catch it
- keep exception handling in outer layers
- When you don't know what to do
- try
- receive_message()
- except
- print "An error occurred!"
30Exception handling style
- Bad
- try
- parse_args()
- f open(file)
- read_input()
- make_report()
- except IOError
- print file, "not found"
- (what if read_input()
- raises IOError?)
- Good
- parse_args()
- try
- f open(file)
- except IOError, msg
- print file, msg
- sys.exit(1)
- read_input()
- make_report()
31Error reporting/logging
- Decide where errors should go
- sys.stdout - okay for small scripts
- sys.stderr - for larger programs
- raise exception - in library modules
- let caller decide how to report!
- log function - not recommended
- better redirect sys.stderr to log object!
32The danger of except
- What's wrong with this code
- try
- return self.childrenO first child
- except
- return None no children
- Solution
- except IndexError
33PYTHON PITFALLS
34Sharing mutable objects
- through variables
- a 1,2 b a a.append(3) print b
- as default arguments
- def add(a, list)
- list.append(a) return list
- as class attributes
- class TreeNode
- children
- ...
35Lurking bugs
- bugs in exception handlers
- try
- f open(file)
- except IOError, err
- print "I/O Error", file, msg
- misspelled names in assignments
- self.done 0
- while not done
- if self.did_it() self.Done 1
36Global variables
- logging module
- log
- def addlog(x)
- log.append(x)
- def resetlog()
- log
- doesnt work!
- logging module
- corrected version
- log
- def addlog(x)
- log.append(x)
- def resetlog()
- global log
- log
37kjpylint
- Detects many lurking bugs
- http//www.chordate.com/kwParsing/
38PERFORMANCE
39When to worry about speed
- Only worry about speed when...
- your code works (!)
- and its overall speed is too slow
- and it must run many times
- and you can't buy faster hardware
40Using the profile module
gtgtgt import profile gtgtgt import xmlini gtgtgt data
open("test.xml").read() gtgtgt profile.run("xmlini.fr
omxml(data)") gtgtgt profile.run("for i in
range(100) xmlini.fromxml(data)")
10702 function calls in 1.155 CPU seconds
Ordered by standard name ncalls tottime
percall cumtime percall filenamelineno(function
) 1 0.013 0.013 1.154 1.154
ltstringgt1(?) 1 0.001 0.001
1.155 1.155 profile0(for i in range(100)
xmlini.fromxml(data)) 0 0.000
0.000 profile0(profiler) 500
0.018 0.000 0.018 0.000
xmlini.py105(end_group) 700 0.032
0.000 0.032 0.000 xmlini.py109(start_item)
700 0.050 0.000 0.050 0.000
xmlini.py115(end_item) 200 0.007
0.000 0.007 0.000 xmlini.py125(start_val)
200 0.014 0.000 0.014 0.000
xmlini.py129(end_val) 1600 0.190
0.000 0.270 0.000 xmlini.py134(finish_start
tag) 1600 0.163 0.000 0.258
0.000 xmlini.py143(finish_endtag) 100
0.004 0.000 0.004 0.000
xmlini.py152(handle_proc) 100 0.007
0.000 0.007 0.000 xmlini.py162(handle_charr
ef) 100 0.007 0.000 0.007 0.000
xmlini.py167(handle_entityref) 3600
0.161 0.000 0.161 0.000
xmlini.py172(handle_data) 100 0.003
0.000 0.003 0.000 xmlini.py182(handle_comme
nt) 100 0.420 0.004 1.141 0.011
xmlini.py60(fromxml) 100 0.007 0.000
0.007 0.000 xmlini.py70(__init__)
100 0.004 0.000 0.004 0.000
xmlini.py80(getdict) 200 0.012 0.000
0.012 0.000 xmlini.py86(start_top)
200 0.014 0.000 0.014 0.000
xmlini.py92(end_top) 500 0.029 0.000
0.029 0.000 xmlini.py99(start_group)
41Measuring raw speed
- Here's one way
- import time
- def timing(func, arg, ncalls100)
- r range(ncalls)
- t0 time.clock()
- for i in r
- func(arg)
- t1 time.clock()
- dt t1-t0
- print "s .3f ms/call (.3f seconds / d
calls)" ( - func.__name__, 1000dt/ncalls, dt,
ncalls)
42How to hand-optimize code
- import string, types
- def dictser(dict, ListTypetypes.ListType,
isinstanceisinstance) - L
- group dict.get("main")
- if group
- for key in group.keys()
- value groupkey
- if isinstance(value, ListType)
- for item in value
- L.extend(" ", key, " ",
item, "\n") - else
- L.extend(" ", key, " ",
value, "\n") - ...
- return string.join(L, "")
43When NOT to optimize code
- Usually
- When it's not yet working
- If you care about maintainability!
- Premature optimization is the root of all evil
(well, almost )
44THREAD PROGRAMMING
45Which API?
- thread - traditional Python API
- import thread
- thread.start_new(doit, (5,))
- (can't easily wait for its completion)
- threading - resembles Java API
- from threading import Thread and much more...
- t Thread(targetdoit, args(5,))
- t.start()
- t.join()
46Atomic operations
- Atomic
- i None
- a.extend(x, y, z)
- x a.pop()
- v dictk
- Not atomic
- i i1
- if not dict.has_key(k) dictk 0
47Python lock objects
- Not reentrant
- lock.acquire() lock.acquire() i.e. twice!
- blocks another thread calls
- lock.release()
- No "lock owner"
- Solution
- threading.RLock class
- (more expensive)
48Critical sections
- lock.acquire()
- try
- "this is the critical section"
- "it may raise an exception..."
- finally
- lock.release()
49"Synchronized" methods
- class MyObject
- def __init__(self)
- self._lock threading.RLock()
- or threading.Lock(), if no
reentrancy needed - def some_method(self)
- self._lock.acquire()
- try
- "go about your business"
- finally
- self._lock.release()
50Worker threads
- Setup
- def consumer() ...
- def producer() ...
- for i in range(NCONSUMERS)
- thread.start_new(consumer, ())
- for i in range(NPRODUCERS)
- thread.start_new(producer, ())
- "now wait until all threads done"
51Shared work queue
- Shared
- import Queue
- Q Queue.Queue(0) or maxQsize
- Producers
- while 1
- job make_job()
- Q.put(job)
- Consumers
- while 1
- job Q.get()
- finish_job(job)
52Using a list as a queue
- Shared
- Q
- Producers
- while 1
- job make_job()
- Q.append(job)
- Consumers
- while 1
- try
- job Q.pop()
- except IndexError
- time.sleep(...)
- continue
- finish_job(job)
53Using a condition variable
- Shared
- Q
- cv Condition()
- Producers
- while 1
- job make_job()
- cv.acquire()
- Q.append(job)
- cv.notify()
- cv.release()
- Consumers
- while 1
- cv.acquire()
- while not Q
- cv.wait()
- job Q.pop()
- cv.release()
- finish_job(job)
54TIME FOR DISCUSSION