Lecture 1: Introduction - PowerPoint PPT Presentation

About This Presentation
Title:

Lecture 1: Introduction

Description:

Computational Problems. A . computational problem (CP) consists of a set of . possible inputs. and their expected . correct results. Example: Is the integer – PowerPoint PPT presentation

Number of Views:165
Avg rating:3.0/5.0
Slides: 40
Provided by: ryersonCa
Category:

less

Transcript and Presenter's Notes

Title: Lecture 1: Introduction


1
Lecture 1 Introduction
  • Bits and pieces

2
Course Bureaucracy
  • Main Website http//cps125.scs.ryerson.ca/
  • Outline, schedule and rules are whatever is said
    there.
  • My Course Website, including these
    slides http//www.scs.ryerson.ca/ikokkari/cps125
    .html
  • But download and read through the official slides
    also
  • My name Ilkka Kokkarinen
  • My email ilkka.kokkarinen_at_gmail.com
  • Office hours Tuesday 10AM at ENG242
  • Do not send me your weekly labs. I am not your
    TA. I don't know your TA. I don't have the power
    to order your TA to do anything.

3
Grading
  • Your course grade is the minimum of two numbers,
    the exam grade and the achievement grade
  • The exam grade is simply the sum of your midterm
    and final exam marks (both on scale 0 to 50)
  • The achievement grade is determined by the labs
    and programming projects that you successfully
    submit
  • See CMF for precise thresholds
  • In effect your AG works as a cutter for the total
    grade that you get for this course
  • There is no gain in copying or buying your labs
    and projects, if you can't pass the exams

4
Achievement Grades
  • During the course, you can unlock up to four
    achievements
  • Each programming project is one achievement
  • Getting 50 of weekly lab marks is one
    achievement
  • Getting 75 of weekly lab marks is one
    achievement
  • Your baseline achievement grade is 0, and
    unlocking achievements increases that
  • To be an A student, you need three achievements
  • To be an A student, you need all four
  • To be a C student, you need just one

5
Computational Problems
  • A computational problem (CP) consists of a set of
    possible inputs and their expected correct
    results
  • Example Is the integer x a prime number?
  • The CP itself does not have a correct answer
    giving its placeholders actual values gives an
    instance of that problem, and each instance has a
    definite answer
  • Every interesting CP has an infinite number of
    possible instances (otherwise just tabulate and
    look up results)
  • In a decision problem, the answer for each
    instance is always either true or false
  • Generally, a CP can be a function from any set of
    inputs to any set of possible results (typically
    integers or some finite subset thereof)

6
Algorithms
  • An algorithm is a well-defined series of
    operations whose execution solves some
    computational problem
  • Consist of finite and deterministic individual
    operations that the entity executing the
    algorithm can perform
  • For any instance of the CP solved by the
    algorithm, the same algorithm produces the
    correct answer in finite time
  • The decisions and exact steps taken during
    executing the algorithm depend on the instance
  • An infinite CP is essentially compressed into a
    finite representation as an algorithm
  • CP's are uncountable, algorithms are countable
    therefore most CP's can't have algorithms to
    solve them

7
Example Euclid's GCD
  • Problem Given two integers a and b, their
    gcd(a, b) is the largest number that exactly
    divides both
  • gcd(45, 20) 5, gcd(45, 18) 9, gcd(50, 25)
    25
  • Important in number theory, interesting
    applications
  • Oldest nontrivial algorithm in history discovered
    by the Greek mathematician Euclid (around 300 BC)
  • Assume a gt b. To compute gcd(a, b), if b is 0,
    you can stop, and return a as the answer to the
    problem.
  • Otherwise, compute gcd(b, a mod b) and return
    that.
  • Example gcd(45, 20) gcd(20, 5) gcd(5, 0)
  • Actually, Euclid originally used the equivalent
    but far less efficient formula gcd(a, b) gcd(a
    - b, b)

8
Executing the Algorithm
  • Why does Euclid's algorithm work? (Number Theory
    101.)
  • But you don't need to understand or prove
    that the algorithm works correctly to be able to
    execute it!
  • Executing the individual steps of the algorithm
    exactly as they are given is enough to produce
    the correct answer
  • Algorithms can be executed by computers designed
    to execute instructions at tremendous speed, but
    have no understanding of why they are doing so
  • Algorithms are self-contained all the
    "intelligence" required to solve the problem
    resides in the algorithm
  • Additional understanding is gravy, but does not
    and cannot affect the execution, as the algorithm
    never relies on its executor to make any
    decisions for it

9
Example Primality testing
  • Problem Is the positive integer x a prime
    number?
  • Brute force algorithm loop through all of the
    potential divisors 2, 3,..., x - 1 and check if
    any one of them divides x
  • If you find even one divisor, stop and answer
    false, no need to check any other potential
    divisors since they can't change the answer
  • Having checked all divisors, stop and answer true
  • Speed up this algorithm by checking only odd
    divisors after 2 (and even more by checking only
    prime divisors)
  • Also, only need to check up to the square root of
    x
  • For the same CP, many different algorithms can
    exist, and their execution times can vary by
    orders of magnitude

10
Operations That We Need
  • Turns out it doesn't matter much which specific
    operations we are given to build our algorithms
    from
  • We don't need very much to achieve computational
    universality and compute any computable function 
  • Some way to pass data back and forth, some
    arithmetic
  • Decisions (if-else)
  • Loops (do-while, repeat-until)
  • Church-Turing Thesis Every computational device,
    physical or mathematical, can solve only problems
    that an extremely primitive Turing machine is
    able to solve
  • As long as they don't run out of physical memory,
    all computers are equivalent, except for their
    speed
  • Don't need different machines for different tasks

11
Generalized Notions of Algorithm 
  • Some algorithms may be probabilistic in that they
    occasionally flip a coin to decide which of the
    possible ways to proceed, but can still be
    guaranteed to return the correct answer with high
    probability
  • Running a probabilistic algorithm several times
    in a row (or in parallel) combining the results
    can be used to further improve its accuracy
  • A heuristic algorithm works correctly and
    efficiently for common inputs, but may give a
    wrong answer for some rare and pathological
    inputs
  • Distributed algorithms require a large number of
    processors that occasionally communicate with
    each other, and the algorithm produces global
    result from local knowledge

12
Recursion
  • Recursion solve a self-similar problem by
    reducing it into a simpler version of the very
    same problem, until the problem has become simple
    enough to solve on the spot
  • Example gcd(a, b) gcd(b, a mod b) when a gt b
  • Example factorial function n! 1 2 ...
    (n - 1)  n
  • Can be solved without recursion by looping from 2
    to n and multiplying each number into the result
  • Alternatively, realize that n! (n - 1)!  n
  • To compute n!, first compute (n - 1)! and after
    you have that, multiply it by n, and there is
    your result
  • To avoid infinite regress, when a problem is
    simple enough, just look up and return its simple
    answer
  • For the factorial, these base cases are 1! 0!
    1

13
Computer languages
  • Processor is a device designed to execute
    instructions of machine code at very high speed
  • Each machine code instruction is extremely
    simple, but together they are computationally
    universal
  • Programming in assembly is possible (and still
    often done), but too complicated for most
    practical problems
  • Solution artificial high-level languages whose
    concepts reside at far higher level of
    abstraction, and are more convenient for us
    humans to think and solve problems in
  • C, Java, Basic, VB, C, JavaScript, Python, Lisp
    etc. etc. 
  • Here we learn and use C, the lowest-level
    high-level programming language in widespread use

14
Compilers
  • Processor can't execute high-level languages
    directly
  • A compiler is a special program that reads in a
    source code program in high-level language, and
    emits an equivalent program that is written in
    machine code
  • Executing the machine code program produces the
    same results as executing the high-level
    program would have produced according to
    the semantics of that language 
  • The compiler program that the processor executes
    is also a series of machine code instructions,
    and the processor doesn't "know" that it is
    specifically executing a compiler
  • Where did this compiler come from?
  • Where did the very first compiler come from?

15
When Things Go Wrong
  • Syntax error The program does not conform to the
    syntactic rules of the language, and is rejected
  • Type error The program tries to use some data in
    violation of what is possible for the type of
    that data
  • Runtime error When run, the program tries to do
    something that is logically or physically
    impossible, and is forcefully terminated
    (crashes)
  • Logic error The program is legal, it runs and
    returns a result... it's just that this result is
    incorrect
  • No compiler can possibly detect, let alone
    prevent or silently correct, your logic errors
  • To be able to do that, the compiler would have to
    be able to read your mind to determine what you
    meant to say, as opposed to what you actually did
    say

16
Dynamic vs. Static
  • While converting the program into machine code,
    the compiler also tries to ensure that whatever
    the program does is legal, logically possible and
    safe
  • Static checks happen at compile time, and are
    designed to guarantee that certain runtime errors
    cannot occur
  • Static checking also makes actual execution
    faster
  • No amount of static checking can possibly prevent
    all runtime errors, so some runtime errors can
    remain
  • We must ultimately execute the program to find
    out what it actually does (for a particular input
    instance)
  • Dynamic checks happen while the program is
    actually being executed for some concrete
    instance of the CP

17
Interpreters
  • Some high-level languages (far above C) are
    infeasible to fully compile into machine code
  • Programs written in such languages are executed
    in an interpreter that executes the program in
    its source form, a lot more like a human would
    mentally execute it
  • Dynamic interpretation allows for high-level
    language features that can't be set in stone at
    the compile time, for example dynamic code
    generation
  • Interpreter for a low-level language (or even
    sort of machine code) is called a virtual machine
  • A program, no matter in what language, can't
    "know" whether it is being executed in an
    interpreter, or whether it has been translated
    into machine code (or even some other high-level
    language) and executed there

18
RAM
  • Computer's random access memory essentially
    consists of a long row of bytes, each byte stored
    in its own address that the processor uses to
    access it
  • Each byte can store a small integer from 0 to 255
  • What each byte "means" depends on the context of
    what it is current being used as, the byte itself
    doesn't "know" this
  • The exact same number 17 stored inside a single
    byte can be either a data value, or a part of a
    value, a machine code instruction, a character...
  • In a von Neumann machine, programs are not
    hardcoded into the hardware, but all program code
    is also data
  • Universal processor can execute any series of
    bytes as machine code instructions

19
RAM vs. Secondary Storage
  • Secondary storage (hard disks) used to extend RAM
  • RAM is volatile, hard disks are persistent
  • Secondary storage also orders of magnitude
    cheaper
  • So why have RAM at all?
  • RAM is faster for the processor to read and write
  • RAM allows random access needed to execute
    programs efficiently, since both data and
    instructions tend not to be accessed in
    sequential order during execution
  • "Random access" really should be called
    "arbitrary access", but the term is historically
    stuck as is
  • Secondary storage works sequentially, in that it
    takes long time to jump to an arbitrary point,
    but reading from it is fast
  • For us humans, even the slowest storage is
    screaming fast

20
Arrays of Data
  • All interesting programs operate on lots of data,
    since there is only so much you can ask about one
    number
  • An array is a row of uniform data elements stored
    sequentially in memory
  • Random access memory allows us to read or write
    each element in same constant time
  • Location of ith element in memory can be quickly
    computed with a ei from the start address of
    array a and the size e of an individual element
  • Contrast with linked list, where each element
    contains a memory address of the element
    following it in the chain

21
Classic Problem Sorting an Array
  • Problem given an array of elements, rearrange
    them to be in ascending sorted order
  • Interesting problem with multitude of algorithms
  • Sorted data makes many other algorithms easier
  • Simplest sorting algorithm is selection sort 
  • Repeatedly find the smallest element of the
    remaining elements, and swap it to the first
    place of these elements that remain
  • Note the self-similarity of this algorithm, when
    having placed the minimum of n elements, you are
    left with the problem of sorting an array of n -
    1 elements

22
Operating Systems
  • All programs tend to have plenty of functionality
    in common IO, file handling, Internet access,
    GUI...
  • The role of operating system is to provide these
    services for programs for free, so that every
    programmer is not forced to always reinvent the
    wheel
  • Multitasking of several processes, memory
    management
  • Important purpose of OS and its device drivers is
    to hide the differences of the underlying
    hardware
  • OS provides a unified interface (kind of a
    "virtual machine") that programs "see",
    regardless of the actual hardware that is working
    underneath
  • Encapsulation hide all implementation details of
    your system under a unified interface

23
Positional Number Systems
  • ?In daily life, we represent numbers in base 10
    so that the position of the digit determines its
    magnitude
  • The same digit 7 can stand for "seven" or "seven
    trillion"
  • For example, 705 7 102 0 101 5 100
  • However, there is nothing magical about base ten,
    other than we humans have ten fingers, so any
    other integer n would work as base just as well
  • With base n, the possible digits allowed are 0,
    ..., n - 1
  • Binary base of two uses bits 0 and 1 
  • 000110112 24 23 21 20 16 8 2 1
    27
  • It's still the same integers, but they are far
    easier to store and represent in electronic
    computer than in base ten

24
More on Positional Number Systems
  • Arithmetic algorithms such as addition,
    subtraction, multiplication work the exact same
    way in all bases
  • Dividing and multiplying by base shift digits
    right and left
  • The larger the base, the shorter the numbers get,
    but basic arithmetic operations get more complex
  • The smaller the base, the longer the numbers get
  • Base ten is a good compromise for human brain
    (actually, if we could reboot humanity, base 4
    might be optimal)
  • In a mixed-radix system, base depends on position
  • Example "3 days, 6 hours, 22 minutes, and 17
    seconds" uses positional bases (30, 24, 60, 60)
  • Negative numbers, real numbers or even complex
    numbers could also work as (pretty weird) bases

25
Bits and Bytes
  • In computer memory, one byte consists of eight
    bits
  • Unsigned byte can store values from 0 to 255
  • A half-byte (nibble) can store values from 0 to
    15
  • To represent larger numbers too big to fit into a
    single byte, pool together two, four or eight
    bytes
  • Once again, a byte doesn't "know" what its value
    represents, or whether that byte currently
    happens to be a part of some larger cluster of
    bytes that together represent something
  • In computing, all meaning is always imposed from
    outside / above, things themselves have no
    semantics
  • Modern processors can address memory several
    bytes at the time, although they usually require
    words of four bytes to start at an address
    divisible by four (eight)

26
Hexadecimal Numbers
  • For low-level representation of integers,
    hexadecimal system of base 16 is often more
    convenient than binary
  • Need six more digits than in base ten, for which
    we use letters A to F, so that digits are
    0123456789ABCDEF
  • Convenient to convert from binary to hex and vice
    versa, since each hex digit depends on and
    represents four particular bits, and is not
    affected by other bits in number
  • This works because 24 16
  • For example, the famous hex number 0xDEADBEEF
    1101 1110 1010 1101 1011 1110 1110 1111 
  • RGB colours are often given in hex, with one byte
    for each of the red, green and blue components
    (sometimes fourth byte for transparency)

27
Negative Integers
  • In the computer memory, each bit can be 0 or 1,
    but not a minus sign or some magical "here the
    number ends" mark
  • The simplest way to represent negative integers
    of known fixed size is to treat the highest-order
    bit as its sign
  • In one byte, -7 would be 1000 0111
  • In two bytes, -7 would be 1000 0000 0000 0111
  • However, this representation has two zeros 0 and
    -0
  • Better store negative integers in twos
    complement
  • To negate a positive integer, flip all its bits
    to their opposite values, and then add 1,
    carrying as far as needed
  • For example, since 7 is 0000 0111, negating its
    bits gives 1111 1000, and adding one gives 1111
    1001 for -7
  • Hardware does all this automatically

28
Fixed Point Decimal
  • In base ten, the decimal part of the number is
    represented as negative powers of ten, listed
    positionally after a special decimal point
    character
  • For example, 12.74 1 101 2 100 710-1
    410-2
  • Same idea can be used to represent decimal
    numbers with bits by assigning an implicit
    decimal point at some location, and treating bits
    right of that point as negative powers of 2
  • For example, 1110.10102  8 4 2 1/2 1/8
    14.625
  • Fixed point representation gives us uniform
    precision through its range of possible values,
    but we need more precision for small numbers than
    for large numbers
  • 0.001 and 0.00000001 are vastly different
    numbers, but if you add a billion to both, they
    are pretty much the same

29
Scientific notation
  • In scientific notation, numbers are represented
    in the format s  m be  where s is the sign, m
    is the mantissa, b is the base, and e is the
    exponent that determines the magnitude of the
    number
  • Sign is either 1 or -1 (when writing out the
    number, the sign is usually coalesced into the
    mantissa), and mantissa is normalized between 1
    and b to make the representation of each number
    unique
  • For example, -1.655 1023 (but not -16.55
    1022)
  • In raw text where superscripts are not possible,
    the letter e is used to denote the power of 10,
    so the previous number would be written out as
    -1.655e23

30
IEEE 754 Floating Point Decimal
  • Using base 2 instead of base 10 again changes
    nothing
  • IEEE 754 standard defines encoding of decimal
    numbers into either 32 bits (single precision) or
    64 bits (double)
  • Numbers still of the form s m be, but now b
    2
  • Mantissa m must be a decimal number that is at
    least 1 but strictly less than 2
  • First bit gives the sign (0 for , 1 for -)
  • Next 8 bits are the exponent biased by 127 to
    allow us to represent it as positive integer in
    the usual fashion
  • The last 23 bits encode the mantissa in fixed
    point
  • Since mantissa is 1 point something, there is no
    need to store the 1, as we can just store the
    decimal part

31
Example
  • Let us encode the number 9.0 in IEEE single
    precision.
  • The sign is , so the first bit is 0. (That was
    easy.)
  • Splitting 9.0 into the form m 2e under the
    constraints that e has to be an integer and m has
    to be at least 1 but strictly less than 2, we get
    e to be 3 (since 24 would be 16, which would
    logically entail m lt 1, violating the constraint)
  • Biasing this by 127, the encoded exponent will
    be 130
  • Solving for m, we get m 9/23 9/8 1 1/8
  • 130 128 2 1000 00102
  • 8 23, so 1/8 is the third bit from the
    beginning
  • Now we can encode this number in 32 bits, sign,
    exponent, mantissa 0100 0001 0001 0000 0000 0000
    0000 0000

32
A Systematic Way
  • Example massage the number 11.5 in form that is
    easy to convert to either single or double
    precision
  • First, write this number as sum of powers of two,
    giving us the sum 11.5 8 2 1 1/2 23
    21 20 2-1
  • Since this is not a power of 2, it is not in the
    form m 2e, where e is an integer and m is
    between 1 and 2
  • However, dividing this by its highest term 23 and
    then multiplying it by 23 does not change its
    value
  • 11.5 (23  21  20  2-1) / 23  23 
  • (1  2-2 2-3 2-4) 23 is in the required
    form m  2e, and even allows us to directly read
    the bits of the mantissa

33
Range of Floating Point
  • Unlike fixed point encoding, floating point can
    represent a far wider range of possible values,
    and gives us more precision where it is needed
    the most (around the zero)
  • Double precision works the exact same way but
    uses a total of 64 bits, with a bias 1023 to
    store the exponent in 11 bits, leaving 52 bits
    for the mantissa
  • Possible exponents are -1023 to 1024, allowing
    us to represent pretty huge numbers
  • Modern processors execute floating point
    arithmetic directly on hardware, making it as
    fast as integer arithmetic
  • Quadruple precision uses 128 bits, with a
    whopping 16 bits for exponent and 111 bits for
    mantissa

34
Special Numbers
  • Just like we can't represent 1/3 as an exact
    finite-length decimal number, even some simple
    decimal numbers (for example, 0.1) have no exact
    representation as float
  • In fact, we can't even represent zero correctly!
  • Zero is simply defined as the special case so
    that all bits of both mantissa and exponent are
    set to 0
  • The standard thus features both 0 and -0 (these
    are meaningful to distinguish as limits)
  • The standard also defines special values for both
    positive and negative infinities, and two NaN
    values resulting from operations that have no
    meaningful result otherwise

35
Limitations of Floating Point
  • Integer arithmetic is exact two plus two equals
    exactly four, not even one infinitesimal fluxion
    or iota more or less
  • However, integer operations can overflow or
    underflow if the result doesn't fit in the fixed
    size
  • No matter what the encoding, you can represent at
    most 264 different numbers in eight bytes (64
    bits) of memory
  • This astronomical number is minuscule compared to
    how many values even simple arithmetic could
    produce
  • When using floating point, intermediate results
    may be a little bit off from what they would be
    in infinite precision
  • Always output decimal numbers rounding them, and
    always test them for equality by suitable
    tolerance

36
Subnormal Numbers
  • Subnormal numbers use the exponent 0 to represent
    really small (as in, close to zero) numbers
  • Making also mantissa 0 then represents the 0
  • Standard guarantees that for any two different
    normal numbers x and y, their difference x - y is
    nonzero
  • The difference of two subnormal numbers may be
    too small to represent in this system, and is
    then 0
  • The machine epsilon is the smallest positive
    number x so that 1 x and 1 are not equal
  • Unit in the last place (ulp) is the smallest
    possible increment that can be made in the number
    system
  • Each individual arithmetic operation is
    guaranteed to have a maximum error of half ulp
    from its true value

37
Integers in Floating Point
  • Single precision IEEE 754 can represent all
    integers up to magnitude 224 (double precision,
    up to 253)
  • It can also represent all larger integers created
    by multiplying any of these integers by 2e, where
    e is a positive integer up to 104
  • Amazingly enough, when viewed as an unsigned
    integer, adding one using the ordinary integer
    addition gives you the next representable
    floating point number
  • Puzzle does there exist a pattern of 32 bits so
    that it represents the exact same integer when
    viewed as an integer as it does viewed as a
    floating point number?

38
Encoding Characters and Text
  • As we have seen, the computer memory can only
    store a sequence of bits that we then interpret
    as integers, decimal numbers, machine code
    instructions etc.
  • To store text, characters must be encoded to
    integers, so the text becomes a sequence of
    integers
  • A character encoding tells what character each
    number represents, and the font gives it a
    visible glyph to render on screen or print for us
    humans to read
  • Originally some variation of ASCII encoding (one
    byte per character, special characters wherever)
  • In the networked global world, use the Unicode
    standard
  • Indicate somehow where text begins and ends
  • In programming, a piece of text is called a string

39
 
  • "He who refuses to do arithmetic is doomed to
    talk nonsense."
  • John McCarthy
  • "Make no mistake about it Computers process
    numbers - not symbols. We measure our
    understanding (and control) by the extent to
    which we can arithmetize an activity."
  • Alan Perlis
  • "The question of whether Machines Can Think... is
    about as relevant as the question of whether
    Submarines Can Swim."
  • Edsger W. Dijkstra
Write a Comment
User Comments (0)
About PowerShow.com