Title: The basic question of Computer Science
1The basic question of Computer Science
- What is computable?
- By asking this we're trying to answer the
question of what computers - can and cannot do. What are their strength?
What are their limits? - Shortest Answer
- a. Have to be able to write an algorithm for the
problem solution. - b. Algorithm cost cannot be too large o(logn),
O(n), O(n logn), O(nk).
2CS Concepts Computational Power
- Computational power refers to the types of
problems a computer can solve. - It has nothing to do with how fast a computer is
nor how nice the graphics are. - It turns out very simple computers are just as
powerful than very expensive ones (although
slower). - So in terms of computational power, all computers
are equal. - This allows generalized solutions - solve a
problem for one and we solve it for all. - Since all computers are equal, we tend to focus
on types of problems rather than types of
computers. - Is the problem computable?
3Be able to Identify Patterns of Doubling/Halving
- Consider two math operations taking a number to
a power, and taking the log of a number. - We will focus on the number 2 (binary no
surprise)
n 2n n log2n
0 1 1 2 2 4 3 8 8 256 9 512 10 1024 1 0 2 1 4 2 8 3 256 8 512 9 1024 10
4Example Color Depth
- How many colors are available if each pixel uses
n bits? - How many bits are required for 256 colors?
- 8 bits, because log2 256 8
- Thats one byte coincidence???
Bits Values Colors
1 2 0 1 00 01 10 11 Black White Black Dark Gray Light Gray White
Bits Values Colors
3 000 001 010 011 100 101 110 111 Black Blue Green Yellow Red Purple Orange White
5 A Computer Model
Processor Model A black box that performs
specific instructions representing discrete
arithmetic and logical operations.
Memory Model Memory is composed of a single row
of storage locations, each referenced by a
numerical address.
6 Algorithm Basics
- Definition A step by step procedure written so
precisely that there is essentially no
possibility for variation in its performance. - Written in pseudo-code
- Part English, part computer code
- Generic enough to be easily translated into most
computer languages. - Five parts, each must be detailed
- A list of inputs and their types
- A list of other variables and their types
- Initialization
- Computation
- A list of outputs and end conditions
- If given a simple algorithm be able to discuss it
and how it operates.
7Why Algorithms are Important
- Algorithms have no mention of a computer.
- Remember that CS focuses on the computation not
the machine. - Thus, an algorithmic solution to a problem
means the problem is solved for every computer - Well find a fast machine if we have to.
- Each step in the algorithm was precise
- Therefore each step can be described and measured
in terms of cost. - This helps us answer the two big questions Can
it be computed and at what cost? - Lastly Algorithms can be abstracted into
functions.
84 Properties of Good Algorithms
- Correctness
- The algorithm works correctly for all possible
inputs.. - Precision
- The algorithms procedures are specified in such
a way that the same actions are taken no matter
who is performing it (Notice what the book says
about that on p 32) - Incremental Operation
- Each step specified should consist of a single
logical step. - Abstraction
- Abstraction allows for many incremental steps to
be logically grouped into single step.
9Example Algorithm Problems
- What inputs are required?
- If the program written from this algorithm was
executed with x 12, y 10 as input, what would
the output be? - Pretend the algorithm was turned into a function,
foo(x, y). If used in a program like below, what
is the type of "myvar?" - myvar foo( 12, 10)
Algorithm foo(x, y) Inputs Integer x
Number y Others Number z Initialization
Assume x, y, given to us. Computation IF ( y gt
x ) THEN z y / 2 ELSE z (x y) /
2 ENDIF Output return z
10Definition of Function
Function A function is a black box that takes
input, operates on it, and produces output. A
function is fully defined by three things
input, output, and a description relating the
output to the input.
Example The Add1 function Input - a number,
x. Output - a number, F(x) Operation - F(x) x
1
11Function Composition
Add2
- Function composition when simple functions are
combined to form more complex functions. - Definition of Add2
- Input A number, X
- Output A number, F(x)
- Operation F(x) x 2
12Why are Functions Important?
- Simple functions are the building blocks of all
computation. - Remember our model of a Processor allows only
simple operations. - Real Processors allow only simple operations.
- Simple functions are easily described.
- Iteration One function repeatedly calling
another. - Recursion One function repeatedly calling
itself. - We can build large programs by composing small
functions. - We can describe large programs by describing the
functions that make them up and how they are
composed.
13Functions are Abstract Algorithms
- Algorithms take in input, operate on it, and
produce output. - Functions take in input, operate on it, and
produce output. - When we speak of functions we focus on WHAT is
being done. - When we speak of algorithms we focus on HOW
things are done. - Algorithms can specify how functions produce
their results. - Many different algorithms may specify a single
function. - Functions can be thought of as abstract
algorithms. - Many algorithms can map to the same function.
14Order Notation
- ORDER NOTATION allows us to relate computational
cost to problem size. - Most input data can have a size associated with
it, like the size of a list. - Be able to compare relative sizes of different
orders - O(k) lt O(log n) lt O(n) lt O(nlogn) lt O(nk) lt O(kn)
lt O(n!)
Polynomial (tractable)
Exponential Factorial (intractable)
15Why Use Order Notation
- The order (or cost) of a computation tells us
whether it is computable or not - All O(k) and O(n) problems are computable.
- O(nk) problems are computable ( usually k 1,
2, or 3) - O(kn) are not computable (except for very small
n). - Order notation allows us to compare the
efficiency of different algorithms that have been
developed to solve the same problems. - Order notation allows us to compare the
efficiency of our algorithms with known costs for
problems. - You dont want the cost of your algorithm to be
larger than the known best costs for a problem. - You cannot make an algorithm lower than the
proven best cost.
16Program Control Introduction
- A list of program statements that always execute
in sequence from top to bottom is called a BASIC
BLOCK. - Very few interesting programs can be written
using only basic blocks. - Interesting programs have the ability to react to
different input data. - PROGRAM CONTROL statements allow us to change how
the program behaves based on decisions made about
the input data. - Two basic program control structures are
- If-Then-Else structures which act like forks in
the road during program execution. The program
either takes the True branch or False branch. - Do-Loops which allow the same program statements
to be executed over and over in a looping style.
17Program Control If-Then -Else
- The If-Then-Else structure has three parts
- A predicate that must evaluate to True or False
- Something to do when the predicate is True
- Something to do when the predicate is False
- (Note Doing nothing counts as something to do)
- An example in our algorithmic pseudo-code
- IF ( X gt Y) THEN
- print(x)
- ELSE
- print(y)
- ENDIF
- Notice We needed a way to tell when we were
done ENDIF.
18Program Control Do-Loop
- Do-Loops in JavaScript
- Each loop requires three things an initial
value, a stopping condition, and a way to change
the looping variable. - The format for this is for(initial value stop
condition increment rule) - Example
- sum 0
- For ( j 1 j lt 5 j j 1)
- sum sum arrayj
- In our example
- The initial value was set by j 1
- The stopping condition was j lt 5 (so j would be
1, 2, 3, and 4) - The increment rule increased the value of j by 1
each tie through the loop ( JavaScript note j
is shorthand for j j 1)
19 Variables
- Why use variables? It would be almost impossible
to try to program by referring to the numerical
addresses. - Too complicated for the human. ( add memory
location 1 and memory location 8 ) - Different machines have different memory
structures. - Three things describe a variable name, type and
its structure. - The act of specifying a new variable is called
declaring the variable.
20 Variable Scope
- Scope refers to the portions of a program that
may legally reference a given memory location
referred to by a variable. - Global scope indicates the memory location is
always accessible. - Local scope means the memory location is only
available during certain portions of the program
execution. - Memory reserved within functions is only
available while the function is executing. - This means that variable names defined within a
function may only be used while the function is
executing. - If a single variable name is repeated, the memory
location being accessed by using the variable
name is the most local definition.
21Data Types and Data Structures
- When specifying data in a program we need to
describe its type and its structure. - Data's type impose meaning onto data (semantics)
and data's structure impose organization
(syntax)onto data. - Data Type (definition) A label applied to data
that tells the computer how to interpret and
manipulate data. - Type tells the computer how much space to reserve
for variables and how to interpret operations on
them. - Data Structure (definition) The way data is
organized logically. - Describes how different pieces of data are
organized.
22Data Structures - Atomic
Name Atomic
Definition A data structure containing a single value, or data item, of any type.
Std Functions Assign places a value in the atomic structure. Retrieve returns the value currently stored in the structure.
Non-Std Functions None
Notes This is the most basic data structure, from which all others are built. Contains only one data item Can be placed anywhere in memory.
23Data Structures - Array
Name Array
Definition A group of data elements stored in contiguous memory.
Std Functions Assign(n) store a value in the nth element. Retrieve(n) retrieves the value stored in the nth element.
Non-Std Functions Length returns number of elements in the array. Sort sorts the elements of the array. Matrix operations, like determinant, transpose, etc.
Notes Typically, all elements of an array are of the same type. Typically, each element in an array can be of any type. Advantage Instant access to any element in the array (ie it is just as easy to retrieve the value of the first element as any other). Disadvantages Hard to change the size of the array. Hard to insert new elements in the middle of the array.
24Data Structures - Linked List
Name Linked List
Definition A group of data elements composed of two parts a value part and a link part. The link points to the next data element in the list.
Std Functions Insert(value) inserts a new element into the list. Remove(value) removes an element from the list.
Non-Std Functions None
Notes The variable representing the linked list points to the head of the list. The last element in the list has a value of NULL for its link. Advantages The linked list can change size easily. Elements can be inserted and deleted easily into linked lists. Disadvantage You do not have quick access to members
25Data Structures - Stack
Name Stack
Definition A structure in which only the top element is visible.
Std Functions Push(value) pushes the value on the top of the stack. Pop() Removes the top value in a stack. Peek() Returns the value of the top element on the stack.
Non-Std Functions Push(value1, value2, value3..) - pushes multiple values on the stack. Pop(n) Pops n elements off the stack. Peek(n) - Returns the value of the nth element.
Notes Can be though of as a stack of plates or trays. Exhibits LIFO (Last in First Out) behavior. Useful in task scheduling in which one task must be completed before others can begin.
26Data Structures - Queue
Name Queue
Definition A list of elements in which elements are added to one end and removed from the other.
Std Functions Enqueue(value) adds the value to one end of the queue. Dequeue() removes a value from the queue.
Non-Std Functions None
Notes Can be thought of as a line of customers, such as at the bank or grocers. Exhibits FIFO (First in First Out) behavior. Useful in task scheduling in which FIFO is important (Example printer queues)
27Data Structures - Stacks Queues
ENTER
EXIT
STACK
ENTER
EXIT
QUEUE
NOTE STACKS AND QUEUES ARE TYPICALLY DEPICTED
LIKE ARRAYS BUT THE INDIVIDUAL ELEMENTS CAN
RESIDE ANYWHERE IN MEMORY.
28Data Structures - Graph
Name Graph
Definition A set of nodes (data elements) connected by edges in an arbitrary manner.
Std Functions None
Non-Std Functions None
Notes The most versatile data structure (linked lists, trees and heaps are special instances of graphs). Standard Problems Graph Coloring Coloring the nodes of a graph such that adjacent nodes do not have the same color. Traveling Salesman Visiting each node in the graph exactly once for the least cost (start and stop at the same node). Maximum Flow Determine the amount of flow through a network.
29Data Structures - Tree
Name Tree
Definition A graph with directed edges connecting parent to child such that there is exactly one node with no parents and all other nodes have exactly one parent.
Std Functions None
Non-Std Functions None
Notes The first element in the tree is the root node, which has no parents, and from which all others can be reached. Nodes with no children are "leaf" nodes. If nodes a and b are connected by an edge, then a is a child of b if b is closer to the root than a. a is a parent of b if a is closer to the root than b Useful in making decisions and categorizing data.
30Data Structures - Heap
Name Heap
Definition A tree in which a parent node has a value larger than all its children.
Std Functions Heap(a, H) Add new node a to heap H. Unheap(H) Remove the root element from heap H and re-establish the heap.
Non-Std Functions None
Notes Flexible data structure useful for sorting elements as they arrive. This allows sorting on lists whoses size change constantly. Used in priority queues or other situations where maintaining and accessing a maximum element is important.
31Data Structures - Graph
THESE ARE ALL GRAPHS.
32Data Structures - Graph
THESE ARE TREES (ABOVE). ARE THEY HEAPS?
THESE ARE NOT TREES (ABOVE). ARE THEY HEAPS?
33Graph Problems
- There are literally thousands of graph problems,
but we will focus on three that are occur very
commonly and show the diversity of the graph
structure - The Traveling Salesman Problem.
- Graph Coloring Problem.
- Maximum Flow Problem.
- Each problem has a decision form and an
optimization form. The decision form asks "Can
we do it?" and the optimization form asks "How
well can we do it?" - At least one of these problems is solved by you
every day without you realizing it (until now). - The fact that the nodes and edges can represent
anything means that the graph structure is very
versatile and virtually any problem can be mapped
to a graph problem.
34Graph Problems - Traveling Salesman
- Description Given a graph, G N, E, where
- N a set of cities.
- E travel routes between the cities, each having
a cost associated with it. - One special node, s.
- You must begin at city sand travel to each of the
other cities exactly once and then return to city
s. Thus you make a complete cycle of all cities
in the graph. - Decision form of the problem Can a route be
found where the total cost of the trip is less
than X? (Answer is yes or no). - Optimization form of the problem What is the
absolute lowest cost?
35Graph Problems - Graph Coloring
- Description Given a graph, G N, E, where
- N a set of nodes.
- E edges between the nodes.
- The object is to color the graph such that no
nodes connecte by an edge have the same color. - Decision form of the problem Can the graph be
colored with X or less colors? (Answer is yes or
no). - Optimization form of the problem What is the
fewest number of colors required to color the
graph?
36Graph Problems - Maximum Flow
- Description Given a graph, G N, E, where
- N a set of nodes.
- E edges representing pipes, each assigned a
given capacity. - Two special nodes. Node s is a source node that
can potentially spit out an infinite amount of
material. Node f is a sink node that can
potentially absorb an infinite amount of
material. - The object is to determine the maximum amount of
material that can flow through the network for
the source to the sink. - Decision form of the problem Can X amount of
material be pushed through the network from the
source to the sink? (Answer is yes or no). - Optimization form of the problem What is the
maximum amount of material that can flow through
the material from the source to the sink?
37Artificial Intelligence
- Artificial Intelligence (AI) is the name given to
encoding intelligent or humanistic behaviors in
computer software. - Problem Nobody has created a widely accepted
definition of intelligence. - At one time was considered a uniquely human
quality. - Now generally accepted to be an animal quality.
- Has been linked to tool use, tool creation,
learning, adaptation to novel situations,
capacity for abstraction. - Problem Nobody has created a widely accepted
definition of artificial intelligence. - Cognitive models attempt to recreate the actual
processes of the human brain. - Behavioral models attempt to produce behavior
that is reasonable for a situation regardless of
how the behavior was produced. - Tend to focus on reasoning, behavior, learning,
adaptation.
38Artificial Intelligence Challenges
- Format and Size of Knowledge the data
structures we have discussed so far capture data
values, but not data meaning. How is knowledge
represented? How are relationships between - Ambiguity Knowledge ultimately represents
natural phenomena that are inherently ambiguous.
How do we resolve this?
39Proposed AI Systems
- Rule Based Behavior designed behavior
specifying sets of conditions and responses. - Wealth and complexity of rules limits
applications. - Case-based and Context-Based Reasoning attempt
to reduce search space of possible behaviors by
only considering those associated with certain
situations or contexts.
40Proposed AI Systems
- Emergent Behavior (Ant Logic) Overall behavior
resulting from the interaction of smaller rule
sets or weak individual agents. Overall behavior
is not designed but desired. - Genetic Algorithms represents behavioral rules
as long strings, termed genomes. Behavior is
evolved as various genomes are tried and
evaluated. Higher rated genomes are allowed to
survive and reproduce with other high ranking
genomes. - Synthetic Social Structures Models more complex
animal social behaviors, such as those found in
herds and packs. Allows efficient interaction
without much communication.
41Parallel Processing
- Parallel Processing occurs when more than one
processor works together to solve a single
problem. - Your lab has a lot of PC's but that doesnt mean
parallel processing is going on. Each separate
PC is working on solving separate problems. - Parallel processing speeds up solution times in
two ways - Multiple computers simply add "horsepower" to an
existing algorithm - New algorithms can be designed to actually lower
the cost of the problem solution. - Two kinds of parallelism
- Data parallelism when the data is divided up
among processors - Task parallelism When the parts of an algorithm
are divided up.
42Types of Parallel Processing
- Pipelining (not usually considered true
parallelism) - This occurs when separate processers, each with a
specific job, work on a phase of a problems
solution. - Most common example is your PC. The "single"
processor is actually made up of many smaller
special purpose processors. - Parallel Processing
- Separate processors, usually identical work with
one memory. - Usually requires special purpose machines. These
machines may contain thousands of simple
processors (usually in a multiple of 2, like 4096
or 8192, etc) - Distributed Processing
- Separate processors of any type each with their
own memory. - Networks or the internet. This is the most
general and flexible type of processing, but
communication costs between processors may be
high.
43Stages of Parallel Computation
- Everything has a benefit and a cost - parallel
processing has an overhead cost. - The goal is to let the benefit of multiple
processors working on a single problem outweigh
the cost of keeping track of them. - Stages of a parallel computation
- (cost) Division of problem/data and distribution
to processors - (cost) Start up of remote processes
- (benefit) Parallel Computation
- (cost) Transfer local results to a common
processor - (cost) Collate results and present
44Terminology
- Closed Problems These problems are well
understood. The known cost and known algorithms
are of the same order. It is known whether they
are computable or not - Open Problems These problems are not as well
understood. The cost of known algorithms to
solve them and the theoretical best are not of
the same order. - Tractable Computable, specifically with respect
to cost. Cost is polynomial. - Intractable Not computable, specifically with
respect to cost. Cost is exponential or worse.
45More Terminology
- Undecidable Problems These are problems for
which no algorithm can be written (regardless of
cost). They are not computable. - Deterministic Algorithms Algorithms that have
no randomness associated with them. They will
always behave the same when given the same input. - Nondeterministic Algorithms Algorithms that
have random elements. They may behave
differently when run repeatedly,even if the input
is the same. -
46Oracles
- Oracles A theoretical device that magically
selects the correct choice for an algorithm when
the algorithm has choices to make. - They don't exist, but are used to classify the
"hardness" of problems. - Example Traveling Salesman with oracle
- We want to travel to n cities.
- Recall that Traveling Salesman is an O(n!)
problem. - With an oracle, we always make the best choice on
each leg of the trip. - This means we only check out one path, and it is
the optimal path. - Since there are n cities, we made n correct
choices and the problem is O(n). - Since Traveling Salesman is intractable without
the oracle, but tractable with the oracle,
Traveling Salesman is an NP problem.
47The Hierarchy of Problems
P Polynomial, O(nk). Easy to solve, easy to
verify solution. Ex Searching, sorting. NP
Non-Deterministic Polynomial,
O(kn) Hard to solve, easy to verify solution Ex
Traveling Salesman Decision. Hard Hard,
O(kn) Hard to solve, hard to verify solution. Ex
Listing all n-digit numbers.
48NP-Complete Problems
- There are literally hundreds of them.
- They can all be mapped to each other (hence the
"complete" part of the name). - They all have exponential upper bound costs (not
computable). - They all have polynomial lower bound costs
(computable). - It is possible that polynomial algorithms will be
developed to solve one of them - If one is solved quickly, then they all are.
- Until then the notion of nondterministic
algorithm guided by an oracle is used to discuss
them.
49Cost of Typical Problems
Name Description Cost
GetMax What is the largest number in an unordered list of numbers? O(n)
Search (Unordered list) Given a list and a key, determine if the key is in the list. O(n)
Search (Ordered List) Given a list and a key, determine if the key is in the list. O(log n)
Sort (BubbleSort) Given an unordered list, rank all elements from smallest to largest. O(n2)
Sort (QuickSort) Given an unordered list, rank all elements from smallest to largest. O(nlogn)
Sort (Problem) Given an unordered list, rank all elements from smallest to largest. O(n log n)
50Cost of Graph Problems
Name Description Cost Comments
Traveling Salesman Decision Does a route exist with cost less than X? O(n!) Hard to solve, easy to verify yes answers.
Traveling Salesman Optimization What is the least cost route. O(n!) Hard to solve, hard to verify.
Graph Coloring Decision Can a graph be colored properly using X colors? O(kn) where k is the number of colors Hard to solve, easy to verify yes answers.
Graph Coloring Optimization What is the least number of colrs required to properly color a graph? O(kn) where k is the number of colors. Hard to solve, hard to verify.
Maximum Flow What is the most material that can be pushed through a network? O(n3) Polynomial solution time means easy to solve, easy to verify.
51Given a Problem, How do you Determine if it is
Computable?
- Method One (usually the hard way)
- Write an algorithm to compute the solution for
all inputs. - Determine the cost of the algorithm.
- Compare to the problem hierarchy.
- Method Two (usually the easy way)
- Show that the new problem can be mapped to a
known problem. - The new problem then has the same cost as the
old. - If the old was computable, then so is the new.