Moving towards the Artificial Hacker - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

Moving towards the Artificial Hacker

Description:

AI techniques can be used to partly or wholly automate tasks. ... Adapted from Arthur Samuel's definition AI aims to create software or machines ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 55
Provided by: Office2004290
Category:

less

Transcript and Presenter's Notes

Title: Moving towards the Artificial Hacker


1
Moving towards the Artificial Hacker
  • RUXCON 2005
  • Ashley Fox

2
Who am I?
  • Member of Felinemenace
  • University Student
  • Computer Security enthusiast for many years
  • By no means an AI expert!!

3
Talk outline
  • Why Artificial Intelligence?
  • What is Artificial Intelligence?
  • Introduction to Genetic Algorithms
  • Introduction to Genetic Programming
  • Introduction to Artificial Neural Networks
  • Introduction to Intelligent Agents

http//www.felinemenace.org
4
Why Artificial Intelligence?
  • Interesting Field.
  • AI techniques can be used to partly or wholly
    automate tasks.
  • Striving towards software tools that can act with
    some degree of intelligence.
  • There are a lot of tasks performed by security
    professionals that lend themselves well to
    automation.

http//www.felinemenace.org
5
What is Artificial Intelligence?
  • Umbrella term encompassing a lot of techniques.
  • Adapted from Arthur Samuels definition AI aims
    to create software or machines to perform tasks
    that if performed by humans would be assumed to
    involve the use of intelligence.
  • Includes fuzzy systems, Bayesian networks, neural
    networks, genetic algorithms, genetic
    programming, expert systems the list goes on
  • Other disciplines such as psychology that study
    AI with the aim of better understanding our
    thought processes. Not really of interest to us

http//www.felinemenace.org
6
Genetic Algorithms - Introduction
  • A type of informed search
  • Suitable for problems in which only the solution
    state matters, not the path(s) that leads to it
  • Search for the optimal value within a search
    space
  • Inspired by evolutionary biology

http//www.felinemenace.org
7
Genetic Algorithms - Introduction continued
  • Attempt to mimic the natural process of evolution
  • Survival of the fittest

http//www.felinemenace.org
8
Genetic Algorithms - Data structures
  • Chromosomes are used to represent potential
    solutions to the problem (think of a chromosome
    as a string)
  • Each chromosome has an assigned fitness produced
    by a fitness function (measure of how close to
    meeting the solution the chromosome is)
  • The population is the entire set of chromosomes
    at a point in evolution

http//www.felinemenace.org
9
Genetic Algorithms - Operations
  • Crossover operation attempts to mimic
    reproduction. In its basic form a crossover
    point within the two chromosomes is selected and
    the two chromosomes are combined
  • Selection operation selects pairs of chromosomes
    from the population to crossover
  • Mutation operation randomly changes a random
    element of a chromosome

http//www.felinemenace.org
10
Genetic Algorithms - Steps
  • A random population of chromosomes are generated.
  • The fitness function is applied to each
    chromosome.
  • The selection operation is applied favoring the
    fittest chromosomes.
  • The crossover operation is applied on the
    selected pairs to generate a new population.
  • Mutation is applied to some offspring

http//www.felinemenace.org
11
Genetic Algorithms - Example
http//www.felinemenace.org
12
Genetic Algorithms - Constructing a GA based
fuzzer
  • Chromosome represents our input into the
    application (File, Packet, argument)
  • Genesis state must be chosen carefully
  • If the input is all rejected we will have
    difficulty evolving
  • If we lack diversity we experience convergence
  • Depending on the application our mutation and
    crossover functions may have generate valid
    data.
  • What is our measure of fitness when fuzzing?

http//www.felinemenace.org
13
Genetic Algorithms - Constructing a GA based
fuzzer cont
  • Code coverage!
  • The more code we can test the more bugs we
    (hopefully) expose.
  • We have a few options for a fitness function
  • Instruction stepping
  • Produce a call-flow-graph - assess fitness based
    on coverage
  • Profile the target application and mark
    interesting code blocks by breakpointing - assess
    fitness based on number of blocks hit

http//www.felinemenace.org
14
Genetic Algorithms - Applications
  • Brute force instances for which we can obtain
    hints as to how successful attempts are.
  • Fyodor/Mikasofts talk used them to breed
    overflow strings.
  • Evolve firewall or IDS rule-sets.

http//www.felinemenace.org
15
Genetic Programming - Introduction
  • Genetic Programming is an adaptation of genetic
    algorithm techniques.
  • Instead of evolving strings we evolve programs.

http//www.felinemenace.org
16
Genetic Programming - Basics
  • We are no longer confined to evolving string
    chromosomes.
  • Are chromosomes are now represented in the form
    of a syntax tree
  • Languages uses prefix notation (such as lisp) are
    ideal for genetic programming as they lend
    themselves well to this tree notation.
  • We are not restricted to evolving lisp programs
    however.

http//www.felinemenace.org
17
Genetic Programming - Basics cont
  • A syntax tree consists of nodes and links.
  • Each node represents an operation
  • Each link from a node represents that nodes
    parameters

http//www.felinemenace.org
18
Genetic Programming - Syntax Tree
http//www.felinemenace.org
19
Genetic Programming - More Basics
  • We generally build our software using subroutines
  • Using syntax trees we represent this with a
    branch to a sub-tree

http//www.felinemenace.org
20
Genetic Programming - Syntax Trees
  • Branches can also represent
  • Iteration
  • Recursion
  • Conditionals
  • Predefined functions
  • At times we must also enforce a constrained
    syntactic structure - enforces what types can
    used as arguments for specific nodes

http//www.felinemenace.org
21
Genetic Programming - Basics Basics Basics
  • The syntax tree now forms a basis for our
    chromosome
  • We still have populations (of chromosomes)
  • Each tree still has a fitness value assigned by a
    fitness function
  • The fitness is evaluated based upon how well a
    chromosome performs at achieving the particular
    goal

http//www.felinemenace.org
22
Genetic Programming - Basics cont
  • In order to evaluate the fitness the code is
    built from the syntax tree and interpreted or ran
    under a virtual machine.

http//www.felinemenace.org
23
Genetic Programming - Operations
  • We still have the same operations as when dealing
    with Genetic Algorithms.
  • Mutate, Crossover, Selection.
  • One new operation, the Architecture Altering
    operation
  • The only significant change is the data structure
    were applying these to (the syntax tree).

http//www.felinemenace.org
24
Genetic Programming - Operations cont
  • The crossover operation selects random branches
    of two directed graphs and grafts them.
  • Mutation alters a branch of the graph.
  • Selection still works (pretty much) the same.

http//www.felinemenace.org
25
Genetic Programming - Program Architecture and
altering operation
  • The arrangement, number and types of branches
    present in the syntax tree dictate the program
    architecture
  • The architecture altering operation is introduced
    so that the underlying architecture of the
    chromosome can be dynamic rather than fixed

http//www.felinemenace.org
26
Genetic Programming - Example
http//www.felinemenace.org
27
Genetic Programming - Breeding shellcode
  • Use PPC instruction set
  • Generally 3 register operands per instruction
  • mnemonic dst, operand, operand
  • add r6, r11,r10

http//www.felinemenace.org
28
Genetic Programming - Breeding shellcode cont
  • Each node in our syntax tree represents an
    instruction.
  • Each node has two links representing its
    operands. Child node destination operands
    evaluate to parent node operands.
  • Links can represent conditional instructions.
  • Terminals with a constant value can evaluate to
    an li or equivalent.

http//www.felinemenace.org
29
Genetic Programming - Breeding shellcode cont
http//www.felinemenace.org
30
Genetic Programming - Breeding shellcode cont
  • There are many different aims when writing
    shellcode.
  • Some measures of fitness may be
  • Length
  • Architecture alteration can play a big part in
    this
  • Absence of Illegal characters
  • Nulls etc.
  • Variance from previous shellcode.
  • Success against a known firewall or IDS rule-set

http//www.felinemenace.org
31
Genetic Programming - Breeding shellcode cont
  • We can emulate code using something like qemu.
  • Base fitness on
  • required register states or
  • required system calls hit

http//www.felinemenace.org
32
Artificial Neural Networks - Introduction
  • An artificial neural network attempts to mimic
    the workings of the human brain.
  • Based upon connectionism.
  • Parallel collection of small processing units
    with the basis on the interconnection between
    these processing elements.
  • ANNs are able to perform pattern matching and
    classification (amongst other tasks).

http//www.felinemenace.org
33
Artificial Neural Networks - Data structures
  • ANNs are composed of synapses and neurons.
  • Neurons perform very simple processing of its
    input and produces output.
  • Synapses are used to connect the output of one
    neuron to the input of another.
  • Each synapse has an associated weight that is
    multiplied by the input value and fed to the
    destination neuron.

http//www.felinemenace.org
34
Artificial Neural Networks - Data structures cont
  • Each neuron has a transfer or activation
    function.
  • The most common function is the sigmoid function.
  • The sigmoid function basically squashes the input
    into the range of values between 0 and 1.
  • The tanh function can also be used that squashes
    the input into the range of values between -1 and
    1.
  • There is an additional bias/weight input that
    acts as a threshold for each neuron. If the
    addition of the weighted inputs from the synapses
    exceeds this bias then the neuron fires.
  • Some simpler topologies use a basic threshold
    function. If the sum of inputs is greater than 0
    fire the value 1. Otherwise fire the deactivated
    value -1.

http//www.felinemenace.org
35
Artificial Neural Networks - Topologies
  • Many variations upon the concept of an artificial
    neural network
  • For the purpose of this presentation were only
    concerned with feed-forward networks.
  • Again there are several variations on
    feed-forward networks.
  • The main ones (especially for the purpose of
    today) are single and multi-layer perceptrons.

http//www.felinemenace.org
36
Artificial Neural Networks - Single layer
perceptrons
  • A Single layer perceptron contains a layer of
    input neuron fed directly to an output layer.
  • The input layer does no processing but passes the
    inputs to the network directly to the outputs.
  • The output layer does the processing within the
    network and outputs the result.
  • These types of networks are simplistic and cannot
    perform well on more complex problems
  • Single layer percceptrons generally use the
    threshold activation function (input gt 0 fire 1
    else fire -1).
  • Some Single layer perceptron networks work with
    continuous output using activation functions such
    as the sigmoid function.
  • Single layer perceptrons can only perform a
    limited number of functions.

http//www.felinemenace.org
37
Artificial Neural Networks - Multi layer
perceptrons
  • Multi layer perceptrons have an input layer, one
    or more hidden layers and an output layer.
  • The hidden and output layers perform processing
    whilst the input layers provide input into the
    network. The output layer still provides the
    output for the network.
  • The sigmoid function is generally used as the
    activation function.

http//www.felinemenace.org
38
Artificial Neural Networks - Topologies cont
http//www.felinemenace.org
39
Artificial Neural Networks - Topologies cont
http//www.felinemenace.org
40
Artificial Neural Networks - How do they learn?
  • After constructing a suitable network topology a
    set of training input data as well as the
    expected outputs are provided.
  • The weights within the network are initialized to
    small random values (the network knows nothing).
  • The network is fed the training data and the
    output vs the expected output is measured.
  • The weights are evolved as to minimize the degree
    of error in the output.
  • Issues can arise when the network overfits the
    data. In this case the network essentially
    becomes a lookup-table of the training input and
    output. The statistical nature of the data is not
    learnt by the network and it will not perform
    well on data outside of the training set.
  • The right amount of training must be performed.
    This is sometimes difficult to predict.
  • The correct network topology must be chosen. This
    is sometimes guesswork and experimentation.

http//www.felinemenace.org
41
Artificial Neural Networks - Applications
  • Anything requiring classification or pattern
    matching.
  • IDS/IPS traffic classification?
  • Virus/Worm classification?
  • Identifying code constructs within binaries?

http//www.felinemenace.org
42
Intelligent Agents - Introduction
  • An agent is simply something that acts.
  • Agents distinguishes themselves from general
    programs by
  • Running under autonomous control
  • Perceiving their environment
  • Adapting to changes within their environment
  • Acting Rationally

http//www.felinemenace.org
43
Intelligent Agents - Autonomy
  • In saying that intelligent agents are autonomous
    we mean that they operate independently and
    without user control.
  • An Intelligent Agent can make its own decisions
    and work towards its own goals.
  • No (or very little) guidance required by an
    operator.

http//www.felinemenace.org
44
Intelligent Agents - Rationality
  • What does it mean to act rationally?
  • For purpose of working with intelligent agents
    rationality refers to an agents ability to
  • Act as to achieve the best outcome.
  • When there is lack of certainty achieve the best
    expected outcome.

http//www.felinemenace.org
45
Intelligent Agents - Task Environment
  • The environment in which an agent operates
    dictates its design, architecture and the types
    of AI techniques it employs.
  • The most important aspects of an Agents task
    environment are
  • The performance measure
  • The Environment
  • Actuators
  • Sensors

http//www.felinemenace.org
46
Intelligent Agents - Task Environment cont
  • There are obviously near-infinite task
    environments that can agent can operate within.
  • There are various properties we can however
    record that are relevant to each environment and
    agent design.
  • Each environment is either
  • Fully or partially observable
  • Deterministic or stochastic
  • Episodic or sequential
  • Static or Dynamic
  • Discrete or Continuous
  • Single or multi agent

http//www.felinemenace.org
47
Intelligent Agents - Agent variations
  • There are many variations in agent software and
    its design.
  • There are four main types that summarize most of
    them
  • Simple reflexive agents
  • Model-based reflexive agents
  • Goal-based agents
  • Utility-based agents

http//www.felinemenace.org
48
Intelligent Agents - Agent variations cont
  • Simple reflexive agents
  • Receive a percept from the environment and
    perform an appropriate action.
  • Constructed with condition-action rules in which
    a change of state triggers and action.
  • Ignore previous percepts and actions.
  • Model based reflexive agents
  • Attempt to compensate for partial observance
    within an environment by constructing and
    internal model of their belief of the state of
    the environment.
  • Model is generally constructed based upon percept
    history (states that have been to present)
  • Goal-based agents
  • Take into account their performance and current
    goals when decision making
  • Utility-based agents
  • Utility function that maps the current state into
    a numerical measure of success.
  • Distinctly aware of how well they are performing
  • Utility functions can constructed based upon many
    measures of performance. These can be hard or
    soft goals for the agent.

http//www.felinemenace.org
49
Intelligent Agents - Learning
  • Agents that can learn themselves or that can be
    taught are more desirable than having to
    explicitly program an agent.
  • In order to learn we add two more elements
  • The learning element
  • The problem generator
  • The learning elements responsibility is to
    assess how well the agent is performing and make
    modifications to its actions (actuators).
  • The problem generators responsibility is to
    suggest new actions for the agent that can result
    in it experiencing new states (experiences) that
    it can learn from

http//www.felinemenace.org
50
Intelligent Worms?
  • Thus far we have not seen intelligent worms.
  • Worms
  • Are not aware of their performance
  • Do not perceive their environment
  • Do not adapt their behavior
  • What if a worm
  • Perceive its environment - operating environment
    and network
  • Adapted its behavior
  • Operated in a cooperative multi-agent fashion
  • Coordinated in a multi-agent fashion
  • Such a worm could
  • Intelligent select targets - DNS / Mail servers?
  • Modify its propagation parameters
  • Communicate its experiences with other instances
    of itself
  • Worms do not necessarily have to perform evil
  • The Nachi/Welchia family of worms patched
    otherwise vulnerable machines

http//www.felinemenace.org
51
Intelligent Worms?
  • Worms could also use genetic operations
  • Selection
  • Crossover
  • Mutation
  • Variations of a worm are unleashed into the wild
  • The worms perceive their environment and adapt
    accordingly
  • The fittest worms survive and continue to
    propagate
  • Worms locate their brethren and produce offspring
    (cross-over)
  • Some of the offspring are subject to mutation
  • Many variants in the wild with different
    propagation parameters - difficult to isolate

http//www.felinemenace.org
52
Useful links
  • Gaul - genetic algorithm utility library
  • http//gaul.sourceforge.net
  • Fann - Fast Artificial Neural network library
  • http//fann.sourceforge.net
  • OpenAI Project
  • http//openai.sourceforge.net

http//www.felinemenace.org
53
References
  • Artificial Intelligence - A Modern Approach 2nd
    Edition
  • Stuart Russel and Peter Norvig.
  • A Genetic Programming tutorial
  • John Koza andRiccardo Poli

http//www.felinemenace.org
54
Questions? Comments? Flames?
  • Hope you enjoyed the talk.
  • If you think of any questions later on Im sure
    you can find me at the bar )

http//www.felinemenace.org
Write a Comment
User Comments (0)
About PowerShow.com