Computer Science / Software Engineering Notes Network

Intelligent Systems

Matthew Barnes, Mathias Ritter

Intro        5

Classical AI        6

Blind Search        6

Problem Types        6

Single-state Problem Formulation        8

State vs Nodes        10

Search Strategies        10

Searching using Tree Search        10

Repeated States        13

Bidirectional Search        14

Heuristic Search        14

Best-first Search        14

Greedy Best-first Search        14

A* Search        15

Admissible Heuristics        18

Consistent Heuristics        18

Dominance        18

Relaxed problems        18

History and Philosophy        18

Local Search        21

Hill-climbing Search        21

Simulated Annealing Search        23

Local Beam Search        24

Genetic Algorithms        25

Constraint satisfaction problems        26

Adversarial Games: Minimax        34

Alpha-beta Pruning        39

Planning        43

Recent Advances in AI        51

Classification        51

Decision Trees        51

Entropy        52

Conditional Entropy        54

Information Gain        54

Nearest Neighbour        57

Neural Networks        63

Perceptrons        63

Perceptron: Example        64

Perceptron: Rewriting the Activation Function        65

Perceptron: Visualisation of the Activation Function        65

Perceptron: Defining Weights and Bias        66

Perceptron: Expressiveness        67

Perceptron: Training with the Delta-Rule (Widrow-Hoff Rule)        68

Perceptron: Limits        68

Types of neurons (activation functions)        69

Multi-layered Neural Networks        70

Multi-layered Nets: Training        71

Further Issues of Neural Networks        71

Deep Learning        71

Supervised vs Unsupervised Learning        72

Offline vs. Online Learning        72

Generalisation & Overfitting        73

Training vs Testing        74

Reasoning        74

Bayes Theorem        74

Total Probability        75

Conditional Probability        75

Bayes’ Rule        76

Example: Monty Hall Problem        76

Example: HIV test problem        77

Bayesian Belief Update        78

Complex Knowledge Representation        79

Joint Distribution        79

Independence        81

Bayesian Networks        81

Properties of Bayesian Networks        82

Pros and Cons of Bayesian Networks        83

Decision Making        83

Bandit Theory        83

Multi-armed Bandit (MAB) Model        85

MAB: Epsilon-first Approach        85

MAB: Epsilon-greedy Approach        86

MAB: Epsilon-first vs. Epsilon-greedy        86

Other Algorithms        86

Performance of Bandit Algorithms and Regret        86

Extensions of the Multi-armed Bandit        88

Reinforcement Learning & Markov Decision Process        88

Reinforcement Learning        90

States, Actions and Rewards        91

Temporal Difference Learning        91

Q-Learning        94

Markov Decision Processes (MDP) and Markov Chain        95

Monte Carlo Simulation (multi-armed bandits)        98

Extensions of Reinforcement Learning        100

Mathematical techniques (not examinable)        101

Functions, Limits, Derivatives        101

Continuity        101

Discontinuity        101

Derivatives        102

Linear combinations of functions        102

More derivative rules        102

Vectors        103

What is a vector?        103

Norm        103

Scalar product        103

Linear independence        103

Basis        103

Linear maps        104

Matrices        104

Scalar fields        105

Vector fields        105

Matrices        105

Partial derivatives        105

Gradient descent        105

Application, optimisation and regression problems        105

TL;DR        105

Classical AI        106

Blind Search        106

Heuristic Search        108

History and Philosophy        108

Local Search        109

Constraint satisfaction problems        110

Adversarial Games: Minimax        112

Alpha-beta Pruning        112

Planning        112

Recent Advances in AI        114

Classification        114

Decision trees        114

Nearest Neighbour        114

Neural Networks        115

Reasoning        116

Decision Making        117

Bandit Theory        117

Reinforcement Learning & Markov Decision Process        118


Intro


Classical AI

Blind Search

Problem Types

Problem name

Properties

Example

Single-state problem

  • Deterministic
  • Observable

  • You always know what state it is in, at any time during the problem.

The shortest path problem.

This is deterministic because when you input the same graph, starting point and ending point, you’ll always get the same shortest path.

This is observable because you can always see where you’re going.

Sensorless / multi-state problem

  • Deterministic
  • Non-observable

  • The initial state could be anything; this requires a general solution that works for all inputs.

Guessing someone’s random number.

This is deterministic because the solution will always be the same number that your friend picked (e.g. if they pick ‘6’, the solution will always be ‘6’, it can’t be different in any other run).

Guessing numbers like 5 or 12 are only solutions for 2 possibilities. Since you don’t know the number to guess, you’ll need a solution that satisfies every number (like saying their number is “x”).

Contingency problem

  • Nondeterministic
  • Partially observable (maybe)

  • Nondeterministic means that, even with the same input, the action might perform differently (this is due to things like race conditions, random numbers etc.)

  • Partially observable means that you can’t view the whole state; only parts of it.

  • Basically, you don’t really know enough until you do something. Once you’ve performed an action, you see a reaction, and you use that reaction (and more actions) to eventually solve the problem.

A piñata (even though it’s not a computer science problem, it explains nondeterministic and partially observable well).

This is nondeterministic because you may not hit the piñata. If you tried multiple times, you’ll miss sometimes and hit the other times. In a more technical sense, the actions will randomly give the result of hit or miss, therefore the solution to this problem is ‘n’ number of swings, where ‘n’ is a random number.

This is partially observable because you know that the piñata is somewhere in-front of you (unless everyone spins you around, then it’s non-observable). You just don’t know where exactly. You will find out some information once you swing your bat, though. If you hit it (solve the problem), you’ll feel it hit your bat. If you miss, you’ll know that the piñata is not where you swung.

Exploration problem

  • Unknown state space

  • You don’t even know what effect your actions have, or how big the problem is.
  • Therefore, you need to ‘explore’ the environment before you try to solve the problem.

Robot exploration.

In this problem, a robot looks around using actions and maps out the area it’s in.

This has an unknown state space because the robot has no idea where they are, but they’re using their actions (such as cameras, touch-sensitive peripherals etc.) to find out and map out their environment.

Single-state Problem Formulation

Item name

Description

Example

Initial state

The state we start off at

When you start a chess game, the initial state is always:

Actions or successor function

The transitions from one state to the next

When you move a piece in chess, the state goes from one to the other:

Goal test

The condition that, when true, means the problem is done

Explicit: “board in state x”

Implicit: “check-mate”

When the opponent’s king is in check-mate:

Path cost

A variable that should be reduced where possible to get the optimal solution

There are multiple path costs that a chess algorithm can adopt.

One could be ‘number of opponent pieces’. This would make the algorithm aggressive, since it would try to reduce the number of opponent pieces as much as it can.

Another could be ‘number of moves taken’. This would make the algorithm more strategic, achieving a checkmate in the least amount of moves.

State vs Nodes

Search Strategies

Searching using Tree Search

Tree search

Description

Example / Animation

Breadth-first search

In breadth-first search, all of the nodes at the current depth level are searched before moving onto the next depth level.

This is complete (assuming the branching factor isn’t infinite), even with infinite depth or loops.

Only problem is, the space complexity is terrible.

However, breadth-first search is optimal (finds the least-cost solution) if the step cost is constant. This is true because it finds the shallowest goal node.

Complete?

Yes, if b (branching factor) is finite

Time?

O(bd+1)

Space?

O(bd+1)

Optimal?

Yes, if step cost is constant

Depth-first search

In depth-first search, it starts at the root and goes all the way down to the far-left leaf, then backtracks and expands shallower nodes.

The space complexity is great (it’s linear) because branches can be released from memory if no solution was found in them. However, it is not complete if the tree has loops since the search is going to get stuck in the loop and will never be able to exit it, or if the depth is infinite as it’ll just keep going down one branch without stopping.

Depth-first search is not optimal because it searches down the furthest branch and could miss shallower goal nodes.

Complete?

No, if depth is infinite (or has loops) it’ll go on infinitely

Time?

O(bm)

Space?

O(bm)

Optimal?

No, because deeper solution might be found first

Depth-limited search

It’s the same as depth-first search, but there’s a depth limit (n), and anything below that depth limit doesn’t exist.

Complete?

No, because the solution might be deeper than the limit

Time?

O(bn)

Space?

O(bn)

Optimal?

No, because deeper solution might be found first

Limit: 2

Iterative deepening search

This is an applied version of depth-limited search, where we start from depth limit 0 and increment the depth limit until we find our goal.

We do this to combine depth-first search’s space efficiency with breadth-first search’s completeness.

Although we get those benefits, IDS is a little bit slower than depth-first.

Complete?

Yes

Time?

O(bd)

Space?

O(bd)

Optimal?

Yes, if step cost is constant

Goal: Node 7

Repeated States

Bidirectional Search

Heuristic Search

Best-first Search

Greedy Best-first Search

Complete?

No – can get stuck in loops like depth first if heuristic is bad

Time?

O(bm), but a good heuristic can give dramatic improvement

Space?

O(bm) -- bad, keeps all nodes in memory

Optimal?

No

A* Search

Step #

Description

Illustration

1

This is the graph we’ll use A* on. We’ll go from node 0 to 6.

The black numbers are the weights of the paths, the green numbers are the heuristic h(n) and the red numbers will be f(n), which is the sum of the weight up to that node and it’s heuristic.

The blue nodes are unvisited (open), the red nodes are visited (closed) and the green node will be the current node we’re on.

2

First, we look at the adjacent nodes. We need to calculate their f(n)!

For node 1, we add g(1) and h(1), which is 45 + 317, which is 362.

For node 2, we add g(2) and h(2), which is 56 + 242, which is 298.

3

We need to pick a node with the least f(n).

Right now, it’s 2, so we pick 2.

4

Now we need to find the f(n) of the adjacent nodes of 2!

Firstly, 3. g(n) is 56 + 25, because it’s the cost of going from the starting node up to that node. h(n) is 161, so adding them together gets 242.

Second, 5. Again, it’s g(n) + h(n), which is 246.

5

Now we need to pick the node of the lowest f(n)!

This would be node ‘3’, so we go with 3.

6

We calculate f(n) for 4 and 6, which are 339 and 141, respectively.

7

The next smallest unvisited f(n) is the goal node, 6. Once we’ve discovered 6, we can map out a path from the goal 6 to the start 0.

The shortest path is 0, 2, 3, 6.

Complete?

Yes (unless there are infinitely many nodes with f ≤ f(G) )

Time?

Exponential (in length of optimal solution)

Space?

Keeps all (expanded) nodes in memory

Optimal?

Yes, if the heuristic is admissible

Admissible Heuristics

Consistent Heuristics

Dominance

Relaxed problems

History and Philosophy

Local Search

Hill-climbing Search

  1. We will start at some (possibly random) initial state and let this state be the current state
  2. Then we will check the neighbour of the current state
  1. The neighbour state is the highest valued successor of the current state, determined by the objective function
  2. We are not going to check any states which are further away than our neighbour
  1. If the value of this neighbour state is higher than the current state we will assign the neighbour state to the current state and go back to the last step
  2. If the value of the neighbour state is lower than the current state we will stop and return the current state as the solution
  1. This is because there is no better state within our neighbours that we can reach

Simulated Annealing Search

  1. We will start at some (possibly random) initial state and let this state be the current state
  2. We will check if the temperature is 0
  1. If it is 0 then we will return the current state as the solution
  1. We will randomly select a successor of the current node as the next node
  2. We will check if the next node is better or worse than the current node
  1. If it is better, we will select it, i.e. assign it to the current node
  2. If it is worse, we will only select it with a probability denoted by the temperature
  1. We will go back to the second step

Local Beam Search

  1. We start with k randomly generated states and let them be the current states
  2. If any of those current states is the goal state then return this state
  3. Otherwise, generate all their successors let the k best children be the current states
  4. Go back to step 2

Genetic Algorithms

Constraint satisfaction problems

  1. If the assignment is complete then return the assignment (solution)
  1. This is a base case - we have found the solution!
  1. Select any unassigned variable
  1. It doesn’t matter which one you select first in order to find a solution
  2. However, it will matter for performance reasons as we are going to explore later
  1. For each value of all values in our domain
  1. If the selected variable would be assigned this value: Is it consistent with the assignment according to constraints, i.e. does it not violate any constraints?
  1. Add it to our assignments
  2. Recursively call this function with the new assignments set, and store whatever it returns as a result
  1. If it didn’t return a failure (so it returned a solution), then return the result
  2. If it returned a failure, then remove this particular assignment from our assignments set
  1. This is the backtracking part of the algorithm
  2. It means that we have gone down a path that didn’t lead to a solution, therefore we are going to undo what we have tried
  1. Return a failure since we have already checked all values for the unassigned variable
  1. This is another base case - there doesn’t exist any solution

Adversarial Games: Minimax

Complete?

Yes (if the tree is finite)

Time?

O(bm)

Space?

O(bm) (depth-first exploration)

Optimal?

Yes (against an optimal opponent)

Game

Evaluation Function

Noughts and Crosses

(or Tic-Tac-Toe for the foreigners)

The number of potential nought lines minus the number of potential cross lines

Chess

c1 * material + c2 * pawn structure + c3 * king safety + c4 * mobility …

Where c1 to c4 are some constants (weights)

Each of those sub-functions (material etc.) are other linear functions

quiescent search

  • Quiescent search is varying the depth limit of the search
  • For noisy (interesting) states, the tree is searched deeper
  • For quiet (less interesting) states, the tree is not searched this far
  • We will use a heuristic function to distinguish between noisy and quiet states

singular extensions

  • Explore a move in greater depth if
  • One move is significantly better than the others
  • There is only one legal move

Alpha-beta Pruning


Planning

Precond:
Effect:

Air cargo transport

Initial State

Goal State

Actions

Example plan

Spare tire problem

Initial State

Goal State

Actions

Example plan

Partial-Order Planning (POP)

Shoe example

Initial State

Goal State

Actions


Recent Advances in AI

Classification

Decision Trees

Entropy

City

Good

OK

Terrible

Birmingham

0.33

0.33

0.33

Southampton

0.3

0.6

0.1

Glasgow

0

0

1

Birmingham

P

log2P

P*log2P

Good

0.33

-1.58

0.53

OK

0.33

-1.58

0.53

Terrible

0.33

-1.58

0.53

SUM =

1.58 (bits)

Southampton

P

log2P

P*log2P

Good

0.3

-1.74

0.52

OK

0.6

-0.74

0.44

Terrible

0.1

-3.32

0.33

SUM =

1.29 (bits)

Glasgow

P

log2P

P*log2P

Good

0

-infinity

0

OK

0

-infinity

0

Terrible

1

0

0

SUM =

0 (bits)

Conditional Entropy

Information Gain

Reads

Skips

Row total

new

7 (70%)

3 (30%)

10

follow_up

2 (25%)

6 (75%)

8

SUM =

18

Nearest Neighbour

Neural Networks

Perceptrons

Perceptron: Example

Perceptron: Rewriting the Activation Function

Perceptron: Visualisation of the Activation Function

Perceptron: Defining Weights and Bias

Perceptron: Expressiveness

Perceptron: Training with the Delta-Rule (Widrow-Hoff Rule)

Perceptron: Limits

Types of neurons (activation functions)

Multi-layered Neural Networks

Multi-layered Nets: Training

Further Issues of Neural Networks

Deep Learning

Supervised vs Unsupervised Learning

Offline vs. Online Learning

Generalisation & Overfitting

Training vs Testing

Reasoning

Bayes Theorem

Total Probability

Conditional Probability

Bayes’ Rule

Example: Monty Hall Problem

Example: HIV test problem


Bayesian Belief Update

Complex Knowledge Representation

Joint Distribution

Independence

 

Bayesian Networks


Properties of Bayesian Networks

Pros and Cons of Bayesian Networks

Decision Making

Bandit Theory

Multi-armed Bandit (MAB) Model

MAB: Epsilon-first Approach

MAB: Epsilon-greedy Approach

MAB: Epsilon-first vs. Epsilon-greedy

Epsilon-first

Epsilon-greedy

Explicitly separates exploration from exploitation

Exploration and exploitation are in an interleaving manner

Observation: epsilon-first is typically very good when T is small

Observation: epsilon-greedy is typically efficient when T is sufficiently large

Drawback: we need to know T in advance

Drawback: slow convergence at the beginning (especially with small epsilon)

Drawback 2: sensitive to the value of epsilon

Drawback 2: sensitive to the value of epsilon

Other Algorithms

Performance of Bandit Algorithms and Regret

Extensions of the Multi-armed Bandit

Reinforcement Learning & Markov Decision Process

Reinforcement Learning


States, Actions and Rewards

Temporal Difference Learning

Q-Learning

Markov Decision Processes (MDP) and Markov Chain

Monte Carlo Simulation (multi-armed bandits)

Extensions of Reinforcement Learning


Mathematical techniques (not examinable)

Functions, Limits, Derivatives

Continuity

Discontinuity

Type of discontinuity

Description

Visual aid

Asymptotic

it curves away and doesn’t come back

Point

the graph has a prick in it

Jump

it does a \tp

Derivatives

Linear combinations of functions

More derivative rules

Rule

Description

Equation

Chain rule

it’s, like, a thing within a thing

Product rule

timesing the things together

Quotient rule

dividing the things together

Inverse rule

like, one is an inverse of the other and you can do the derivative of it and stuff

Vectors

What is a vector?

Norm

Scalar product

Linear independence

Basis

Linear maps

Matrices

Scalar fields

Vector fields

Matrices

Partial derivatives

Gradient descent

Application, optimisation and regression problems


TL;DR

  1. It’s not examinable
  2. It’s not even finished

Classical AI

Blind Search

Heuristic Search

History and Philosophy

  1. Turing used clever heuristic search to break Nazi messages, and he and the government studied AI in secret
  2. After WWII, some smart people modelled artificial neurons, a really smart guy built the first neural network computer, and Turing’s book “Computing Machinery and Intelligence” was published
  3. In 1956, Dartmouth Workshop brings 10 super nerds on automata theory (probably Gennaro) to work on AI and they dominated the field for 20 years
  4. Between 1952 to 1969, They made general problem solvers, had success with checkers, invented Lisp, and made microworlds like vacuum world and blocks world
  5. Around 1966 people got sad and started to give up
  6. Around 1969 people got happy and made stuff like DENDRAL
  7. In 1980, AI becomes more cognitive
  8. In 1986, AI started to look more like neural networks
  9. In 1988, people start using insects as inspiration instead of logic
  10. In 1995, they made them better
  11. Fast forward to today, when AI is used pretty much everywhere.

Local Search

Big β, low temperature

Small β, high temperature

Constraint satisfaction problems

finite domains

infinite domains

Discrete variables

 complete assignments

variables are integers, strings etc.

Continuous variables

start/end times for Hubble Space Telescope observations

Most constrained variable

Most constraining variable

Least constraining value

Adversarial Games: Minimax

Alpha-beta Pruning

Planning

Recent Advances in AI

Classification

Decision trees

Nearest Neighbour

Neural Networks

  1. Put all of our data into a line graph
  2. Draw a line of best fit
  3. Minimise the mean squared error of the graph and the line

Reasoning

Decision Making

Bandit Theory

Reinforcement Learning & Markov Decision Process

Good luck!