Computer Science / Software Engineering Notes Network

Computer Systems I

Matthew Barnes

Contents

Physical        9

Intro to CPU structure        9

Definition        9

Functions of a CPU        9

Structure        9

Registers        9

User visible registers        10

Control registers        10

Assembler, C -> ARM        10

Data flow        11

Computer interfaces        12

Types of input        12

Analogue inputs        12

Converting analogue to digital        12

Quantisation        12

How does ADC work?        13

DAC        13

CISC and RISC        13

Architectures        13

Von Neumann Architecture        13

Harvard Architecture        14

RISC        14

CISC        15

Pipelining and branch prediction        15

Pipelining        15

Dealing with branches        16

Multiple Streams        16

Prefetch Branch Target        16

Loop buffer        16

Branch prediction        16

Delay slot        17

SSE        17

Flynn’s classifications        17

SSE        17

How SSE works        18

Superscalar        18

What is superscalar?        18

Limitations        18

Instruction issues        19

Superscalar requirements        20

Interconnection structure & bus design        20

What is a bus        20

Single bus problems        20

Bus layouts for different units        21

Types of buses        21

Control / Address / Data bus        21

Peripheral Component Interconnect Express (PCIe)        22

Universal Serial Bus (USB)        22

Hard disk buses        23

Parallel ATA (PATA or IDE)        23

Serial ATA (SATA)        23

SCSI        23

Fibre channel        24

I2C        24

Mass storage        24

Magnetic disk        24

Floppy        24

Hard disk        24

Solid State Drives        25

Optical storage and CD-ROM        26

DVD        26

Blu-ray        27

Magnetic tape        27

Storage Area Network (SAN)        27

RAID        27

What is RAID        27

RAID 0        28

RAID 1        28

RAID 0+1        28

RAID 1+0        28

RAID 4        29

RAID 5        29

RAID 6        29

RAID 5+0        30

RAID 6+0        30

Software RAID        30

Hot spares and hot swap        31

Stripe sizes        31

RAID controllers        31

Microcontrollers        31

What is a microcontroller?        31

Embedded systems        32

Input / Output        32

GPIO        32

Analogue to digital inputs        32

Different types of microcontrollers        32

Arduino        32

Microchip’s PIC        32

TI MSP430        32

ARM        33

Cortex M        33

Raspberry Pi 3 details        33

Memory and RAM        33

Location        33

Capacity        34

Unit of transfer        34

Access methods        34

Sequential        34

Direct        34

Random        34

Associative        34

Memory hierarchy        35

Performance        35

Access time        35

Memory cycle time        35

Transfer rate        35

Physical types        35

Semiconductor memory        35

Dynamic RAM (DRAM)        35

Static RAM (SRAM)        36

Read-only Memory (ROM)        36

Double data rate RAM (DDR RAM)        36

Error correction        36

Simple parity bit        37

Flash RAM solid state disk        37

Parallel Processing        37

Amdahl’s law        37

Symmetric multiprocessors        38

Computer cluster        38

Process based message-parsing        39

Thread based message-parsing        39

Instruction level parallelism        39

Main dataflow        40

Code generation        40

PC architectures        41

Chipset        41

Server motherboards        41

Intel QuickPath Interconnect (QPI)        41

Onboard SSD        42

Basic Input/Output System (BIOS)        42

Unified Extensible Firmware Interface (UEFI)        42

Cache        42

Definition        42

Operation        42

Size        43

Instruction / Data split        43

Three level caches        43

Mapping function        43

Direct mapping        44

Full associative mapping        44

Set associative mapping        45

Replacement algorithms        45

Write policies        46

Write-through        46

Write-back        46

Cache coherence        46

Intel’s latest CPUs        46

Multi-core CPU        47

Non uniform memory access (NUMA)        47

Loop detector        47

Hyperthreading        47

Integrated memory controller        48

Power management        48

“Turbo boost”        48

Summary of i3, i5, i7 and Xeon        49

GPUs        49

Definition        49

Why we need GPUs        49

3D rendering pipeline        49

Typical transformation maths        50

Raster operations        50

Comparison to supercomputers        50

GPUs with no graphical output        51

Replacing the pipeline model        51

Onboard Intel GPUs        51

Programming GPUs        51

Uses of GPUs        51

Logical        53

Intro to digital electronics        53

Truth table        53

Logic notation        53

Logic’s relation to electronics        53

Logic gates        53

Boolean algebra        55

Transistors        55

FET transistor        55

MOSFET transistor        56

Flip flops and registers        56

Flip flops        56

Reset-set flip flop (RS)        56

Clocked reset-set flip flop        56

D-type flip flop        56

Registers        57

Reg4        57

Shift registers        58

Serial data transfer        59

Receiving serial data        59

Sending serial data        59

Registers in modern hardware        60

Number systems        60

Definition        60

Convert decimal to binary        61

Convert binary to decimal        61

Convert hexadecimal to binary        62

Convert octal to binary        62

Convert binary coded decimal (BCD) to decimal        63

Arithmetic        63

Unsigned integers        63

Signed integers        63

Sign and magnitude        63

2’s complement        63

1’s complement        64

Biased offset        64

Floating point        65

Exponent field = 0        67

Rounding modes        67

Addition / Subtraction        67

Multiplication / Division        67

Variable types to bitwidths and ranges        68

Gates        68

Gate sizes        68

Gate speed (delays)        68

Glitch generator        68

Inertial delay        69

Karnaugh maps        69

Addition        70

Full adder        70

Ripple adder        70

Carry-lookahead        71

Multicycling        71

Binary multiplication        72

Booth multiplication algorithm        72

ARM and Assembly        74

ARM instruction set        74

Thumb2        74

ARM registers        74

Simple instruction examples        74

Status register        74

Memory access        75

Table        75

Offsetting        75

Looping        75

Stack        75

Subroutine return address        76

Subroutine call with interrupt        76

Return address using stack        76

Return address using LR register        76

LR        76

Networking        77

What is a network        77

Layered network models        77

Layer encapsulation        77

Physical & Link layer        77

Network access layer        78

Internet layer        78

IPv4        79

NAT & NAPT        79

IPv6        79

ICMP / ICMPv6        80

Transport layer        80

TCP and UDP        80

TCP flow / congestion control        80

Application layer        81

Naming and addressing        81

DNS & ARP        81

DNS        81

ARP / NDP        82

Routing        82

Multiple paths        82

Switches        83

Other concerns        83

Monitoring        83

The Internet of things        83

Instruction sets        84

Interpretation of binary data        84

Instruction set        84

Elements of an instruction        84

Instruction types        84

Arithmetic instructions        85

Shift and rotate operations        85

Logical        85

Transfer of control        85

Input / Output        85

Types of operand        85

Instruction set architectures (ISA)        86

Accumulator ISA        86

Stack based ISA        86

Register-memory ISA        86

Register-register ISA        86

Memory-memory ISA        86

Alignment        87

Endedness        87

Address modes        88

Immediate addressing        88

Direct addressing        88

Indirect addressing        88

Multilevel indirect addressing        89

Register addressing        89

Register indirect addressing        89

Displacement addressing        89

Scaled displacement addressing        89

Orthogonality        90

Completeness        90

Operating Systems & Virtual Machines        90

About operating systems        90

Layers and views of a computer system        91

Kernel services        91

Memory management        91

Task management        91

File management        91

Device management        91

Essential hardware features for an OS        92

Memory protection        92

Timer        92

Privileged instructions        92

Interrupts        92

Scheduling        92

Context switching        92

Process control block        92

Memory management        93

Swapping        93

Partitioning        93

Physical and logical addresses        93

Paging        93

Virtual memory        93

Page table sizes        94

Translation lookaside buffer        94

Demand paging        94

Segmentation        94

Amdahl’s Law        95

The principle of locality        95

Disk caches        95


Physical

Intro to CPU structure

Definition

Functions of a CPU

Structure

Registers

User visible registers

Control registers

Assembler, C -> ARM

C

ARM Assembly

main()
{
 
int a,b,c[50];
 b =
2;
 
for( a= 0; a < 50; a++)
   c[a] = a * b;
}

  mov  r3, #2
 str  r3, [fp,
#-16]
 mov  r3,
#0
 str  r3, [fp,
#-20]
 b  .L2
.L3:
 ldr  r1, [fp,
#-20]
 ldr  r2, [fp,
#-20]
 ldr  r3, [fp,
#-16]
 mul  r0, r3, r2
 mvn  r2,
#207
 mov  r3, r1, asl
#2
 sub  r1, fp,
#12
 add  r3, r3, r1
 add  r3, r3, r2
 str  r0, [r3,
#0]
 ldr  r3, [fp,
#-20]
 add  r3, r3,
#1
 str  r3, [fp,
#-20]
.L2:
 ldr  r3, [fp,
#-20]
 cmp  r3,
#49
 ble  .L3
 sub  sp, fp,
#12
 ldmfd  sp, {fp, sp, pc}

Data flow

Computer interfaces

Types of input

Analogue inputs

Converting analogue to digital

Quantisation

How does ADC work?

DAC

CISC and RISC

Architectures

Von Neumann Architecture

Harvard Architecture

RISC

CISC

Pipelining and branch prediction

Pipelining

Dealing with branches

Multiple Streams

Prefetch Branch Target

Loop buffer

Branch prediction

Delay slot

SSE

Flynn’s classifications

SSE

How SSE works

Superscalar

What is superscalar?

Limitations

Instruction issues

Superscalar requirements

Interconnection structure & bus design

What is a bus

Single bus problems

Bus layouts for different units

Unit

Buses

Memory

I/O module (mouse, CD drive etc.)

CPU connection

Types of buses

Control / Address / Data bus

Peripheral Component Interconnect Express (PCIe)

Universal Serial Bus (USB)

Hard disk buses

Parallel ATA (PATA or IDE)

Serial ATA (SATA)

SCSI

Fibre channel

I2C

Mass storage

Magnetic disk

Floppy

Hard disk

Solid State Drives

Optical storage and CD-ROM

DVD

Blu-ray

Magnetic tape

Storage Area Network (SAN)

RAID

What is RAID

RAID 0

RAID 1

RAID 0+1

RAID 1+0

RAID 4

RAID 5

RAID 6

RAID 5+0

RAID 6+0

Software RAID

Hot spares and hot swap

Stripe sizes

RAID controllers

Microcontrollers

What is a microcontroller?

Embedded systems

Input / Output

GPIO

Analogue to digital inputs

Different types of microcontrollers

Arduino

Microchip’s PIC

TI MSP430

ARM

Cortex M

Raspberry Pi 3 details

Memory and RAM

Location

Capacity

Unit of transfer

Access methods

Sequential

Direct

Random

Associative

Memory hierarchy

Performance

Access time

Memory cycle time

Transfer rate

Physical types

Semiconductor memory

Dynamic RAM (DRAM)

Static RAM (SRAM)

Read-only Memory (ROM)

Double data rate RAM (DDR RAM)

Error correction

Simple parity bit

Flash RAM solid state disk

Parallel Processing

Amdahl’s law

Symmetric multiprocessors

Computer cluster

Process based message-parsing

Thread based message-parsing

Instruction level parallelism

a = b + c

x = y + z

Main dataflow

Code generation

PC architectures

Chipset

Server motherboards

Intel QuickPath Interconnect (QPI)

Onboard SSD

Basic Input/Output System (BIOS)

Unified Extensible Firmware Interface (UEFI)

Cache

Definition

Operation

Size

Instruction / Data split

Three level caches

Mapping function

Direct mapping

Full associative mapping

Set associative mapping

Replacement algorithms

Write policies

Write-through

Write-back

Cache coherence

Intel’s latest CPUs

Multi-core CPU

Non uniform memory access (NUMA)

Loop detector

Hyperthreading

Integrated memory controller

Power management

“Turbo boost”

Summary of i3, i5, i7 and Xeon

GPUs

Definition

Why we need GPUs

3D rendering pipeline

Typical transformation maths

Raster operations

Comparison to supercomputers

GPUs with no graphical output

Replacing the pipeline model

Onboard Intel GPUs

Programming GPUs

Uses of GPUs


Logical

Intro to digital electronics

Truth table

Logic notation

Logic’s relation to electronics

Logic gates

Gate

Explanation

AND

OR

NAND

NOR

XOR

Boolean algebra

Transistors

FET transistor

MOSFET transistor

Flip flops and registers

Flip flops

Reset-set flip flop (RS)

Clocked reset-set flip flop

D-type flip flop

Registers

Reg4

Shift registers

Serial data transfer

Receiving serial data

Sending serial data

Registers in modern hardware

Number systems

Definition

Decimal

Binary

Hexadecimal

Octal

BCD

1

00001

0x1

1

0001

2

00010

0x2

2

0010

3

00011

0x3

3

0011

4

00100

0x4

4

0100

5

00101

0x5

5

0101

6

00110

0x6

6

0110

7

00111

0x7

7

0111

8

01000

0x8

10

1000

9

01001

0x9

11

1001

10

01010

0xA

12

0001 0000

11

01011

0xB

13

0001 0001

12

01100

0xC

14

0001 0010

13

01101

0xD

15

0001 0011

14

01110

0xE

16

0001 0100

15

01111

0xF

17

0001 0101

16

10000

0x10

20

0001 0110

Convert decimal to binary

Convert binary to decimal

Bits

1

0

0

1

1

0

Square numbers (in decimal)

32

16

8

4

2

1

Convert hexadecimal to binary

Hex

3

F

1

Decimal

3

15

1

Binary

0011

1111

0001

Convert octal to binary

Octal

2

2

6

Binary

010

010

110

Convert binary coded decimal (BCD) to decimal

Arithmetic

Unsigned integers

Signed integers

Sign and magnitude

2’s complement

1’s complement

Biased offset

Binary

Decimal (unsigned int)

Decimal (biased offset 6)

0000

0

-6

0001

1

-5

0010

2

-4

0011

3

-3

0100

4

-2

0101

5

-1

0110

6

0

0111

7

1

1000

8

2

1001

9

3

1010

10

4

1011

11

5

1100

12

6

Floating point

Binary digits

1

0

1

0

0

0

Decimal representation

1

1/2

1/4

1/8

1/16

1/32

Binary digits

0

0

0

1

0

1

Decimal representation

1

1/2

1/4

1/8

1/16

1/32

Exponent field = 0

Rounding modes

Addition / Subtraction

Multiplication / Division

Variable types to bitwidths and ranges

Gates

Gate sizes

Gate speed (delays)

Glitch generator

Inertial delay

Karnaugh maps

Addition

Full adder

Ripple adder

Carry-lookahead

Multicycling

Binary multiplication

Booth multiplication algorithm

P

B

B-1 

0000

0010

0

P

B

B-1 

0000

0010

0

0000

0001

0

P

B

B-1 

0000

0010

0

0000

0001

0

1010

0001

0

P

B

B-1 

0000

0010

0

0000

0001

0

1010

1101

0001

0000

0

1

P

B

B-1 

0000

0010

0

0000

0001

0

1010

1101

0001

0000

0

1

0011

0001

0000

1000

1

0

P

B

B-1 

0000

0010

0

0000

0001

0

1010

1101

0001

0000

0

1

0011

0001

0000

1000

1

0

0000

1100

0

ARM and Assembly

ARM instruction set

Thumb2

ARM registers

Simple instruction examples

mov r2, r1
mov r4, #7
add r3, r1, r2
subs r3, r1, r2
B <label>
cmp r1, #7
ble <label>

r2 = r1
r4 = 7
r3 = r1 + r2
r3 = r1 – r2
Branch to <label>
Compare r1 with 7
Branch if less or equal

Status register

Memory access

Table

MOV r1, #Table
LDR r0, [r1]

Puts the address of Table into r0 then…

Loads word from address stored in r1 into r0

This is like r0 = Table[0]  

Offsetting

LDR r0, [r1,8]

Adds 8 to the address r1 then reads the word
So this is like r0 = Table[8]

In this case r1 contains a memory address

Looping

Stack

Subroutine return address

Subroutine call with interrupt

Return address using stack

Return address using LR register

LR

Networking

What is a network

Layered network models

Layer encapsulation

Physical & Link layer

Network access layer

Internet layer

IPv4

NAT & NAPT

IPv6

ICMP / ICMPv6

Transport layer

TCP and UDP

TCP flow / congestion control

Application layer

#create an INET, STREAMing socket
s = socket.socket( socket.AF_INET, socket.SOCK_STREAM)

#now connect to the web server on port 80
s.connect(
"www.myself.me", 80)

Naming and addressing

DNS & ARP

DNS

ARP / NDP

$ arp -a 152.78.65.254
iam-router.core.ecs.soton.ac.uk (
152.78.65.254) at 00:14:1b:3d:2c:00 [ether] on eth0

Routing

Multiple paths

Switches

Other concerns

Monitoring

The Internet of things

Instruction sets

Interpretation of binary data

Instruction set

Elements of an instruction

Instruction types

Arithmetic instructions

Shift and rotate operations

Logical

Transfer of control

Input / Output

Types of operand

Instruction set architectures (ISA)

Accumulator ISA

Stack based ISA

Register-memory ISA

Register-register ISA

Memory-memory ISA

Alignment

Endedness

0x900

0x901

0x902

0x903

0x12

0x34

0x56

0x78

0x900

0x901

0x902

0x903

0x78

0x56

0x34

0x12

Address modes

Immediate addressing

Direct addressing

Indirect addressing

Multilevel indirect addressing

Register addressing

Register indirect addressing

Displacement addressing

Scaled displacement addressing

Orthogonality

Completeness

Operating Systems & Virtual Machines

About operating systems

Layers and views of a computer system

Kernel services

Memory management

Task management

File management

Device management

Essential hardware features for an OS

Memory protection

Timer

Privileged instructions

Interrupts

Scheduling

Context switching

Process control block

Memory management

Swapping

Partitioning

Physical and logical addresses

Paging

Virtual memory

Page table sizes

Translation lookaside buffer

Demand paging

Segmentation

Amdahl’s Law

The principle of locality

Disk caches