Computer Science / Software Engineering Notes Network

Distributed Systems & Networks

Matthew Barnes

Networking (Kirk)        5

Physical + Link layer        5

Functions        5

Acknowledgements        7

Frames        9

Ethernet CSMA/CD        10

WiFi CSMA/CA        10

Ethernet LANs        11

Ethernet theory        11

ARP        12

Building Ethernet LANs        13

Spanning tree protocol        13

VLANs        15

Ethernet frame priority        16

Internet/Network layer        16

Internet Protocol (IP)        17

Subnets        21

Calculating subnets        23

ICMP        25

IP routing protocols        25

Aggregating prefixes        26

Interior routing protocols        27

Distance Vector        27

Link state        28

Exterior routing protocols        29

Transport layer        30

UDP        30

TCP        31

TCP or UDP?        35

Sockets        36

DNS        37

Application layer        40

Telnet        40

Email        41

SMTP        41

IMAP        42

HTTP        42

HTTP/2        43

QUIC        43

CoAP        44

RTSP        44

SMB        44

NFS        45

P2P        45

IPv6        45

IPv6 Features        46

Why use v6?        47

IPv6 headers        47

IPv6 with routers and privacy        48

Deploying IPv6        48

IPv6 security        49

Other IPv6 stuff        49

Network security        50

DNS Security        50

Firewalls        51

Intrusion Detection Systems        52

Port / Physical Security        52

DDoS        53

Wi-Fi (In)Security        54

Distributed Systems Implementations (Tim)        55

Distributed Systems Models        55

Physical models        55

Architectural models        55

Mapping entities to infrastructure        56

Architectural patterns        56

Fundamental models: Interaction, Failure and Security        57

Distributed Object Systems        59

Distributed OOP        59

Remote Interfaces        60

Server Deployment        60

Dynamic Code Loading        60

Many Clients!        61

The Monitor        61

The Observer Pattern        62

RMI Limitations        62

Data Serialisation        62

Serialising Java Objects        62

Mobile code 📱        63

Pass by reference or value?        63

Programming language independence        64

JSON        64

GSON        65

Linked Data & Semantics        65

Loose Coupling        66

Space & Time Uncoupling        66

Group Communication        67

Publish & Subscribe        68

Message Queues & IoT        69

RabbitMQ        70

Distributed Transactions        71

Distributed Data        71

Lost Updates & Inconsistent Retrievals        71

Dirty Reads & Premature Writes        73

Distributed Commit Protocols        74

Two-Phase Commit        75

2PC Failure Handling        75

Increasing Concurrency & Deadlocks        76

Distributed Systems Theory (Corina)        77

Time in Distributed Systems        77

Clock synchronisation        77

Logical clocks        78

Distributed Mutual Exclusion        82

Failure detectors        82

Distributed Mutual Exclusion        82

Leader Election        91

Asynchronous systems        92

Synchronous systems        96

Reliable and Ordered Multicast        98

Basic multicast        99

Reliable multicast        99

Ordered multicast        100

Consensus        103

Synchronous        104

Asynchronous        105

Consistency models        106

Strong consistency        107

Sequential consistency        107

Causal consistency        109

Eventual consistency        109

Strong eventual consistency        110

Other stuff (Leonardo)        110

Data Replication and Scalability        110

Data Replication        110

Primary-backup        110

CAP theorem        112

Scalability        115

Highly Available Distributed Data Stores (Amazon Dynamo)        116

Replication        117

Consistency        118

Fault tolerance        123

Why is Amazon Dynamo AP?        123

Scalability        124

Consistent Distributed Data Stores (Google BigTable)        125

Data model        125

Architecture        126

Tablets        129

SSTables        130

Why is Google BigTable CP?        132

Online Distributed Processing        132

Use cases        132

Requirements        133

Apache Storm        133

Data model        133

Architecture        134

Replication        135

Stream grouping        136

Fault tolerance        137

Batch Distributed Processing        139

Implementation #1: Google MapReduce        139

Implementation #2: Apache Hadoop        140

TL;DR        142

Kirk’s stuff        143

Tim’s stuff        150

Corina’s stuff        154

Leonardo’s stuff        158


Networking (Kirk)

Physical + Link layer

Functions

OSI model

TCP/IP model

Tanenbaum’s book

Application

Application

Application

Presentation

Transport

Transport

Session

Internet

Network

Transport

Link

Link

Network

Hardware

Physical

Data Link

Physical

Acknowledgements

ARQ protocol name

Explanation (not in the slides, so it might not be examinable, but if you wanna play it safe then read this anyway)

Stop-and-wait

The sender sends the first frame, then the recipient sends an ACK signal for that frame.

The sender sends the second frame, then the recipient sends an ACK signal for that frame.

This keeps on going until all the frames have been sent.

Go-back-N

The sender sends frames #1 to #N to the recipient, in order, over and over again, even if an ACK signal doesn’t exist.

The recipient throws away duplicate frames and frames that aren’t in the right order. The recipient sends an ACK signal containing the lowest number frame it had missed, let’s say #M where 1 ≤ M < N.

The sender gets the ACK signal, and now sends frames #M to

#(M + N) to the recipient. This keeps on going until all frames are collected.

Protocols like these are called sliding window protocols.

Example:

Alice is sending frames 1, 2, 3, 4 and 5.

Bob gets frames 1, 2, 3 and 5, but misses 4. Bob sends an ACK signal to Alice with the number ‘4’.

Alice receives this ACK signal, and is now sending frames 4, 5, 6, 7 and 8.

Now Bob has frames 1, 2, 3, 4, 5, 6, 7 and 8. Bob sends an ACK signal with the number ‘9’.

Alice receives this ACK signal again, and is now sending frames 9, 10, 11, 12 and 13.

Selective-repeat

Go-back-N may send duplicate frames, which is inefficient on the bandwidth. Selective-repeat aims to fix this.

The sender sends frames #1 to #N.

The receiver can either:

  • Send an ACK signal to move onto the next set of frames
  • Send a NAK signal to request a specific frame

If the sender gets an ACK signal with number M, then it’ll start sending frames #M to #(M + N).

If the sender gets a NAK signal with number M, then it’ll send only the frame #M to the recipient.

Example:

Alice sends frames 1 and 2 to Bob.

Bob gets frames 1 and 2, and sends an ACK signal with number 3 to Alice.

Alice gets this ACK signal and sends frames 3 and 4 to Bob.

Bob misses 3, but gets 4. Bob sends a NAK signal with number 3 to Alice.

Alice gets this NAK signal, and sends frame #3 to Bob.

Bob gets frame 3 and sends an ACK signal with number 5 to Alice.

Alice gets this ACK signal and now sends frames 5 and 6 to Bob.

Frames

Ethernet CSMA/CD

WiFi CSMA/CA

Ethernet LANs

Ethernet theory

ARP

Building Ethernet LANs

Spanning tree protocol

No bridge loop

Bridge loop

VLANs

Ethernet frame priority

Internet/Network layer

Internet Protocol (IP)

Class

Prefix range

Prefix length

# of networks

# addresses per network

Examples

A

0.0.0.0 to

127.0.0.0

8 bits

128 networks

~16 million addresses

20.xx.xx.xx

102.xx.xx.xx

B

128.0.0.0 to 191.255.0.0.

16 bits

~16,000 networks

~65,000 addresses

152.78.xx.xx

160.125.xx.xx

C

192.0.0.0 to 223.255.255.0

24 bits

~2 million networks

256 addresses

196.50.40.xx

202.155.4.xx

D

224.0.0.0 to 239.255.255.255

n/a

n/a

n/a

n/a (used for multicast groups)

E

240.0.0.0 to 255.255.255.255

n/a

n/a

n/a

n/a (reserved)

Class A

Class B

Class C

Class D

Class E

  1. The first 8 bits must be within a certain range
  2. It must have a fixed prefix

Address range

Number of addresses

Subnet mask

10.0.0.0 -

10.255.255.255

16,777,216

8 bits mask

172.16.0.0 - 172.31.255.255

1,048,576

12 bits mask

192.168.0.0 - 192.168.255.255

65,536

16 bits mask

Subnets

172.16.17.30

10101100.00010000.00010001.00011110

0.0.15.255

00000000.00000000.00001111.11111111

Broadcast =

10101100.00010000.00011111.11111111

= 172.16.31.255

Calculating subnets

ICMP

IP routing protocols

Aggregating prefixes

Interior routing protocols

Distance Vector

Link state

  1. Discover neighbours
  2. Determine cost metric to each neighbour
  3. Construct link-state information packet
  4. Flood this message to all site routers in the same area
  5. Use messages to build topology, and then compute shortest paths for prefixes served by any given router

Exterior routing protocols

  1. BGP neighbours, called BGP peers, are identified and create TCP sessions over port 179
  2. They send their whole routing tables to each other, and send incremental updates when something changes
  3. Then, the routers advertise routes they know to their neighbours, containing network prefix, prefix length, AS path, next hop etc.
  4. Neighbours can then use their knowledge of ASNs to choose whether to use that route or not

Transport layer

UDP

TCP

TCP or UDP?

TCP

UDP

Connection

Connection oriented

Connectionless: “fire and forget”

Reliability

Handles ACK & retransmissions

Application needs to handle ACK & retransmissions if needed

Data Order

Guaranteed that it arrives and in the correct order

No guarantee that data is received in the order sent

Header

20-bytes minimum

8-bytes

Good for

Applications that need high reliability

Applications that need fast and efficient transmission

Example protocols

HTTP(S), FTP, SMTP, Telnet, SSH

DHCP, TFTP, SNMP, RIP, RTP, COAP

Sockets

DNS

Application layer

Telnet

  1. It’s old and computer normies non-techies don’t like using terminals
  2. It doesn’t even have security, it’s unencrypted

Email

SMTP

  1. Alice: “Hey, do you want to start a TCP connection?”
  2. Bob: “Yes. Let’s talk in TCP.” (220 message)
  3. Alice: “Alright, let’s begin. Hello! Tell me what you can do.” (EHLO extended hello message)
  4. Bob: “OK. Here’s what I can do: (a list of things Bob can do)” (250 message with list of supported SMTP extensions)
  5. Alice and Bob start performing mail transactions (you don’t need to know these)
  6. Alice: “I’m done with you. Let’s stop talking.”
  7. Bob: “OK. Goodbye.” Bob stops listening to Alice (221 message, Bob closes transmission channel)
  8. Alice receives Bob’s goodbye and also stops listening to Bob.

IMAP

HTTP

HTTP/2

QUIC

CoAP

RTSP

SMB

NFS

P2P

IPv6

IPv6 Features

Why use v6?

IPv6 headers

IPv6 with routers and privacy

Deploying IPv6

IPv6 security

Other IPv6 stuff

Network security

DNS Security

  1. The attacker sends tons of messages to a DNS with the victim’s source IP
  2. The DNS spams the victim with DNS response messages
  3. The victim is overloaded with DNS messages and dies (not really, but you get the picture)

Firewalls

Packet #1 on port 80

Packet #2 on port 80

<!DOCTYPE html>

<html>

   <head>

      <title>JoJo homepage</title>

   </head>

        

   <body>

      <h1>Joseph Joestar</h1>

      <p>Best JoJo</p>

      <p>maybe josuke too</p>

   </body>

        

</html>

<!DOCTYPE html>

// i’m a real packet i promise

remove_folder(system32)

Intrusion Detection Systems

Port / Physical Security

  1. A supplicant (client)
  2. An Authenticator (switch or access point)
  3. An Authentication Server (e.g. LDAP using RADIUS)

WARNING: It’s a joke

DDoS

  1. Attacker connects with target
  2. Attacker sends “SYN”
  3. Target sends “SYN ACK”, and is now expecting an ACK back
  4. Attacker never sends ACK back

Wi-Fi (In)Security

Distributed Systems Implementations (Tim)

Distributed Systems Models

Physical models

Architectural models

Mapping entities to infrastructure

Architectural patterns

Fundamental models: Interaction, Failure and Security

Distributed Object Systems

Distributed OOP

Remote Interfaces

Server Deployment

Dynamic Code Loading

grant {
    permission java . net. SocketPermission
         "*:1024 -65535",
         "connect,accept,resolve";
    permission java . net. SocketPermission
         "*:80", "connect";
};

Many Clients!

The Monitor

The Observer Pattern

RMI Limitations

Data Serialisation

Serialising Java Objects

Mobile code 📱

public interface Task extends Serializable {
   
public Object execute();
}

Pass by reference or value?

Programming language independence

import java .io. Serializable ;
import java .io. ObjectOutputStream ;
import java .io. ObjectInputStream ;

public class MyClass implements Serializable
{
   
private static final long serialVersionUID = 2L;

   
private void writeObject ( ObjectOutputStream out )
     
throws IOException
   {
// Custom solution
   }

   
private void readObject ( ObjectInputStream in ) 
     
throws IOException , ClassNotFoundException
   {
// Custom solution
   }
}

JSON

{

  “auction”: “Notes Network Auction”

  “items”: [

    {

      “ID”: “023469234”,

      “name”: “Red Stone of Aja”,

      “stock”: 1

    },

    {

      “ID”: “FE40536”

      “name”: “String”

      “stock”: 58

    },

    {

      “ID”: “296360”

      “name”: “Time Stop Hat”

      “stock”: 3

    },

  ]

}

GSON

Linked Data & Semantics

Loose Coupling

Space & Time Uncoupling

Time coupled

Time uncoupled

Space coupled

Message is directed specifically towards receiver.

Receiver must be present when the message is sent.

Example: sockets, remote invocation

Message is directed specifically towards receiver.

Receiver doesn’t have to be present when the message is sent.

Example: snail mail (normal mail), email

Space uncoupled

Sender doesn’t know exactly who they’re sending to.

Receiver must be present when the message is sent.

Example: IP multicast

Sender doesn’t know exactly who they’re sending to.

Receiver doesn’t have to be present when the message is sent.

Example: publish & subscribe with message queues

Group Communication

The frowning person won’t receive the message as they aren’t available, hence why this is space-uncoupled-time-coupled.

Publish & Subscribe

Message Queues & IoT

RabbitMQ

Distributed Transactions

Distributed Data

Lost Updates & Inconsistent Retrievals

Transaction T

Transaction U

bal = a.getBalance(); $200

bal = a.getBalance(); $200

a.setBalance(bal * 1.1); $220

commit

a.setBalance(bal - $100); $100

commit

Transaction T

Transaction U

a.withdraw($100); $100

total = a.getBalance(); $100

total += b.getBalance(); $400

b.deposit($100); $400

commit

commit

Dirty Reads & Premature Writes

Transaction T

Transaction U

bal = a.getBalance(); $200

a.setBalance(bal * 1.1); $220

bal = a.getBalance(); $220

a.setBalance(bal + 20); $240

commit

abort

Transaction T

Transaction U

$200

a.setBalance($250); $250

a.setBalance($220); $220

commit

abort

You think Bart Simpson from the slides will care if your transactions are fully committed? Probably not.

Distributed Commit Protocols

Two-Phase Commit

  1. Voting
  1. Completion

2PC Failure Handling

Increasing Concurrency & Deadlocks

Lock requested

Lock already set

Read

Write

Commit

None

OK

OK

OK

Read

OK

OK

WAIT

Write

OK

WAIT

Commit

WAIT

WAIT

Distributed Systems Theory (Corina)

Time in Distributed Systems

Clock synchronisation

Logical clocks

Distributed Mutual Exclusion

Failure detectors

Distributed Mutual Exclusion

Image

Explanation

We have three processes and the critical region.

The blue process requests access to the critical region from all other processes.

Both the green and yellow processes don’t care about the critical region, so they both immediately reply to the blue process.

The blue process, having received replies from all other participants, can now use the critical region.

Yellow wants to use the critical region now, so they send out requests to everyone.

Blue adds yellow to their queue, because blue is currently using the critical region.

Green doesn’t care about the critical region, so green replies to yellow.

Now green also wants to use the critical region. They send out requests to everyone.

Blue is using the critical region, so green is added to the queue. Yellow came first (it knows because of the totally-ordered timestamps), so yellow adds green to its queue.

Blue is now done with the critical region and sends a reply to yellow, because yellow was at the head of the queue.

It will also send a response to green, because they’re next on the queue.

Yellow just got responses from everyone! That means yellow can now use the critical region.

Yellow is now done with the critical region, and sends a response to green.

Green now has all the responses, and can now use the critical region.

Image

Explanation

We have participants A, B and C. The groups are:

A’s group: A, B

B’s group: B, C

C’s group: C, A

A wants the critical region, so it multicasts a REQUEST message to it’s group (B).

B isn’t in the region, and it hasn’t voted for anyone else, so it’ll vote for A.

Because A has received all votes from its group, it gets access to the critical region.

Now C wants to access the critical region! It sends a REQUEST to its group (A).

However, A is already using the critical region, so it puts C in the queue.

A has now finished, and will send a RELEASE message to its group (B).

B will now cancel the vote with A.

A looks at the queue and sees that C tried to access the critical region. A will now vote for C.

C, now having all the votes in its group, now has access to the critical region.

Leader Election

Asynchronous systems

Image

Description

We have processes 2, 6, 7 and 3 in a ring. 3 decides to start an election and creates an ELECTION message with 3 in it. It passes it to its neighbour.

Red = non-participating

Green = participating

6 receives the message and sees that it’s ID is bigger than the one in the message. It puts its ID in there, sets itself as participating, and sends the message to its neighbour.

2 receives the message. 2 is smaller than 6, so it just forwards the message and marks itself as a participant.

7 receives the message. 7 is bigger than 6, so it replaces the ID, forwards it and marks itself as a participant.

3 receives the message. Since 3 is already a participant, and 3 is smaller than 7, it’ll just forward on the message.

In fact, 6 and 2 will just forward on the message too. It’ll only differ when 7 receives the message, so we’ll skip 2 and 6 for now.

7 receives the message. Since it’s already participating, and the message ID is equal to 7’s ID, it realises that they are the new coordinator.

It creates an ELECTED message with it’s ID in it, sets itself as 7’s elected process, and sends the ELECTED message to its neighbour. 7 will also mark itself as a non-participant.

3 receives the ELECTED message. 3 is not 7, so it sets 3’s elected process as 7 and forwards the message. 3 will also mark itself as a non-participant.

6 and 2 will do the same thing, so again, the algorithm only differs when the message goes to 7.

7 receives the message. The ELECTED message ID is equal to 7’s ID, so 7 stops forwarding the message and the message is “deleted”. A new coordinator has been elected with all participants!

Best case

Worst case

The initiating process has the highest identifier.

There are N ELECTION messages...

... and N ELECTED messages...

...resulting in 2N turnaround messages (messages sent in total)

The anti-clockwise (last) neighbour has highest identifier.

There are 2N - 1 ELECTION messages...

...and N ELECTED messages...

...resulting in 3N - 1 turnaround messages (messages sent in total).

Ring-based algorithm message

Modified ring-based algorithm message

Synchronous systems

Image

Explanation

Process 4 is the coordinator, as indicated by the yellow colour.

Process 4 fails and 1 reacts by sending ELECTION messages to all processes with a higher ID.

Process 2 and 3 answer process 1’s ELECTION message.

Since processes 2 and 3 received ELECTION messages, they themselves start elections too.

3 answers 2, so 2 waits for a COORDINATOR message from someone higher up.

3 doesn’t get anything back from process 4 because it’s failed. 3 is about to hit the timeout and declare itself the coordinator, until...

Uh oh! A rat is going to nibble on 3’s cabling and shut it down! 3, look out! He’s got air pods on, he can’t hear me! Oh god oh fuck I’m literally crying and shaking right now

Oh no! Process 3 goes down before it can declare itself as the coordinator!

Now 2 is waiting for a COORDINATOR message, but it’s not going to get one.

2 reaches the timeout and declares itself as the coordinator! It sends the COORDINATOR message to all participants below itself.

1 receives the COORDINATOR message and declares 2 the coordinator.

Best case

Worst case

When the coordinator is process N, and N fails, and N - 1 detects that failure and elects itself using COORDINATOR messages.

There’s only an overhead of N - 2 coordinator messages and a delay of 1 message (because there’s no ELECTION/ANSWER messages, only the COORDINATOR messages being passed to everyone)

When the least identifier process detects coordinator failure.

There’s a total overhead of (N + 1)(N - 2) messages. Basically, that’s an O(N2) complexity.

Plus, there’s a huge delay of all the election and answer messages swarming around.

Reliable and Ordered Multicast

Basic multicast

Reliable multicast

Ordered multicast

Type of order

Image

Explanation

FIFO

In the diagram, messages are being sent between three processes. The blocks are the “receiving” events, and the small black circle dots are the “sending” events.

The F1 message (beige blocks) is sent before the F2 message (black blocks) in the same process, hence why you’ll always see, in all processes, beige blocks before the black blocks.

The F3 message (orangey brown blocks) is independent of the other messages, so the orangey brown blocks can go anywhere between the other blocks.

Causal

It’s clear that because C1 was sent before C2 in the same process, so the beige blocks will always go before the black blocks.

In this example, let’s say that C3 is sent in response to C1 (the beige block), so  is true and the orangey brown blocks will always go after the beige blocks.

However, C2 and C3 are independent of each other (concurrent), so black blocks and orangey brown blocks can go before or after each other.

Total

It doesn’t matter when the messages were sent, each process has the same order of receiving messages: beige, then black.

Consensus

Synchronous

Asynchronous

Consistency models

Strong consistency

Sequential consistency

Causal consistency

Eventual consistency

Garfield: Eventual consistency in cat form

Strong eventual consistency

Other stuff (Leonardo)

Data Replication and Scalability

Data Replication

Primary-backup

  1. A client sends a request to the primary server.
  2. If the primary server has already executed this request before, we go to step 5.
  3. Primary server executes request and stores response
  4. If it’s a write request, primary server updates and notifies all the backup (slave) servers and waits for ACKs (acknowledgements).
  5. Primary server sends response to client.

CAP theorem

Scalability

  1. The primary server, as it’s the one handling all the requests
  2. The backup servers, as the primary needs to wait for all the ACK signals
  1. Share data on more than one primary server
  2. Primary server only waits for majority of ACK signals
  1. If there are data dependencies, we’ll need coordination amongst primary servers
  2. Not all backup servers might be up to date, and if the primary fails and a backup server is elected that isn’t up to date yet, we’ll have problems

Highly Available Distributed Data Stores (Amazon Dynamo)

Replication

Consistency

Image

Explanation

Here, we’re putting K = 2 into their respective nodes. Since W = 2, we’ll only accept 2 node responses. The green nodes are the nodes that have sent write responses first, filling up our W window.

For some reason, the request does not reach D, E and F.

Here, we’re trying to get K. However, node D responds because B and C are down. Now we’ve got a stale version of K!

Image

Explanation

Like last time, we’re putting K = 2 into their respective nodes. Since W = 2, we’ll only write to the 2 most available nodes. The green nodes are the nodes that have written the update.

Here, we’re trying to get K. B goes down, and R = 2, so we’ll fetch the values C and D have. They’re two different versions: one is old and one is new!

Image

Description

Here, we write Object #1, written as O1, by node Sx. This is the first time Sx has written anything, so it gets the value of ‘1’ in its vector clock spot.

Here, we overwrite the object with #2, also done by node Sx. As a result, the update count for Sx is now 2.

We write yet again, to object #3. This time, node Sy does it, so Sx’s counter stays at 2, and Sy’s counter becomes 1.

Here’s where things get interesting though: Sy’s writes are taking a long time. Maybe there’s lag. Anyway, the client overwrites with #4 and Sz, a new node, writes that.

Now, a new branch has been formed, because there were two changes to the same version, #2.

Now we’re doing a new read. The client finds out that there’s a conflict and there’s branches to join together.

The client fixes the conflict and pushes object #5. The vector clock still takes into account who has updated the object in all branches.

Fault tolerance

Why is Amazon Dynamo AP?

Scalability

Consistent Distributed Data Stores (Google BigTable)

Data model

ID

Name

Part

Stand

1

Jotaro Kujo

3

Star Platinum

2

Josuke Higashikata

4

Crazy Diamond

3

Giorno Giovanna

5

Gold Experience

Architecture

Tablets

SSTables

Why is Google BigTable CP?

Online Distributed Processing

Use cases

Requirements

Apache Storm

Data model

Architecture

Replication

Stream grouping

  1. Increase the replication factor of overloaded spouts/bolts, so we distribute the workload
  2. Add more Worker nodes so more systems can run more instances of the spouts/bolts, thereby distributing the workload

Fault tolerance

Batch Distributed Processing

Implementation #1: Google MapReduce

Implementation #2: Apache Hadoop

TL;DR

Kirk’s stuff

OSI model

TCP/IP model

Tanenbaum’s book

Application

Application

Application

Presentation

Transport

Transport

Session

Internet

Network

Transport

Link

Link

Network

Hardware

Physical

Data Link

Physical

Class

Prefix range

Prefix length

A

0.0.0.0 to

127.0.0.0

8 bits

B

128.0.0.0 to

191.255.0.0.

16 bits

C

192.0.0.0 to

223.255.255.0

24 bits

D

224.0.0.0 to

239.255.255.255

n/a

E

240.0.0.0 to

255.255.255.255

n/a

TCP

UDP

Connection

Connection oriented

Connectionless: “fire and forget”

Reliability

Handles ACK & retransmissions

Application needs to handle ACK & retransmissions if needed

Data Order

Guaranteed that it arrives and in the correct order

No guarantee that data is received in the order sent

Header

20-bytes minimum

8-bytes

Good for

Applications that need high reliability

Applications that need fast and efficient transmission

Example protocols

HTTP(S), FTP, SMTP, Telnet, SSH

DHCP, TFTP, SNMP, RIP, RTP, COAP

Tim’s stuff

Time coupled

Time uncoupled

Space coupled

Message is directed specifically towards receiver.

Receiver must be present when the message is sent.

Example: sockets, remote invocation

Message is directed specifically towards receiver.

Receiver doesn’t have to be present when the message is sent.

Example: snail mail

Space uncoupled

Sender doesn’t know exactly who they’re sending to.

Receiver must be present when the message is sent.

Example: IP multicast

Sender doesn’t know exactly who they’re sending to.

Receiver doesn’t have to be present when the message is sent.

Example: publish & subscribe with message queues

Lock requested

Lock already set

Read

Write

Commit

None

OK

OK

OK

Read

OK

OK

WAIT

Write

OK

WAIT

Commit

WAIT

WAIT

Corina’s stuff

Central Server Algorithm

Ring-based algorithm

Token-based, uses a central server, clients ask for a token and waits if it’s not available.

  • safety
  • liveness
  • no fairness
  • good performance
  • single point of failure

Tokens are passed around a ring of processes.

  • safety
  • liveness
  • no fairness
  • no bottleneck
  • higher bandwidth usage

Ricart and Agrawala’s Algorithm

Maekawa’s Voting algorithm

Uses totally-ordered time stamps. Must receive responses from all other participants before using critical region.

  • safety
  • liveness
  • fairness
  • good performance
  • failure of even just one node has big repercussions

Only need all responses from our respective group to access critical region. All groups overlap.

  • safety
  • no liveness
  • no fairness
  • performance is slightly better than Ricart and Agrawala
  • can have deadlocks (but can be mitigated)

Ring-based algorithm

Bully election algorithm

Pass ELECTION and ELECTED messages around in a ring, updating the process ID inside it. Used in asynchronous systems.

  • safety
  • liveness
  • does not tolerate failures (however we can modify to use a list, which mitigates this)
  • O(N) in both worst and best case

When a coordinator fails, failure detection systems fire off in other nodes and they send ELECTION and COORDINATOR messages to decide the new coordinator. Used in synchronous systems.

  • no safety
  • liveness
  • O(N2) worst case, O(N) best case

CP

Strong consistency (Google BigTable)

  • Messages have global timestamps
  • Order of messages respect timestamp
  • Expensive, and not always necessary

Sequential consistency

  • Shuffles read and write operations together, like multi-threading or if a switch selects operations each iteration

Strong eventual consistency

  • Good compromise (don’t need to know this one)

Causal consistency

  • Uses the causal relation and vector clocks
  • Operations are only in order by the causal relation
  • Better performance than sequential

AP

Eventual consistency (Amazon Dynamo)

  • Lazy; eventually update replicas
  • Can have conflicts, so we can use consensus or roll back

Leonardo’s stuff

Minor compaction

Major compaction

Empty out memtable into new SSTable

Compress all SSTables and memtable into one big SSTable

🎊 You’ve reached the end. Good luck on the exam! 🎊