Social Computing Techniques (COMP3208) Notes Document

Introductory Lecture        4

Module Overview        4

Coursework        5

Additional Links and Resources        6

Recommender Systems        7

Where to Find What        7

What is Social Computing?        7

What is Crowdsourcing?        8

Crowdsourcing Examples        9

CAPTCHA        9

NoiseTube        9

Amazon Mechanical Turk        10

Some other examples        10

Recommender Systems        11

Online Advertising Auctions        11

Summary        13

Past & Present        13

Terminology        13

History of Social Computing        13

1714 - Longitude Prize        13

1791 - Gaspard de Prony’s Logarithmic Tables        14

Wisdom of the Crowds        14

1906 - Sir Francis Galton        15

1984 - Space Shuttle Challenger Disaster        16

Early Social Computing Systems        17

Commercialisation        20

Web 2.0        22

Rise of Crowdsourcing        23

Crowdsourcing for Social Good        24

Crowdsourcing and Machine Learning        25

Why Do We Still Use Social Computing?        25

Features of Computing Systems        27

Recommender Systems        28

Learning Outcomes        28

Resources        28

What are Recommender Systems?        29

Examples of Recommender Systems        29

Why use Recommender Systems?        32

Potential Issues        33

Paradigms of Recommender Systems        33

Content-based        33

Knowledge-based        34

Collaborative Filtering        35

Hybrid        36

Pros and Cons of Different Approaches        36

Similarity Coefficients        37

“Cleaning” Features        37

Calculating Similarity        39

Exercise        39

Self-Test Question Answers        40

Question 1        40

Question 2        40

Question 3        41

User-Based Collaborative Filtering        41

Similarity Measure (for Ratings)        42

Pearson Coefficient Example        43

Neighbourhood Selection        44

Prediction        44

Improving Prediction        45

Produce Recommendation        46

Item-Based Collaborative Filtering        46

Similarity Measure (for Ratings)        47

Adjusted Cosine Similarity Example        48

Prediction        49

Exercise 2        49

Alternative Approaches        51

Simple Approach: Slope One        51

Types of Ratings        52

Example: Items Bought Together        52

Evaluating Recommender Systems        53

Correct and Incorrect Classifications        54

Metrics        55

Precision        55

Recall        55

Accuracy        55

Rank Score        56

Error Measures        56

Online Auctions        57

Where Are Auctions Used and Why?        57

Sponsored Search        57

Display Advertising        58

Why Use Auctions?        59

Auction Design        59

Auction Types        60

English Auction        60

Dutch Auction        61

Sealed-Bid First Price Auction        61

Sealed-Bid Second Price Auction (Vickery Auction)        62

Dominant Strategy        62

Formal Model and Utility        63

Computing Expected Utilities        64

Reputation Systems        65

Trust Issues…        65

Terminology        65

Cherries and Lemons        65

The Lemon Problem        66

Moral Hazard        66

Trust on the Web        66

The Solution?        67

Reputation vs. Recommender Systems        67

Design Considerations        67

Existing Systems        67

Reputation Value        69

Measuring Confidence        69

Probabilistic Approach        70

Using the Beta Distribution        70

Evaluating Reputation Systems        72

Issues        72

Financial Incentives        74

Experimental Design        75

Human Computation        75

Modern Recommendation Systems        75

Incentives in Crowd        75

Rank Aggregation        75

Experts Crowdsourcing        75

Revision Lectures        75

tl;dr        75

Introductory Lecture        75

Past & Present        76

Recommender Systems        79

Online Auctions        90

Reputation Systems        92

Additional Lectures        96

Introduction to Coursework        96

Assignments        96

Useful Resources        97

Forming Groups (of 2 or 3)        97

Task Datasets        97

Submitting Computed Ratings        98

Written Report        100

Academic Integrity        100

Coursework Surgery Sessions        101

Coursework “Surgery” 1        101

Coursework “Surgery” 2        101

Coursework “Surgery” 3        101

Coursework “Surgery” 4        101

Guest Lecture: Davide Zilli from Mind Foundry        101

Guest Lecture: Mikhail Fain from Cookpad        101

Introductory Lecture

Module Overview

The primary module aim is to investigate how online collective intelligence and “crowd wisdom” is used in various business and research areas, and the underlying technologies/algorithms used in practice.

Main topics:

Learning outcomes:

There is a strong applied component to the module (i.e. learning how shit is actually used). Note that it’s not about social networks (that’s for COMP6250: Social Media and Network Science).

Assessment:

Lecturers:

Coursework

Additional Links and Resources

Recommender Systems

Where to Find What

What is Social Computing?

It’s a widely used concept with no single definition, but according to HP Research Labs:

"Social Computing Research focuses on methods for harvesting the collective intelligence of groups of people in order to realize greater value from the interaction between users and information”

(basically, harvest info from users to learn about them and their interactions with #content)

ECS has done various research in social computing, e.g. enhancing crowdsourcing techniques for sustainable disaster relief; here’s a flex:

What is Crowdsourcing?

Dictionary definition of crowdsource: “obtain (information or input into a particular task or project) by enlisting the services of a number of people, either paid or unpaid, typically via the Internet”. (Basically, get people online to do shit for you)

Topics to be investigated include:

Crowdsourcing Examples

CAPTCHA

NoiseTube

Amazon Mechanical Turk

Some other examples

Recommender Systems

Topics to be investigated include:

Online Advertising Auctions

Topics to be investigated include:

Summary

Past & Present

Terminology

Before diving into the lecture, some useful terms:

History of Social Computing

1714 - Longitude Prize

1791 - Gaspard de Prony’s Logarithmic Tables

Wisdom of the Crowds

As mentioned before, the wisdom of the crowd is the collective opinion of a group of individuals rather than that of a single expert.

1906 - Sir Francis Galton

1984 - Space Shuttle Challenger Disaster

You may be aware of the tragic Challenger disaster in 1986, where the space shuttle exploded 73 seconds after lift-off: https://www.youtube.com/watch?v=j4JOjcDFtBE

As you can imagine, the disaster had a big impact on the stock market at the time, and particularly the stock prices for NASA’s suppliers, which can be seen in the graph below:

NASA’s 4 main suppliers were:

Early Social Computing Systems

Commercialisation

Overtime. such systems became more commercial, and if you haven’t been living under a rock you probably know:

Web 2.0

Web 2.0 (also known as Participative/Participatory and Social Web), coined in 1999, refers to sites that emphasize:

Here are a bunch of sites (that you’ve probably heard of unless you’ve just immigrated from Pluto) that boomed in the 00s built on a crowd-generated content model:

Rise of Crowdsourcing

As mentioned before (doesn’t hurt to recap), coined in 2006 by Jeff Howe in Wired Magazine, crowdsourcing is to “obtain (information or input into a particular task or project) by enlisting the services of a number of people, either paid or unpaid, typically via the Internet”. (Basically, get people online to do shit for you)

Here are some notable examples of crowdsourcing initiatives:

Crowdsourcing for Social Good

Crowdsourcing and Machine Learning

Why Do We Still Use Social Computing?

There’s still vast commercial potential (i.e. companies can make $$$) from social computing:

Features of Computing Systems

One can also think of computing features as being either relevant to “traditional” or “social” computing.  Some of the most important characteristics of social computing can be summarized as user-created content where:

When analysing a social computing system, one can ask the questions (common sense really):

Here’s a soulless, unimaginative corporate graphic for your viewing pleasure:

Recommender Systems

Learning Outcomes

Resources

Some of the slides I’m ripping off are based on tutorials by Dietmar Jannach and Gerhard Friedrich, and there are two textbooks for this part of the module:

What are Recommender Systems?

Recomender systems are a type of information filtering system which filters items based on a personal profile. (tl;dr they recommend you stuff)

They’re different from many other information filtering systems (e.g. standard search engines and reputation systems) in that the recommendations are personal to the specific user or group of users (with the theory that we aren’t as unique as we think we are, you’re not special mate).

Recommender systems are given:

The recommender system can then calculate a relevance score used to rank the options, and can finally choose the set of options to recommend to us.

Examples of Recommender Systems

(the 2000s called, they want their OS back)

Why use Recommender Systems?

Recommender systems can provide:

Potential Issues

However, naturally some challenges exist around developing and using recommenders, such as:

Paradigms of Recommender Systems

These are some different forms/models of recommender system, including:

Looking at the first example:

Content-based

Knowledge-based

Collaborative Filtering

Hybrid

Pros and Cons of Different Approaches

Type of Recommender

Pros

Cons

Collaborative

  • No knowledge-engineering effort
  • Well understood
  • Works well in some domains
  • Can produce “surprising” recommendations
  • Requires some form of rating feedback
  • “Cold start” for new users and items
  • Sparsity problems
  • No integration of other knowledge sources

Content-based

  • No community required
  • Comparison between items possible
  • Content descriptions necessary
  • “Cold start” for new users
  • No “surprises”

Knowledge-based

  • Deterministic (i.e. a given set of inputs will always return the same output) recommendations
  • Guaranteed quality
  • No “cold start”
  • Can resemble sales dialogue
  • Knowledge engineering effort to bootstrap (the start up cost is high in terms of data needed to make recommendations) - hence it can be very computationally expensive
  • Basically static
  • Doesn’t react to short-term trends

Similarity Coefficients

(FYI, a coefficient is a number in front of a variable of term, e.g. in 6a + 3b + 2x²y, the term coefficients are 6, 3 and 2 respectively)

So how do we mathematically evaluate the similarity between documents?

Below are a couple of simple approaches, computing document similarity based on keyword overlap:

(x and y are two books, and |X| indicates the size of a set X - remember cardinality in Foundations?)

Dice’s coefficient:      

Jaccard’s coefficient: 

The above two coefficients effectively tackle the problem in the same way, just via slightly different calculations. One tends to choose the coefficient that’s most suitable for the dataset hand/problem being solved.

At the two extremes:

“Cleaning” Features

(FYI for below, an array is a horizontal list of numbers, whereas a matrix is a 2D grid)

There are some potential shortcomings to using keywords; e.g. they:

One mechanism to combat this is to clean the features, basically optimizing the set of keywords for the task at hand, using techniques such as (if you did a machine learning module, here’s some déjà vu):

Calculating Similarity

Exercise

(This exercise taken from the slides, originally from StackOverflow apparently)

Suppose we want to compare the similarity between two documents, in this case simple sentences (but we can apply the same approach to books or websites):

Step 1: Get the frequency of each term (order does not matter) to produce vectors x̄ and ȳ.

x̄ = (Jane: 0, Julie: 1, likes: 0, Linda: 1, loves: 2, me: 2, more: 1, than: 1)

ȳ = (Jane: 1, Julie: 1, likes: 1, Linda: 0, loves: 1, me: 2, more: 1, than: 1)

Though to simplify we can just write them like:

x̄ = (0,1,0,1,2,2,1,1)

ȳ = (1,1,1,0,1,2,1,1)

Step 2: Calculate the similarity, using the Dice, Jaccard and Cosine approaches.

Dice’s coefficient = (2*5) / (6+7) = 10/13

Jaccard’s coefficient = 5/8 

Cosine similarity = (0*1 + 1*1 + 0*1 + 1*0 + 2*1 + 2*2 + 1*1 + 1*1) / (√(0²+1²+0²+1²+2²+2²+1²+1²) * √(1²+1²+1²+0²+1²+2²+1²+1²)) = 9 / (√12 * √10) = (3√30) / 20

Bear in mind that for Dice and Jaccard, we don’t care about the frequency, only if the words appear or not (i.e. the cardinality of the sets, i.e. number of unique elements).

Self-Test Question Answers

If it isn’t obvious, green = correct, red = incorrect (if you’re colourblind have no fear, I’ll also add a tick  and cross ).

Question 1

A definition of a recommender system is:

  1. ✓ A type of information filtering system which filters items based on a personal profile
  2. ✓ A type of information filtering system which filters items based on a personal profile. It is different from many other information filtering systems (such as standard search engines and reputation systems), in that the recommendations are personal to the specific user or group of users
  3. ✗ An information filtering system (don’t ask me why this one’s incorrect)
  4. ✗ A special kind of search engine

Question 2

Examples of providers that use recommender systems are:

  1. ✗ Amazon and Netflix (again I don’t understand why this isn’t technically correct but we move)
  2. ✓ Amazon, Netflix and Spotify
  3. ✓ Amazon, Netflix, Spotify and many large supermarkets
  4. ✗ Amazon, Netflix, Spotify and Google search

Question 3

There are many advantages to using recommender systems for both customers and providers. Some of these for the customer are reducing information overload, and helping them discover new things. For the provider, they increase trust and customer loyalty which can in turn increase sales revenue.

User-Based Collaborative Filtering

We touched on collaborative filtering in the previous lecture; focusing more on user-based collaborative filtering...

In terms of requirements, systems like Amazon have millions of items and tens of millions of users, so clearly algorithms need to have good performance independent of these colossal numbers. The algorithms should ideally have:

So what steps does one need to consider for the system?

Naturally however, some issues exist with user-based collaborative filtering such as:

Similarity Measure (for Ratings)

Pearson Coefficient Example

Say we have 6 users as shown in the table below, but users 1 and 2 have not rated some items:

For users 1 and 2, we can calculate the mean average user rating from the items they’ve both rated so far:

Then we can deduct user 1’s average from each of user 1’s ratings, and deduct user 2’s average from each of user 2’s ratings, so we end up with:

Finally we can plug the values into the Pearson coefficient formula (given above) like so, to calculate the similarity between user 1 and 2’s ratings (it looks more complicated than it is!):

As the resulting value of -0.89 is close to -1, we can see that their ratings are strongly negatively/inversely correlated.

Neighbourhood Selection

After calculating the similarity between all users, how do we choose which to base our base our recommendation on? We could use different constraints/selection criteria for neighbours, for example:

Prediction

We can then use the given user ratings and similarity calculations to predict unrated items.

For example, say user 2 and user 3 have rated item 3, but user 1 hasn’t:

We can make a prediction of what user 1 will rate item 3, based on user 2 and 3’s ratings of item 3 and those users’ similarity to user 1.

So, given the following user rating average and user similarity results:

...we can simply plug the values into the prediction formula to calculate user 1’s predicted item 3 rating like so:

Improving Prediction

Some neighbour ratings may be more “valuable” than others; agreement on controversial items can be more informative than on common items.

Some things that can be done to possibly improve the prediction function include:

Produce Recommendation

This is the final step (kinda obvious) for the user-based collaborative filtering recommender to take; it can:

Item-Based Collaborative Filtering

We can also do collaborative filtering based on items!

As with user-based, one needs to consider the main steps for the system, which are similar but of course in terms of items rather than users:

Naturally however, some issues exist with item-based collaborative filtering; pre-processing techniques can be used to mitigate these.

Similarity Measure (for Ratings)

Looking again at the table, say this time we want to find the similarity between two items (rather than users):

Adjusted Cosine Similarity Example

So for example, say we have user ratings for items 3 and 4 like so (as you can see user 1 hasn’t rated both items, so we just ignore this row):

...and we calculate each user’s mean average rating (for all the items each one’s rated) like so:

...we can simply plug these values into the similarity function (summing the products of the respective weighted item ratings, and dividing this result by the product of the two items’ weighted square sums - sounds a lot more complicated than it is, just look at the formula) and simplify the expression for the final result:

As the resulting value of -0.816 is close to -1 (similar to the previous example), we can see that their ratings are strongly negatively/inversely correlated.

Prediction

Similarly to the user-based collaborative filtering system, we can predict unrated items.

Exercise 2

(From Blackboard)

Alternative Approaches

Of course, there are other ways of predicting ratings beyond similarity matrices and whatnot…

Simple Approach: Slope One

This approach focuses on computing the average differences between pairs of items; say for instance we have the following table of user item ratings:

...the (mean) average difference between items 1 and 2 is simply (ignoring user 1 and 2’s ratings, who haven’t rated item 2):

We can also calculate a predicted rating for item 2, taking the mean of the average differences between item 2 and every other item that all of the users have rated (so 1, 4 and 5):

Optionally we can adapt the prediction formula to weight each difference based on the number of jointly weighted items, like so:

Types of Ratings

Pure collaborative filtering-based systems, as seen above, only rely on the rating matrix;  the question is, what kind of ratings can one use? For example:

Example: Items Bought Together

Say for instance we have the following matrix of items that people have bought together:

...It’s apparent for example that item 2 is quite frequently bought with item 3 (so we can infer that if a user buys one of the items, they’re more likely to buy the other too).

Evaluating Recommender Systems

What makes for a “good” recommendation?

Of course it can vary between companies, thinking about e.g. what they’re trying to persuade people to buy, what behaviour are they trying to persuade people to engage in, etc. These things will have an impact on good ways of evaluating the recommender system, going back to the “human factor” (remembering that people are mostly predictable).

So what kinds of measures are used in practice?

How can we measure the performance (again considering the “human factor”)?

For offline experiments, some common protocol:

Correct and Incorrect Classifications

How can we label correct and incorrect classifications that the system makes? The following four test responses are often used:

Metrics

Various metrics can be used to attempt to measure the performance of an classification algorithm, such as:

Precision

Recall

Things like calculating the area under a curve can give us a visual representation of precision and recall.

Accuracy

This simple metric considers the accuracy of all correct classifications (i.e. the true positives and the true negatives), given by the formula below:

This is arguably the metric that’s of most interest for recommender systems, which quickly lets us see how effective it is at making predictions.

Rank Score

In a recommender system, the item rank score/position really matters; we want to make sure that ground truths are recommended more often than things where the certainty is less.

For example, looking at the following table excerpt:

...if someone has bought item 2, we want to know that:

Error Measures

There are different ways of measuring the error rate of a recommender; the two most commonly used are probably:

Online Auctions

Where Are Auctions Used and Why?

One of the first things that may come to mind when people talk abou auctions are eBay auctions. There are two main ways that you can buy things: buy it now, and auctions! (You’re probably familiar with the system but notice the current price, number of bids, time remaining etc:)

eBay charges a % of commission fees for items sold on their platform

Going into some stuff mentioned in the first lecture...

Sponsored Search

Sponsored search is the basic model that most search engines use to generate revenue (at least 80% of revenue by big companies like Google is generated by selling advertisements on the internet).

e.g. searching for a product on Google, the search results with green links are ranked by page ranking advertisement results. Note how some of them are marked as “Ad”; these advertisers pay Google to put their results on top for those keywords.

Yahoo for instance used to call sponsored search auctions “computational advertising”, it’s the same idea.

Display Advertising

Another type of auction is display advertising, a type of graphic advertising on websites (shown via web browsers), usually in the form of some media e.g. an image banner or a video ad.

These days as systems are more sophisticated, they often use cookies from your browsing and purchase history to better target ads to you. (So e.g say you buy some plane tickets and then scroll through Facebook, you might see more airline/holiday advertisements pop up.)

Why Use Auctions?

So why use auctions?

Auction Design

There are of course different types of auctions, each with advantages and disadvantages. There are equivalences between them, and they're more/less suited to different scenarios.

The auction design affects how people behave during an auction - so one must consider:

The most typically desired auction property is for the bidders to bid truthfully; i.e. no matter how valuable the item, we ideally want users to bid what they think the price should be; this is a price discovery process (of course this isn’t always easily feasible in practice).

Auction Types

There are four classical auction types…

English Auction

This type has as very long history and is the most commonly known type of auction, entailing (referring to a singular item):

Low-res pic of Sotheby’s (fancy place):

Pretty simple process:

English auctions are most commonly used (as you’d expect) for selling goods, e.g. antiques and artworks.

Dutch Auction

This is an example of an open-bid descending auction (effectively the opposite way to the English auction).

This type of auction is famously used for selling flowers in the Netherlands, fun fact (flowers have a limited life cycle so need to be sold off quickly before they die).

Sealed-Bid First Price Auction

This auction is similar to the English auction in that the highest bidded price wins the auction (and the winner then pays how much they bid). However the difference is that, whereas the English auction is open bid, this type is sealed bid (so bidders don’t see other bidders’ prices).

In summary:

In this type of auction there’s a trade-off between bidding a lower price than your private value and risking losing, or bidding as high a price but risking paying more than might’ve been necessary.

Sealed-Bid Second Price Auction (Vickery Auction)

Lastly but not leastly, looking at the second price auction…

 

The idea is simple: goods are awarded to the bidder that made the highest bid, but they pay the price of the second highest bid.

Despite only being a small change to the sealed-bid first price auction model, this auction type arguably has preferable properties, and bidders are better incentivised to bid truthfully.

Of course, one could take a gamble and bid higher than their private value, increasing the chances of winning but also increasing the chance of them having to pay higher than their private value (which arguably goes beyond individual rationality), so again it’s a trade-off.

Dominant Strategy

Formal Model and Utility

In analysis, one can use utility theory to better capture individual preferences, attaching numerical values to outcomes.

Also, one must consider tie-breaking (for sealed-bid auctions), as in some cases 2 or more bidders bid exactly the same.

Revisiting the English auction example (I’m just gonna screenshot the slide cause cba formatting the symbols):

Note how the minimum increment is more of a practical matter; in theoretical analysis, it can be resolved by using an arbitrary small ɛ (epsilon).

Computing Expected Utilities

An expected utility (which we want to maximize) is simply the utility gained if you win in a given scenario, multiplied by the probability of winning:

E[u] = Pr[win]*(v - p)

Let’s look at a first-price sealed bid auction example:

Sponsored Search Auctions

e.g. the results enclosed in the grey box are “ads”, paid advertisements by companies:

Payment Models

There are three main types of payment method:

Pay-Per-Click

(where CTR = click through rate, i.e. the ratio of users who click on a specific link to the number of total users who view a page, email, or ad)

So how do you design an auction that maximises the profit?

Google’s Quality Score

Google has a way to estimate the “quality” of the bidders and the advertisements, computing an ad's quality score; this is based on (amongst other things):

Of course, other search engines can have their own slightly different ”quality score” formulas.

Generalized Second Price Auction (GSP)

A lot of search engines compute ad allocation computed based on the product of the bid and the quality score.  Higher slots are more valuable, and bidders with higher product value get a higher slot

Formally: Let bi and qi denote the bidding price and ad quality score respectively. The allocation is given by sorting the ads in ascending order according to: bi * qi

In Google’s “generalised second price auction (GSP)”:

...Very similar to second price auctions, right? (Hence the name)

A simple example:

Expected Value

If qi represents the click through rate, X the set of bidders allocated, and the bid bi is the true value generated by the click, then the expected value is:

It can be shown that, by sorting according to the product, this maximises the total expected value (a.k.a.the social welfare of the allocation, i.e. the sum of the utilities of the bidders, assuming that bi is the true value).

Alternative Models

We have assumed that the click probability is independent of the position, which is a simplification, i.e. higher positions have higher click probability (which is why they are more attractive).

Thus more refined alternative models for accurately estimating the (often separable) CTR can be used, such as:

Reputation Systems

As buyers and sellers, we’ve always wanted to know that we’re dealing with someone we can rely on; with online systems:

Trust Issues…

Terminology

Cherries and Lemons

The Lemon Problem

Moral Hazard

Trust on the Web

The Solution?

Reputation vs. Recommender Systems

So what’s the difference?

Design Considerations

What should one consider when designing a recommender system?

Existing Systems

Probably one of the most well-known reputation systems is the one that eBay uses for its sellers and buyers, trying to tackle the issue of asymmetric information.

Reputation Value

Measuring Confidence

Probabilistic Approach

For binary ratings we would use the beta distribution (effectively a probability of probabilities), which has parameters α (number of successes, i.e. a positive rating) and β (number of failures, i.e. a negative rating).

(where μ = population mean, σ = population standard deviation)

Using the Beta Distribution

The beta distribution is a family of continuous probability distributions defined on the interval [0, 1]. It is parameterized by two positive shape parameters, denoted by α and β. These appear as exponents of the random variable and control the shape of the distribution.

Evaluating Reputation Systems

This can (unfortunately) be a much harder task than evaluating recommender systems...

Evaluation methods:

Issues

Of course, reputation systems have a number of issues associated with them. One issues is ballot stuffing – when a single user can easily place multiple ratings.

To mitigate this:

Another common issue is that of people leaving false/unfair ratings, as people can be biased; if they received an item they think is inferior or they’ve overpaid for (or if they’re having technical issues with…) they’re far more likely to leave a negative review, often very harshly

There’s also the issue of slander, where a competitor places a false rating to discredit the victim. There’s also the issue of self-promotion, where a seller may want to promote their own product; one has to watch out for “farms”, groups of people who are employed to leave fake positive reviews on items to artificially boost their reputation.

Some solutions to the aforementioned issues include:

Furthermore, there’s the issue of whitewashing, where sellers can just change their identity to “reset” their reputation if they get a bad rep. To combat this, sites can:

Another problem is reciprocation and retaliation, resulting in bias towards positive ratings; eBay used to have 99% positive ratings overall, and there was a high correlation between buyer and seller ratings (indicative of reciprocation/retaliation fears).

Some possible solutions:

To briefly touch on a few other issues (yeah, there are a lot)...

Financial Incentives

In terms of financial incentives for reputation systems...

Experimental Design and Web Analytics

What is Experimental Design?

Where is experimental design used in the real world? Some example applications include:

Example - Web design/Landing Page Optimisation

Experimental Approach

A-B Split Testing

Multivariant Testing

One-factor-at-a-time (OFAT)

Let’s look at an example (from the slides of course)!

Fractional Factorial Design

Main vs. Interaction Effects

As aforementioned, latin squares for instance can only be used to measure main effect and some interaction effects (whereas full factorial design can measure the main effects and all interaction effects).

...So what's the difference in effects? In short, “interaction” effects are those that are directly dependent on the “main” effects.

Determining the Optimal Design

There are different ways we can try to determine which design is optimal; one simple approach that can be used:

User Trials

So how do we decide how many user trials need to implement, to minimize the variance in the statistics?

Further Reading

For more info on experimental design, check out:

Human Computation

Modern Recommendation Systems

Incentives in Crowd

Rank Aggregation

Experts Crowdsourcing

Revision Lectures

tl;dr

In Matthew Barnes’ notes spirit, I’m going to try to sum up the module concisely, a COMP3208 Any% speedrun if you will. So smash that like button and let’s go:

Introductory Lecture

"Social Computing Research focuses on methods for harvesting the collective intelligence of groups of people in order to realize greater value from the interaction between users and information”

Dictionary definition of crowdsource: “obtain (information or input into a particular task or project) by enlisting the services of a number of people, either paid or unpaid, typically via the Internet”.

Many sites rely on ads for income, which are determined through fast auctions; advertisers bid on users the moment they enter a website.

Two main types of online advertising:

Past & Present

Terminology:

History:

Early social computing systems:

Commercialisation:

Web 2.0 (i.e. Participative/Participatory and Social Web) refers to sites emphasizing:

Sites with crowd-generated content:

Notable crowdsourcing initiatives:

Crowdsourcing for social good:

Crowdsourcing and AI:

Why use social computing now?

Some important characteristics of social computing can be summarized as user-created content where:

When analysing a social computing system:

Recommender Systems

Recomender systems are a type of information filtering system which filters items based on a personal profile. Recommendations are personal to the specific user/group of users.

Recommender systems are given:

Examples of recommender systems:

Recommender systems can provide:

However, potential issues:

Paradigms of recommender systems:

Content-based:

Knowledge-based:

Collaborative filtering:

Hybrid:

Similarity coefficients:

Shortcomings to using keywords:

:

To help, one can clean the features, with techniques like:

Calculating similarity:

User-based collaborative filtering:

Steps for the system to consider:

Similarity measure for ratings:

Possible neighbourhood selection criteria:

Prediction function:

Ways to improve prediction function:

Lastly, making a recommendation:

Item-based collaborative filtering:

Some issues exist with item-based; pre-processing techniques can be used to mitigate these.

Similarity measure for ratings:

Prediction function:

An alternative simple approach is a slope one, which focuses on computing the average differences between pairs of items. e.g. say we have:

Average difference between items 1 and 2:

Predicted rating for item 2:

Adapted prediction to weight each difference based on the number of jointly weighted items:

Types of ratings:

Evaluating recommender systems:

Measuring their performance:

For offline experiments:

Labelling classifications:

To note:

Classification performance metrics:

Error measures:

Online Auctions

Why use auctions:

Auction design:

Four classical auction types:

Dominant strategy:

Formal model and utility:

Reputation Systems

How do we know we’re dealing with a reliable buyer or seller, and that the quality of product/service is as described?

Terminology:

Trust issues:

Reputation systems are based on assumption of consensus; reputation is ideally an objective measure; they are designed to address asymmetric information.

Design considerations:

e.g. eBay:

e.g. Amazon:

Reputation values:

Probabilistic approach:

(μ = population mean, σ = population standard deviation)

Evaluating reputation systems:

Issues with reputation systems:

Additional Lectures

Introduction to Coursework

There are going to be 4 coursework “surgery” sessions, scheduled throughout the duration of the module. These are interactive drop-in sessions to ask any questions you might have.

Assignments

Task:

Deliverables:

Learning Outcomes (which are practical in nature):

Useful Resources

Forming Groups (of 2 or 3)

Task Datasets

< walkthrough of data files >

Submitting Computed Ratings

Module Wiki >> Assignment >> Submit (pretty standard m8)

...When you submit you’ll get a standard handin receipt, shortly followed by another email.

This confirms the test harness has queued the submission, and telling you how many submissions you have remaining for your group (it doesn't matter who submits from the group):

...Then after your code is evaluated, you’ll receive another email with the mean square error results for both the small and large datasets (bear in mind though it’s only the large one that counts for marking!):

Written Report

As for the marking scheme, it’s linked on the Wiki (on ECS intranet), go check it out!

Academic Integrity

Coursework Surgery Sessions

Coursework “Surgery” 1

Coursework “Surgery” 2

Coursework “Surgery” 3

Coursework “Surgery” 4

Guest Lecture: Davide Zilli from Mind Foundry

Guest Lecture: Mikhail Fain from Cookpad