Crowdsourcing using MTurk for HCI research

Crowdsourcing using Mechanical
Turk for Human Computer
Interaction Research

Ed H. Chi

Research Scientist

Google

(work done while at [Xerox] PARC)

1

Historical Footnote

De Prony, 1794, hired hairdressers

•  (unemployed after French revolution; knew only
addition and subtraction)

•  to create logarithmic and trigonometric tables.

•  He managed the process by splitting the
work into very detailed workﬂows.

!"#$% &'#(")$)*'%+ ,'"%-
• !"#$%/ 0121 )31

56'#(")12/+7 "/1-
#$)3 6'#(")$)*'%

–  Grier, When computers were human, 2005

• !"#$%&'() 6'#(")
– &9$*2$")+ $/)2'%'#
&'#(")1- )31 !$9
6'#1) '2?*) @)3211
(2'?91#A )&*&)&%#
-$./" '4 %"#12*6
6'#(")$)*'%/ $62'
$/)2'%'#12/
C2*12+ D31% 6'#(")12/ 0
C2*12

2

Talk in 3 Acts

•  Act 1:

–  How we almost failed in using MTurk?!

–  [Kittur, Chi, Suh, CHI2008]

•  Act II:

–  Apply MTurk to visualization evaluation

–  [Kittur, Suh, Chi, CSCW2008]

•  Act III:

–  Where are the limits?

Aniket Kittur, Ed H. Chi, Bongwon Suh.

Crowdsourcing User Studies With Mechanical Turk. In CHI2008.

Aniket Kittur, Bongwon Suh, Ed H. Chi. Can You Ever Trust a Wiki?
Impacting Perceived Trustworthiness in Wikipedia. In CSCW2008.

3

Example Task from Amazon MTurk

4

Using Mechanical Turk for user studies

Traditional user Mechanical Turk

studies

Task complexity

Complex

Simple

Long

Short

Task subjectivity

Subjective

Objective

Opinions

Veriﬁable

User information

Targeted demographics

Unknown demographics

High interactivity

Limited interactivity

Can Mechanical Turk be usefully used for user studies?

5

Task

•  Assess quality of Wikipedia articles

•  Started with ratings from expert Wikipedians

–  14 articles (e.g., Germany , Noam Chomsky )

–  7-point scale

•  Can we get matching ratings with mechanical turk?

6

Experiment 1

•  Rate articles on 7-point scales:

–  Well written

–  Factually accurate

–  Overall quality

•  Free-text input:

–  What improvements does the article need?

•  Paid $0.05 each

7

Experiment 1: Good news

•  58 users made 210 ratings (15 per article)

–  $10.50 total

•  Fast results

–  44% within a day, 100% within two days

–  Many completed within minutes

8

Experiment 1: Bad news

•  Correlation between turkers and Wikipedians
only marginally signiﬁcant (r=.50, p=.07)

•  Worse, 59% potentially invalid responses

Experiment 1
Invalid 49%
comments
<1 min 31%
responses

•  Nearly 75% of these done by only 8 users

9

Not a good start

•  Summary of Experiment 1:

–  Only marginal correlation with experts.

–  Heavy gaming of the system by a minority

•  Possible Response:

–  Can make sure these gamers are not rewarded

–  Ban them from doing your hits in the future

–  Create a reputation system [Delores Lab]

•  Can we change how we collect user input ?

10

Design changes

•  Use veriﬁable questions to signal monitoring

–  How many sections does the article have?

–  How many images does the article have?

–  How many references does the article have?

11

Design changes


•  Make malicious answers as high cost as good-faith
answers

–  Provide 4-6 keywords that would give someone a
good summary of the contents of the article

12

Design changes


answers

•  Make veriﬁable answers useful for completing
task

–  Used tasks similar to how Wikipedians evaluate quality
(organization, presentation, references)

13

Design changes


answers

•  Make veriﬁable answers useful for completing
task

•  Put veriﬁable tasks before subjective responses

–  First do objective tasks and summarization

–  Only then evaluate subjective quality

–  Ecological validity?

14

Experiment 2: Results

•  124 users provided 277 ratings (~20 per article)

•  Signiﬁcant positive correlation with Wikipedians

–  r=.66, p=.01

•  Smaller proportion malicious responses

•  Increased time on task

Experiment 1

Experiment 2

Invalid 49%

3%

comments

<1 min 31%

7%

responses

Median time

1:30

4:06

15

Quick Summary of Tips

•  Mechanical Turk offers the practitioner a way to access a
large user pool and quickly collect data at low cost

•  Good results require careful task design

1.  Use verifiable questions to signal monitoring

2.  Make malicious answers as high cost as good-faith answers

3.  Make verifiable answers useful for completing task

4.  Put verifiable tasks before subjective responses

16

Generalizing to other MTurk studies

•  Combine objective and subjective questions

–  Rapid prototyping: ask veriﬁable questions about content/
design of prototype before subjective evaluation

–  User surveys: ask common-knowledge questions before
asking for opinions

•  Filtering for Quality

–  Put in a ﬁeld for Free-Form Responses and Filter out
data without answers

–  Results that came in too quickly

–  Sort by WorkerID and look for cut and paste answers

–  Look for outliers in the data that are suspicious

17

Talk in 3 Acts

•  Act 1:

–  How we almost failed?!

•  Act II:

–  Applying MTurk to visualization evaluation

•  Act III:


18

What would make you trust Wikipedia more?

20

What is Wikipedia?

Wikipedia is the best thing ever. Anyone in the world can write
anything they want about any subject, so you know you re getting the
best possible information.
– Steve Carell, The Office

21


Nothing

22


Wikipedia, just by its nature, is
impossible to trust completely. I don't
think this can necessarily be
changed.

23

WikiDashboard
  Transparency of social dynamics can reduce conflict and coordination
issues
  Attribution encourages contribution
–  WikiDashboard: Social dashboard for wikis
–  Prototype system: http://wikidashboard.parc.com

  Visualization for every wiki page
showing edit history timeline and
top individual editors

  Can drill down into activity history
for specific editors and view edits
to see changes side-by-side

Citation: Suh et al.
CHI 2008 Proceedings

2011 UCBerkeley Visual Computing Retreat 24

Hillary
Clinton

2011 UCBerkeley Visual
25
Computing Retreat 25

Top
Editor
-‐
Wasted
Time
R

2011 UCBerkeley Visual
26
Computing Retreat

Surfacing information

•  Numerous studies mining Wikipedia revision
history to surface trust-relevant information
–  Adler & Alfaro, 2007; Dondio et al., 2006; Kittur et al., 2007;
Viegas et al., 2004; Zeng et al., 2006

Suh, Chi, Kittur, & Pendleton, CHI2008

•  But how much impact can this have on user
perceptions in a system which is inherently
mutable?
27

Hypotheses

1.  Visualization will impact perceptions of trust
2.  Compared to baseline, visualization will
impact trust both positively and negatively
3.  Visualization should have most impact when
high uncertainty about article
•  Low quality
•  High controversy

28

Design

•  3 x 2 x 2 design

Controversial Uncontroversial

Visualization Abortion Volcano
High quality
•  High stability George Bush Shark
•  Low stability
•  Baseline (none) Pro-life feminism Disk
defragmenter Low quality
Scientology and
celebrities Beeswax

29

Example: High trust visualization

30

Example: Low trust visualization

31

Summary info

•  % from anonymous
users

32

Summary info

users
•  Last change by
anonymous or
established user

33

Summary info

users
•  Last change by
anonymous or
established user
•  Stability of words

34

Graph

•  Instability

35

Method

•  Users recruited via Amazon s Mechanical Turk
–  253 participants
–  673 ratings
–  7 cents per rating
–  Kittur, Chi, & Suh, CHI 2008: Crowdsourcing user studies
•  To ensure salience and valid answers, participants
answered:
–  In what time period was this article the least stable?
–  How stable has this article been for the last month?
–  Who was the last editor?
–  How trustworthy do you consider the above editor?

36

Results

7 High stability Baseline Low stability

6
Trustworthiness rating
5

4

3

2

1
Low qual High qual Low qual High qual

Uncontroversial Controversial

main effects of quality and controversy:
• high-quality articles > low-quality articles (F(1, 425) = 25.37, p < .001)
• uncontroversial articles > controversial articles (F(1, 425) = 4.69, p = .
031)

37

Results


6
5

4

3

2

1


interaction effects of quality and controversy:
• high quality articles were rated equally trustworthy whether controversial
or not, while
• low quality articles were rated lower when they were controversial than
when they were uncontroversial.
38

Results
1.  Significant effect of visualization: High-Stability > Low-Stability, p < .001
2.  Viz has both positive and negative effects:
–  High-Stability > Baseline (p < .001) > Low-Stability, p < .01
3.  No interaction of visualization with either quality or controversy
–  Robust across visualization conditions

6

5

4

3

2

1

39

Results

6

5

4

3

2

1

40

Results

6

5

4

3

2

1

41

Talk in 3 Acts

•  Act 1:

–  How we almost failed?!

•  Act II:

–  Applying MTurk to visualization evaluation

•  Act III:


42

Limitations of Mechanical Turk

•  No control of users environment

–  Potential for different browsers, physical distractions

–  General problem with online experimentation

•  Not yet designed for user studies

–  Difﬁcult to do between-subjects design

–  May need some programming

•  Hard to control user population

–  hard to control demographics, expertise

43

Crowdsourcing for HCI Research

•  Does my interface/visualization work?

–  WikiDashboard: transparency vis for Wikipedia [Suh et al.]

–  Replicating Perceptual Experiments [Heer et al., CHI2010]

•  Coding of large amount of user data

–  What is a Question in Twitter? [Sharoda Paul, Lichan Hong, Ed Chi]

•  Incentive mechanisms

–  Intrinsic vs. Extrinsic rewards: Games vs. Pay

–  [Horton & Chilton, 2010 for Mturk] and [Ariely, 2009] in general

44

Crowdsourcing for HCI Research

•  Does my interface/visualization work?

–  WikiDashboard: transparency vis for Wikipedia [Suh et al. VAST,
Kittur et al. CSCW2008]

–  Replicating Perceptual Experiments [Heer et al., CHI2010]

•  Coding of large amount of user data

–  What is a Question in Twitter? [S. Paul, L. Hong, E. Chi, ICWSM 2011]

•  Incentive mechanisms

–  Intrinsic vs. Extrinsic rewards: Games vs. Pay

–  [Horton & Chilton, 2010 on MTurk] and Satisﬁcing

–  [Ariely, 2009] in general: Higher pay != Better work

45

Managing Quality

•  Quality through redundancy: Combining votes

–  Majority vote [work best when similar worker quality]

–  Worker-Quality‐adjusted vote

–  Managing dependencies

•  Quality through gold data

–  Advantaged when imbalanced dataset & bad workers

•  Estimating worker quality (Redundancy + Gold)

–  Calculate the confusion matrix and see if you actually
get some information from the worker

•  Toolkit: http://code.google.com/p/get‐another‐label/

Source: Ipeirotis, WWW2011 46

Coding and Machine Learning
!"#$%& '(%)*"(+

•  Integration with Machine Learning

• ,)#-+' %-.&% */-"+"+0 1-*- using
–  Build automatic classiﬁcation models
crowdsourced data

• 2'& */-"+"+0 1-*- *( .)"%1 #(1&%

Data from existing
crowdsourced answers

N
New C
Case Automatic Model Automatic
(through machine learning) Answer

Source: Ipeirotis, WWW2011
47

Crowd Programming for Complex Tasks

•  Decompose tasks into smaller tasks

–  Digital Taylorism

–  Frederick Winslow Taylor (1856-1915)

–  1911 'Principles Of Scientiﬁc Management’

•  Crowd Programming Explorations

–  MapReduce Models

•  Kittur, A.; Smus, B.; and Kraut, R. CHI2011EA on CrowdForge.

•  Kulkarni, Can, Hartmann, CHI2011 workshop & WIP

–  Little, G.; Chilton, L.; Goldman, M.; and Miller, R. C. In
KDD 2010 Workshop on Human Computation.

48

CHI 2011 • Work-in-Progress May 7–12, 2011 • Vancouver, BC, Canada

Crowd Programming for Complex Tasks
!

!

"#!$%&!'%()(*!%!(&+,-.-+/!&01,+((-#2!('+&!-(!%&&3-+/!'1! &%0'-'-1#!('+&!%()+/!:10)+0(!'1!,0+%'+!%#!%0'-,3+!18'3-#+*!
+%,4!-'+$!-#!'4+!&%0'-'-1#5!64+(+!'%()(!%0+!-/+%337! 0+&0+(+#'+/!%(!%#!%00%7!1.!(+,'-1#!4+%/-#2(!(8,4!%(!
•  Crowd Programming Explorations

(-$&3+!+#1824!'1!9+!%#(:+0%93+!97!%!(-#23+!:10)+0!-#!%!
(410'!%$18#'!1.!'-$+5!;10!+<%$&3+*!%!$%&!'%()!.10!
EF-('107G!%#/!EH+120%&47G5!"#!%#!+#=-01#$+#'!:4+0+!
:10)+0(!:183/!,1$&3+'+!4-24!+..10'!'%()(*!'4+!#+<'!('+&!

–  Kittur, A.; Smus, B.; and Kraut, R. CHI2011EA on
%0'-,3+!:0-'-#2!,183/!%()!%!:10)+0!'1!,133+,'!1#+!.%,'!1#!
%!2-=+#!'1&-,!-#!'4+!%0'-,3+>(!18'3-#+5!?83'-&3+!-#('%#,+(!
$-24'!9+!'1!4%=+!(1$+1#+!:0-'+!%!&%0%20%&4!.10!+%,4!
(+,'-1#5!F1:+=+0*!'4+!/-..-,83'7!%#/!'-$+!-#=13=+/!-#!

CrowdForge.

1.!%!$%&!'%()(!,183/!9+!-#('%#'-%'+/!.10!+%,4!&%0'-'-1#@!
+525*!$83'-&3+!:10)+0(!,183/!9+!%()+/!'1!,133+,'!1#+!.%,'!
.-#/-#2!'4+!-#.10$%'-1#!.10!%#/!:0-'-#2!%!,1$&3+'+!
&%0%20%&4!.10!%!4+%/-#2!-(!%!$-($%',4!'1!'4+!31:!:10)!
+%,4!1#!%!'1&-,!-#!&%0%33+35! ,%&%,-'7!1.!$-,01I'%()!$%0)+'(5!648(!:+!901)+!'4+!'%()!
–  Kulkarni, Can, Hartmann, CHI2011 workshop & WIP

8&!.80'4+0*!(+&%0%'-#2!'4+!-#.10$%'-1#!,133+,'-1#!%#/!
;-#%337*!0+/8,+!'%()(!'%)+!%33!'4+!0+(83'(!.01$!%!2-=+#! :0-'-#2!(89'%()(5!B&+,-.-,%337*!+%,4!(+,'-1#!4+%/-#2!
$%&!'%()!%#/!,1#(13-/%'+!'4+$*!'7&-,%337!-#'1!%!(-#23+! .01$!'4+!&%0'-'-1#!:%(!8(+/!'1!2+#+0%'+!$%&!'%()(!-#!
0+(83'5!"#!'4+!%0'-,3+!:0-'-#2!+<%$&3+*!%!0+/8,+!('+&!
$-24'!'%)+!.%,'(!,133+,'+/!.10!%!2-=+#!'1&-,!97!$%#7!
:10)+0(!%#/!4%=+!%!:10)+0!'80#!'4+$!-#'1!%!&%0%20%&45! “Please solve the 16-question SAT located at
A#7!1.!'4+(+!('+&(!,%#!9+!-'+0%'-=+5!;10!+<%$&3+*!'4+! http://bit.ly/SATexam”. In both cases, we paid workers
'1&-,!.10!%#!%0'-,3+!(+,'-1#!/+.-#+/!-#!%!.-0('!&%0'-'-1#!
between $0.10 and $0.40 per HIT. Each “subdivide” or
,%#!-'(+3.!9+!&%0'-'-1#+/!-#'1!(89(+,'-1#(5!B-$-3%037*!'4+!
&%0%20%&4(!0+'80#+/!.01$!1#+!0+/8,'-1#!('+&!,%#!-#! “merge” HIT received answers within 4 hours; solutions
'80#!9+!0+10/+0+/!'401824!%!(+,1#/!0+/8,'-1#!('+&5!
to the initial task were complete within 72 hours.
!"#$%#&'()$#%
C+!+<&310+/!%(!%!,%(+!('8/7!'4+!,1$&3+<!'%()!1.!
:0-'-#2!%#!+#,7,31&+/-%!%0'-,3+5!C0-'-#2!%#!%0'-,3+!-(!%! Results
,4%33+#2-#2!%#/!-#'+0/+&+#/+#'!'%()!'4%'!-#=13=+(!$%#7! The decompositions produced by Turkers while running
/-..+0+#'!(89'%()(D!&3%##-#2!'4+!(,1&+!1.!'4+!%0'-,3+*!
41:!-'!(4183/!9+!('08,'80+/*!.-#/-#2!%#/!.-3'+0-#2! Turkomatic are displayed in Figure 1 (essay-writing)
-#.10$%'-1#!'1!-#,38/+*!:0-'-#2!8&!'4%'!-#.10$%'-1#*!
.-#/-#2!%#/!.-<-#2!20%$$%0!%#/!(&+33-#2*!%#/!$%)-#2!
and Figure 4 (SAT).
'4+!%0'-,3+!,14+0+#'5!64+(+!,4%0%,'+0-('-,(!$%)+!%0'-,3+!
Figure 4. For the SAT task, we uploaded
:0-'-#2!%!,4%33+#2-#2!98'!0+&0+(+#'%'-=+!'+('!,%(+!.10!
sixteen questions from a high school
180!%&&01%,45! In the essay task, each “subdivide” HIT was posted
Scholastic Aptitude Test to the web and three times by Turkomatic and the best of the three
61!(13=+!'4-(!&0193+$!:+!,0+%'+/!%!(-$&3+!.31:! *)+',$%-.%/",&)"0%,$#'0&#%12%"%3100"41,"&)5$%
was selected by experimenters (simulating Turker 49
posed ,1#(-('-#2!1.!%!&%0'-'-1#*!$%&*!%#/!0+/8,+!('+&5!!64+!
the following task to Turkomatic: 6,)&)7+%&"#89%

“Please solve the 16-question SAT located voting) to continue the solution process. The proposed
at http://bit.ly/SATexam”. decompositions were overwhelmingly linear and chose
1804

Future Directions in Crowdsourcing

•  Real-time Crowdsourcing

–  Bigham, et al. VizWiz, UIST 2010

What color is this pillow? What denomination is Do you see picnic tables What temperature is my Can you please tell me What k
this bill? across the parking lot? oven set to? what this can is? thi

(89s) . (24s) 20 (13s) no (69s) it looks like 425 (183s) chickpeas. (91s)
(105s) multiple shades (29s) 20 (46s) no degrees but the image (514s) beans (99s) n
of soft green, blue and is difficult to see. (552s) Goya Beans picture
gold (84s) 400 (247s)
(122s) 450

Figure 2: Six questions asked by participants, the photographs they took, and answers received with latency in s
50

the total time required to answer a question. quikTurkit also distribution was set such that half of the HI




•  Embedding of Crowdwork inside Tools

–  Bernstein, et al. Solyent, UIST 2010

51

Crowd Feedback
To effectively design feedback mechanism
the goals of learning, engagement, and qu
improvement, we first analyze the importa

dimensions of the design space for crowd
(Figure 2).

Timeliness: When should feedback be sho

In micro-task work, workers stay with tas

while, then move on. This implies two tim
synchronously deliver feedback when wor
•  Embedding of Crowdwork inside Tools

engaged in a set of tasks, or asynchronou
–  Bernstein, et al. Solyent, UIST 2010

feedback after workers have completed th

•  Shepherding Crowdwork

Synchronous feedback may have more im
–  Dow et al. CHI2011 WIP

task performance since
while workers are still th
the task domain. It also
probability that workers
onto similar tasks. Howe
synchronous feedback p
burden on the feedback
they have little time to r
This implies a need for t
scheduling algorithms th
near real-time feedback
Asynchronous feedback
52
feedback providers more
Figure 2: Current systems (in orange) focus on
asynchronous, single-bit feedback by requesters. review and comment on

Tutorials

•  Matt Lease http://ir.ischool.utexas.edu/crowd/

•  AAAI 2011 (w HCOMP 2011): Human Computation: Core Research Questions
and State of the Art (E. Law & Luis von Ahn)

•  WSDM 2011: Crowdsourcing 101: Putting the WSDM of Crowds to Work for
You (Omar Alonso and Matthew Lease)

–  http://ir.ischool.utexas.edu/wsdm2011_tutorial.pdf

•  LREC 2010 Tutorial: Statistical Models of the Annotation Process (Bob Carpenter
and Massimo Poesio)

–  http://lingpipe-blog.com/2010/05/17/

•  ECIR 2010: Crowdsourcing for Relevance Evaluation. (Omar Alonso)

–  http://wwwcsif.cs.ucdavis.edu/~alonsoom/crowdsourcing.html

•  CVPR 2010: Mechanical Turk for Computer Vision. (Alex Sorokin and Fei‐Fei Li)

–  http://sites.google.com/site/turkforvision/

•  CIKM 2008: Crowdsourcing for Relevance Evaluation (D. Rose)

–  http://videolectures.net/cikm08_rose_cfre/

•  WWW2011: Managing Crowdsourced Human Computation (Panos Ipeirotis)

–  http://www.slideshare.net/ipeirotis/managing-crowdsourced-human-computation

53

Social Q&A on Twitter!
!

S.
Paul,
L.
Hong,
E.
Chi,
ICWSM
2011

3/27/12 54

Why social Q&A?!

!

!

People turn to their friends on social networks because they
trust their friends to provide tailored answers to subjective
questions on niche topics.!
!

3/27/12! 55

Research Questions!
!
What kinds of questions are Twitter users asking
their friends?!
!
Types and topics of questions!
!
Are users receiving responses to the questions
they are asking?!
Number, speed, and relevancy of responses!
!
How does the nature of the social network affect Q&A
behavior?!
Size and usage of network, reciprocity of relationship!

3/27/12 58

Identifying question tweets was challenging!
!
Advertisement framed as question!
!
!

Rhetorical question!
!

!
Missing context!
!
!

Used heuristics to identify candidate tweets! that were
possibly questions!

! 3/27/12 59

Classifying candidates tweets using
Mechanical Turk!
Crowd-sourced question tweet identification to Amazon Mechanical
Turk!

!

!
Control tweet!

!
•  Each Tweet classified by two Turkers!
!
•  Each Turker classified 25 tweets: 20 candidates and 5
control tweets!
•  Only accepted data from Turkers who classified all
control tweets correctly!
3/27/12 60

Overall method for filtering questions!
! Candidate tweets
Random sample of public tweets

Applied heuristics to !
identify candidate tweets !

12,000! 1.2 million!
(4,100 presented to Turkers)!

Classified candidates! Tracked responses !
using Mechanical Turk! to each candidate tweet!

624!

1152!

3/27/12 61

Findings: Types and topics of questions!
!
Rhetorical (42%), factual (16%), and poll (15%) questions
were common!
Significant percentage of personal & health (11%)questions!
!
!
Question types! Question topics!

How do you feel about Which team is better
interracial dating? others
raiders or steelers?
16%

uncategorized

5%

entertainment

professional
32%

4%

restaurant/food
Any good iPad app
4%
recommendations?
current
events

In UK, when you need to 4%
gree8ngs

technology

see a specialist, do you 7%
10%

personal

need special forms or &
health

permission? 11%

ethics
&

philosophy

7%
Any idea how to lost
weight fast?

3/27/12 62

Findings: Responses to questions!
8
!
7

!
log(number of questions)

6
Number of responses
5
have a long tail Low (18.7%)
4 distribution! response rate in
3 general, but quick
2 responses!
1

0
!
0
1
2
3
4
5
6
7
8
10
16
17
28
29
39
147
Number of answers

Most often reciprocity between asker and
answerer was one-way (55%)!
Responses were largely (84%) relevant!

3/27/12
! 63

Findings: Social network characteristics!
!
Which characteristics of asker predict whether she will
receive a response?!
!
!
Network size and status in network are good
!
predictors of whether asker will receive
response!

Logistic regression modeling (structural properties)!
!
Number of followers (+) " "Number of tweets posted!
Number of days on Twitter (+)" "Frequency of use of Twitter!
Ratio of followers/followees (+)!
Reciprocity rate (-)!

!
!
!
3/27/12 64

Thanks!

•  chi@acm.org

•  http://edchi.net

•  @edchi

•  Aniket Kittur, Ed H. Chi, Bongwon Suh. Crowdsourcing User Studies
With Mechanical Turk. In Proceedings of the ACM Conference on Human-
factors in Computing Systems (CHI2008), pp.453-456. ACM Press, 2008.
Florence, Italy.

•  Aniket Kittur, Bongwon Suh, Ed H. Chi. Can You Ever Trust a Wiki?
Impacting Perceived Trustworthiness in Wikipedia. In Proc. of Computer-
Supported Cooperative Work (CSCW2008), pp. 477-480. ACM Press, 2008.
San Diego, CA. [Best Note Award]

66

Crowdsourcing using MTurk for HCI research

Recommended

Recommended

More Related Content

Similar to Crowdsourcing using MTurk for HCI research

Similar to Crowdsourcing using MTurk for HCI research (20)

More from Ed Chi

More from Ed Chi (20)

Recently uploaded

Recently uploaded (20)

Crowdsourcing using MTurk for HCI research