First visit to My Boss is a Robot? Try starting with the about the experiment page.
As we dig into the philsophical roots of radical crowdsourcing, we are finding that the idea that humans can be organized and ultimately made subservient to machines is not particularly new.
The book Technopoly, written in 1992 by American cultural critic, Neil Postman, lays out much of the history of society’s relationship with technology and, though it was written before the widespread use of computer networks, proves somewhat prescient about where the unexamined adoption of information technologies might lead.
Postman frames his argument with the story of King Thamus, described in Plato’s Phaedrus. Thamus is presented with a number of inventions–calculation, astronomy, and writing, to name a few–but is unimpressed. He takes specific issue with writing, claiming that it will cause people to stop remembering things in favor of writing them down, leading them to gain much information, but little wisdom. To the inventor, Theuth, he admonsishes, “the discoverer of an art is not the best judge of the good or harm that will accrue to those who practice it.”
In broad terms, Postman then describes the evolution of society from tool-using cultures where technology is integrated into daily life but is bound by a social or religious worldview, through technocracy–where technology plays a central role in a culture, becoming a source of social power and challenging traditional structures–and finally technopoly, which Postman calls the “submission of all forms of culture to the sovereignty of machines and technology.”
Postman writes that we are entering a period of technopoly, the immediate roots of which stretch back at least as far as Frederick Taylor, the father of scientific management. Taylor was famous for efficiency studies in which repetitive tasks were measured and standardized similarly to machine parts. Postman sees in this behavior “the first statement that society is best served when human beings are placed at the disposal of their techniques and technology, that human beings are, in a sense, worth less than their technology.”
The rest of the book examines the dawning information age and its attendant glut of data which, Postman argues, encourages an over-reliance on statistics–an “invisible technology”–to replace social or religious sources of meaning. Postman also has a special loathing for computers, which he sees as usurping decision-making power from people.
Despite immediate impressions, the book is more than just the ranting of a Luddite; Postman readily acknowledges the many conveniences and advances that technology has enabled. He simply cautions that society take a note from King Thamus and look carefully at what is lost when a technology is adopted.
One wonders what Postman, who died in 2003, would make of crowdsourcing sites like Mechancial Turk. The idea seems an almost pure expression of technopoly–people literally take direction from, and are compensated by, computers. Ever the technological skeptic, he might take heart to know that there may be a limit to this trend, perhaps demonstrated in some small way by the fact that our experiment to completely mechanize the process of writing a news article isn’t going very well.
Posted by MacGregor Campbell at 9:49 am on June 1st, 2011.
Categories: Uncategorized.
First visit to My Boss is a Robot? Try starting with the about the experiment page.
We have now assembled an entire article using crowdsourced labour. Time to head to the editing stage. Except there’s a problem. Or rather there are lots of little problems. There are so many minor errors in the copy that there isn’t much point trying to edit it. An experienced editor would be able to clean it up, but we want to crowdsource the editing process. That will require cleaner copy than we have. So it’s time to go back to drawing board.
We think we know where we went wrong. In our work-flow, the laborers on Mechanical Turk construct the article paragraph by paragraph. The results looked quite reasonable at each stage — see, for example, Surprising good writing from the Turkers. But each paragraph has one or two problems. Some contained unnecessary information, others had minor errors. When the paragraphs are strung together the errors compound. The result is something of a mess.
To make the process work, we need to produce cleaner copy first time round. If we eliminate the errors in the initial writing stage we should, I think, be able to produce editable copy. To do so, we need to experiment with different quality-control mechanisms, such as having workers check paragraphs for accuracy, or having them follow a stricter template, similar to Martin Robbins’ brilliant satire of science “churnalism” for The Guardian.
I spent a day at a crowdsourcing workshop earlier this week and quality control mechanisms were a big part of the discussion. Crowds can execute tasks quickly and cheaply, but they are very hard to control. That’s one reason why CrowdFlower, which helps companies execute jobs on Mechanical Turk, is doing so well. (I’ve visited their offices several times over the past few years — almost every time I was directed to newer and bigger premises).
Anyway, it’s time to start rethinking our work-flow, with a new emphasis on quality. Niki Kittur and colleagues, our collaborators at Carnegie Mellon, are going to experiment with different quality-control mechanisms. These could include voting or fact-checking, for example. The idea is to find mechanisms that are quick, cheap and reliable. Results to come soon!
Posted by Jim Giles at 2:03 pm on May 12th, 2011.
Categories: the experiment.
We already know that it’s possible to produce encyclopaedia entries quickly and cheaply using Mechanical Turk — our collaborators at Carnegie Mellon unveiled a system for doing so last year. But news articles are harder to crowdsource, in part because journalists need to do interviews. How can we automate the process of finding and interviewing expert sources?
Here’s our approach. Like I said in a previous post on the ethics of crowdsourced journalism, it’s something of a fudge. But it seems to work quite well, at least up to a point.
To find sources, we use the references section of the paper that we’re writing about. (Quick recap: our system takes a scientific paper and automates the process of producing a news article about that paper). We start by asking workers to find email addresses for the authors of relevant references. We’re still experimenting with the best definition of “relevant”, but using the first few citations works fairly well, since these tend to refer to studies that are closely related to the paper itself. To weed out random answers, we only use emails that are identified by several workers.
Once we have addresses for relevant experts, the robot boss emails our potential sources a copy of the paper that we’re writing about, together with a list of generic questions, such as “What are the applications of this work?” The emails are signed by us.
When the answers come back, the robot boss creates two final tasks. One bunch of workers selects sections of the email that sound most interesting; a second set reads the draft of the story and inserts the selections where appropriate. For example, here’s a quote that the workers culled from one expert’s response to our email. (By way of context: we’re writing about a study on the factors that make music popular).
“Without passing judgment on their talents, I wonder if YouTube creations like Justin Bieber and Russell Cooper would have become stars if YouTube did not post the number of times a video clip has been viewed.”
It’s a nice quote and, having seen the full email from the source, I think the workers did a good job of homing in on the quotable sections. But problems arose when we asked them to insert the quote into the story. In fact, the problems were significant enough for us to think that we need to tweak some of the steps the precede this quote-getting stage. More on that next week.
Posted by Jim Giles at 10:42 am on May 5th, 2011.
Categories: the experiment.
Our plan to automate the creation of news articles raises a couple of ethical problems. I talked about pay (it’s below US minimum wage) in a
previous post. The other issue is journalistic in nature. We need expert sources to inform the article. But is it ethical to ask crowdsourced workers to be reporters?
Some journalists would have no problem with this. Many working reporters, myself included, never received formal journalism training. I started off as a freelancer and later got a staff job, but at no point did anyone sit me down and discuss the ethics of reporting. My sense is that this has traditionally been the way it works in Britain, where I was first based, but less so in the United States.
But that doesn’t mean that journalistic ethics don’t matter. For example, I don’t think that reporters should misrepresent themselves — i.e. pretend to be someone else — unless there is a strong public interest rationale for doing so. Some newspapers, particularly in United States, never permit their reporters to do so. A more minor example involves the business of turning a question into a quote. For example, a reporter might ask: “Would you say that this project is now dead?” If the source replies “maybe, yes”, the quote becomes “this project is now dead,” says source. That’s okay in British tabloid papers, not so much in The New York Times.
Reporters usually know what the rules are. That’s why publications require staff to follow codes of conduct. But that control can disappear when reporting is outsourced. On a platform like Mechanical Turk, workers only have time to read a few lines of instructions. There’s no room for an ethics lesson.
In the initial tests of our crowdsourcing system, we dodged this issue by interviewing sources by email. We first prepared a set of generic questions that could apply to any scientific paper, such as “What is the major contribution of this paper?” and “What are the applications of this work?” Then workers obtained email addresses for academics who are well placed to comment on the paper we’re writing about. The software that controls the process — what we’re calling the robot boss — emailed the academics. (I’ll post something on the results of this process in the next few days).
Like I said, that’s a dodge. It might work for our purposes, but to do more thorough reporting we would need to ask specific questions. It’s hard to see how we could automate the generation of those questions, yet we lose control of the process if we let workers do the interviews. Could we have workers read the paper and come up with more detailed questions, but not actually talk to sources directly? Possibly. But this issue makes me think that our system, which might produce workable short stories, won’t easily be extended to more detailed pieces.
Posted by Jim Giles at 4:25 pm on April 20th, 2011.
Categories: ethics.