Ideas worth spreading

Get the perfect ideas,

selected just for you

TED日本語

TED Talks（英語日本語字幕付き動画）

TED日本語 - ジェニファー・ゴルベック: カーリー・フライの謎解き ― ソーシャルメディアでの「いいね！」があなたの秘密を明かす？

TED Talks

カーリー・フライの謎解き ― ソーシャルメディアでの「いいね！」があなたの秘密を明かす？

The curly fry conundrum

ジェニファー・ゴルベック

Jennifer Golbeck

内容

カーリーフライは好きですか？フェイスブックて「いいね！」しましたか？フェイスブック(やそのほかのソーシャルネット)が、一見ランダムな「いいね！」や投稿からあなたについてどれほどのことを推測できるか、この講演を見て知っておいてください。コンピュータ科学者のジェニファー・ゴルベックが、どうやってそんなことが可能なのか、この技術の応用が必ずしも好ましいとは限らないこと、そして自分の情報をコントロールする力を本来の持ち主の手に戻すべきである理由を語ります。

カテゴリ: ニュースと政治

タグ　　: TED日本語

外部リンク: TED｜ジェニファー・ゴルベック: カーリー・フライの謎解き ― ソーシャルメディアでの「いいね！」があなたの秘密を明かす？

字幕

SCRIPT

Script

If you remember that first decade of the web, it was really a static place. You could go online, you could look at pages, and they were put up either by organizations who had teams to do it or by individuals who were really tech-savvy for the time. And with the rise of social media and social networks in the early 2000s, the web was completely changed to a place where now the vast majority of content we interact with is put up by average users, either in YouTube videos or blog posts or product reviews or social media postings. And it's also become a much more interactive place, where people are interacting with others, they're commenting, they're sharing, they're not just reading.

So Facebook is not the only place you can do this, but it's the biggest, and it serves to illustrate the numbers. Facebook has 1.2 billion users per month. So half the Earth's Internet population is using Facebook. They are a site, along with others, that has allowed people to create an online persona with very little technical skill, and people responded by putting huge amounts of personal data online. So the result is that we have behavioral, preference, demographic data for hundreds of millions of people, which is unprecedented in history. And as a computer scientist, what this means is that I've been able to build models that can predict all sorts of hidden attributes for all of you that you don't even know you're sharing information about. As scientists, we use that to help the way people interact online, but there's less altruistic applications, and there's a problem in that users don't really understand these techniques and how they work, and even if they did, they don't have a lot of control over it. So what I want to talk to you about today is some of these things that we're able to do, and then give us some ideas of how we might go forward to move some control back into the hands of users.

So this is Target, the company. I didn't just put that logo on this poor, pregnant woman's belly. You may have seen this anecdote that was printed in Forbes magazine where Target sent a flyer to this 15-year-old girl with advertisements and coupons for baby bottles and diapers and cribs two weeks before she told her parents that she was pregnant. Yeah, the dad was really upset. He said, "How did Target figure out that this high school girl was pregnant before she told her parents?" It turns out that they have the purchase history for hundreds of thousands of customers and they compute what they call a pregnancy score, which is not just whether or not a woman's pregnant, but what her due date is. And they compute that not by looking at the obvious things, like, she's buying a crib or baby clothes, but things like, she bought more vitamins than she normally had, or she bought a handbag that's big enough to hold diapers. And by themselves, those purchases don't seem like they might reveal a lot, but it's a pattern of behavior that, when you take it in the context of thousands of other people, starts to actually reveal some insights. So that's the kind of thing that we do when we're predicting stuff about you on social media. We're looking for little patterns of behavior that, when you detect them among millions of people, lets us find out all kinds of things.

So in my lab and with colleagues, we've developed mechanisms where we can quite accurately predict things like your political preference, your personality score, gender, sexual orientation, religion, age, intelligence, along with things like how much you trust the people you know and how strong those relationships are. We can do all of this really well. And again, it doesn't come from what you might think of as obvious information.

So my favorite example is from this study that was published this year in the Proceedings of the National Academies. If you Google this, you'll find it. It's four pages, easy to read. And they looked at just people's Facebook likes, so just the things you like on Facebook, and used that to predict all these attributes, along with some other ones. And in their paper they listed the five likes that were most indicative of high intelligence. And among those was liking a page for curly fries. (Laughter) Curly fries are delicious, but liking them does not necessarily mean that you're smarter than the average person. So how is it that one of the strongest indicators of your intelligence is liking this page when the content is totally irrelevant to the attribute that's being predicted? And it turns out that we have to look at a whole bunch of underlying theories to see why we're able to do this. One of them is a sociological theory called homophily, which basically says people are friends with people like them. So if you're smart, you tend to be friends with smart people, and if you're young, you tend to be friends with young people, and this is well established for hundreds of years. We also know a lot about how information spreads through networks. It turns out things like viral videos or Facebook likes or other information spreads in exactly the same way that diseases spread through social networks. So this is something we've studied for a long time. We have good models of it. And so you can put those things together and start seeing why things like this happen. So if I were to give you a hypothesis, it would be that a smart guy started this page, or maybe one of the first people who liked it would have scored high on that test. And they liked it, and their friends saw it, and by homophily, we know that he probably had smart friends, and so it spread to them, and some of them liked it, and they had smart friends, and so it spread to them, and so it propagated through the network to a host of smart people, so that by the end, the action of liking the curly fries page is indicative of high intelligence, not because of the content, but because the actual action of liking reflects back the common attributes of other people who have done it.

So this is pretty complicated stuff, right? It's a hard thing to sit down and explain to an average user, and even if you do, what can the average user do about it? How do you know that you've liked something that indicates a trait for you that's totally irrelevant to the content of what you've liked? There's a lot of power that users don't have to control how this data is used. And I see that as a real problem going forward.

So I think there's a couple paths that we want to look at if we want to give users some control over how this data is used, because it's not always going to be used for their benefit. An example I often give is that, if I ever get bored being a professor, I'm going to go start a company that predicts all of these attributes and things like how well you work in teams and if you're a drug user, if you're an alcoholic. We know how to predict all that. And I'm going to sell reports to H.R. companies and big businesses that want to hire you. We totally can do that now. I could start that business tomorrow, and you would have absolutely no control over me using your data like that. That seems to me to be a problem.

So one of the paths we can go down is the policy and law path. And in some respects, I think that that would be most effective, but the problem is we'd actually have to do it. Observing our political process in action makes me think it's highly unlikely that we're going to get a bunch of representatives to sit down, learn about this, and then enact sweeping changes to intellectual property law in the U.S. so users control their data.

We could go the policy route, where social media companies say, you know what? You own your data. You have total control over how it's used. The problem is that the revenue models for most social media companies rely on sharing or exploiting users' data in some way. It's sometimes said of Facebook that the users aren't the customer, they're the product. And so how do you get a company to cede control of their main asset back to the users? It's possible, but I don't think it's something that we're going to see change quickly.

So I think the other path that we can go down that's going to be more effective is one of more science. It's doing science that allowed us to develop all these mechanisms for computing this personal data in the first place. And it's actually very similar research that we'd have to do if we want to develop mechanisms that can say to a user, "Here's the risk of that action you just took." By liking that Facebook page, or by sharing this piece of personal information, you've now improved my ability to predict whether or not you're using drugs or whether or not you get along well in the workplace. And that, I think, can affect whether or not people want to share something, keep it private, or just keep it offline altogether. We can also look at things like allowing people to encrypt data that they upload, so it's kind of invisible and worthless to sites like Facebook or third party services that access it, but that select users who the person who posted it want to see it have access to see it. This is all super exciting research from an intellectual perspective, and so scientists are going to be willing to do it. So that gives us an advantage over the law side.

One of the problems that people bring up when I talk about this is, they say, you know, if people start keeping all this data private, all those methods that you've been developing to predict their traits are going to fail. And I say, absolutely, and for me, that's success, because as a scientist, my goal is not to infer information about users, it's to improve the way people interact online. And sometimes that involves inferring things about them, but if users don't want me to use that data, I think they should have the right to do that. I want users to be informed and consenting users of the tools that we develop.

And so I think encouraging this kind of science and supporting researchers who want to cede some of that control back to users and away from the social media companies means that going forward, as these tools evolve and advance, means that we're going to have an educated and empowered user base, and I think all of us can agree that that's a pretty ideal way to go forward.

Thank you.

ウェブが最初の10年間どんなだったか覚えていますか？固定的なものでしたネットにつなげサイトを見ることはできましたが当時それはそのための部署を持つ組織やコンピュータに精通した個人が立ち上げたものでした 2000年代初期にソーシャルメディアやソーシャルネットワークが登場するとウェブは大きく変容を遂げ今や私たちの見るコンテンツの大部分が一般的なユーザーによる YouTubeのビデオ、ブログ製品のレビューやソーシャルメディアでの投稿で占められていますまた人々が互いにやり取りをする場へと変化していますコメントしたり情報を共有したりしただ情報を見るだけではないのです

フェイスブックはこのような場として唯一ではないものの最大です数字を見れば分りますフェイスブックには月間ユーザーが12億いますつまり地球上のインターネット人口の― 半分がフェイスブックを利用しています他のサイトと同様に ITのスキルが殆どなくてもネット上の人格を作ることができるそんなサイトであり人々は個人的な情報を大量に投稿してきたのですその結果何億という人々の行動パターン好みや人口統計データなどが得られるのですこんなことは過去には有りませんでした私のようなコンピュータ科学者にとってこれは意味深く私は人々が共有した情報から本人が公開しているとは思いもしない多くの隠された特性を予測できるモデルを構築することができました科学者はそれによって人々の― ネット上での交流を手助け出来るのですがそんなに利他的でない応用もあります問題はユーザーがこの様な技術の存在やしくみを理解せずたとえ知っていたとしてもコントロールする手段が無いことです私が今日お話ししたいことはこういうことに対して我々が何をできるかそして我々の手にいくらかコントロールを取り返すアイデアについてです

これはTargetという会社のロゴこの哀れな妊婦のお腹にロゴを意味もなく貼りつけたのではありません雑誌フォーブスに載った逸話をご覧になったかもしれません Targetはこの15歳の少女が親に妊娠を打ち明ける２週も前に哺乳瓶、おむつ、ベビーベッドの広告とクーポン券を送りつけたのです父親は激怒しました Targetは親さえ知らない高校生の少女の妊娠をどうして知っていたのか？判明したことは彼らには何十万という顧客の購入履歴データがあり彼らが言う所の妊娠スコアというものを計算したのです単に妊娠の判断だけでなく予定日の推定さえするのですすぐそれと分かる購入品― 例えばベビーベッドや赤ちゃん服だけでなくいつもよりビタミン剤を多めに買ったとかおむつを入れるのに必要であろう大きな手さげカバンを買ったということから推測するのですそれぞれの物は購入したからと言って何かがばれる訳ではなさそうですがそういった購入行動のパターンを他の数千人の人々のデータと照らし合わせることによってその意味が見えてきますこのようにして我々は― ソーシャルメディアを通して皆さんの事を分析しています我々は数百万の人々のこの様なささいな行動パターンから様々なことを見出そうとしているのです

私の研究所では仲間たちと様々なことを正確に予測する手法を開発しました人々の政治的傾向個性性別性的傾向宗教年令知能それに加え知人をどの程度信頼しどれくらい深い関係かといったことですかなり上手くいきました繰り返しますが直接的でない情報から結果が得られるのです

私が特に気に入っている事例は米国科学アカデミー紀要に載った今年の論文でグーグルで検索すれば見つかるでしょう４ページの論文ですぐに読めますここではフェイスブックで何を「いいね！」したかという情報だけを元に先ほど挙げたような個人の特性を予測していますこの論文では高い知能と関連性の高い５つの「いいね！」の対象をリストしましたその一つがカーリー・フライのページです (笑) カーリー・フライは確かに美味しいですがカーリー・フライが好きなこと自体が平均以上の知性を意味するのではありませんではどうして対象物が予測しようとする性質と無関係なのにこれが知性と関連性の高い指標となるのでしょうこれが知性と関連性の高い指標となるのでしょうこれを説き明かすために背後にあるありとあらゆる理論に着目すべきことが分りましたその一つが社会学で「同類性」といわれるもので人間は基本的に似た者同士が集まるというものです賢い人は賢い人達と仲間になる傾向があり若者は若者同士で集まるといったことでこれは何百年もの前に確立された理論です情報がネットワークによりどう広がるかもよく分かっています話題になるビデオやフェイスブックの「いいね！」のような情報はまるで病気が伝染するようにソーシャルネットワークを通して広がるのですこういうことは長年研究され良い予測モデルがありますこういったことを合わせて考えてみればなぜあんな予測がなされるのか分ってきますそこで仮説を示すとすれば― そのページを作った人かあるいは初期に「いいね！」をした一人が知性の高い人だったのでしょう彼らが気に入ってそしてその友達がこれを見て類が友を呼び ― きっと彼には賢い友人が多いのでしょう仲間の輪は広がり彼らも気に入りそこからさらに賢い友達へと広まりネットワークを通してたくさんの知性の高い人へと伝わっていきついにはカーリー・フライへの「いいね！」という行動が商品の中身とは無関係に「いいね！」と投票した人たちの共通の特性が反映され高い知性を表すことになったのです

とても複雑な関係ですね？これを普通の方の前で説明するのは難しいしそうしたからといって普通の人はどうすべきか分かりませんね― 何かを「いいね！」したときそれとは直接関係のない性格をつかまれてしまうなんてどうして分かるでしょう？ユーザーがデータの用途をコントロールできないそんな仕組みが多くあるのですそして私は実際に問題が起こっていることを知っています

私はユーザーにデータの使用方法のコントロールを与える２つのやり方があると思いますというのもデータ利用は必ずしもユーザの― 為になっていないからです私がよく挙げる例はもし私が教授職に飽きて会社を立ち上げ皆さんの特性だとかチームワーク力とか薬物使用癖やアルコール依存などを予測します推定方法は分かっていますそして皆さんを雇用したがっている人材派遣会社や大企業にレポートを売りつけるのです我々はすぐにでもできます明日にだってビジネスを始められるでしょうそして私が皆さんのデータをそのように使うのを止めることはできませんそのことが問題だと言っているのです

取り得る対策の一つはポリシーや法律による方法ですある意味これが最も効果的かもしれませんしかし問題は我々は働きかけることしかできないことです政策決定のプロセスを見ていると多くの議員が集まって我々の話を聞いて事情を理解しユーザーが自分のデータの用途を管理できるよう米国の知的所有権法を大幅に変えるというのはとても起こりそうな気がしません

ポリシーという道もありますつまりソーシャルメディア会社が “データは皆さんのものです皆さんがすべて管理できます” と言うのです問題は収入モデルにありますソーシャルメディア会社はユーザのデータを何らかの方法で共有もしくは利用することで成り立っています時々言われる事ですがフェイスブックにとって “ユーザは顧客ではなくて製品だと” だからこのような会社が大事な資産をユーザーに返すなどという譲歩をするでしょうか？可能だとは思いますが私は直ぐには起こりそうにないと思います

もう一つの道でより効果的であろうものは科学を使うことですデータから個人の特性を導き出すメカニズムを開発できるようにしてくれた科学を研究することそれとごく似た研究によって “これにはリスクがありますよ” といった警告をユーザーに発するような仕組みを開発できるでしょうフェイスブックで「いいね！」したり個人情報を共有すると皆さんがドラッグをやっているかとか職場で上手くいっているかどうかとかそういった事について予測される精度が上がったのでした警告の仕組みを導入すると情報を公開する友達に限定するまったく共有しないといった人々の選択に影響するでしょうまたはアップロードする情報を暗号化するという手段も考えられるでしょうこれでフェイスブックの様なサイトやそこからデータを受ける第三者のサービス会社には無意味なデータとなり一方で本人が見て欲しいと思う人は見ることができますこれは知的な観点から非常に面白い研究であり科学者は喜んで取り組むでしょうその点で政策に訴える手段よりも優れています

この話をすると皆さんに― よく指摘されるのは皆が情報を非公開にするようになったら私たちの開発してきた人の行動パターンの― 予測手法が使えなくなるのではということですその通りですが私にとってそれは成功なのですなぜなら一科学者として私の目標はユーザーの情報から何かをあぶりだすことではなくて人々のオンラインにおける交流を改善することだからですそのために人々のデータから推測をすることもありますが皆さんがデータの利用を望まないのであれば皆さんにそう言う権利があるべきだと思います我々が開発するツールについてユーザーは知らされ同意の元で使われるべきだと思います

データを管理する力の一部をソーシャルメディアからユーザーに返すような科学的研究を推し進め研究者を支援することでこのようなツールは進化し進歩を遂げることになりユーザーが知識と力を持つようになるでしょうこれが理想的な方法だと同意して頂けるものと思ってます

どうも有り難うございました

(拍手)

―　もっと見る　―

―　折りたたむ　―

品詞分類

主語
動詞
助動詞
準動詞
関係詞等

品詞分類表

TED 日本語

TED Talks

関連動画

洋楽おすすめ

RECOMMENDS

洋楽歌詞