Ideas worth spreading

Get the perfect ideas,

selected just for you

TED日本語

TED Talks（英語日本語字幕付き動画）

TED日本語 - ケネス・ツーケル: ビックデータはより良いデータ

TED Talks

ビックデータはより良いデータ

Big data is better data

ケネス・ツーケル

Kenneth Cukier

内容

自動運転車は始まったばかりです。ビックデータが牽引する技術やデザインの未来はどうなるのでしょうか？ワクワクする科学的なトークで、ケネス・ツーケルは機械学習や人間の知識などの今後を検証します。

カテゴリ

教養と教育

タグ　　: TED日本語

外部リンク: TED｜ケネス・ツーケル: ビックデータはより良いデータ

字幕

SCRIPT

Script

America's favorite pie is?

Audience: Apple. Kenneth Cukier: Apple. Of course it is. How do we know it? Because of data. You look at supermarket sales. You look at supermarket sales of 30-centimeter pies that are frozen, and apple wins, no contest. The majority of the sales are apple. But then supermarkets started selling smaller, 11-centimeter pies, and suddenly, apple fell to fourth or fifth place. Why? What happened? Okay, think about it. When you buy a 30-centimeter pie, the whole family has to agree, and apple is everyone's second favorite. (Laughter) But when you buy an individual 11-centimeter pie, you can buy the one that you want. You can get your first choice. You have more data. You can see something that you couldn't see when you only had smaller amounts of it.

Now, the point here is that more data doesn't just let us see more, more of the same thing we were looking at. More data allows us to see new. It allows us to see better. It allows us to see different. In this case, it allows us to see what America's favorite pie is: not apple.

Now, you probably all have heard the term big data. In fact, you're probably sick of hearing the term big data. It is true that there is a lot of hype around the term, and that is very unfortunate, because big data is an extremely important tool by which society is going to advance. In the past, we used to look at small data and think about what it would mean to try to understand the world, and now we have a lot more of it, more than we ever could before. What we find is that when we have a large body of data, we can fundamentally do things that we couldn't do when we only had smaller amounts. Big data is important, and big data is new, and when you think about it, the only way this planet is going to deal with its global challenges -- to feed people, supply them with medical care, supply them with energy, electricity, and to make sure they're not burnt to a crisp because of global warming -- is because of the effective use of data.

So what is new about big data? What is the big deal? Well, to answer that question, let's think about what information looked like, physically looked like in the past. In 1908, on the island of Crete, archaeologists discovered a clay disc. They dated it from 2000 B.C., so it's 4,000 years old. Now, there's inscriptions on this disc, but we actually don't know what it means. It's a complete mystery, but the point is that this is what information used to look like 4,000 years ago. This is how society stored and transmitted information.

Now, society hasn't advanced all that much. We still store information on discs, but now we can store a lot more information, more than ever before. Searching it is easier. Copying it easier. Sharing it is easier. Processing it is easier. And what we can do is we can reuse this information for uses that we never even imagined when we first collected the data. In this respect, the data has gone from a stock to a flow, from something that is stationary and static to something that is fluid and dynamic. There is, if you will, a liquidity to information. The disc that was discovered off of Crete that's 4,000 years old, is heavy, it doesn't store a lot of information, and that information is unchangeable. By contrast, all of the files that Edward Snowden took from the National Security Agency in the United States fits on a memory stick the size of a fingernail, and it can be shared at the speed of light. More data. More.

Now,one reason why we have so much data in the world today is we are collecting things that we've always collected information on, but another reason why is we're taking things that have always been informational but have never been rendered into a data format and we are putting it into data. Think, for example, the question of location. Take, for example, Martin Luther. If we wanted to know in the 1500s where Martin Luther was, we would have to follow him at all times, maybe with a feathery quill and an inkwell, and record it, but now think about what it looks like today. You know that somewhere, probably in a telecommunications carrier's database, there is a spreadsheet or at least a database entry that records your information of where you've been at all times. If you have a cell phone, and that cell phone has GPS, but even if it doesn't have GPS, it can record your information. In this respect, location has been datafied.

Now think, for example, of the issue of posture, the way that you are all sitting right now, the way that you sit, the way that you sit, the way that you sit. It's all different, and it's a function of your leg length and your back and the contours of your back, and if I were to put sensors, maybe 100 sensors into all of your chairs right now, I could create an index that's fairly unique to you, sort of like a fingerprint, but it's not your finger.

So what could we do with this? Researchers in Tokyo are using it as a potential anti-theft device in cars. The idea is that the carjacker sits behind the wheel, tries to stream off, but the car recognizes that a non-approved driver is behind the wheel, and maybe the engine just stops, unless you type in a password into the dashboard to say, "Hey, I have authorization to drive." Great.

What if every single car in Europe had this technology in it? What could we do then? Maybe, if we aggregated the data, maybe we could identify telltale signs that best predict that a car accident is going to take place in the next five seconds. And then what we will have datafied is driver fatigue, and the service would be when the car senses that the person slumps into that position, automatically knows, hey, set an internal alarm that would vibrate the steering wheel, honk inside to say, "Hey, wake up, pay more attention to the road." These are the sorts of things we can do when we datafy more aspects of our lives.

So what is the value of big data? Well, think about it. You have more information. You can do things that you couldn't do before. One of the most impressive areas where this concept is taking place is in the area of machine learning. Machine learning is a branch of artificial intelligence, which itself is a branch of computer science. The general idea is that instead of instructing a computer what do do, we are going to simply throw data at the problem and tell the computer to figure it out for itself. And it will help you understand it by seeing its origins. In the 1950s, a computer scientist at IBM named Arthur Samuel liked to play checkers, so he wrote a computer program so he could play against the computer. He played. He won. He played. He won. He played. He won, because the computer only knew what a legal move was. Arthur Samuel knew something else. Arthur Samuel knew strategy. So he wrote a small sub-program alongside it operating in the background, and all it did was score the probability that a given board configuration would likely lead to a winning board versus a losing board after every move. He plays the computer. He wins. He plays the computer. He wins. He plays the computer. He wins. And then Arthur Samuel leaves the computer to play itself. It plays itself. It collects more data. It collects more data. It increases the accuracy of its prediction. And then Arthur Samuel goes back to the computer and he plays it, and he loses, and he plays it, and he loses, and he plays it, and he loses, and Arthur Samuel has created a machine that surpasses his ability in a task that he taught it.

And this idea of machine learning is going everywhere. How do you think we have self-driving cars? Are we any better off as a society enshrining all the rules of the road into software? No. Memory is cheaper. No. Algorithms are faster. No. Processors are better. No. All of those things matter, but that's not why. It's because we changed the nature of the problem. We changed the nature of the problem from one in which we tried to overtly and explicitly explain to the computer how to drive to one in which we say, "Here's a lot of data around the vehicle. You figure it out. You figure it out that that is a traffic light, that that traffic light is red and not green, that that means that you need to stop and not go forward."

Machine learning is at the basis of many of the things that we do online: search engines, Amazon's personalization algorithm, computer translation, voice recognition systems. Researchers recently have looked at the question of biopsies, cancerous biopsies, and they've asked the computer to identify by looking at the data and survival rates to determine whether cells are actually cancerous or not, and sure enough, when you throw the data at it, through a machine-learning algorithm, the machine was able to identify the 12 telltale signs that best predict that this biopsy of the breast cancer cells are indeed cancerous. The problem: The medical literature only knew nine of them. Three of the traits were ones that people didn't need to look for, but that the machine spotted.

Now, there are dark sides to big data as well. It will improve our lives, but there are problems that we need to be conscious of, and the first one is the idea that we may be punished for predictions, that the police may use big data for their purposes, a little bit like "Minority Report." Now, it's a term called predictive policing, or algorithmic criminology, and the idea is that if we take a lot of data, for example where past crimes have been, we know where to send the patrols. That makes sense, but the problem, of course, is that it's not simply going to stop on location data, it's going to go down to the level of the individual. Why don't we use data about the person's high school transcript? Maybe we should use the fact that they're unemployed or not, their credit score, their web-surfing behavior, whether they're up late at night. Their Fitbit, when it's able to identify biochemistries, will show that they have aggressive thoughts. We may have algorithms that are likely to predict what we are about to do, and we may be held accountable before we've actually acted. Privacy was the central challenge in a small data era. In the big data age, the challenge will be safeguarding free will, moral choice, human volition, human agency.

There is another problem: Big data is going to steal our jobs. Big data and algorithms are going to challenge white collar, professional knowledge work in the 21st century in the same way that factory automation and the assembly line challenged blue collar labor in the 20th century. Think about a lab technician who is looking through a microscope at a cancer biopsy and determining whether it's cancerous or not. The person went to university. The person buys property. He or she votes. He or she is a stakeholder in society. And that person's job, as well as an entire fleet of professionals like that person, is going to find that their jobs are radically changed or actually completely eliminated. Now, we like to think that technology creates jobs over a period of time after a short, temporary period of dislocation, and that is true for the frame of reference with which we all live, the Industrial Revolution, because that's precisely what happened. But we forget something in that analysis: There are some categories of jobs that simply get eliminated and never come back. The Industrial Revolution wasn't very good if you were a horse. So we're going to need to be careful and take big data and adjust it for our needs, our very human needs. We have to be the master of this technology, not its servant. We are just at the outset of the big data era, and honestly, we are not very good at handling all the data that we can now collect. It's not just a problem for the National Security Agency. Businesses collect lots of data, and they misuse it too, and we need to get better at this, and this will take time. It's a little bit like the challenge that was faced by primitive man and fire. This is a tool, but this is a tool that, unless we're careful, will burn us.

Big data is going to transform how we live, how we work and how we think. It is going to help us manage our careers and lead lives of satisfaction and hope and happiness and health, but in the past, we've often looked at information technology and our eyes have only seen the T, the technology, the hardware, because that's what was physical. We now need to recast our gaze at the I, the information, which is less apparent, but in some ways a lot more important. Humanity can finally learn from the information that it can collect, as part of our timeless quest to understand the world and our place in it, and that's why big data is a big deal.

アメリカで人気のパイと言えば？

聴衆：「アップルパイ」もちろんアップルパイですよねどうして分かるのでしょうか？データがあるからですスーパーの売上げを考えてみましょう 30cmの冷凍パイの売上げについてですアップルパイが断トツ１位です売上げの大部分がアップルパイですところがスーパーが小さな 11ｃｍセンチのパイを売り始めると突然アップルパイは４､５番目に転落しまいましたなぜでしょうか？何が起こったのでしょうか？考えてみてください 30cmのパイを買う時は家族全員の希望に沿うパイを選びますアップルパイは家族の第二希望なのです（笑）でも個人用の11cmのパイを買う時は自分が欲しいパイを買います自分の第一希望を買えるのですデータがたくさんあるとデータが少ない時には分からなかったことが分かってくるのです

つまりより多くのデータがあると多くが見えるだけでなく見ていたことからも多くが分かるのですデータが多いほど新しいことが分かってきますより良い見方や違う見方ができるようになりますこの例で分かることは「アメリカで人気のパイはアップルパイではない」ということです

皆さんは「ビッグデータ」という言葉をお聞きになられたことがあるでしょうもしかしたら耳にタコがでくるくらいお聞きになっているかもしれませんビッグデータは誇大宣伝されている部分もあり非常に残念なことですなぜならビッグデータは社会の進歩に欠かせない非常に重要なツールだからです昔は少ないデータから世界を理解しようと考えてきました現在は以前では考えられなかった程の大量のデータがあるのです大量のデータがあるとデータ量が少なかった時に不可能だったことが根本的に可能になるということが分かってきましたビッグデータは重要で新しいものですビックデータについて考えてみると地球規模の課題について ― 食糧問題や医療の供給エネルギーや電力の供給などに対処する唯一の方法であり地球温暖化の影響でカリカリに焼けることがないようにデータを効率的に使うことが必要なのです

ビッグデータの新しいモノとは何で重大事とは何でしょうか？その問いに答えるために情報がどのようなもので過去には物理的にどう映っていたのかを考えてみましょう 1908年クレタ島で考古学者が粘土の円盤を発見しました 4.000年前の紀元前2,000年のものですこの円盤には文字が書かれていますが実質的には解読できません完全に謎なのですが 4,000年前の情報がどんなものだったのかを言いたいのですこれが社会が情報を保管して伝えたやり方です

さて社会はそれほど進歩しませんでした今でもディスクに情報を保管していますでも以前よりもずっと大量の情報を保管できるのです検索やコピーもより簡単です共有や処理もより簡単です情報を収集する時かつては想像だにしなかった情報の再利用もできるのですこの点においてデータは固定的なモノから流動的なモノへ変化のない静的なモノから変わりやすくダイナミックスなモノへと変化しているのですいうなれば情報には流動性がありますクレタ島で発見された 4,000年前の円盤は重く情報はたくさん書かれていませんし書き変えることはできないのです対照的にエドワード・スノーデンがアメリカの国家安全保障局から持ち出したファイルはすべて指の爪サイズの USBに保存でき光速で共有できるのですデータは膨れ上がっています

さて今日の世界に大量のデータがあるのは常時情報を集めているモノを収集しているからです別の理由は常に情報を含みつつもデータ形式にレンダレングされていないものを集めているからですそしてデータに置き換えます例として場所について考えてみましょうマーティン・ルターを例に挙げます 1,500年代にマーティン・ルターの居場所を知りたいのなら常に彼の後をついて行き羽ペンとインク入れを持ち運び居場所を記録しなければなりませんでも今日ではどうでしょうか電気通信業者のデータペースにより居場所が分かります常にあなたの居場所に関する情報を記録するスプレッドシートやデータベースへの登録などがあります携帯電話を持っているなら GPS機能があります GPS機能のない機種でもあなたの情報を記録できるのですつまり場所はデータ化されるのです

別の例として姿勢について考えてみましょう今皆さん全員座っておられますがあなたの座り方あなたの座り方あなたの座り方全て異なります足の長さや背中や背中の曲線などが違います今皆さんが座られている椅子に 100個のセンサーを付けるならあなた独自の座り方の特徴を指ではないですが指紋のように分類できるのです

これで何ができるのでしょうか？東京の研究者はこれを車の盗難防止装置として使えると考えています運転席に車泥棒が座るという発想により防犯につなげようとしています認証されていないドライバーが運転席に座ると「自分は認証されたドライバーである」と伝えるためにダッシュボードにパスワードを入力しないとエンジンが始動しないかもしれません素晴らしいですね

ヨーロッパで全ての車がこの技術を搭載するとどうなるのでしょうか？その時何ができるのでしょうか？おそらくデータを収集すると車の事故が次の５秒で起こることをピタリと言い当てることができるかもしれませんそしてドライバーの疲労をデータ化し車がドライバーの姿勢が悪くなってきたと感じたら自動的に内部アラームを設定しますハンドルを振動させたり「起きてください道路にもっと注意を向けましょう」と言葉で教えてくれます暮らしの様々な側面をデータ化すると私たちのできることを分類できます

つまりビックデータの価値とは何でしょうか？考えてみてくださいあなたはより多くの情報を持っており以前にはできなかったことができるのですこのコンセプトが生じる最も印象的な領域の１つが機械学習の領域です機械学習とは人口知能に含まれコンピュータ・サイエンスの１つですその概念はコンピュータに何をするかを教える代わりに単純に問題となるデータを投げるとコンピュータが独自に解明してくれるのですその起源を辿ると分かりやすいでしょう 1950年代アーサー・サミュエルという IBMのコンピュータ科学者はチェッカーが好きでしたコンピュータ・プログラムを書き彼はコンピュータと対戦しました彼は対戦して勝ちました彼は対戦して勝ちました彼は対戦して勝ちましたコンピュータが正式なルールしか知らなかったからですアーサー・サミュエルはすごいことを知っていました彼は戦略を知っていました彼はサブプログラムを作成してバックグラウンドで走らせましたサブプログラムは一手ごとにその盤面の配置から勝つ確率と負ける確率を記録したのです彼はコンピュータと対戦して勝ちました彼はコンピュータと対戦して勝ちました彼はコンピュータと対戦して勝ちましたそしてアーサー・サミュエルはコンピュータ自体がゲームをするようにしましたコンピュータは独自にゲームをしより多くのデータを集めましたより多くのデータを集めると予測の精度も上がりますそしてアーサー・サミュエルはコンピュータの所へ戻り対戦して負けました彼は対戦して負けました彼は対戦して負けましたアーサー・サミュエルは教えたタスクで彼の能力を凌ぐコンピュータを作りあげました

機械学習という発想はどこにでもあります自動運転車はどのように作られたと思いますか？ソフトウェアに全道路法規を記入するとより豊かな社会なのでしょうか？いいえ記憶装置は安価？いいえアルゴリズムがより速い？いいえプロセッサがより良い？いいえそれらはすべて重要ですがそれが理由ではありません問題の性質を変えているからです私たちの言わんとすることを ― 例えば「自動車の周辺には多くの情報があり皆さんはそれを理解しています信号機についても理解しています信号機は赤で青ではないので停止する必要があり前進できません」ということをコンピュータに明確に説明しようと試みていた問題の性質を変えてしまいました

機械学習は私たちがネット上で行う多くの事の根底となっています例えば検索エンジン Amazonのパーソナライズ・アルゴリズム自動翻訳音声認識などです最近研究者は生検やガンの生検について研究しており細胞が実際ガンに冒されているかどうかを調べるためにデータや生存率を使ってコンピュータに特定させようとしています案の定データを入力すると機会学習のアルゴリズム経由でコンピュータは12個の兆候を特定することで乳ガン細胞の生検結果はガンであるとビタリと予測します問題は医学文献が９個しか兆候を知らなかったことです特性のうち３個は探す必要がないものでしたがコンピュータは見つけました

さてビックデータにも負の側面があります私たちの暮らしを向上させますが意識しなければならない問題もあります最初の問題は『マイノリティ・リポート』のように警察が目的のためにビックデータを使って予測に基づいて罰するかもしれないということですさて予測警備とかアルゴリズム的犯罪学という用語です例えば過去の犯罪がどこで起こったかというデータがたくさんあるとパトロールすべき所が分かるという考え方ですその通りですがもちろん問題もあります場所のデータだけで止まらず個人レベルにまで下りていってしまうことです個人の高校の成績証明書のデータを使うのはどうでしょうか？失業しているのかどうか信用情報ネットサーフィンの行動パターン夜更かしするのかどうかなどを使うかもしれません Fitbit による生化学情報を得れば使用者が積極的な考え方をしていることさえ分かります私たちの行動を予測し得るアルゴリズムがあり実際に私たちが行動する前に責任を負うことになるかもしれませんスモールデータの時代ではプライバシーが中心的な課題でしたがビックデータの時代では課題は自由意思や道徳基準の選択人間の決断力や行為主体性などを保護することです

ビックデータに職を奪われるという別の問題もありますビックデータやアルゴリズムは 20世紀に工場の自動化や組立ラインがブルーカラーに対抗したように 21世紀にはホワイトカラーや専門職に対抗することになるでしょう顕微鏡を使ってガン生検を調べてガンであるかどうかを決める検査技師について考えてみましょうその検査技師は大学教育を受けました不動産物件を買ったり投票したり社会への出資者でもありますそして検査技師の仕事とは同じような専門職の一群と同様に仕事内容が根本的に変わったり完全に無くなったりします短い一時的な混乱の後長年に渡ってテクノロジーが仕事を作ってきたことについて考えてください私たちが暮らす枠組み産業革命 ― は真実でまさに起こったことですしかしその分析で忘れていることがありますそれはなくなり二度と戻ってこなかった職種があるということです産業革命はあまり有難くないものでしたあなたが馬であればですから注意深くビックデータを取扱い私たちのニーズ非常に人間的なニーズのために調整していく必要があります私たちはこの技術の召使ではなく所有者にならなければなりませんビックデータの時代は始まったばかりなので正直言って今集められた全データに私たちはあまりうまく対処できていません国家安全保障局だけの問題ではありません企業も多くのデータを集め乱用していますうまく使えるようになるには時間がかかります原始人と火が直面していた課題にちょっと似ていますこれはツールですが注意しないと私たちを焼いてしまうツールなのです

ビックデータは生き方や働き方や考え方を変えていくことでしょう私たちのキャリアを管理して満足して希望が持て幸福で健康な暮らしに導くことでしょうしかし過去に情報技術でよくあったように物理的なものであるＴ ― 技術やハードウェアに目が行きがちになります明確でない部分があるもののいくつかの点においてかなり重要であるＩ ― 情報に再び着目する必要があります世界や私たちの居場所を理解するために時代を超えた冒険の一端として集めた情報から人間性がついに学べるのですそのことがビックデータが重大事な理由なのです

(拍手)

―　もっと見る　―

―　折りたたむ　―

品詞分類

主語
動詞
助動詞
準動詞
関係詞等

品詞分類表

TED 日本語

TED Talks

関連動画

洋楽おすすめ

RECOMMENDS

洋楽歌詞