Ideas worth spreading

Get the perfect ideas,

selected just for you

TED日本語

TED Talks（英語日本語字幕付き動画）

TED日本語 - フレデリック・カプラン: 情報の世界をめぐる、タイムマシンの制作

TED Talks

情報の世界をめぐる、タイムマシンの制作

How I built an information time machine

フレデリック・カプラン

Frederic Kaplan

内容

中世のFacebookを見れたら、どうでしょう？これは、それほど突飛な話でもありません。楽しく興味深いトークで、研究者・エンジニアのフレデリック・カプランが堂々発表するのは「ヴェネツィア・タイムマシン」というプロジェクトです。全長80キロにも及ぶ保管庫にある書籍をデジタル化し、1000年にもわたるヴェネツィアの歴史や地理を再現します。（TEDxCaFoscariUで撮影）

カテゴリ

科学と技術

コンピューター

タグ　　: TED日本語

外部リンク: TED｜フレデリック・カプラン: 情報の世界をめぐる、タイムマシンの制作

字幕

SCRIPT

Script

This is an image of the planet Earth. It looks very much like the Apollo pictures that are very well known. There is something different; you can click on it, and if you click on it, you can zoom in on almost any place on the Earth. For instance, this is a bird's-eye view of the EPFL campus. In many cases, you can also see how a building looks from a nearby street. This is pretty amazing. But there's something missing in this wonderful tour: It's time. i'm not really sure when this picture was taken. I'm not even sure it was taken at the same moment as the bird's-eye view. In my lab, we develop tools to travel not only in space but also through time. The kind of question we're asking is Is it possible to build something like Google Maps of the past? Can I add a slider on top of Google Maps and just change the year, seeing how it was 100 years before,1,000 years before? Is that possible? Can I reconstruct social networks of the past? Can I make a Facebook of the Middle Ages? So, can I build time machines? Maybe we can just say, "No, it's not possible." Or, maybe, we can think of it from an information point of view. This is what I call the information mushroom. Vertically, you have the time. and horizontally, the amount of digital information available. Obviously, in the last 10 years, we have much information. And obviously the more we go in the past, the less information we have. If we want to build something like Google Maps of the past, or Facebook of the past, we need to enlarge this space, we need to make that like a rectangle. How do we do that? One way is digitization. There's a lot of material available -- newspaper, printed books, thousands of printed books. I can digitize all these. I can extract information from these. Of course, the more you go in the past, the less information you will have. So, it might not be enough. So, I can do what historians do. I can extrapolate. This is what we call, in computer science, simulation. If I take a log book, I can consider, it's not just a log book of a Venetian captain going to a particular journey. I can consider it is actually a log book which is representative ofmany journeys of that period. I'm extrapolating. If I have a painting of a facade, I can consider it's not just that particular building, but probably it also shares the same grammar of buildings where we lost any information.

So if we want to construct a time machine, we need two things. We need very large archives, and we need excellent specialists. The Venice Time Machine, the project I'm going to talk to you about, is a joint project between the EPFL and the University of Venice Ca'Foscari.

There's something very peculiar about Venice, that its administration has been very, very bureaucratic. They've been keeping track of everything, almost like Google today. At the Archivio di Stato, you have 80 kilometers of archives documenting every aspect of the life of Venice overmore than 1,000 years. You have every boat that goes out, every boat that comes in. You have every change that was made in the city. This is all there. We are setting up a 10-year digitization program which has the objective of transforming this immense archive into a giant information system. The type of objective we want to reach is 450 books a day that can be digitized. Of course, when you digitize, that's not enough, because these documents, most of them are in Latin, in Tuscan, in Venetian dialect, so you need to transcribe them, to translate them in some cases, to index them, and this is obviously not easy. In particular, traditional opticalcharacter recognition method that can be used for printed manuscripts, they do not work well on the handwritten document. So the solution is actually to take inspiration from another domain: speech recognition. This is a domain of somethingthat seems impossible, which can actually be done, simply by putting additional constraints. If you have a very good model of a language which is used, if you have a very good model of a document, how well they are structured. And these are administrative documents. They are well structured in many cases. If you divide this huge archive into smaller subsets where a smaller subsetactually shares similar features, then there's a chance of success.

If we reach that stage, then there's something else: we can extract from this document events. Actually probably 10 billion events can be extracted from this archive. And this giant information system can be searched in many ways. You can ask questions like, "Who lived in this palazzo in 1323?" "How much cost a sea bream at the Realto market in 1434?" "What was the salary of a glass maker in Murano maybe over a decade?" You can ask even bigger questions because it will be semantically coded. And then what you can do is put that in space, because much of this information is spatial. And from that, you can do things like reconstructing this extraordinary journey of that city that managed tohave a sustainable development over a thousand years, managing to have all the time a form of equilibrium with its environment. You can reconstruct that journey, visualize it in many different ways. But of course, you can not understandVenice if you just look at the city. You have to put it in a larger European context. So the idea is also to document all the things that worked at the European level. We can reconstruct also the journey of the Venetian maritime empire, how it progressively controlled the Adriatic Sea, how it became the most powerful medieval empire of its time, controlling most of the sea routes from the east to the south.

But you can even do other things, because in these maritime routes, there are regular patterns. You can go one step beyond and actually create a simulation system, create a Mediterranean simulator which is capable actually of reconstructing even the information we are missing, which would enable us to havequestions you could ask like if you were using a route planner.

"If I am in Corfu in June 1323 and want to go to Constantinople, where can I take a boat?"

Probably we can answer this question with one or two or three days' precision.

"How much will it cost?"

"What are the chance of encountering pirates?"

Of course, you understand, the central scientific challengeof a project like this one is qualifying, quantifying and representing uncertainty and inconsistencyat each step of this process. There are errors everywhere, errors in the document, it'sthe wrong name of the captain, some of the boats never actually took to sea. There are errors in translation, interpretative biases, and on top of that, if you add algorithmic processes, you're going to have errors in recognition, errors in extraction, so you have very, very uncertain data.

So how can we detect andcorrect these inconsistencies? How can we represent that form of uncertainty? It's difficult. One thing you can do is document each step of the process, not only coding the historical information but what we call the meta-historical information, how is historical knowledge constructed, documenting each step. That will not guarantee that we actually converge toward a single story of Venice, but probably we can actually reconstruct a fully documented potential story of Venice. Maybe there's not a single map. Maybe there are several maps. The system should allow for that, because we have to deal witha new form of uncertainty, which is really new for this type of giant databases.

And how should we communicate this new research to a large audience? Again, Venice is extraordinary for that. With the millions of visitors that come every year, it's actually one of the best places to try to invent the museum of the future. Imagine, horizontally you see the reconstructed map of a given year, and vertically, you see the document that served the reconstruction, paintings, for instance. Imagine an immersive system that permits to go and dive and reconstructthe Venice of a given year, some experience you could share within a group. On the contrary, imagine actually that you start from a document, a Venetian manuscript, and you show, actually, whatyou can construct out of it, how it is decoded, how the context of that document can be recreated. This is an image from an exhibit which is currently conducted in Geneva with that type of system.

So to conclude, we can say that research in the humanities is about to undergo an evolution which is maybe similar to what happened to life sciences 30 years ago. It's really a question of scale. We see projects which are much beyond any single research team can do, and this is really new for the humanities, which very often take the habit of working in small groups or only with a couple of researchers. When you visit the Archivio di Stato, you feel this is beyond what any single team can do, and that should be a joint and common effort. So what we must do for this paradigm shift is actually foster a new generation of "digital humanists" that are going to be ready for this shift.

I thank you very much.

これは地球の画像ですアポロ17号から撮影されたあの有名な写真によく似ていますよねでもちょっと違いますこの画像はクリックできクリックすることで地球上のほぼ全ての地点にズームインできます例えばこれは空から見た― ローザンヌ工科大学（EPFL）のキャンパスです多くの場合近くの通りから見た建物の様子も見ることができます本当に素晴らしいことですでもこの素敵なツアーにはあることが欠けています「時間」ですこの写真がいつ撮影されたのか分からないばかりか空撮写真と同じ時期に撮られたのかさえ分かりません私の研究室で開発しているツールは空間だけでなく時間を超えて旅ができるようにします私たちが投げかけている問いはこうです過去のGoogleマップのようなものを作れないか？つまり Googleマップの上部にスクロールバーを付けてそれで年を遡れるようにできないか？百年前や千年前の様子を見られるようにできないか？過去のソーシャル・ネットワークを再現できないか？中世のFacebookを作れないか？タイムマシンを作れないか？単に「不可能だ」と言うこともできるでしょうしかし情報という観点から考えたらどうでしょうこれは「キノコ型情報」と呼んでいるもので縦軸に時間横軸にデジタル情報蓄積量を示したグラフです過去10年たくさんの情報があることは一目瞭然ですねそして時間を遡るにつれ情報は減っていきます過去のGoogleマップや Facebookを作るためにはこの部分を広げてちょうど長方形にする必要がありますどうすればいいでしょうか？１つはデジタル化です資料はたくさんあります新聞や書籍―それも何千という書籍ですこれらを全てデジタル化してそこから情報を抽出できますもちろん昔に行くにつれ情報は少なくなるので十分ではないかもしれませんですから歴史学者のように「推定」を行うのですコンピュータ科学の世界で言うシミュレーションですここに航海日誌があるとしましょうそれをただの日誌でバチカンの船長がある航海をつづるものと捉えるのではなくその日誌に書かれているのは当時数多くされた航海の代表例だと捉えるのですこうして推定をするわけです建物の外観を描いた絵があればそれを単に特定の建物を描いたものとするのではなくおそらく同じ構造は情報が残っていない― ほかの建物にも採用されていたと考えるのです

ですからタイムマシンを作るのに必要なものは２つです大量の保存記録と優秀な専門家ですヴェネツィア・タイムマシンというプロジェクトについてお話ししますこれはローザンヌ工科大学とヴェネツィア・カ・フォスカリ大学との共同プロジェクトです

ヴェネツィアに特有なのは政府がずっと非常に官僚的であることですあらゆることを記録してきています今日のGoogleのようなものですヴェネツィアの古文書館には全長80キロにわたる保管庫がありヴェネツィア生活の全てが千年以上にわたり記録されてきています出航・到着した船も全て分かります市内のあらゆる変化が記録されていますこれらの情報は全てそこにあるのです今デジタル化の10年計画を立てておりこの膨大な資料を巨大な情報システムに変えようとしています目標として掲げているのは一日 450冊の本をデジタル化することです当然デジタル化したところで十分ではありませんというのもこれらの文書が書かれているのはたいていラテン語やトスカナ語ヴェネツィアの方言なので文字に起こして場合により翻訳もして索引を付ける必要がありどう見ても簡単なことではないのです特にこれまでの光学式文字認識（OCR）方法は印刷原稿には使えますが手書きの文書となるとうまく行きませんこれを解決するため参考にしたのは音声認識の分野です音声認識は不可能と思われたことですがただ条件を加えるだけで実現することができます必要なのは使われている言語の― 良いモデルですつまり構成が整った文書の― 良いモデルがあればよいのですこれらは行政文書ですから多くは構成が整っています膨大な保存記録を細かく分類し同じような特徴ごとに分類ができればうまくいく可能性があります

その段階まで行けば他のこともできますこの文書から出来事を抽出できるのです実際おそらくこの保存記録から 100万件の出来事が抽出できますさらにこの巨大な情報システムはさまざまな方法で検索できますこんな質問もできます「1323年にこの宮殿に住んでいたのは誰？」「1434年にレアルト市場で鯛はいくらで売られていた？」「ムラノのガラス職人の給料はいくらだった？例えばこの10年で」もっと大きな質問もできます意味に応じてコード化されているからですそれを場所と結びつけることもできます多くの情報は場所と関係しているからですそこからこの都市の素晴らしい歴史をたどることができますこの都市が千年以上もの時を超えて常に環境との均衡を保ちながら持続的な発展をとげてきた― その軌跡をたどるのです都市の歴史を再構築してさまざまな形でビジュアル化できます当然ヴェネツィアを理解するにはその都市だけではなく広くヨーロッパという文脈で見る必要がありますですからヨーロッパで起こった― 全ての事柄を記録するのです海洋帝国時代のヴェネツィアの動きを再現することもできますどのようにアドリア海の支配を強めていきどのように当時中世で最強の帝国になり東から南にわたるほとんどの海上航路を押さえたかです

他のこともできますこうした海上航路には決まったパターンがあるからですさらに一歩進めてシミュレーション・システムを作り地中海のシミュレーターを作れば欠けている情報でさえ再構築をすることができこんな質問も受けられるようになりますまるで旅行代理店に相談する感じで

「1323年6月にコルフ島からコンスタンチノープルに行くにはどこで船に乗ればよいですか？」と

おそらくこの質問へは 1日､2日､あるいは3日の誤差で答えられます

「いくらかかりますか？」

「海賊に遭遇する可能性は？」という質問もです

もちろんご承知の通りこのようなプロジェクトで核となる科学的課題はこのプロセスの各段階において不確実性や矛盾を制限・数量化し説明をすることです誤りはどこにでもあります文書にもです船長は違う名前で船は実は出航しなかったかもしれません翻訳や解釈上の誤りもあるでしょうさらにアルゴリズム的処理を加えれば認識や抽出においても誤りが出てくるでしょうですからここにあるのは非常に不確実なデータなのです

ではどうすればこうした矛盾を見つけ修正できるでしょう？不確実性の形式をどう説明できるでしょう？難しいことですができることとしたらプロセスの各段階を記録して歴史的情報だけでなくいわゆる「メタヒストリー情報」もコード化するのです歴史的知識がどう形成されたか各段階で記録するのですこれによってヴェネツィアの歴史を一つに収斂させられるとは限りませんでもおそらく完全に記録をもとにした― ヴェネツィアの歴史を再構築できますもしかしたら地図は一つでなく複数あるかもしれませんシステムはそれを許容すべきなのです不確実性の新たな形式を扱わないといけないからですその形式はこの種の巨大データベースには新しいものなのですから

ではこの新しい研究成果をどうすれば多くの人に伝えられるでしょう？あらためて申し上げるとヴェネツィアはそれに最適です毎年何百万もの人々が訪れており未来の博物館をつくるには最もふさわしい場所なのです想像してみてください下にある年の再現地図を置き壁にはその再現に使用された― 例えば絵画などの資料が見られるのですこの没入型システムによってその年のヴェネツィアに入り込んで再構築しまわりの人とその体験を共有できるのです一方でヴェネツィアの原稿などの文書から始めてそれから何が言えるか見せることができますどのように解読がされどのような文脈で文書が再生されたかなどですこちらの画像はジュネーブで現在行われている展示で同様なシステムを使って出したイメージです

結論として言えるのは人文科学の研究は今進化を遂げようとしていますちょうど 30年前に生物科学に起こったような進化ですまさに規模の問題なのですこうしたプロジェクトは１つの研究チームでできる範囲を大きく超えるもので人文科学にとっては今までなかったことなのです私たちはしばしば小さなグループや数名の研究者だけで研究する傾向にありますがあの古文書館を訪れてみれば１つの研究チームでできることを超えていて共同で行うべきものというのがわかるでしょうこうしたパラダイム・シフトに向けて私たちは「デジタル古典研究者」という新たな世代を育む必要があるのです彼らこそこのシフトにふさわしいのです

ありがとうございました

（拍手）

―　もっと見る　―

―　折りたたむ　―

品詞分類

主語
動詞
助動詞
準動詞
関係詞等

品詞分類表

TED 日本語

TED Talks

関連動画

洋楽おすすめ

RECOMMENDS

洋楽歌詞