0:00:03 > 0:00:10The world we live in is awash with data that comes pouring in from everywhere around us.
0:00:10 > 0:00:14On its own this data is just noise and confusion.
0:00:14 > 0:00:22To make sense of data, to find the meaning in it, we need the powerful branch of science - statistics.
0:00:22 > 0:00:26Believe me there's nothing boring about statistics.
0:00:26 > 0:00:29Especially not today when we can make the data sing.
0:00:29 > 0:00:33With statistics we can really make sense of the world.
0:00:33 > 0:00:35And there's more.
0:00:35 > 0:00:40With statistics, the data deluge, as it's being called, is leading us
0:00:40 > 0:00:46to an ever greater understanding of life on Earth and the universe beyond.
0:00:46 > 0:00:50And thanks to the incredible power of today's computers,
0:00:50 > 0:00:57it may fundamentally transform the process of scientific discovery.
0:00:57 > 0:01:02I kid you not, statistics is now the sexiest subject around.
0:01:23 > 0:01:25Did you know that there is one million boats in Sweden?
0:01:25 > 0:01:27That's one boat per nine people!
0:01:27 > 0:01:31It's the highest number of boats per person in Europe!
0:01:41 > 0:01:45Being a statistician, you don't like telling your profession at dinner parties.
0:01:45 > 0:01:48But really, statisticians shouldn't be shy
0:01:48 > 0:01:51because everyone wants to understand what's going on.
0:01:51 > 0:01:56And statistics gives us a perspective on the world we live in
0:01:56 > 0:01:59that we can't get in any other way.
0:02:03 > 0:02:09Statistics tells us whether the things we think and believe are actually true.
0:02:19 > 0:02:25And statistics are far more useful than we usually like to admit.
0:02:25 > 0:02:29In the last recession there was this famous call-in to a talk radio station.
0:02:29 > 0:02:37The man complained, "In times like this when unemployment rates are up to 13%, income has fallen by 5%,
0:02:37 > 0:02:41"and suicide rates are climbing, and I get so angry that the government
0:02:41 > 0:02:45"is wasting money on things like collection of statistics."
0:02:48 > 0:02:50I'm not officially a statistician.
0:02:50 > 0:02:55Strictly speaking, my field is global health.
0:02:58 > 0:03:03But I got really obsessed with stats when I realised how much people
0:03:03 > 0:03:06in Sweden just don't know about the rest of the world.
0:03:06 > 0:03:10I started in our medical university, Karolinska Institutet,
0:03:10 > 0:03:13an undergraduate course called Global Health.
0:03:13 > 0:03:17These students coming to us actually have the highest grade you can get
0:03:17 > 0:03:18in the Swedish college system,
0:03:18 > 0:03:22so I thought, "Maybe they know everything I'm going to teach them."
0:03:22 > 0:03:25So I did a pre-test when they came, and one of the questions
0:03:25 > 0:03:28from which I learned a lot was this one -
0:03:28 > 0:03:32which country has the highest child mortality of these five pairs?
0:03:32 > 0:03:34I won't put you at test here, but it is Turkey
0:03:34 > 0:03:37which is highest there, Poland,
0:03:37 > 0:03:40Russia, Pakistan, and South Africa.
0:03:40 > 0:03:43And these were the result of the Swedish students.
0:03:43 > 0:03:44A 1.8 right answer out of five possible.
0:03:44 > 0:03:49And that means there was a place for a professor of International Health and for my course.
0:03:49 > 0:03:56But one late night when I was compiling the report, I really realised my discovery.
0:03:56 > 0:04:01I had shown that Swedish top students know statistically
0:04:01 > 0:04:04significantly less about the world than the chimpanzees.
0:04:06 > 0:04:09Because the chimpanzees would score half right.
0:04:09 > 0:04:12If I gave them two bananas with Sri Lanka and Turkey,
0:04:12 > 0:04:15they would be right half of the cases, but the students are not there.
0:04:15 > 0:04:20I did also an unethical study of the professors of the Karolinska Institutet,
0:04:20 > 0:04:25that hands out the Nobel Prize for medicine, and they are on par with the chimpanzees there.
0:04:28 > 0:04:32Today there's more information accessible than ever before.
0:04:32 > 0:04:35'And I work with my team at the Gapminder Foundation
0:04:35 > 0:04:41'using new tools that help everyone make sense of the changing world.
0:04:41 > 0:04:45'We draw on the masses of data that's now freely available
0:04:45 > 0:04:49'from international institutions like the UN and the World Bank.
0:04:49 > 0:04:53'And it's become my mission to share the insights
0:04:53 > 0:05:00'from this data with anyone who'll listen, and to reveal how statistics is nothing to be frightened of.'
0:05:02 > 0:05:05I'm going to provide you a view of
0:05:05 > 0:05:09the global health situation across mankind.
0:05:09 > 0:05:14And I'm going to do that in hopefully an enjoyable way, so relax.
0:05:14 > 0:05:17So we did this software which displays it like this.
0:05:17 > 0:05:19Every bubble here is a country -
0:05:19 > 0:05:21this is China, this is India.
0:05:21 > 0:05:23The size of the bubble is the population.
0:05:23 > 0:05:27I'm going to stage a race between this sort of yellowish Ford here
0:05:27 > 0:05:32and the red Toyota down there and the brownish Volvo.
0:05:32 > 0:05:36The Toyota has a very bad start down here, and United States,
0:05:36 > 0:05:38Ford is going off-road there,
0:05:38 > 0:05:40and the Volvo is doing quite fine, this is the war.
0:05:40 > 0:05:43The Toyota got off track, now Toyota is on the healthier side of Sweden.
0:05:43 > 0:05:46That's about where I sold the Volvo and bought the Toyota.
0:05:46 > 0:05:47AUDIENCE LAUGH
0:05:47 > 0:05:50This is the great leap forward, when China fell down.
0:05:50 > 0:05:53It was the central planning by Mao Zedong.
0:05:53 > 0:05:56China recovered and said, "Never more stupid central planning,"
0:05:56 > 0:05:57but they went up here.
0:05:57 > 0:06:02No, there is one more inequity, look there - United States
0:06:02 > 0:06:07They broke my frame. Washington DC is so rich over there,
0:06:07 > 0:06:13but they are not as healthy as Kerala in India. It's quite interesting, isn't it?
0:06:13 > 0:06:14LAUGHTER AND APPLAUSE
0:06:20 > 0:06:25Welcome to the USA, world leaders in big cars
0:06:25 > 0:06:28and free data.
0:06:28 > 0:06:35There are many here who share my vision of making public data accessible and useful for everyone.
0:06:35 > 0:06:43The city of San Francisco is in the lead, opening up its data on everything.
0:06:43 > 0:06:47Even the police department is releasing all its crime reports.
0:06:47 > 0:06:50This official crime data has been turned
0:06:50 > 0:06:55into a wonderful interactive map by two of the city's computer whizzes.
0:06:55 > 0:06:58It's community statistics in action.
0:07:09 > 0:07:13Crimespotting is a map of crime reports from the San Francisco Police Department
0:07:13 > 0:07:16showing dots on maps for citizens to be able to see
0:07:16 > 0:07:19patterns of crime around their neighbourhoods in San Francisco.
0:07:19 > 0:07:25The map is not just about individual crimes but about broader patterns that show you where crime is
0:07:25 > 0:07:27clustered around the city, which areas have high crime,
0:07:27 > 0:07:30and which areas have relatively low crime.
0:07:36 > 0:07:41We're here at the top of Jones Street on Nob Hill...
0:07:42 > 0:07:45..quite a nice neighbourhood.
0:07:45 > 0:07:49What the crime maps show us is the relationship between
0:07:49 > 0:07:51topography and crime.
0:07:51 > 0:07:54Basically the higher up the hill, the less crime there is.
0:07:56 > 0:07:58You cross over the border
0:07:58 > 0:08:00into the flats...
0:08:02 > 0:08:09Essentially as soon as you get into the lower lying areas of Jones Street the crime just skyrockets.
0:08:20 > 0:08:24We're here in the uptown Tenderloin district.
0:08:26 > 0:08:30It's one of the oldest and densest neighbourhoods in San Francisco.
0:08:30 > 0:08:32This is where you go to buy drugs.
0:08:32 > 0:08:33Right around here.
0:08:37 > 0:08:41We see lots of aggravated assaults, lots of auto thefts.
0:08:41 > 0:08:48Basically a huge part of the crime that happens in the city happens in this five or six block radius.
0:08:55 > 0:08:58If you've been hearing police sirens in your neighbourhood,
0:08:58 > 0:09:02you can use the map to find out why.
0:09:02 > 0:09:05If you're out at night in an unfamiliar part of town,
0:09:05 > 0:09:09you can check the map for streets to avoid.
0:09:09 > 0:09:12If a neighbour gets burgled, you can see -
0:09:12 > 0:09:16is it a one-off or has there been a spike in local crime?
0:09:16 > 0:09:19If you commute through a neighbourhood and you're worried
0:09:19 > 0:09:23about its safety, the fact that we have the ability to turn off all
0:09:23 > 0:09:25the night-time and middle-of-the-day crimes
0:09:25 > 0:09:28and show you just the things that are happening during the commute,
0:09:28 > 0:09:32it is a statistical operation. But I think to people that are interacting with the thing
0:09:32 > 0:09:38it feels very much more like they're just sort of browsing a website or shopping on Amazon.
0:09:38 > 0:09:43They're looking at data and they don't realise they're doing statistics.
0:09:43 > 0:09:47What's most exciting for me is that public statistics
0:09:47 > 0:09:52is making citizens more powerful and the authorities more accountable.
0:10:02 > 0:10:04We have community meetings that the police attend
0:10:04 > 0:10:08and what citizens are now doing are bringing printouts
0:10:08 > 0:10:12of the maps that show where crimes are taking place,
0:10:12 > 0:10:16and they're demanding services from the police department
0:10:16 > 0:10:20and the police department is now having to change how they police,
0:10:20 > 0:10:22how they provide policing services,
0:10:22 > 0:10:27because the data is showing what is working and what is not.
0:10:28 > 0:10:31People in San Francisco are also using public data
0:10:31 > 0:10:35to map social inequalities and see how to improve society.
0:10:35 > 0:10:39And the possibilities are endless.
0:10:39 > 0:10:43I think our dream government data analysis project
0:10:43 > 0:10:46would really be focused on live information,
0:10:46 > 0:10:51on stuff that was being reported and pushed out to the world over the internet as it was happening.
0:10:51 > 0:10:55You know, trash pickups, traffic accidents, buses,
0:10:55 > 0:10:57and I think through the kind of stats-gathering power
0:10:57 > 0:11:02of the internet it's possible to really begin to see the workings of the city
0:11:02 > 0:11:04displayed as a unified interface.
0:11:07 > 0:11:09So that's where we are heading.
0:11:09 > 0:11:14Towards a world of free data with all the statistical insights that come from it,
0:11:14 > 0:11:21accessible to everyone, empowering us as citizens and letting us hold our rulers to account.
0:11:21 > 0:11:26It's a long way from where statistics began.
0:11:26 > 0:11:32Statistics are essential to us to monitor our governments and our societies.
0:11:32 > 0:11:36But it was our rulers up there who started
0:11:36 > 0:11:40the collection of statistics in the first place in order to monitor us!
0:11:46 > 0:11:51In fact the word 'statistics' comes from 'the state'.
0:11:51 > 0:11:55Modern statistics began two centuries ago.
0:11:55 > 0:11:59Once it got going, it spread and never stopped.
0:11:59 > 0:12:01And guess who was first!
0:12:03 > 0:12:07The Chinese have Confucius, the Italians have da Vinci,
0:12:07 > 0:12:10and the British have Shakespeare.
0:12:10 > 0:12:12And we have the Tabellverket -
0:12:12 > 0:12:16the first ever systematic collection of statistics!
0:12:16 > 0:12:21Since the year 1749 we have collected data
0:12:21 > 0:12:26on every birth, marriage and death, and we are proud of it!
0:12:29 > 0:12:32The Tabellverket recorded information
0:12:32 > 0:12:34from every parish in Sweden.
0:12:34 > 0:12:39It was a huge quantity of data and it was the first time any government
0:12:39 > 0:12:41could get an accurate picture of its people.
0:12:49 > 0:12:53Sweden had been the greatest military power in Northern Europe,
0:12:53 > 0:12:58but by 1749 our star was really fading
0:12:58 > 0:13:00and other countries were growing stronger.
0:13:00 > 0:13:03At least we were a large power,
0:13:03 > 0:13:09thought to have 20 million people, enough to rival Britain and France.
0:13:13 > 0:13:18But we were in for a nasty surprise.
0:13:18 > 0:13:20The first analysis of the Tabellverket
0:13:20 > 0:13:24revealed that Sweden only had two million inhabitants.
0:13:24 > 0:13:30Sweden was not just a power in decline, it also had a very small population.
0:13:30 > 0:13:36The government was horrified by this finding - what if the enemy found out?
0:13:37 > 0:13:44But the Tabellverket also showed that many women died in childbirth and many children died young.
0:13:44 > 0:13:48So government took action to improve the health of the people.
0:13:48 > 0:13:52This was the beginning of modern Sweden.
0:13:53 > 0:13:59It took more than 50 years before the Austrians, Belgians, Danes,
0:13:59 > 0:14:02Dutch, French, Germans, Italians
0:14:02 > 0:14:08and, finally, the British, caught up with Sweden in collecting and using statistics.
0:14:24 > 0:14:29It was called political arithmetic. It was a lovely phrase that was used for statistics.
0:14:29 > 0:14:33Governments could have much more control and understanding of
0:14:33 > 0:14:36the society - how it was working, how it was developing
0:14:36 > 0:14:40and essentially so they could control it better.
0:14:43 > 0:14:47It wasn't just governments who woke up to the power of statistics.
0:14:47 > 0:14:54Right across Europe, 19th century society went mad for facts.
0:14:54 > 0:14:57And, despite its late start, Britain,
0:14:57 > 0:15:01with its Royal Statistical Society in London,
0:15:01 > 0:15:04was soon a statisticians' nirvana.
0:15:05 > 0:15:09I love looking at old copies of the Royal Statistical Society journal
0:15:09 > 0:15:11because it's full of such odd stuff.
0:15:11 > 0:15:14There's a wonderful paper from the 1840s
0:15:14 > 0:15:19which shows a map of England and the rates of bastardy in each county.
0:15:19 > 0:15:23So you can identify very quickly the areas with high rates of bastardy.
0:15:23 > 0:15:27Being in East Anglia it always makes me slightly laugh that Norfolk
0:15:27 > 0:15:30seems to top the "bastardy league" in the 1840s.
0:15:30 > 0:15:36One of the founders of the Royal Statistical Society
0:15:36 > 0:15:42was the great Victorian mathematician and inventor Charles Babbage.
0:15:42 > 0:15:50In 1842 he read the latest poem by an equally great Victorian, Alfred Tennyson.
0:15:50 > 0:15:53Vision of Sin contained the lines:
0:15:53 > 0:15:55"Fill the cup, and fill the can
0:15:55 > 0:15:58"Have a rouse before the morn
0:15:58 > 0:16:03"Every moment dies a man Every moment one is born."
0:16:03 > 0:16:07So keen a statistician was Babbage that he could not contain himself.
0:16:07 > 0:16:09He dashed off a letter to Tennyson
0:16:09 > 0:16:12explaining that because of population growth,
0:16:12 > 0:16:13the line should read,
0:16:13 > 0:16:18"Every moment dies a man and one and a 16th is born."
0:16:18 > 0:16:22I may add that the exact figure is 1.067,
0:16:22 > 0:16:27but something must be conceded to the laws of metre.
0:16:31 > 0:16:36In the 19th century, scholars all over Europe did amazing work
0:16:36 > 0:16:39in measuring their societies.
0:16:39 > 0:16:42They were hoovering up data on almost everything.
0:16:42 > 0:16:46But numbers alone don't tell you anything.
0:16:46 > 0:16:51You have to analyse them, and that's what makes statistics.
0:16:55 > 0:16:59When the first statisticians began to get to grips with
0:16:59 > 0:17:00analysing their data
0:17:00 > 0:17:05they seized upon the average, and they took the average of everything.
0:17:09 > 0:17:13What's so great about an average is that
0:17:13 > 0:17:18you can take a whole mass of data and reduce it to a single number.
0:17:21 > 0:17:26And though each of us is unique, our collective lives produce
0:17:26 > 0:17:29averages that can characterise whole populations.
0:17:41 > 0:17:45I looked in my local newspaper one week and saw a pensioner
0:17:45 > 0:17:49had accidentally put her foot on the accelerator
0:17:49 > 0:17:52and crushed her friend against a wall.
0:17:52 > 0:17:56Devastating, hideous, horrible thing to happen.
0:17:56 > 0:18:01And then there was a second one about a young man who didn't have
0:18:01 > 0:18:07a driving licence, was driving a car under the influence of drugs and alcohol
0:18:07 > 0:18:10and he bashed into a pedestrian and killed him.
0:18:10 > 0:18:15What's remarkable, absolutely remarkable, if you look at the number
0:18:15 > 0:18:22of people who die each year in traffic crashes, it's nearly a constant.
0:18:22 > 0:18:24What?
0:18:24 > 0:18:31All these individual events, somehow when you sum them all up there's the same number every year.
0:18:31 > 0:18:35And every year, two and a half times as many men
0:18:35 > 0:18:38die in traffic crashes as women, and it's a constant.
0:18:38 > 0:18:44And every year the rate in Belgium is double the rate in England.
0:18:44 > 0:18:47There are these remarkable regularities.
0:18:47 > 0:18:54So that these individual particular events sum up into a social phenomenon.
0:18:56 > 0:18:58Let's see what Sweden have done.
0:18:58 > 0:19:01We used to boast about fast social progress, that's where we were....
0:19:01 > 0:19:05'In my lectures, to tell stories about the changing world,
0:19:05 > 0:19:08'I use the averages from entire countries,
0:19:08 > 0:19:12'whether the average of income, child mortality, family size
0:19:12 > 0:19:13'or carbon output.'
0:19:13 > 0:19:16OK, I give you Singapore. The year I was born,
0:19:16 > 0:19:20Singapore had twice the child mortality of Sweden, the most tropical country in the world,
0:19:20 > 0:19:22a marshland on the Equator, and here we go.
0:19:22 > 0:19:25It took a little time for them to get independent,
0:19:25 > 0:19:27but then they started to grow their economy,
0:19:27 > 0:19:29and they made the social investment, they got away malaria,
0:19:29 > 0:19:33they got a magnificent health system that beat both US and Sweden.
0:19:33 > 0:19:37We never thought it would happen that they would win over Sweden!
0:19:37 > 0:19:40LAUGHTER AND APPLAUSE
0:19:40 > 0:19:46But useful as averages are, they don't tell you the whole story.
0:19:48 > 0:19:53On average, Swedish people have slightly less than two legs.
0:19:53 > 0:19:57This is because few people only have one leg or no legs,
0:19:57 > 0:19:59and no-one has three legs.
0:19:59 > 0:20:06So almost everybody in Sweden has more than the average number of legs.
0:20:06 > 0:20:10The variation in data is just as important as the average.
0:20:16 > 0:20:19But how do you get a handle on variation?
0:20:19 > 0:20:23For this, you transform numbers into shapes.
0:20:23 > 0:20:26Let's look again at the number of adult women in Sweden
0:20:26 > 0:20:27for different heights.
0:20:27 > 0:20:31Plotting the data as a shape shows how much their heights
0:20:31 > 0:20:36vary from the average and how wide that variation is.
0:20:36 > 0:20:41The shape a set of data makes is called its distribution.
0:20:41 > 0:20:46This is the income distribution of China, 1970.
0:20:46 > 0:20:51This is the income distribution of the United States, 1970.
0:20:51 > 0:20:54Almost no overlap, and what has happened?
0:20:54 > 0:20:56China is growing, it's not so equal any longer,
0:20:56 > 0:21:01and it's appearing here overlooking the United States.
0:21:01 > 0:21:03Almost like a ghost, isn't it?
0:21:03 > 0:21:05It's pretty scary.
0:21:05 > 0:21:06Rrrr!
0:21:06 > 0:21:08LAUGHTER
0:21:17 > 0:21:21The statisticians who first explored distribution
0:21:21 > 0:21:25discovered one shape that turned up again and again.
0:21:25 > 0:21:28The Victorian scholar Francis Galton
0:21:28 > 0:21:32was so fascinated he built a machine that could reproduce it,
0:21:32 > 0:21:36and he found it fitted so many different sets of measurements
0:21:36 > 0:21:38that he named it the normal distribution.
0:21:38 > 0:21:45Whether it was people's arm spans, lung capacities,
0:21:45 > 0:21:47or even their exam results,
0:21:47 > 0:21:51the normal distribution shape recurred time and time again.
0:21:51 > 0:21:56Other statisticians soon found many other regular shapes,
0:21:56 > 0:22:01each produced by particular kinds of natural or social processes.
0:22:01 > 0:22:05And every statistician has their favourite.
0:22:05 > 0:22:09The Poisson distribution, the Poisson shape is my favourite distribution.
0:22:09 > 0:22:11I think it's an absolute cracker.
0:22:15 > 0:22:18The Poisson shape describes how likely it is
0:22:18 > 0:22:21that out-of-the-ordinary things will happen.
0:22:21 > 0:22:24Imagine a London bus stop where we know that on average
0:22:24 > 0:22:26we'll get three buses in an hour.
0:22:26 > 0:22:29We won't always get three buses, of course.
0:22:29 > 0:22:33Amazingly, the Poisson shape will show us the probability
0:22:33 > 0:22:37that in any given hour we will get four, five, or six buses,
0:22:37 > 0:22:39or no buses at all.
0:22:40 > 0:22:43The exact shape changes with the average.
0:22:43 > 0:22:46But whether it's how many people will win the lottery jackpot
0:22:46 > 0:22:48each week,
0:22:48 > 0:22:51or how many people will phone a call centre each minute,
0:22:51 > 0:22:54the Poisson shape will give the probabilities.
0:22:57 > 0:23:01The wonderful example where this was applied to in the late 19th century
0:23:01 > 0:23:04was to count each year the number of Prussian officers,
0:23:04 > 0:23:07cavalry officers, who were kicked to death by their horses.
0:23:07 > 0:23:10Now, some years there were none, some years there were one,
0:23:10 > 0:23:13some years there were two, up to seven, I think, one particularly bad year.
0:23:13 > 0:23:16But with this distribution, however many years there were
0:23:16 > 0:23:19with nought, one, two, three, four Prussian cavalry officers
0:23:19 > 0:23:23kicked to death by their horses, beautifully obeyed the Poisson distribution.
0:23:42 > 0:23:48So statisticians use shapes to reveal the patterns in the data.
0:23:48 > 0:23:51But we also use images of all kinds
0:23:51 > 0:23:54to communicate statistics to a wider public.
0:23:54 > 0:23:57Because if the story in the numbers
0:23:57 > 0:24:02is told by a beautiful and clever image, then everyone understands.
0:24:02 > 0:24:09Of the pioneers of statistical graphics, my favourite is Florence Nightingale.
0:24:24 > 0:24:27There are not many people who realise that she was known
0:24:27 > 0:24:30as a passionate statistician and not just the Lady of the Lamp.
0:24:30 > 0:24:34She said that "to understand God's thoughts, we must study statistics,
0:24:34 > 0:24:37"for these are the measure of His purpose."
0:24:37 > 0:24:40Statistics was for her a religious duty and moral imperative.
0:24:42 > 0:24:45When Florence was nine years old she started collecting data.
0:24:45 > 0:24:48Her data was different fruits and vegetables she found.
0:24:48 > 0:24:50Put them into different tables.
0:24:50 > 0:24:52Trying to organise them in some standard form.
0:24:52 > 0:24:55And so we have one of Nightingale's first statistical tables
0:24:55 > 0:24:57at the age of nine.
0:25:04 > 0:25:11In the mid 1850s Florence Nightingale went to the Crimea to care for British casualties of war.
0:25:11 > 0:25:14She was horrified by what she discovered.
0:25:14 > 0:25:19For all the soldiers being blown to bits on the battlefield, there were many, many more soldiers
0:25:19 > 0:25:25dying from diseases they caught in the army's filthy hospitals.
0:25:25 > 0:25:29So Florence Nightingale began counting the dead.
0:25:29 > 0:25:34For two years she recorded mortality data in meticulous detail.
0:25:34 > 0:25:39When the war was over she persuaded the government to set up
0:25:39 > 0:25:41a Royal Commission of Inquiry,
0:25:41 > 0:25:44and gathered her data in a devastating report.
0:25:44 > 0:25:48What has cemented her place in the statistical history books
0:25:48 > 0:25:50are the graphics she used.
0:25:50 > 0:25:53And one in particular, the polar area graph.
0:25:53 > 0:25:58For each month of the war, a huge blue wedge represented
0:25:58 > 0:26:02the soldiers who had died from preventable diseases.
0:26:02 > 0:26:05The much smaller red wedges were deaths from wounds,
0:26:05 > 0:26:10and the black wedges were deaths from accidents and other causes.
0:26:10 > 0:26:17Nightingale's graphics were so clear they were impossible to ignore.
0:26:17 > 0:26:19The usual thing around Florence Nightingale's time
0:26:19 > 0:26:23was just to produce tables and tables of figures - absolutely really tedious stuff that,
0:26:23 > 0:26:26unless you're an absolutely dedicated statistician,
0:26:26 > 0:26:29it's really quite difficult to spot the patterns quite naturally.
0:26:29 > 0:26:33But visualisations, they tell a story, they tell a story immediately.
0:26:33 > 0:26:38And the use of colour and the use of shape can really tell a powerful story.
0:26:38 > 0:26:41And nowadays of course we can make things move as well.
0:26:41 > 0:26:44Florence Nightingale would have loved to have played with...
0:26:44 > 0:26:48She would have produced wonderful animations, I'm absolutely certain of it.
0:26:50 > 0:26:54Today, 150 years on, Nightingale's graphics
0:26:54 > 0:26:57are rightly regarded as a classic.
0:26:57 > 0:27:00They led to a revolution in nursing, health care
0:27:00 > 0:27:05and hygiene in hospitals worldwide, which saved innumerable lives.
0:27:07 > 0:27:11And statistical graphics has become an art form of its very own,
0:27:11 > 0:27:16led by designers who are passionate about visualising data.
0:27:24 > 0:27:27This is the Billion Pound-O-Gram.
0:27:27 > 0:27:29This image arose out of frustration
0:27:29 > 0:27:32with the reporting of billion pound amounts in the media.
0:27:32 > 0:27:34£500 billion pounds for this war.
0:27:34 > 0:27:36£50 billion for this oil spill.
0:27:36 > 0:27:39It doesn't make sense - the numbers are too enormous to get your mind round.
0:27:39 > 0:27:43So I scraped all this data from various news sources and created this diagram.
0:27:43 > 0:27:48So the squares here are scaled according to the billion pound amounts.
0:27:48 > 0:27:51When you see numbers visualised like this
0:27:51 > 0:27:54you start to have a different relationship with them.
0:27:54 > 0:27:56You can start to see the patterns, and the scale of them.
0:27:56 > 0:27:59Here in the corner, this little square - £37 billion.
0:27:59 > 0:28:02This was the predicted cost of the Iraq war in 2003.
0:28:02 > 0:28:06As you can see it's grown exponentially over the last few years
0:28:06 > 0:28:10and the total cost now is around about £2,500 billion.
0:28:10 > 0:28:13It's funny because when you visualise statistics
0:28:13 > 0:28:15you understand them, and when you understand them
0:28:15 > 0:28:18you can really start to put things in perspective.
0:28:23 > 0:28:27Visualisation is right at the heart of my own work too.
0:28:27 > 0:28:30I teach global health.
0:28:30 > 0:28:33And I know having the data is not enough -
0:28:33 > 0:28:39I have to show it in ways people both enjoy and understand.
0:28:39 > 0:28:42Now I'm going to try something I've never done before.
0:28:42 > 0:28:45Animating the data in real space,
0:28:45 > 0:28:50with a bit of technical assistance from the crew.
0:28:50 > 0:28:52So here we go.
0:28:52 > 0:28:54First, an axis for health.
0:28:54 > 0:28:58Life expectancy from 25 years to 75 years.
0:28:58 > 0:29:01And down here an axis for wealth.
0:29:01 > 0:29:06Income per person - 400, 4,000, 40,000.
0:29:06 > 0:29:10So down here is poor and sick.
0:29:10 > 0:29:14And up here is rich and healthy.
0:29:14 > 0:29:18Now I'm going to show you the world
0:29:18 > 0:29:21200 years ago, in 1810.
0:29:21 > 0:29:22Here come all the countries.
0:29:22 > 0:29:26Europe, brown; Asia, red; Middle East, green;
0:29:26 > 0:29:29Africa south of the Sahara, blue; and the Americas, yellow.
0:29:29 > 0:29:33And the size of the country bubble shows the size of the population.
0:29:33 > 0:29:37In 1810, it was pretty crowded down there, wasn't it?
0:29:37 > 0:29:39All countries were sick and poor.
0:29:39 > 0:29:43Life expectancy was below 40 in all countries.
0:29:43 > 0:29:48And only UK and the Netherlands were slightly better off. But not much.
0:29:48 > 0:29:52And now I start the world.
0:29:52 > 0:29:56The industrial revolution makes countries in Europe and elsewhere
0:29:56 > 0:29:59move away from the rest.
0:29:59 > 0:30:02But the colonized countries in Asia and Africa,
0:30:02 > 0:30:04they are stuck down there.
0:30:04 > 0:30:08And eventually the Western countries get healthier and healthier.
0:30:08 > 0:30:13And now we slow down to show the impact of the First World War
0:30:13 > 0:30:15and the Spanish flu epidemic.
0:30:15 > 0:30:18What a catastrophe!
0:30:18 > 0:30:22And now I speed up through the 1920s and the 1930s and,
0:30:22 > 0:30:24in spite of the Great Depression,
0:30:24 > 0:30:27Western countries forge on towards greater wealth and health.
0:30:27 > 0:30:29Japan and some others try to follow.
0:30:29 > 0:30:32But most countries stay down here.
0:30:32 > 0:30:35And after the tragedies of the Second World War,
0:30:35 > 0:30:39we stop a bit to look at the world in 1948.
0:30:39 > 0:30:421948 was a great year.
0:30:42 > 0:30:43The war was over,
0:30:43 > 0:30:48Sweden topped the medal table at the Winter Olympics and I was born.
0:30:48 > 0:30:51But the differences between the countries of the world
0:30:51 > 0:30:52was wider than ever.
0:30:52 > 0:30:54United States was in the front.
0:30:54 > 0:30:56Japan was catching up.
0:30:56 > 0:30:58Brazil was way behind,
0:30:58 > 0:31:03Iran was getting a little richer from oil but still had short lives.
0:31:03 > 0:31:05And the Asian giants...
0:31:05 > 0:31:08China, India, Pakistan, Bangladesh, and Indonesia,
0:31:08 > 0:31:11they were still poor and sick down here.
0:31:11 > 0:31:14But look what was about to happen! Here we go again.
0:31:14 > 0:31:18In my lifetime, former colonies gained independence and then finally
0:31:18 > 0:31:22they started to get healthier and healthier and healthier.
0:31:22 > 0:31:26And in the 1970s, then countries in Asia and Latin America
0:31:26 > 0:31:28started to catch up with the Western countries.
0:31:28 > 0:31:31They became the emerging economies.
0:31:31 > 0:31:32Some in Africa follows,
0:31:32 > 0:31:36some Africans were stuck in civil war, and others were hit by HIV.
0:31:36 > 0:31:41And now we can see the world in the most up-to-date statistics.
0:31:42 > 0:31:45Most people today live in the middle.
0:31:45 > 0:31:48But there is huge difference at the same time
0:31:48 > 0:31:51between the best-off countries and the worst-off countries.
0:31:51 > 0:31:54And there are also huge inequalities within countries.
0:31:54 > 0:31:59These bubbles show country averages but I can split them.
0:31:59 > 0:32:02Take China. I can split it into provinces.
0:32:02 > 0:32:05There goes Shanghai...
0:32:05 > 0:32:08It has the same health and wealth as Italy today.
0:32:08 > 0:32:11And there is the poor inland province Guizhou,
0:32:11 > 0:32:12it is like Pakistan.
0:32:12 > 0:32:18And if I split it further, the rural parts are like Ghana in Africa.
0:32:19 > 0:32:23And yet, despite the enormous disparities today,
0:32:23 > 0:32:27we have seen 200 years of remarkable progress!
0:32:27 > 0:32:31That huge historical gap between the west and the rest is now closing.
0:32:31 > 0:32:35We have become an entirely new, converging world.
0:32:35 > 0:32:37And I see a clear trend into the future.
0:32:37 > 0:32:40With aid, trade, green technology and peace,
0:32:40 > 0:32:43it's fully possible that everyone can make it
0:32:43 > 0:32:45to the healthy, wealthy corner.
0:32:48 > 0:32:51Well, what you've just seen in the last few minutes
0:32:51 > 0:32:56is a story of 200 countries shown over 200 years and beyond.
0:32:56 > 0:33:00It involved plotting 120,000 numbers.
0:33:00 > 0:33:02Pretty neat, huh?
0:33:07 > 0:33:13So, with statistics, we can begin to see things as they really are.
0:33:13 > 0:33:18From tables of data to averages, distributions and visualisations,
0:33:18 > 0:33:22statistics gives us a clear description of the world.
0:33:22 > 0:33:28But, with statistics, we can not only discover WHAT is happening
0:33:28 > 0:33:30but also explore WHY,
0:33:30 > 0:33:34by using the powerful analytical method - correlation.
0:33:35 > 0:33:38Just looking at one thing at a time doesn't tell you very much.
0:33:38 > 0:33:41You've got to look at the relationships between things,
0:33:41 > 0:33:43how they change, how they vary together.
0:33:43 > 0:33:45That's what correlation is about.
0:33:45 > 0:33:48That's how you start trying to understand the processes
0:33:48 > 0:33:50that are really going on in the world and society.
0:33:52 > 0:33:57Most of us today would recognise that crime correlates to poverty,
0:33:57 > 0:34:00that infection correlates to poor sanitation,
0:34:00 > 0:34:02and that knowledge of statistics correlates
0:34:02 > 0:34:05to being great at dancing!
0:34:06 > 0:34:10Correlations can be very tricky.
0:34:10 > 0:34:12I got a joke about silly correlations.
0:34:12 > 0:34:15There was this American who was afraid of heart attack.
0:34:15 > 0:34:19He found out that the Japanese ate very little fat
0:34:19 > 0:34:22and almost didn't drink wine,
0:34:22 > 0:34:25but they had much less heart attacks than the Americans.
0:34:25 > 0:34:28But, on the other hand, he also found out that the French
0:34:28 > 0:34:35eat as much fat as the Americans and they drink much more wine but they also have less heart attacks.
0:34:35 > 0:34:40So he concluded that what kills you is speaking English.
0:34:40 > 0:34:43# Smoke, smoke, smoke that cigarette
0:34:43 > 0:34:48# Puff, puff, puff and if you smoke yourself to death... #
0:34:48 > 0:34:51The time, the pace, the cigarette. Weights Tilt.
0:34:51 > 0:34:56The best example of a really ground-breaking correlation
0:34:56 > 0:35:01is the link that was established in the 1950s between smoking and lung cancer.
0:35:01 > 0:35:07Not long after the Second World War, a British doctor, Richard Doll,
0:35:07 > 0:35:11investigated lung cancer patients in 20 London hospitals.
0:35:11 > 0:35:15And he became certain that the only thing they had in common was smoking.
0:35:15 > 0:35:18So certain, that he stopped smoking himself.
0:35:18 > 0:35:22But other people weren't so sure.
0:35:22 > 0:35:25A lot of the discussion of the early data,
0:35:25 > 0:35:29linking smoking to lung cancer, said, "It's not the smoking, surely,
0:35:29 > 0:35:32"that thing we've done all our lives, that can't be bad for you.
0:35:32 > 0:35:35"Maybe it's genes.
0:35:35 > 0:35:39"Maybe people who are genetically predisposed to get lung cancer
0:35:39 > 0:35:43"are also genetically predisposed to smoke."
0:35:43 > 0:35:47"Maybe it's not the smoking, maybe it's air pollution -
0:35:47 > 0:35:52"that smokers are somehow more exposed to air pollution than non-smokers.
0:35:52 > 0:35:56"Maybe it's not smoking, maybe it's poverty."
0:35:56 > 0:36:00So now we've got three alternative explanations, apart from chance.
0:36:02 > 0:36:06To verify his correlation did imply cause and effect.
0:36:06 > 0:36:10Richard Doll created the biggest statistical study of smoking yet.
0:36:10 > 0:36:14He began tracking the lives of 40,000 British doctors,
0:36:14 > 0:36:17some of whom smoked and some of whom didn't,
0:36:17 > 0:36:19and gathered enough data
0:36:19 > 0:36:22to correlate the amount the doctors smoked
0:36:22 > 0:36:24with their likelihood of getting cancer.
0:36:24 > 0:36:30Eventually, he not only showed a correlation between smoking and lung cancer,
0:36:30 > 0:36:35but also a correlation between stopping smoking and reducing the risk.
0:36:35 > 0:36:37This was science at its best.
0:36:39 > 0:36:44What correlations do not replace is human thought.
0:36:44 > 0:36:46You've got to think about what it means.
0:36:46 > 0:36:50What a good scientist does, if he comes with a correlation,
0:36:50 > 0:36:55is try as hard as she or he possibly can to disprove it,
0:36:55 > 0:37:00to break it down, to get rid of it, to try and refute it.
0:37:00 > 0:37:05And if it withstands all those efforts at demolishing it
0:37:05 > 0:37:10and it is still standing up then, cautiously, you say, "We really might have something here."
0:37:26 > 0:37:32However brilliant the scientist, data is still the oxygen of science.
0:37:32 > 0:37:39The good news is that the more we have, the more correlations we'll find, the more theories we'll test,
0:37:39 > 0:37:42and the more discoveries we're likely to make.
0:37:46 > 0:37:53And history shows how our total sum of information grows in huge leaps as we develop new technologies.
0:37:53 > 0:38:00The invention of the printing press kicked off the first data and information explosion.
0:38:00 > 0:38:06If you piled up all the books that had been printed by the year 1700,
0:38:06 > 0:38:11they would make 60 stacks each as high as Mount Everest.
0:38:12 > 0:38:15Then, starting in the 19th century,
0:38:15 > 0:38:19there came a second information revolution with the telegraph,
0:38:19 > 0:38:23gramophone and camera. And later radio and TV.
0:38:23 > 0:38:28The total amount of information exploded.
0:38:28 > 0:38:35And by the 1950s the information available to us all had multiplied 6,000 times.
0:38:35 > 0:38:41Then, thanks to the computer and later the internet, we went digital.
0:38:41 > 0:38:47And the amount of data we have now is unimaginably vast.
0:38:49 > 0:38:55A single letter printed in a book is equivalent to a byte of data.
0:38:55 > 0:38:58A printed page equals a kilobyte or two.
0:39:01 > 0:39:06Five megabytes is enough for the complete works of Shakespeare.
0:39:08 > 0:39:1110 gigabytes - that's a DVD movie.
0:39:16 > 0:39:23Two terabytes is the tens of millions of photos added to Facebook every day.
0:39:24 > 0:39:32Ten petabytes is the data recorded every second by the world's largest particle accelerator.
0:39:32 > 0:39:35So much only a tiny fraction is kept.
0:39:35 > 0:39:43Six exabytes is what you'd have if you sequenced the genomes of every single person on Earth.
0:39:48 > 0:39:50But really, that's nothing.
0:39:50 > 0:39:55In 2009, the internet added up to 500 exabytes.
0:39:55 > 0:40:02In 2010, in just one year, that will double to more than one zettabyte!
0:40:06 > 0:40:14Back in the real world, if we turned all this data into print it would make 90 stacks of books,
0:40:14 > 0:40:18each reaching from here all the way to the sun!
0:40:18 > 0:40:23The data deluge is staggering, but, with today's computers
0:40:23 > 0:40:28and statistics, I'm confident we can handle it.
0:40:28 > 0:40:31When it comes to all the data on the internet,
0:40:31 > 0:40:33the powerhouse of statistical analysis
0:40:33 > 0:40:37is the Silicon Valley giant Google.
0:40:44 > 0:40:50The average person over their lifetime is exposed to about 100 million words of conversation.
0:40:50 > 0:40:54And so if you multiple that by the six billion people on the planet,
0:40:54 > 0:40:58that amount of words is about equal to the number of words
0:40:58 > 0:41:01that Google has available at any one instant in time.
0:41:03 > 0:41:08Google's computers hoover up and file away every document, web page, and image they can find.
0:41:08 > 0:41:14They then hunt for patterns and correlations in all this data,
0:41:14 > 0:41:17doing statistics on a massive scale.
0:41:17 > 0:41:25And, for me, Google has one project that's particularly exciting - statistical language translation.
0:41:25 > 0:41:30We wanted to provide access to all the web's information, no matter what language you spoke.
0:41:30 > 0:41:33There's just so much information on the internet,
0:41:33 > 0:41:37you couldn't hope to translate it all by hand into every possible language.
0:41:37 > 0:41:41We figured we'd have to be able to do machine translation.
0:41:44 > 0:41:47In the past, programmers tried to teach their computers
0:41:47 > 0:41:53to see each language as a set of grammatical rules - much like the way languages are taught at school.
0:41:53 > 0:41:58But this didn't work because no set of rules could capture a language
0:41:58 > 0:42:01in all its subtlety and ambiguity.
0:42:01 > 0:42:05"Having eaten our lunch the coach departed."
0:42:05 > 0:42:07Well, that's obviously incorrect.
0:42:07 > 0:42:12Written like that it would imply that the coach has eaten the lunch.
0:42:12 > 0:42:15It would be far better to say...
0:42:15 > 0:42:19"having eaten our lunch we departed in the coach."
0:42:19 > 0:42:26Those rules are helpful and they are useful most of time, but they don't turn out to be true all the time.
0:42:26 > 0:42:30And the insight of using statistical machine translation is saying,
0:42:30 > 0:42:35"If you've got to have all these exceptions anyways, maybe you can get by without having any of the rules.
0:42:35 > 0:42:39"Maybe you can treat everything as an exception." And that's essentially what we've done.
0:42:48 > 0:42:52What the computer is doing when he's learning how to translate
0:42:52 > 0:42:55is to learn correlations between words
0:42:55 > 0:42:57and correlations between phrases.
0:42:57 > 0:43:00So we feed the system very large amounts of data
0:43:00 > 0:43:04and then the system is seeing that a certain word or a certain phrase
0:43:04 > 0:43:07correlates very often to the other language.
0:43:09 > 0:43:15Google's website currently offers translation between any of 57 different languages.
0:43:15 > 0:43:22It does this purely statistically, having correlated a huge collection of multilingual texts.
0:43:22 > 0:43:25The people that built the system don't need to know Chinese
0:43:25 > 0:43:29in order to build the Chinese-to-English system, or they don't need to know Arabic.
0:43:29 > 0:43:33But the expertise that's needed is basically knowledge of statistics,
0:43:33 > 0:43:35knowledge of computer science, knowledge of infrastructure
0:43:35 > 0:43:40to build those very large computational systems that we are building for doing that.
0:43:42 > 0:43:48I hooked up with Google from my office in Stockholm to try the translator for myself.
0:43:48 > 0:43:51'I will type... some Swedish sentences.'
0:43:51 > 0:43:53OK.
0:43:53 > 0:43:55Sveriges...
0:43:55 > 0:43:59..guldring i orat.
0:44:00 > 0:44:07OK. So it says, "Sweden's finance minister has a ponytail and a gold ring in your ear."
0:44:07 > 0:44:11- I guess it probably means in his ear.- 'That's exactly correct, it's amazing!
0:44:11 > 0:44:15'He comes from the Conservative party, that's the kind of Sweden we have today.
0:44:15 > 0:44:18'I will type one more sentence.'
0:44:18 > 0:44:22'I sitt samkonade...'
0:44:22 > 0:44:25partnerskap...
0:44:25 > 0:44:28nya biskop.
0:44:28 > 0:44:35"In his same-sex partnership has Stockholm's new bishop and his partners a three-year son."
0:44:35 > 0:44:38It's almost perfect, there's one important thing -
0:44:38 > 0:44:41it's HER, it's a lesbian partnership.
0:44:41 > 0:44:46OK, so those kinds of words his and her are one of the challenges
0:44:46 > 0:44:49in translation to get really those right.
0:44:49 > 0:44:51Especially when it comes to bishops one can excuse it!
0:44:51 > 0:44:53'Right, right.'
0:44:53 > 0:44:58- I guess more often than not it would probably be a "his". - 'I will write one more sentence.'
0:44:58 > 0:45:01Nar Sverige deltar I olympiader ar malet
0:45:01 > 0:45:03'inte att vinna utan att sla Norge.'
0:45:06 > 0:45:11OK. "When Sweden is taking part in Olympic goal is not to win but to beat Norway."
0:45:11 > 0:45:13'Yes! This is what it is!
0:45:13 > 0:45:17'But they are very good in Winter Olympics, so we can't make it, but we are trying.'
0:45:17 > 0:45:19Ah, very good, very good.
0:45:19 > 0:45:24'This is absolutely amazing, you know, and I was especially impressed
0:45:24 > 0:45:30'that it picks up words like "same-sex partnership" which are very new to the language."
0:45:30 > 0:45:36'The translator is good, but if they succeed with what's next, that'll be remarkable.'
0:45:36 > 0:45:38One of the exciting possibilities
0:45:38 > 0:45:42is combining the machine translation technology with the speech recognition technology.
0:45:42 > 0:45:45Now, both of these are statistical in nature.
0:45:45 > 0:45:51The machine translation relies on the statistics of mapping from one language to another,
0:45:51 > 0:45:57and similarly speech recognition relies on the statistics of mapping from a sound form to the words.
0:45:57 > 0:45:59When we put them together,
0:45:59 > 0:46:03now we have the capability of having instant conversation
0:46:03 > 0:46:06between two people that don't speak a common language.
0:46:06 > 0:46:08I can talk to you in my language,
0:46:08 > 0:46:11you hear me in your language and you can answer back.
0:46:11 > 0:46:15And in real time we can make that translation,
0:46:15 > 0:46:18we can bring two people together and allow them to speak.
0:46:31 > 0:46:39The internet is just one of many technologies created to gather massive amounts of data.
0:46:39 > 0:46:43Scientists studying our earth and our environment
0:46:43 > 0:46:47now use an incredible range of instruments
0:46:47 > 0:46:50to measure the processes of our planet.
0:46:52 > 0:47:00All around us are sensors continuously measuring temperature, water flow, and ocean currents.
0:47:00 > 0:47:06And high in orbit are satellites busy imaging cloud formations, forest growth and snow cover.
0:47:06 > 0:47:11Scientists speak of "instrumenting the earth".
0:47:13 > 0:47:20And pointing up to the skies above are powerful new telescopes mapping the universe.
0:47:30 > 0:47:34What's happening in astronomy is typical of how profoundly
0:47:34 > 0:47:39this new torrent of data is transforming science.
0:47:39 > 0:47:45Astronomers are now addressing many enduring mysteries of the cosmos
0:47:45 > 0:47:49by applying statistical methods to all this new data.
0:47:59 > 0:48:03The galaxy is a very big place and it's got billions of stars in it,
0:48:03 > 0:48:09and so to put together a coherent picture of the whole galaxy requires having an enormous amount of data.
0:48:09 > 0:48:13And before you could do a large sky survey with sensitive, digital detectors
0:48:13 > 0:48:16that meant that you could map many, many stars all at once,
0:48:16 > 0:48:20it was very difficult to build up enough data on enough of the galaxy.
0:48:24 > 0:48:28In the past, large surveys of the night sky had to be done
0:48:28 > 0:48:32by exposing thousands of large photographic plates.
0:48:32 > 0:48:37But these surveys could take 25 years or more to complete.
0:48:39 > 0:48:44Then, in the 1990s, came digital astronomy and a huge increase
0:48:44 > 0:48:49in both the amount and the accessibility of data.
0:48:49 > 0:48:55The Sloan Sky Survey is the world's biggest yet, using a massive digital sensor
0:48:55 > 0:49:00mounted on the back of a custom-built telescope in New Mexico.
0:49:00 > 0:49:05It's scanned the sky night after night for eight years,
0:49:05 > 0:49:09building up a composite picture in unprecedented resolution.
0:49:09 > 0:49:14The Sloan is some of the best, deepest survey data that we have in astronomy.
0:49:14 > 0:49:18Both on our own galaxy and on galaxies further away from ours.
0:49:24 > 0:49:27All the Sloan data is on the internet,
0:49:27 > 0:49:34and with it astronomers have identified millions of hitherto unknown stars and galaxies.
0:49:34 > 0:49:37They also comb the database for statistical patterns
0:49:37 > 0:49:42which will prove, disprove, or even suggest new theories.
0:49:42 > 0:49:49So we have this idea that galaxies grow, they become large galaxies like the one we live in, the milky way,
0:49:49 > 0:49:55not all at once, or not smoothly, but by continuously incorporating,
0:49:55 > 0:49:59basically cannibalising, smaller galaxies.
0:49:59 > 0:50:04They dissolve them and they become part of the bigger galaxy as it grows.
0:50:06 > 0:50:12It's a startling idea, and, in the Sloan data, is the evidence to support it.
0:50:12 > 0:50:16Groups of stars that came from cannibalised galaxies
0:50:16 > 0:50:21stand out in the Sloan data as statistically different from other stars
0:50:21 > 0:50:24because they move at a different velocity.
0:50:24 > 0:50:28Each big spike on one of these distribution graphs
0:50:28 > 0:50:35means Professor Rockosi has found a group of stars all travelling in a different way to the rest.
0:50:35 > 0:50:38They are the telltale patterns she's looking for.
0:50:40 > 0:50:44The evidence is accumulating that, in fact, this really is how galaxies grow,
0:50:44 > 0:50:47or an important way in which how galaxies grow.
0:50:47 > 0:50:53And so this is an important part of understanding how galaxies form, not only ours but every galaxy.
0:50:56 > 0:51:00The more data there is, the more discoveries can be made.
0:51:00 > 0:51:03And the technology is getting better all the time.
0:51:03 > 0:51:07The next big survey telescope starts its work in 2015.
0:51:07 > 0:51:10It will leave Sloan in the dust!
0:51:10 > 0:51:16Sloan has taken eight years to cover one quarter of the night sky.
0:51:17 > 0:51:25The new telescope will scan the entire sky, in even greater resolution, every three days!
0:51:34 > 0:51:41The vast amounts of data we have today allows researchers in all sorts of fields
0:51:41 > 0:51:46to test their theories on a previously unimaginable scale.
0:51:46 > 0:51:53But more than this, it may even change the fundamental way science is done.
0:51:53 > 0:51:58With the power of today's computers applied to all this data,
0:51:58 > 0:52:03the machines might even be able to guide the researchers.
0:52:14 > 0:52:17We're at a potentially profoundly important
0:52:17 > 0:52:22and potentially one of the most significant points in science,
0:52:22 > 0:52:24and certainly one of the most exciting,
0:52:24 > 0:52:32where the potential to transform not just how scientists do science but even what science is possible.
0:52:32 > 0:52:34And what will power that transformation
0:52:34 > 0:52:38of both how science is done and even what science is possible
0:52:38 > 0:52:40is going to be computation.
0:52:41 > 0:52:49Many of the dynamics of the natural world, like the interplay between the rainforests and the atmosphere,
0:52:49 > 0:52:53are so complex that we don't as yet really understand them.
0:52:53 > 0:52:59But now computers are generating literally tens of thousands of different simulations
0:52:59 > 0:53:03of how these biological systems might work.
0:53:03 > 0:53:07It's like creating thousands of hypothetical parallel worlds.
0:53:07 > 0:53:10Each and every one of these simulations
0:53:10 > 0:53:18is analysed with statistics to see if any are a good match for what is observed in nature.
0:53:18 > 0:53:21The computers can now automatically generate,
0:53:21 > 0:53:26test and discard hypotheses with scarcely a human in sight.
0:53:28 > 0:53:35This new application of statistics will become absolutely vital for the future of science.
0:53:35 > 0:53:39It's creating a new paradigm, if you like,
0:53:39 > 0:53:42in science, in the way in which we can do science,
0:53:42 > 0:53:45which is increasingly...
0:53:45 > 0:53:51Which one might characterise as... data-centric or data driven
0:53:51 > 0:53:55rather than being hypothesis-driven or experimentally-driven.
0:53:55 > 0:53:58So, it's exciting times in terms of the science,
0:53:58 > 0:54:02in terms of the computation and in terms of the statistics.
0:54:08 > 0:54:15Now, if all that sounds a bit abstract and theoretical to you, how about one final frontier?
0:54:15 > 0:54:19Could statistics even make sense of your feelings?
0:54:21 > 0:54:25In California - where else? - one computer scientist
0:54:25 > 0:54:32is harvesting the internet to try to divine the patterns of our innermost thoughts and emotions.
0:54:44 > 0:54:46This is the madness movement.
0:54:46 > 0:54:50The madness movement represents a skyscraper view of the world.
0:54:50 > 0:54:54Each of these brightly coloured dots is an individual feeling
0:54:54 > 0:54:58expressed by someone out there in a blog or a tweet.
0:54:58 > 0:55:04And when you click on the dot it explodes to reveal the underlying feeling of that person.
0:55:04 > 0:55:07This is what people say they're feeling today.
0:55:07 > 0:55:10Better...safe...
0:55:10 > 0:55:12crappy...
0:55:12 > 0:55:14well...
0:55:14 > 0:55:18pretty...special...
0:55:18 > 0:55:20sorry...alone...
0:55:25 > 0:55:29So, every minute, We Feel Fine crawls the world's blogs,
0:55:29 > 0:55:34takes all the sentences that start with the words "I feel" or "I am feeling",
0:55:34 > 0:55:35and puts them in a database.
0:55:35 > 0:55:40We collect all the feelings and we count the most common.
0:55:40 > 0:55:43They are better...bad...
0:55:43 > 0:55:45good...right...
0:55:45 > 0:55:48guilty...sick...
0:55:48 > 0:55:51the same...like shit...
0:55:51 > 0:55:54sorry...well...
0:55:54 > 0:55:56and so on.
0:55:58 > 0:56:01And we can take a look at any one feeling and analyse it.
0:56:01 > 0:56:04Right now a lot of people are feeling happy.
0:56:04 > 0:56:11We can take a look at all the people who are happy and break it down by age, gender or location.
0:56:11 > 0:56:16Since bloggers have public profiles we have that information and so we can ask questions like,
0:56:16 > 0:56:21"Are women happier than men?" or, "Is England happier than the United States?"
0:56:30 > 0:56:33We find that, as people get older, they get happier.
0:56:33 > 0:56:40And, moreover, we find that for younger people they associate happiness more with excitement,
0:56:40 > 0:56:47and, as people get older, they associate happiness more with peacefulness.
0:56:51 > 0:56:57And we also find that women feel loved more often than men, but also more guilty.
0:56:57 > 0:57:02While men feel good more often than women, but also more alone.
0:57:06 > 0:57:12As people lead more and more of their lives online, they leave behind digital traces,
0:57:12 > 0:57:19and with these digital traces we can begin to statistically analyse what it means to be human.
0:57:51 > 0:57:54So where does all of this leave us?
0:57:54 > 0:58:00We generate unimaginable quantities of data about everything you can think of.
0:58:00 > 0:58:02We analyse it to reveal the patterns.
0:58:02 > 0:58:10And now not only experts but all of us can understand the stories in the numbers.
0:58:18 > 0:58:21Instead of being led astray by prejudice,
0:58:21 > 0:58:28with statistics at our fingertips, our eyes can be open for a fact-based view of the world.
0:58:28 > 0:58:33So, more than ever before, we can become authors of our own destiny.
0:58:33 > 0:58:36And that's pretty exciting isn't it?!
0:58:37 > 0:58:44# 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
0:58:44 > 0:58:50# 1, 22, 3, 24, 25, 26, 27, 28, 9, 30, 31, 32, 3, 34, 35, 36, 7
0:58:50 > 0:58:54# 38, 39, 40, 41, 42, 3, 44, 45, 46, 47
0:58:54 > 0:58:58LYRICS DEGENERATE INTO GIBBERISH
0:59:08 > 0:59:13GIBBERISH DEGENERATES INTO NOISE
0:59:13 > 0:59:14# 100. #