Big Data

Download Subtitles

Transcript

0:00:02 > 0:00:05Like it or not, our world is driven by computers.

0:00:05 > 0:00:09The details of your life, my life, and the world we live in

0:00:09 > 0:00:13are being recorded and kept in vast stores as digital information.

0:00:13 > 0:00:16And now, a new generation of technology

0:00:16 > 0:00:21is analysing our data in ways that are already changing our lives.

0:00:21 > 0:00:26We now live in a world of big data. Computers talking to each other,

0:00:26 > 0:00:30sharing our information in ways we never believed possible,

0:00:30 > 0:00:32sending out a stream of 1s and 0s.

0:00:32 > 0:00:34So tonight, on Bang,

0:00:34 > 0:00:38we find out exactly what Big Data is and what it's really

0:00:38 > 0:00:43capable of doing, the good and the not so good of this brave new world.

0:00:43 > 0:00:46Maggie will be looking at the frightening things people can do

0:00:46 > 0:00:47with personal data.

0:00:47 > 0:00:51And how some of us might be leaving ourselves vulnerable to crime online.

0:00:51 > 0:00:54This is about someone else being really careless with your data.

0:00:54 > 0:00:57Kind of a shocking thing to see.

0:00:57 > 0:00:59I will be looking at how big data technology

0:00:59 > 0:01:01can improve our lives...

0:01:01 > 0:01:04From getting you to your holiday destination safely...

0:01:05 > 0:01:07..to helping us to save lives.

0:01:07 > 0:01:11This is really going to make a huge difference, isn't it?

0:01:11 > 0:01:14And Jem will take us back to basics of what data actually is

0:01:14 > 0:01:17and how it's become so powerful.

0:01:17 > 0:01:21That's Bang, on data - and the new digital revolution.

0:01:26 > 0:01:28For most of us, data means the digital information

0:01:28 > 0:01:30that we personally use every day.

0:01:30 > 0:01:33But there is another kind of data that we rely on without even

0:01:33 > 0:01:35thinking about it.

0:01:35 > 0:01:38For example, every time we take to the skies.

0:01:40 > 0:01:43We're used to seeing streams of vapour left in the wake of planes,

0:01:43 > 0:01:47but as we have been hearing in the news, they can also leave a data trail -

0:01:47 > 0:01:52a stream of data that monitors the plane's performance.

0:01:52 > 0:01:56At the headquarters of Rolls-Royce in Derby, engineers make nearly

0:01:56 > 0:02:00half the world's passenger jet engines, including this,

0:02:00 > 0:02:03the Trent 1000 - the engine that powers

0:02:03 > 0:02:06many of our transatlantic flights.

0:02:06 > 0:02:09The temperatures in the back of the engine are staggeringly high -

0:02:09 > 0:02:11we talk about the temperature

0:02:11 > 0:02:14as being half of the temperature of the surface of the sun,

0:02:14 > 0:02:17and in fact it's 200 degrees above the melting point

0:02:17 > 0:02:19of the metals that we use.

0:02:19 > 0:02:22The only reason they don't melt is that we pass cooling air

0:02:22 > 0:02:24through special passages

0:02:24 > 0:02:28and channels that keeps the gas away from touching the metal.

0:02:28 > 0:02:31The engine is full of vital components -

0:02:31 > 0:02:33all engineered with absolute precision,

0:02:33 > 0:02:36including an on-board computer.

0:02:38 > 0:02:42This relatively unassuming box is the brains of the engine.

0:02:42 > 0:02:44Not only does it control it,

0:02:44 > 0:02:47but it also performs another crucial function.

0:02:47 > 0:02:51It receives data from sensors buried deep within the engine,

0:02:51 > 0:02:55measuring 40 parameters 40 times a second including temperatures,

0:02:55 > 0:02:58pressures and turbine speeds.

0:02:58 > 0:03:02All of the measurements received by the computer are stored,

0:03:02 > 0:03:05and then streamed via satellite back to base, here in Derby.

0:03:06 > 0:03:09And that's not just true for the Trent 1000.

0:03:09 > 0:03:13It's the same for the entire fleet - that's thousands of engines.

0:03:13 > 0:03:17A Rolls-Royce-powered engine takes off or lands every two

0:03:17 > 0:03:19and a half seconds, somewhere in the world.

0:03:19 > 0:03:21That's a very cool factoid.

0:03:21 > 0:03:24And wherever they are up in the air, there's information coming back

0:03:24 > 0:03:29about the functionality of this engine back to Derby, to here.

0:03:29 > 0:03:31Absolutely, and they're constantly monitored

0:03:31 > 0:03:34using clever data analytics that are looking for anything

0:03:34 > 0:03:37going wrong in the engine, or any sign that it might need to be

0:03:37 > 0:03:39serviced early or something like that.

0:03:39 > 0:03:41So wherever you're flying to,

0:03:41 > 0:03:44while you're 30,000 feet or higher,

0:03:44 > 0:03:48thousands of streams of data are constantly sent back to base.

0:03:49 > 0:03:53Here, computers are programmed to sift through it

0:03:53 > 0:03:54for any anomalies.

0:03:54 > 0:03:57So you've got 11 engines that have flagged something up on your system.

0:03:57 > 0:04:00That's not necessarily an emergency

0:04:00 > 0:04:02but something that needs to be looked at. Is that how it works?

0:04:02 > 0:04:05Just an example - this has been flagged due to that -

0:04:05 > 0:04:08a step change in the behaviour of the oil pressure parameter

0:04:08 > 0:04:10of around ten PSI.

0:04:10 > 0:04:13So when we start to see something that is not what

0:04:13 > 0:04:16we would expect to see, that's the first trigger point.

0:04:16 > 0:04:19- And that's when you guys step in. - Absolutely.

0:04:20 > 0:04:24Analysts here take a closer look at any problem data,

0:04:24 > 0:04:28and get on the phone to the airlines immediately. The result?

0:04:28 > 0:04:31Technical faults are dealt with before they become a major

0:04:31 > 0:04:34problem, preventing delays.

0:04:34 > 0:04:38Plus, the working life of the engine is dramatically improved.

0:04:38 > 0:04:43One of these engines will fly around the world 450 times before it

0:04:43 > 0:04:46needs to be overhauled - that's a hell of a mileage.

0:04:46 > 0:04:49And just as big data works for plane engines,

0:04:49 > 0:04:53it can also work for healthcare and the human body.

0:04:53 > 0:04:56We've all seen computerised systems in our hospitals.

0:04:56 > 0:04:58In intensive care, vital signs

0:04:58 > 0:05:02need to be monitored frequently at the bedside.

0:05:02 > 0:05:05Now traditionally, this information is noted down on paper,

0:05:05 > 0:05:07but here at King's College Hospital,

0:05:07 > 0:05:10a new technique is being trialled

0:05:10 > 0:05:13that records all this information and important new data

0:05:13 > 0:05:15in a way that could mean the difference

0:05:15 > 0:05:19between life and death for a huge number of people.

0:05:19 > 0:05:21Brain injuries are the most common cause of death

0:05:21 > 0:05:23and disability in young people.

0:05:23 > 0:05:26In UK hospitals, we treat over

0:05:26 > 0:05:31220,000 patients with them every year.

0:05:31 > 0:05:35Jordan Ball was admitted to King's after a motorbike accident last year.

0:05:35 > 0:05:38His recovery is as rare as it is remarkable.

0:05:38 > 0:05:41When I woke up, my mum and my brother came to see me.

0:05:41 > 0:05:45Only one eye was open, and I was just staring at my brother.

0:05:45 > 0:05:49- And a tear came down my eye. - Don't. You will make me cry.

0:05:49 > 0:05:53And he was like, "Mum, he knows it's me! He knows it's me!"

0:05:53 > 0:05:5690% of people with his injury don't even wake up.

0:05:56 > 0:05:59The 10% of people that do wake up, they need help with

0:05:59 > 0:06:03eating, walking, they don't tend to recover like Jordan has.

0:06:03 > 0:06:05He is on the top end of recovery.

0:06:05 > 0:06:08- Do you realise how lucky you are, Jordan?- I do.

0:06:08 > 0:06:11- He does do, don't you? - It's incredible.

0:06:12 > 0:06:16Most patients are a lot less lucky - and for them,

0:06:16 > 0:06:19some of the most serious problems can occur in the following hours

0:06:19 > 0:06:22and days after admission to hospital.

0:06:22 > 0:06:27Secondary brain injuries are a serious concern in patients

0:06:27 > 0:06:29that suffer trauma to the brain.

0:06:29 > 0:06:36If you imagine that this is the initial point of injury...

0:06:36 > 0:06:39one of the biggest problems

0:06:39 > 0:06:43is that the electrical activity in the tissue surrounding this point

0:06:43 > 0:06:47short-circuits, and storms through all the cells,

0:06:47 > 0:06:51using up all the energy supply, the glucose.

0:06:51 > 0:06:56Eventually, the cells stop working and they die, never to be replaced.

0:06:56 > 0:06:59If it was possible to know when these secondary events were starting

0:06:59 > 0:07:03to happen, doctors could intervene and potentially limit the damage.

0:07:05 > 0:07:09When Jordan was in ICU, he suffered seizures

0:07:09 > 0:07:11that the doctors understood very little about.

0:07:11 > 0:07:13So obviously, the doctors were doing everything they possibly could

0:07:13 > 0:07:17but there was a big part of all of this that was an unknown.

0:07:17 > 0:07:20If there was anything that could detect brain activity

0:07:20 > 0:07:25and that doctors were then able to make a move to stop that happening,

0:07:25 > 0:07:27that would have been wonderful.

0:07:27 > 0:07:29King's College Hospital have been working with

0:07:29 > 0:07:31Professor Martyn Boutelle

0:07:31 > 0:07:36and his team at Imperial College on a big data early warning system.

0:07:36 > 0:07:39This bolt can be fitted to the skull by a neurosurgeon without

0:07:39 > 0:07:41even having to go into theatre,

0:07:41 > 0:07:45turning what's going on inside the brain into data.

0:07:45 > 0:07:48You can see, sticking out, we have a number of different sensors.

0:07:48 > 0:07:51One of them the measures brain electrical activity.

0:07:51 > 0:07:55Another one measures the pressure and tissue oxygen

0:07:55 > 0:07:58and brain temperature as well.

0:07:58 > 0:08:00And then the last one measures chemically

0:08:00 > 0:08:02what is going on in the brain tissue.

0:08:02 > 0:08:05So if all of the indications from a probe like this

0:08:05 > 0:08:09- are letting you know that there is trouble ahead...- Yes.

0:08:09 > 0:08:12That gives you a chance to act before the secondary brain

0:08:12 > 0:08:14injury really takes effect?

0:08:14 > 0:08:15Exactly. That is the idea.

0:08:16 > 0:08:19This data could produce vital new insights,

0:08:19 > 0:08:23but it's recording between 16 and 32 channels,

0:08:23 > 0:08:26each being measured up to 200 times a second.

0:08:27 > 0:08:29Very quickly you can't see what's going on

0:08:29 > 0:08:31when there's that much data.

0:08:31 > 0:08:35Doctors in ICU need an automated solution to turn all this

0:08:35 > 0:08:39available data into something immediately useful.

0:08:39 > 0:08:42So Professor Boutelle turned to Cybula - big data specialists

0:08:42 > 0:08:46who also worked on the engine monitoring systems at Rolls-Royce.

0:08:46 > 0:08:48So, obviously,

0:08:48 > 0:08:52an engine is completely different to a person's heart or a person's brain,

0:08:52 > 0:08:59- but you're using the same programme to pinpoint problems in each.- Yes.

0:08:59 > 0:09:01- How does that work?- Exactly.

0:09:01 > 0:09:03Because it's doesn't matter to us

0:09:03 > 0:09:06whether it's brain data or whether it's an aero engine.

0:09:06 > 0:09:09It is really just the data that's the issue.

0:09:09 > 0:09:11We are able to look at the patterns

0:09:11 > 0:09:14in the data that characterise those events

0:09:14 > 0:09:16through those shapes.

0:09:16 > 0:09:19Here is an example of a brain event that we are actually looking for.

0:09:19 > 0:09:22In this section here, we are looking

0:09:22 > 0:09:25for this kind of spreading kind of wave.

0:09:25 > 0:09:27The liquid goes into here...

0:09:27 > 0:09:31So, with a big data solution at its heart, this prototype brain

0:09:31 > 0:09:34monitoring system works in near-real-time.

0:09:34 > 0:09:38Importantly for the busy critical care staff,

0:09:38 > 0:09:42they could see that something was happening here so they can see,

0:09:42 > 0:09:46"Yes, it has started to happen. We need to do something."

0:09:48 > 0:09:52Big data is being used in flood alert, transport, natural disaster

0:09:52 > 0:09:54response systems...

0:09:54 > 0:09:57You name it, it can provide vital new information.

0:09:57 > 0:10:00And always as a collaboration between research groups,

0:10:00 > 0:10:04engineers and experts in data collection and analysis.

0:10:05 > 0:10:08The big data revolution isn't just about storing more

0:10:08 > 0:10:11and more unconnected information.

0:10:11 > 0:10:13It's also about programmers

0:10:13 > 0:10:16designing new software to spot patterns and make connections in the

0:10:16 > 0:10:21data - and for that they need access to the data in the first place.

0:10:21 > 0:10:24Many believe that if we could make more data

0:10:24 > 0:10:25freely and openly available,

0:10:25 > 0:10:31then we could crack problems in ways that were previously unimaginable.

0:10:31 > 0:10:34The Open Data Institute is encouraging businesses

0:10:34 > 0:10:38and the UK government to share more of their information.

0:10:38 > 0:10:40If you make this data available,

0:10:40 > 0:10:42it could be used to show more transparently what's

0:10:42 > 0:10:45going on, it can make people accountable

0:10:45 > 0:10:48for the performance of, for example, public services, health education...

0:10:48 > 0:10:52So give us a few examples of the difference that can be made.

0:10:52 > 0:10:55The data that was held by the Department of Transport

0:10:55 > 0:10:59on bicycle accidents - within three days, that data had been taken,

0:10:59 > 0:11:03turned from one form into another and somebody had written

0:11:03 > 0:11:05an application that basically avoided

0:11:05 > 0:11:07the bicycle accident black spots around London.

0:11:07 > 0:11:11Just a very obvious thing that never occurred to the people

0:11:11 > 0:11:14that had the data in the first place to do.

0:11:14 > 0:11:18Sharing data clearly has some advantages but when it comes

0:11:18 > 0:11:21to personal information, it can be very controversial.

0:11:21 > 0:11:24Unless they opt out, people in England will soon

0:11:24 > 0:11:27have their medical records put into a digital database,

0:11:27 > 0:11:30and the NHS there plan to make them available for research.

0:11:30 > 0:11:33It could lead to medical breakthroughs - but some

0:11:33 > 0:11:37people still worry about releasing sensitive information like this.

0:11:37 > 0:11:39If I'm undergoing a medical crisis,

0:11:39 > 0:11:42I'd really like that my medical records could be shared

0:11:42 > 0:11:45between the appropriate services in an appropriate way,

0:11:45 > 0:11:48but blanket data publication, you have to be very

0:11:48 > 0:11:52cautious about...in this area when it's relating to individual data.

0:11:52 > 0:11:55And I think that both corporations

0:11:55 > 0:11:59and governments have to be extremely careful to respect

0:11:59 > 0:12:02and maintain the privacy of an individual.

0:12:02 > 0:12:05The NHS say that details that could identify individuals will be

0:12:05 > 0:12:09removed before the information is made available.

0:12:09 > 0:12:11But medical records aren't the only sensitive data

0:12:11 > 0:12:13we have to be conscious of.

0:12:13 > 0:12:16Today, it seems everyone wants to know what you're doing.

0:12:16 > 0:12:20Take a train or a bus and your journey is tracked.

0:12:20 > 0:12:22Use a points card when you're out shopping,

0:12:22 > 0:12:25and they keep track of what you're spending

0:12:25 > 0:12:28and use that information to market yet more stuff to you.

0:12:28 > 0:12:30And your bank account

0:12:30 > 0:12:33and spending patterns are monitored by financial services.

0:12:33 > 0:12:36Now, that's for your protection to prevent fraud,

0:12:36 > 0:12:39but it's also for future credit checks.

0:12:39 > 0:12:43It doesn't stop even when we think we're having some time to ourselves.

0:12:43 > 0:12:46We now spend nearly half our waking hours in front of some screen

0:12:46 > 0:12:50or other - but it's not always just between you and the computer.

0:12:51 > 0:12:53Take e-mail, for example - now,

0:12:53 > 0:12:57if you use a free service like Yahoo or Gmail, you'll be very

0:12:57 > 0:13:01familiar with those targeted ads that appear on the page.

0:13:01 > 0:13:03So have a look at this - this account belongs to

0:13:03 > 0:13:04a member of our production team,

0:13:04 > 0:13:07and you'll see that the ad that's appeared

0:13:07 > 0:13:10is offering flights to Australia.

0:13:10 > 0:13:13No surprise, because he uses this e-mail address

0:13:13 > 0:13:14to book most of his travel.

0:13:14 > 0:13:18Some free services automatically scan your search queries,

0:13:18 > 0:13:22social networks and even e-mails to get a sense of who you are,

0:13:22 > 0:13:25so they can target their adverts better.

0:13:25 > 0:13:26Sometimes,

0:13:26 > 0:13:29when a service is being offered for free, WE are what's being sold.

0:13:29 > 0:13:33In this case, as a potential customer for a targeted ad.

0:13:34 > 0:13:37This is what we sign up to in return for free communication,

0:13:37 > 0:13:41map services and a world of knowledge at our fingertips.

0:13:41 > 0:13:43Internet giants like Google, Facebook, Yahoo

0:13:43 > 0:13:47and Twitter don't release our personal details directly to

0:13:47 > 0:13:51advertisers, but they do generate an income from our profile.

0:13:51 > 0:13:55So how bothered are we, really, about sharing this sort of data?

0:13:55 > 0:14:00Let's find out from some volunteers at City University London.

0:14:00 > 0:14:04I've got some cards for them to choose from - where red means

0:14:04 > 0:14:08they share a lot online, green means they're sharing very little,

0:14:08 > 0:14:11and yellow is somewhere in the middle.

0:14:11 > 0:14:14OK, so let's start by asking you to choose a card.

0:14:14 > 0:14:16- I'll go with one of these, actually.- OK.

0:14:16 > 0:14:18Yellow, but I'd like to be green.

0:14:18 > 0:14:20I'd like to think that I protect myself.

0:14:20 > 0:14:23I do tend to try and go through privacy settings

0:14:23 > 0:14:24on things like Twitter and Facebook.

0:14:24 > 0:14:27I never save card details on any of my accounts.

0:14:27 > 0:14:29I'm a bit paranoid so...!

0:14:30 > 0:14:32Later on we'll find out whether they share

0:14:32 > 0:14:34as little information as they think.

0:14:35 > 0:14:39But first - internet security expert Professor Alan Woodward

0:14:39 > 0:14:42is showing me a murky corner of the internet

0:14:42 > 0:14:44where personal details are bought and sold,

0:14:44 > 0:14:47known as the Dark Web.

0:14:47 > 0:14:49These are bulletin boards where people

0:14:49 > 0:14:51are discussing selling now,

0:14:51 > 0:14:53not just credit card details,

0:14:53 > 0:14:55but all sorts of different personal information.

0:14:55 > 0:14:58There is actually quite a black humour side to this -

0:14:58 > 0:14:59he's saying how professional he is -

0:14:59 > 0:15:04"This is the result of three years' hard work".

0:15:04 > 0:15:05You can get very specific.

0:15:05 > 0:15:08There you are, date of birth, 15.

0:15:08 > 0:15:10Why is that important? Because in the UK, for example,

0:15:10 > 0:15:13a name and a date of birth, to a credit agency,

0:15:13 > 0:15:16can be considered a unique combination.

0:15:16 > 0:15:19So how much information does someone need to glean

0:15:19 > 0:15:21for it to be really useful?

0:15:21 > 0:15:25Anything you can add that starts to make your reference more unique.

0:15:25 > 0:15:27So you don't need much of that.

0:15:27 > 0:15:29Including, for example, put your home address down,

0:15:29 > 0:15:31your date of birth,

0:15:31 > 0:15:34if you can get hold of something like social security numbers,

0:15:34 > 0:15:36National Insurance numbers, in the UK.

0:15:36 > 0:15:40I could be whoever I want, from this list of people I can buy.

0:15:40 > 0:15:42And to give you an idea of how easy it is

0:15:42 > 0:15:44to collect data once it's out there,

0:15:44 > 0:15:49Alan's asked James Lyne from Cyber Security giant Sophos to join us.

0:15:49 > 0:15:53He's using some legal and freely available data harvesting tools

0:15:53 > 0:15:57to gather information about our volunteers.

0:15:57 > 0:15:59What these tools all really do,

0:15:59 > 0:16:02is, they take individual pieces of information,

0:16:02 > 0:16:04that in themselves would be completely innocuous,

0:16:04 > 0:16:08so a name, a social media profile,

0:16:08 > 0:16:09an e-mail address,

0:16:09 > 0:16:13and they combine them together using these Big Data techniques,

0:16:13 > 0:16:15expanding the information massively

0:16:15 > 0:16:19and make a very accurate profile of what that person looks like.

0:16:19 > 0:16:21Surprisingly, James doesn't need much information

0:16:21 > 0:16:23to build an accurate profile.

0:16:23 > 0:16:26People don't realise that often photos, tweets

0:16:26 > 0:16:31and other data they may upload, contain GPS coordinates, by default.

0:16:31 > 0:16:34So you might not give away your address or postcode,

0:16:34 > 0:16:38but you're giving away your location to plus or minus 10-15 metres.

0:16:39 > 0:16:44You might see 160 tweets that correlate to that location.

0:16:44 > 0:16:46Plus, the tweet content may talk about being at home,

0:16:46 > 0:16:48doing something for the kids.

0:16:48 > 0:16:51It gives away very clearly that's where they live.

0:16:51 > 0:16:54With potentially two to three years' backlog of data,

0:16:54 > 0:16:56that's enough to build a profile of anyone.

0:16:56 > 0:16:58And we'll be letting the volunteers know

0:16:58 > 0:17:01what we've found out about them, later on.

0:17:01 > 0:17:05From keeping planes in the air to stealing identities,

0:17:05 > 0:17:06if you've got access to data

0:17:06 > 0:17:10you can build some incredibly powerful tools.

0:17:10 > 0:17:13We first found this out long before the Big Data revolution,

0:17:13 > 0:17:15over 70 years ago.

0:17:15 > 0:17:17Jem takes up the story.

0:17:17 > 0:17:18During World War II,

0:17:18 > 0:17:22brilliant minds gathered in these buildings at Bletchley Park

0:17:22 > 0:17:25to decipher encrypted German messages.

0:17:25 > 0:17:27The results helped shorten the war by two years,

0:17:27 > 0:17:30and saved countless lives.

0:17:30 > 0:17:34At first they used human computers, real people sat at desks

0:17:34 > 0:17:37cracking the codes by pen and paper.

0:17:37 > 0:17:41But by 1943, the engineers had realised machines might be able

0:17:41 > 0:17:43to do a much better job -

0:17:43 > 0:17:47machines that processed with simple on/off switches.

0:17:47 > 0:17:51So how do you link simple switches to answer a problem?

0:17:51 > 0:17:53Well, I've got two of them here.

0:17:53 > 0:17:55Essentially, it's a tiny computer -

0:17:55 > 0:17:57I now need to programme it.

0:17:57 > 0:18:03Now the input to this I'm assigning to "Is it Monday?"

0:18:03 > 0:18:06so if it is Monday - it gets a positive input.

0:18:06 > 0:18:10If it isn't - it gets nothing at all.

0:18:10 > 0:18:15This switch, the input for that, "Is it 7.30?"

0:18:15 > 0:18:21And the output here, I'm assigning "Good time to watch BBC1?"

0:18:21 > 0:18:26Right. Let's start using the computer. Is it Monday?

0:18:26 > 0:18:29Yes. Is it 7.30?

0:18:29 > 0:18:30Yes.

0:18:32 > 0:18:35The computer says it is an ideal time

0:18:35 > 0:18:37for checking out some science on your telly.

0:18:38 > 0:18:39At Bletchley Park,

0:18:39 > 0:18:43engineers hooked up their own network of on/off switches

0:18:43 > 0:18:45to crack the German codes.

0:18:45 > 0:18:49Where I've used two switches, this machine used over 2,000,

0:18:49 > 0:18:52and it was aptly called Colossus.

0:18:52 > 0:18:55Back in the day, Colossus was revolutionary

0:18:55 > 0:18:58because it used these electronic valves

0:18:58 > 0:19:01for its fast and reliable switching.

0:19:01 > 0:19:04Fast and reliable for its time, because within ten years,

0:19:04 > 0:19:09that same job was being done by transistors, considerably smaller.

0:19:09 > 0:19:13Now, I pulled this out of a modern computer.

0:19:13 > 0:19:16The central processing unit - the chip that does the switching.

0:19:16 > 0:19:21And on there, there are 54 million transistors.

0:19:21 > 0:19:23And it's that kind of miniaturisation

0:19:23 > 0:19:28that has revolutionised what we can do with computers.

0:19:28 > 0:19:33Using switches to process ons and offs is how all computers work,

0:19:33 > 0:19:36but today they're known as 1s and 0s.

0:19:37 > 0:19:39You might not think you can get much subtlety

0:19:39 > 0:19:42out of a switch just being on or off,

0:19:42 > 0:19:46but there millions of them at work for you right now,

0:19:46 > 0:19:49sending out a stream of 1s and 0s,

0:19:49 > 0:19:53sequentially telling every pixel on your screen

0:19:53 > 0:19:56just how bright or dark they need to be.

0:19:56 > 0:19:59Your holiday snaps? A sequence of 1s and 0s.

0:19:59 > 0:20:02Your MP3s? A load of 1s and 0s.

0:20:05 > 0:20:06And every letter on a keyboard

0:20:06 > 0:20:11is an eight digit code of 1s or 0s, to a computer.

0:20:11 > 0:20:15For Colossus, data was fed in on paper tape.

0:20:15 > 0:20:20Each punched hole, or unpunched space, acted as a 1 or a 0.

0:20:20 > 0:20:25Today, individual 1s or 0s are called bits.

0:20:25 > 0:20:28Nowadays we reckon eight bits are a byte.

0:20:28 > 0:20:33And to match the storage capacity of something like a hard-drive -

0:20:33 > 0:20:39250GB - your piece of paper would need to go to the moon

0:20:39 > 0:20:41and back, and probably back to the moon again.

0:20:43 > 0:20:46So how can we pack so many bits into such a little box?

0:20:46 > 0:20:50Well, most hard-drives work using magnets.

0:20:50 > 0:20:53Computers magnetise an area of a disc

0:20:53 > 0:20:55like I'm magnetising these bolt-heads.

0:20:58 > 0:21:01I'll use magnetic North for 1 and South for 0,

0:21:01 > 0:21:04which can then be detected later.

0:21:04 > 0:21:07In a real hard drive the magnetisable areas

0:21:07 > 0:21:09are sitting on a spinning disc.

0:21:09 > 0:21:13Quite literally, on there, there are millions and billions

0:21:13 > 0:21:15of magnetisable areas,

0:21:15 > 0:21:19each of them so small, that they're smaller than a virus.

0:21:19 > 0:21:2210,000 of them would fit across the width of a human hair.

0:21:22 > 0:21:27And this is spinning around at 100 times a second or more.

0:21:27 > 0:21:30And yet still the computer is extracting

0:21:30 > 0:21:33a phenomenal amount of data incredibly quickly.

0:21:33 > 0:21:35Now just because this all seems like

0:21:35 > 0:21:39some ridiculous fantasy piece of engineering

0:21:39 > 0:21:43doesn't mean you shouldn't have a go at building your own.

0:21:43 > 0:21:44Now where's that MDF?

0:21:47 > 0:21:52As a team, we are putting together a massive four byte hard-drive.

0:21:52 > 0:21:55Four rings of eight magnets on a spinning platter.

0:21:55 > 0:21:58As the disc spins past the electro magnet

0:21:58 > 0:22:01it reads each bit as a 0 or a 1.

0:22:03 > 0:22:06I've left Chris and Jim to secretly encode each ring

0:22:06 > 0:22:10as a sequence of eight bits - enough for a letter on a keyboard.

0:22:12 > 0:22:14That's a zero.

0:22:14 > 0:22:17And I decipher the code back into letters.

0:22:17 > 0:22:19What takes me 30 seconds,

0:22:19 > 0:22:22a computer does at nearly the speed of light.

0:22:22 > 0:22:26Oh! I mean, that's just brilliant! What can I say?

0:22:26 > 0:22:29Milk and two sugars, please.

0:22:31 > 0:22:35Our ability to store vast quantities of information digitally,

0:22:35 > 0:22:39a bit like this, and process it with tiny, lightning fast switches,

0:22:39 > 0:22:41is what's driven computing,

0:22:41 > 0:22:45and opened up this whole field of Big Data.

0:22:45 > 0:22:47And as engineers develop even better storage,

0:22:47 > 0:22:49and even faster processing,

0:22:49 > 0:22:53Big Data applications are going to have a bigger and bigger influence

0:22:53 > 0:22:55on our everyday lives.

0:22:56 > 0:22:59Back at City University, London, it's results time

0:22:59 > 0:23:02in our personal data experiment.

0:23:02 > 0:23:04First up, those who chose green,

0:23:04 > 0:23:08believing they put no personal data about themselves online.

0:23:08 > 0:23:10Could we find anything about them?

0:23:11 > 0:23:13You've got my mobile number in there.

0:23:13 > 0:23:15Which I'm a bit surprised about but I'm guessing that might come

0:23:15 > 0:23:17from a shopping website or something like that.

0:23:17 > 0:23:19It was from somewhere that you'd published it.

0:23:19 > 0:23:23But it's not just his mobile that's public.

0:23:23 > 0:23:26Do you use that e-mail address for resetting certain accounts?

0:23:26 > 0:23:27Uh...yes.

0:23:27 > 0:23:29I think when I said I was green

0:23:29 > 0:23:31I'd forgotten how I put some of those things in.

0:23:31 > 0:23:33It's amazing how much this information just stays there.

0:23:33 > 0:23:34Absolutely.

0:23:34 > 0:23:37There's actually an astonishing number of cases

0:23:37 > 0:23:39where people thought they were really, really secure,

0:23:39 > 0:23:41and gave nothing away, but in reality,

0:23:41 > 0:23:43posted an awful lot of information online.

0:23:43 > 0:23:47And even for those in the red group, who knew they had data online,

0:23:47 > 0:23:49there were still surprises.

0:23:49 > 0:23:51Under my name there's only my phone number and my e-mail address.

0:23:51 > 0:23:55Whereas with the other guys it's their full home details,

0:23:55 > 0:23:58addresses, many phone numbers for them.

0:23:58 > 0:24:01This isn't about you being careless with your data.

0:24:01 > 0:24:05This is about someone else being really careless with your data,

0:24:05 > 0:24:09and all those other coaches, and the names of those children.

0:24:09 > 0:24:11All the names just listed out there,

0:24:11 > 0:24:14it's kind of a shocking thing to see.

0:24:14 > 0:24:17In fact, all of our groups were quite shocked.

0:24:17 > 0:24:21Can anybody who is not a member of this website just access it,

0:24:21 > 0:24:23or do you have to, you know, become a member of it?

0:24:23 > 0:24:26It's all accessible. We didn't register to get that.

0:24:26 > 0:24:29That's good then(!)

0:24:29 > 0:24:32So what, do you think, constitutes safe online behaviour?

0:24:32 > 0:24:35So, firstly, don't be too paranoid.

0:24:35 > 0:24:37I use Twitter, I use LinkedIn,

0:24:37 > 0:24:39I enjoy online services.

0:24:39 > 0:24:41But we have to think a little carefully

0:24:41 > 0:24:42about the information we upload.

0:24:42 > 0:24:47Do we want to give away the location of this photo, in our back garden,

0:24:47 > 0:24:50that contains the location of our house, plus or minus 10-15 metres?

0:24:50 > 0:24:54Secondly, consider lying online.

0:24:54 > 0:24:56Now, I know that sounds like a strange thing to say

0:24:56 > 0:24:58in the real world, but when a service provider says,

0:24:58 > 0:25:00"What's your date of birth?",

0:25:00 > 0:25:04don't tell them, and if they demand you give them that information,

0:25:04 > 0:25:06give them a fake answer.

0:25:06 > 0:25:08Keep note of that for future purposes, for a reset,

0:25:08 > 0:25:10but don't tell them the truth.

0:25:10 > 0:25:13And remember, once something's on the internet,

0:25:13 > 0:25:15you really can't delete it.

0:25:15 > 0:25:18So think before you put anything there in the first place.

0:25:18 > 0:25:21So we should be wary about any information

0:25:21 > 0:25:23that's out there and unrestricted.

0:25:23 > 0:25:26But as Liz is finding out, Big Data's offering up

0:25:26 > 0:25:29more than just new ways to reveal our identity.

0:25:29 > 0:25:30It's also offering up

0:25:30 > 0:25:33a new generation of facial recognition techniques

0:25:33 > 0:25:37that eventually, may even be able to tell how we're feeling.

0:25:37 > 0:25:41Two dimensional facial recognition systems have a wide variety

0:25:41 > 0:25:45of very useful applications, but they're not completely foolproof.

0:25:45 > 0:25:53If I hold up a picture... of Jem Stansfield's head,

0:25:53 > 0:25:56because the system only analyses in two dimensions,

0:25:56 > 0:25:59this flat picture can fool it into thinking

0:25:59 > 0:26:01I'm someone completely different.

0:26:02 > 0:26:052D systems mostly work by measuring the distance

0:26:05 > 0:26:07between your key facial features.

0:26:07 > 0:26:10But the technology can be easily confused.

0:26:10 > 0:26:13Here at the Centre for Machine Vision

0:26:13 > 0:26:15at the Bristol Robotics Laboratory,

0:26:15 > 0:26:18Mark Hansen and his team have made a system

0:26:18 > 0:26:19that can see in 3D, like we can.

0:26:21 > 0:26:24This booth uses a high speed camera,

0:26:24 > 0:26:30and five near-infrared flashes to build up a 3D likeness of my face.

0:26:30 > 0:26:31So it's captured all the images.

0:26:31 > 0:26:33God, I look hideous!

0:26:33 > 0:26:36That's an awful photograph!

0:26:36 > 0:26:39That is a 3D image of my face, and it's saying, "Access denied."

0:26:39 > 0:26:41Why is it not recognising me?

0:26:41 > 0:26:45Because we haven't enrolled you on the system yet.

0:26:45 > 0:26:48I walk through a few more times and Mark programmes the computer

0:26:48 > 0:26:52to recognise the face its detecting as mine.

0:26:52 > 0:26:56Yay! Good afternoon, Liz. Excellent.

0:26:56 > 0:26:59We're extracting the key features of your face...

0:26:59 > 0:27:01The height of my cheeks, the bump on my nose,

0:27:01 > 0:27:03is that way it all boils down to?

0:27:03 > 0:27:05Absolutely, yep.

0:27:05 > 0:27:07This is Big Data facial recognition,

0:27:07 > 0:27:11matching patterns captured across my entire 3D image,

0:27:11 > 0:27:14with what it has already learned about my face.

0:27:14 > 0:27:16It's more robust than 2D systems,

0:27:16 > 0:27:17and you'd need a twin,

0:27:17 > 0:27:21or a 3D print-out of someone's head, to fool it.

0:27:21 > 0:27:25But the most advanced on display today isn't 3D.

0:27:25 > 0:27:26It's actually 4D.

0:27:27 > 0:27:31This system can process my reactions to a series of YouTube clips,

0:27:31 > 0:27:34in real-time, guessing what I'm feeling.

0:27:36 > 0:27:40These kinds of technologies aren't ready to leave the lab quite yet,

0:27:40 > 0:27:43but this is how robots could see us in the future,

0:27:43 > 0:27:46and identify what we are thinking.

0:27:46 > 0:27:48Whoa!

0:27:48 > 0:27:52And you can't help but imagine what else might be on the horizon.

0:27:54 > 0:27:57Can Big Data predict the future?

0:27:57 > 0:27:59This may seem a little far-fetched,

0:27:59 > 0:28:01but in many ways it's already happening.

0:28:01 > 0:28:05Police forces in the UK are trialling the use of data,

0:28:05 > 0:28:08like weather forecasts and records of break-ins,

0:28:08 > 0:28:11to predict where the next crimes might happen.

0:28:11 > 0:28:14And online retailers are planning to pre-package our goods

0:28:14 > 0:28:16before we've even ordered them.

0:28:16 > 0:28:19Whether we like it or not, Big Data is here,

0:28:19 > 0:28:21and it's going to change our world

0:28:21 > 0:28:23in ways we could never have imagined.

0:28:25 > 0:28:29Next week on Bang Goes the Theory, we look at the science of ageing.

0:28:29 > 0:28:32And we'll be joined by Sir Terry Wogan.

0:28:32 > 0:28:35Meanwhile, if you fancy working in Big Data,

0:28:35 > 0:28:37check out our careers guide at bbc.co.uk/bang.

0:28:40 > 0:28:43And for information on keeping your data secure,

0:28:43 > 0:28:46follow the links to the Open University website,

0:28:46 > 0:28:48and play their interactive privacy game.