Unlocking the Code

Download Subtitles

Transcript

0:00:04 > 0:00:07Inside every living thing

0:00:07 > 0:00:10is the most incredible molecule in the universe.

0:00:10 > 0:00:13It's DNA.

0:00:17 > 0:00:21It holds the code to make every single one of us,

0:00:21 > 0:00:24and all other life on earth.

0:00:24 > 0:00:27It's simply wonderful.

0:00:27 > 0:00:32And in the last decade, our understanding of that genetic code

0:00:32 > 0:00:35has undergone nothing less than a revolution.

0:00:36 > 0:00:39We finally finished reading the human genome.

0:00:39 > 0:00:43We made a list consisting of every single one

0:00:43 > 0:00:47of the three billion units that make up human DNA.

0:00:47 > 0:00:51You could say that, after 13 years and billions of dollars,

0:00:51 > 0:00:54we had finally read the book of life.

0:00:55 > 0:00:58But how does that book of life actually work?

0:01:00 > 0:01:04How does this long list in our DNA make us unique?

0:01:07 > 0:01:10How does it influence what we look like?

0:01:10 > 0:01:13How smart we are?

0:01:13 > 0:01:17How long we live, and our ultimate fate?

0:01:19 > 0:01:23How does the genome make you, you?

0:01:39 > 0:01:43My name is Dr Adam Rutherford.

0:01:43 > 0:01:45After years in the lab as a geneticist,

0:01:45 > 0:01:50I'm now a journalist who writes about how biology shapes our lives.

0:01:53 > 0:01:58I believe that the defining science of the new century was born

0:01:58 > 0:02:00almost exactly ten years ago.

0:02:00 > 0:02:03In February 2001, a multi-billion dollar project,

0:02:03 > 0:02:07that had united thousands of scientists

0:02:07 > 0:02:11from across the world, finally published its first results.

0:02:11 > 0:02:14We had read our entire genetic code.

0:02:14 > 0:02:19Without a doubt, this is the most important, most wondrous map

0:02:19 > 0:02:21ever produced by humankind.

0:02:21 > 0:02:26Today, we are learning the language in which God created life.

0:02:28 > 0:02:33Whatever your religious views, that is a bold statement.

0:02:33 > 0:02:39I believe that this endeavour DID change science - and the world.

0:02:39 > 0:02:42But maybe not quite as we thought it would.

0:02:42 > 0:02:44So ten years on, we have to ask ourselves,

0:02:44 > 0:02:46just how far have we come?

0:02:46 > 0:02:51How much does decoding our genetic make-up tell us about being human?

0:02:56 > 0:03:00The human genome is the total of our hereditary information,

0:03:00 > 0:03:04the complete list of every single one

0:03:04 > 0:03:07of the three billion bases in our DNA.

0:03:07 > 0:03:12Those bases are the chemical rungs inside the double helix.

0:03:12 > 0:03:14There are four different kinds.

0:03:14 > 0:03:18A for Adenine, T for Thymine,

0:03:18 > 0:03:22C for Cytosine, and G for Guanine.

0:03:23 > 0:03:29In 2001, we finally decoded the entire list for an average human.

0:03:29 > 0:03:31And that became the reference

0:03:31 > 0:03:34against which others could now be compared.

0:03:34 > 0:03:36It ushered in a new era,

0:03:36 > 0:03:41where the previously unimaginable was now quite easily possible.

0:03:41 > 0:03:45For instance, we can now routinely dig into the very heart of DNA,

0:03:45 > 0:03:48pretty well in the comfort of our own homes.

0:03:53 > 0:03:56This is Hugh Rienhoff.

0:03:56 > 0:03:58- Hi, how are you? - How's it going? Nice to see you.

0:03:58 > 0:04:03His youngest daughter Bea was born in 2003.

0:04:03 > 0:04:04At what point did you figure out

0:04:04 > 0:04:09there was something not exactly right with your daughter?

0:04:09 > 0:04:10The minute she was born,

0:04:10 > 0:04:16when Bea was taken out of the womb by caesarean section

0:04:16 > 0:04:18and I saw that she had very long feet

0:04:18 > 0:04:22and she had contracted fingers

0:04:22 > 0:04:25and she also had a port wine stain on her face.

0:04:27 > 0:04:31Bea also had long, floppy legs, poor muscle co-ordination,

0:04:31 > 0:04:35poor growth, and her eyes were set unusually far apart.

0:04:39 > 0:04:41As a clinical geneticist,

0:04:41 > 0:04:46Rienhoff figured his daughter's unique symptoms had a genetic cause.

0:04:50 > 0:04:53The doctors couldn't work out what it was,

0:04:53 > 0:04:57so, in a makeshift laboratory in their home in San Francisco,

0:04:57 > 0:05:00he started to rifle through her genome,

0:05:00 > 0:05:02her entire genetic code.

0:05:04 > 0:05:08And because of the advances made in the human genome project,

0:05:08 > 0:05:11the technology to do this is now commonplace.

0:05:13 > 0:05:16This is the DNA kit which you can buy

0:05:16 > 0:05:20and in there are the solutions that allow me

0:05:20 > 0:05:23to purify all the things away from the DNA.

0:05:23 > 0:05:29So, at the end of the day, when I add alcohol, just grain alcohol,

0:05:29 > 0:05:31it causes the DNA to come out of solution

0:05:31 > 0:05:34and it looks like a white piece of cotton,

0:05:34 > 0:05:37which is just floating in a clear liquid.

0:05:41 > 0:05:46But the next stages were totally unimaginable, even ten years ago.

0:05:48 > 0:05:52He isolated specific stretches of Bea's DNA,

0:05:52 > 0:05:57copied them and then set them off to a commercial lab to be read.

0:06:02 > 0:06:09So what I've done, Bea Bea, is I've taken that piece of DNA from you,

0:06:09 > 0:06:15and I'm looking for instances where your DNA sequence

0:06:15 > 0:06:22does not match the sequence sets in the reference genomes.

0:06:22 > 0:06:27But, using this new technology, and the dogged persistence of a parent,

0:06:27 > 0:06:29Rienhoff located sections of code

0:06:29 > 0:06:32that might be the key to Bea's condition.

0:06:32 > 0:06:39We found one gene that was clearly not being made properly in Bea.

0:06:39 > 0:06:42And one of them is involved in muscle development.

0:06:43 > 0:06:44Excellent.

0:06:44 > 0:06:48Rienhoff is confident that he may have found a direct link

0:06:48 > 0:06:51between his daughter's DNA and her condition.

0:06:51 > 0:06:55# A, B, C, D, E, F, G... #

0:06:55 > 0:07:01It's not a cure, but it's, without a doubt, a huge insight.

0:07:02 > 0:07:05There is some comfort in knowing exactly what's wrong,

0:07:05 > 0:07:07even if you can't do anything,

0:07:07 > 0:07:10even if you don't know what to expect in the future,

0:07:10 > 0:07:12it's still nice to know what's wrong.

0:07:14 > 0:07:18Just think for a moment what Hugh Rienhoff has achieved.

0:07:18 > 0:07:21It's truly impressive.

0:07:21 > 0:07:26Working alone, without the support of a university or a hospital,

0:07:26 > 0:07:29he personally decoded the DNA that caused his daughter's

0:07:29 > 0:07:32unique and unknown medical condition.

0:07:34 > 0:07:38This is where genetics has taken us.

0:07:45 > 0:07:49But to fully understand how far we have come,

0:07:49 > 0:07:54we need to go back 50 years to the dawn of modern genetics.

0:07:55 > 0:07:58Back then we had just worked out

0:07:58 > 0:08:04that the mechanism of inheritance, and of life itself, lay in DNA.

0:08:04 > 0:08:10But what DNA was and how it actually worked was still a mystery.

0:08:10 > 0:08:15The long journey to unravelling just how our DNA made us who we are,

0:08:15 > 0:08:18and also how it could go wrong,

0:08:18 > 0:08:21began one summer morning nearly 50 years ago.

0:08:25 > 0:08:28On the 8th of July, 1953,

0:08:28 > 0:08:32an envelope arrived at an office in Cambridge University.

0:08:32 > 0:08:36In it was a letter from America and it was addressed to Francis Crick,

0:08:36 > 0:08:39who just three months earlier, along with his colleague Jim Watson,

0:08:39 > 0:08:43had discovered that the DNA molecule was shaped like a twisted ladder,

0:08:43 > 0:08:45the famous double helix.

0:08:48 > 0:08:51The letter addressed a question

0:08:51 > 0:08:54that Crick and Watson had been unable to answer.

0:08:54 > 0:08:57How does the DNA code work?

0:08:59 > 0:09:02Strangely, the letter wasn't written by a biologist,

0:09:02 > 0:09:06but by a physicist, called George Gamow, better known

0:09:06 > 0:09:10for his theories on radioactivity and the Big Bang.

0:09:12 > 0:09:15The letter was riddled with spelling mistakes and errors,

0:09:15 > 0:09:17but it did contain an original insight,

0:09:17 > 0:09:20something that the biologists had not yet considered.

0:09:20 > 0:09:24Gamow was looking past what had captivated everyone

0:09:24 > 0:09:28about Crick and Watson's discovery, which was its famous twisted shape.

0:09:28 > 0:09:31Instead, he was looking INSIDE the double helix

0:09:31 > 0:09:33at the rungs of the ladder.

0:09:35 > 0:09:40Gamow saw information, where others just saw a twisted molecule.

0:09:40 > 0:09:43He became fascinated by the four different molecules

0:09:43 > 0:09:45that made the rungs of the spiral -

0:09:45 > 0:09:48A, T, C and G -

0:09:48 > 0:09:51and the patterns that they formed.

0:09:51 > 0:09:53He guessed that the way DNA worked

0:09:53 > 0:09:56was through a hidden code in the patterns

0:09:56 > 0:10:00that these four different chemicals made inside the DNA spiral.

0:10:00 > 0:10:05He was suggesting an entire cryptic language hidden in the DNA molecule.

0:10:07 > 0:10:11Francis Crick himself said the importance of Gamow's work

0:10:11 > 0:10:14was that it was an abstract theory of coding,

0:10:14 > 0:10:17and was uncluttered by unnecessary chemical details.

0:10:17 > 0:10:21Which is a polite way of saying his biology was terrible,

0:10:21 > 0:10:23but his insight was piercing.

0:10:23 > 0:10:26Within a year, Crick, Watson, Gamow

0:10:26 > 0:10:30and a handful of the most brilliant scientists of their generation

0:10:30 > 0:10:33had formed a gang to try and decipher the code,

0:10:33 > 0:10:34to try and understand

0:10:34 > 0:10:38how the letters are rendered into flesh and blood.

0:10:40 > 0:10:42Scientists already knew

0:10:42 > 0:10:46that chemicals called proteins make living tissue.

0:10:46 > 0:10:51All our body's organs - muscles, skin, heart and brain -

0:10:51 > 0:10:54are all made of, or by, proteins.

0:10:57 > 0:11:00And proteins themselves are made of

0:11:00 > 0:11:03smaller building blocks called amino acids.

0:11:05 > 0:11:09And, although there are millions of proteins, it only takes combinations

0:11:09 > 0:11:14of just 20 amino acids to make every protein.

0:11:15 > 0:11:18Think of it like a set of plastic bricks

0:11:18 > 0:11:21in which there are only 20 different types of brick.

0:11:21 > 0:11:24Each amino acid is represented by a different brick,

0:11:24 > 0:11:26and, just like amino acids,

0:11:26 > 0:11:28the bricks can be different shapes and sizes.

0:11:28 > 0:11:30In order to make a protein, all you have to do

0:11:30 > 0:11:32is build a length of different bricks.

0:11:37 > 0:11:38But how did the DNA molecule,

0:11:38 > 0:11:43with the secret code Gamov suspected was there,

0:11:43 > 0:11:47actually make the proteins that make up our body?

0:11:47 > 0:11:49Well, it was obvious that the DNA would have to

0:11:49 > 0:11:53encode the amino acids, the building blocks of those proteins.

0:11:57 > 0:12:01But discovering just how DNA could make particular amino acids

0:12:01 > 0:12:04wouldn't come for another eight years,

0:12:04 > 0:12:08until 1961, in Washington DC.

0:12:10 > 0:12:12Two young and unknown scientists,

0:12:12 > 0:12:15Marshall Nirenberg and Heinrich Matthaei,

0:12:15 > 0:12:19believed they had figured out something that no-one else had -

0:12:19 > 0:12:23how to find out which particular letters in DNA

0:12:23 > 0:12:27encode which amino acids that make which proteins.

0:12:32 > 0:12:36They laboriously tested combination after combination

0:12:36 > 0:12:40of amino acids, proteins and pieces of DNA.

0:12:40 > 0:12:45If they got the combination right, they would begin to reveal exactly

0:12:45 > 0:12:48how the code in DNA actually worked.

0:12:49 > 0:12:53Weeks passed with no end in sight.

0:12:53 > 0:12:58Then, late one Saturday night, they had a go at an untried combination.

0:12:58 > 0:13:03They put together a stretch of code that effectively spelt out only Ts.

0:13:03 > 0:13:07And they discovered that this particular stretch of code

0:13:07 > 0:13:11made only one particular amino acid, phenylalanine.

0:13:11 > 0:13:15It was the Rosetta Stone moment.

0:13:15 > 0:13:18Nirenberg and Matthaie had cracked it.

0:13:18 > 0:13:21They had shown that a string of Ts in the genetic code

0:13:21 > 0:13:25was an instruction for the cell to go and get some phenylalanine,

0:13:25 > 0:13:27and string it together into a protein.

0:13:27 > 0:13:30And in doing so, they had taken the first step

0:13:30 > 0:13:32in deciphering the genetic code.

0:13:32 > 0:13:37They had translated the first word in the secret language of our genes.

0:13:44 > 0:13:45But what no-one knew

0:13:45 > 0:13:50was how DNA could make each of us so very different.

0:13:50 > 0:13:55What bits of our DNA made us tall or blue-eyed, asthmatic or diabetic.

0:13:55 > 0:14:00Could we isolate bits of DNA that make particular proteins,

0:14:00 > 0:14:03and give us particular features and qualities

0:14:03 > 0:14:05that we can all easily see.

0:14:06 > 0:14:11To begin to understand that, there's another idea I need to explain.

0:14:11 > 0:14:13If I told you that this family

0:14:13 > 0:14:16have something in common on a genetic level,

0:14:16 > 0:14:20you'd probably pretty quickly guess what it is.

0:14:20 > 0:14:22They all have a ginger gene.

0:14:22 > 0:14:24So what is a gene?

0:14:28 > 0:14:31Well, a gene is a unit of inheritance.

0:14:31 > 0:14:35Physically, it is a small section of your DNA that influences a trait,

0:14:35 > 0:14:39like gingerness, or eye colour, or even ear waxiness!

0:14:39 > 0:14:44Genes spell out in our DNA the precise nature of a protein.

0:14:44 > 0:14:46So this family has a pigmentation gene

0:14:46 > 0:14:50that encodes a protein that makes their hair ginger.

0:14:50 > 0:14:53The code in that gene is obviously different

0:14:53 > 0:14:56from the code of people who are, for example, blonde.

0:14:56 > 0:15:00And this difference in code is called a variant.

0:15:00 > 0:15:04But which bits of our DNA molecule hold the bits of code

0:15:04 > 0:15:07that gives us ginger or blonde hair?

0:15:07 > 0:15:12Finding these individual genes was an incredible challenge.

0:15:12 > 0:15:15Our DNA molecules are both immensely long -

0:15:15 > 0:15:20containing over three billion letters of code - and they're microscopic.

0:15:20 > 0:15:24But what if those bits of code can give you not ginger hair

0:15:24 > 0:15:26but a devastating disease?

0:15:26 > 0:15:31Then, tracking them down is obviously hugely important.

0:15:31 > 0:15:35And it was this that was the next challenge geneticists faced.

0:15:37 > 0:15:43In the 1980s in Britain, the search to link diseases to specific genes

0:15:43 > 0:15:47was led by Professor Kay Davies, a young ambitious researcher.

0:15:49 > 0:15:53She focused on a degenerative disease,

0:15:53 > 0:15:57Duchenne Muscular Dystrophy, or DMD, that affected boys.

0:15:57 > 0:15:59Duchenne Muscular Dystrophy

0:15:59 > 0:16:02is a progressive muscle-wasting disease,

0:16:02 > 0:16:06so these boys tend to have difficulty walking and climbing up stairs,

0:16:06 > 0:16:07about the age four or five.

0:16:07 > 0:16:11They generally go into a wheelchair about the age of 12.

0:16:11 > 0:16:14Many of them would be dead by 20.

0:16:14 > 0:16:18We knew, for example, that Duchenne Muscular Dystrophy was a muscle gene.

0:16:18 > 0:16:21We had no idea what it did. There were all sorts of theories,

0:16:21 > 0:16:24but it was impossible because there are thousands of genes

0:16:24 > 0:16:27expressed in muscles to decide which was the one that was mutated.

0:16:30 > 0:16:32But how could she trace the suspect gene?

0:16:33 > 0:16:35The technology of the time meant

0:16:35 > 0:16:39she couldn't easily read the genetic code directly,

0:16:39 > 0:16:44but she could look at huge stretches of the DNA, called chromosomes.

0:16:44 > 0:16:47Chromosomes are lengths of bunched-up DNA,

0:16:47 > 0:16:51hundreds of millions of base pairs long.

0:16:51 > 0:16:54We humans have 23 pairs of chromosomes -

0:16:54 > 0:16:57one of each pair from each of our parents.

0:17:00 > 0:17:04With Duchenne Muscular Dystrophy, Professor Davies had a crucial clue

0:17:04 > 0:17:06to help her find the abnormal variant.

0:17:09 > 0:17:12She knew the disease only affected boys.

0:17:12 > 0:17:15This meant she could trace the genetic fault

0:17:15 > 0:17:18to one of the chromosomes relating to sex,

0:17:18 > 0:17:21in this case, the X chromosome.

0:17:23 > 0:17:27So her first step was to collect a bank of X chromosomes

0:17:27 > 0:17:30from families with a history of the disease.

0:17:30 > 0:17:34We had to purify people's X chromosomes,

0:17:34 > 0:17:36and then we could amplify the material

0:17:36 > 0:17:39and put it in bacteria and grow it and look at it.

0:17:39 > 0:17:41And we'd never been able to do that before.

0:17:45 > 0:17:51Davies chemically chopped these chromosomes into small chunks.

0:17:51 > 0:17:55She could now start to search through the DNA of affected families

0:17:55 > 0:17:58for variations in their genetic code.

0:17:58 > 0:18:03To do this, she made a family tree for each affected family,

0:18:03 > 0:18:06showing the patterns of DMD

0:18:06 > 0:18:08inherited through the generations.

0:18:10 > 0:18:11Then she compared that tree

0:18:11 > 0:18:14with a tree showing patterns of inheritance

0:18:14 > 0:18:17from the pieces of X chromosome she had collected.

0:18:19 > 0:18:23When one of those trees matched the tree showing the DMD inheritance,

0:18:23 > 0:18:26she knew the piece of DNA she was looking at

0:18:26 > 0:18:29was very close to the gene responsible

0:18:29 > 0:18:31for Duchenne Muscular Dystrophy.

0:18:32 > 0:18:37It was just a case, which was a challenge still, of then homing in.

0:18:37 > 0:18:40We knew it was in that five million base pairs of DNA

0:18:40 > 0:18:42and all we had to do was find the gene.

0:18:42 > 0:18:47That painstaking search took several groups over ten years,

0:18:47 > 0:18:52but finally the gene responsible for DMD was located.

0:18:52 > 0:18:56So the eureka moment was when we found where the gene was.

0:18:56 > 0:18:59We knew then we could develop prenatal diagnosis for the disease

0:18:59 > 0:19:01which hadn't been available up to that point,

0:19:01 > 0:19:03so it was a very exciting time.

0:19:05 > 0:19:09It was the first time genetics had made a serious clinical impact.

0:19:09 > 0:19:15We could now diagnose a crippling genetic disease in an unborn child.

0:19:16 > 0:19:18We did, for example, diagnosis in a family

0:19:18 > 0:19:22where a particular mother had had a couple of abortions

0:19:22 > 0:19:25because she didn't want an affected male,

0:19:25 > 0:19:28and that was quite frequent in DMD families

0:19:28 > 0:19:31because it's such a distressing disease,

0:19:31 > 0:19:33and then we were able to do a diagnosis.

0:19:33 > 0:19:38We could predict whether the foetus was affected, and there were twins.

0:19:38 > 0:19:41I remember it very well because the diagnosis came back

0:19:41 > 0:19:44that she was going to have two normal twins.

0:19:44 > 0:19:48In fact, one of the twins was a boy and other a girl.

0:19:48 > 0:19:50So this lady then had an instant family.

0:19:50 > 0:19:53These two twins were born, obviously normal,

0:19:53 > 0:19:55and the female was not a carrier.

0:19:55 > 0:19:58So that was just a wonderful story.

0:20:00 > 0:20:03Professor Davies' discovery of the gene variant linked to

0:20:03 > 0:20:07Duchenne Muscular Dystrophy was a genuine landmark.

0:20:10 > 0:20:15Years of genetic research finally had a real effect on people

0:20:15 > 0:20:19and it fired the starting gun for the race to understand other

0:20:19 > 0:20:26brutal genetic diseases, like Cystic Fibrosis and Huntingdon's Disease.

0:20:31 > 0:20:35But being diagnosed with a genetic disease isn't necessarily

0:20:35 > 0:20:37the easiest thing to take,

0:20:37 > 0:20:42because understanding its causes is not a cure.

0:20:42 > 0:20:46Charles Sabine was a war correspondent for NBC,

0:20:46 > 0:20:50working in Afghanistan, Iraq and Kuwait.

0:20:50 > 0:20:52EXPLOSION

0:20:52 > 0:20:56Then, in 2003, he was told he had the faulty gene

0:20:56 > 0:20:59which causes Huntington's disease.

0:20:59 > 0:21:06I had never, in all the experiences that I had been through,

0:21:06 > 0:21:10from being taken, captured, by Mujahedin guerrillas

0:21:10 > 0:21:12and had a grenade held to my head...

0:21:14 > 0:21:18None of those experiences scared me as much as Huntington's disease,

0:21:18 > 0:21:21because of the finality, the terrible finality of the disease.

0:21:21 > 0:21:25This disease takes away your dignity

0:21:25 > 0:21:29and, right now, it has a complete vacuum of hope.

0:21:29 > 0:21:34So that is what makes it so impossible to deal with.

0:21:34 > 0:21:38Huntington's is a genetic disease that attacks the brain,

0:21:38 > 0:21:42and, in all cases, leads to mental and physical decline,

0:21:42 > 0:21:45and, then, without exception, death.

0:21:45 > 0:21:50What I experienced was this sudden feeling,

0:21:50 > 0:21:54first of all, of lack of control of any aspect of my life

0:21:54 > 0:21:59because suddenly it was not me that was determining

0:21:59 > 0:22:02the way my life was going to go,

0:22:02 > 0:22:06but by 50/50 chance was going to be determined by this gene inside me

0:22:06 > 0:22:07that I had no control of.

0:22:09 > 0:22:14In the 1980s, using techniques like those developed by Kay Davies,

0:22:14 > 0:22:16scientists finally located the gene

0:22:16 > 0:22:19responsible for this devastating disease.

0:22:19 > 0:22:22We found the Huntingdon's gene on chromosome four.

0:22:22 > 0:22:26That revolutionised Huntingdon's, because you could tell,

0:22:26 > 0:22:30in instances where those individuals wish to know the information,

0:22:30 > 0:22:33you could tell them whether they were going to be affected,

0:22:33 > 0:22:35but more so, you could protect them, if they wanted,

0:22:35 > 0:22:38against having affected children in the future.

0:22:38 > 0:22:40That was a huge breakthrough.

0:22:42 > 0:22:45And although Charles may not be cured,

0:22:45 > 0:22:48because of this breakthrough and genetic screening,

0:22:48 > 0:22:52Sabine knows his daughter will never have to live through

0:22:52 > 0:22:54this horrific disease.

0:22:54 > 0:22:58Her existence and the fact that she does not have the gene

0:22:58 > 0:23:01for Huntington's disease gives me probably more joy

0:23:01 > 0:23:03than anything in the world.

0:23:03 > 0:23:08The success of genetic screening made the '80s a crucial time

0:23:08 > 0:23:10in our story of the genome.

0:23:12 > 0:23:17But all the diseases isolated in the '80s have one thing in common.

0:23:17 > 0:23:21They're all caused by just one gene - they are monogenic.

0:23:32 > 0:23:35But monogenic diseases are unusual...

0:23:37 > 0:23:40..because most diseases, and indeed most human traits,

0:23:40 > 0:23:44are not simply linked to a single gene,

0:23:44 > 0:23:49but to many, sometimes dozens of genes.

0:23:52 > 0:23:53Just take height.

0:23:58 > 0:24:03You, quite clearly, are the tallest, so stand over this side here.

0:24:03 > 0:24:05You, come in this gap here...

0:24:05 > 0:24:11'At 5ft 10in, I am rather boringly an inch over the national average.'

0:24:11 > 0:24:13But there is a large range around that mean.

0:24:16 > 0:24:18So what determines how tall you are?

0:24:18 > 0:24:22So if you think about height, it seems quite obvious

0:24:22 > 0:24:26that height has an inherited component, and that means genes.

0:24:26 > 0:24:30Tall parents tend to give birth to tall children.

0:24:30 > 0:24:34But when we began to look comprehensively in the genome

0:24:34 > 0:24:38for the genes which affect height, we found dozens of them.

0:24:38 > 0:24:41Height is what's known as polygenic.

0:24:41 > 0:24:44It's influenced by many genes.

0:24:49 > 0:24:53Even though it's one measurement to us,

0:24:53 > 0:24:55it's actually a mishmash of loads of components -

0:24:55 > 0:24:59bone lengths, muscle growth, nutrition, and so on -

0:24:59 > 0:25:02all combining into how tall you are.

0:25:02 > 0:25:05And that would make the genetics very murky.

0:25:08 > 0:25:13So, to understand polygenic diseases and traits, we'd have to link

0:25:13 > 0:25:17each trait with every single possible influencing gene.

0:25:17 > 0:25:20That would be a massively difficult thing to do

0:25:20 > 0:25:21because, to find each gene,

0:25:21 > 0:25:26we'd have to read and know more of our DNA sequence than ever before.

0:25:27 > 0:25:31By the late '70s, a new invention was being developed

0:25:31 > 0:25:34that would pave the way to unpack the whole genome

0:25:34 > 0:25:39and ultimately read every single one of the three billion bases in it.

0:25:40 > 0:25:44It's time to meet the man who cracked it, who finally figured out

0:25:44 > 0:25:48how to read every single letter of any DNA molecule.

0:25:48 > 0:25:52He was born in a small Gloucestershire village in 1918

0:25:52 > 0:25:54and his name was Fred Sanger.

0:25:57 > 0:26:00Sanger was a quiet, unassuming man

0:26:00 > 0:26:03who spent the Second World War studying in Cambridge,

0:26:03 > 0:26:08and there began his lifelong love for unpicking the molecules of life.

0:26:11 > 0:26:14Fred Sanger's first great achievement was to discover

0:26:14 > 0:26:17the chemical structure of insulin.

0:26:17 > 0:26:21For that, he got a Nobel Prize in 1958.

0:26:23 > 0:26:27That's impressive enough, but winning TWO Nobel Prizes?

0:26:27 > 0:26:29Well, that's just showing off.

0:26:29 > 0:26:33In 1977, Fred Sanger invented a technique which earned him

0:26:33 > 0:26:37his second Nobel Prize, and for which he'll always be remembered.

0:26:37 > 0:26:40Officially, it goes by the rather sinister title

0:26:40 > 0:26:42of the Chain Termination Method.

0:26:42 > 0:26:44But as a tribute, in the business,

0:26:44 > 0:26:47it's better known as Sanger Sequencing.

0:26:49 > 0:26:52So how does it work?

0:26:52 > 0:26:56In us, our genomes are more than three billion letters long.

0:26:56 > 0:27:00But for purposes of simplicity, I'm going to sequence a gene

0:27:00 > 0:27:02of just six letters.

0:27:04 > 0:27:07The problem is, what with DNA being so small,

0:27:07 > 0:27:09is that we can't read it directly.

0:27:09 > 0:27:13In other words, we can't see what the letters are.

0:27:13 > 0:27:15So we need an indirect way of reading the cards,

0:27:15 > 0:27:20and this is where Sanger's cunning technique comes into its own.

0:27:20 > 0:27:24First, he got the DNA to start copying itself

0:27:24 > 0:27:26into shorter fragments.

0:27:28 > 0:27:30And here's the cunning bit.

0:27:30 > 0:27:34So essentially, Sanger's technique is a chemical trick that allows you

0:27:34 > 0:27:39to read just one card in your shortened fragment of DNA,

0:27:39 > 0:27:41and that's the end card.

0:27:43 > 0:27:46So what good does that do, you may very well ask?

0:27:46 > 0:27:49How does knowing the end card in a shortened fragment

0:27:49 > 0:27:52help you read the entire sequence of your original DNA?

0:27:52 > 0:27:54Well, the answer is it's a numbers game.

0:27:56 > 0:28:01Sanger got the original DNA to replicate itself

0:28:01 > 0:28:05millions of times at every possible length.

0:28:05 > 0:28:08Now, there was an end letter,

0:28:08 > 0:28:14a letter he could read, at every possible position in the sequence.

0:28:14 > 0:28:19So you end up with a mix containing fragments of your original DNA

0:28:19 > 0:28:24that terminates at every single position along the sequence.

0:28:24 > 0:28:25So the final step

0:28:25 > 0:28:30is that you read along the rows. A...A.

0:28:30 > 0:28:33T, along the line...it's a T.

0:28:33 > 0:28:37C, all the way along, it's a C.

0:28:37 > 0:28:44T, T, A, A and G.

0:28:44 > 0:28:48And bingo! There is your DNA sequence.

0:28:49 > 0:28:53In real life, the results of sequencing look something like this.

0:28:53 > 0:28:58Fans of forensic detective shows will recognise this.

0:28:58 > 0:28:59It's a sequencing gel.

0:28:59 > 0:29:06It's in four columns, one for each letter, A, T, C and G.

0:29:06 > 0:29:08And by reading from the bottom upwards,

0:29:08 > 0:29:11you can see that the actual sequence is...

0:29:11 > 0:29:12A...

0:29:14 > 0:29:15A...

0:29:16 > 0:29:18T...

0:29:19 > 0:29:20A...

0:29:21 > 0:29:23..C, and so on.

0:29:23 > 0:29:28When Sanger and his colleagues first came up with this technique

0:29:28 > 0:29:31in the 1970s, it was manual and a painstaking slog.

0:29:31 > 0:29:35Nowadays, the process has developed and is fully automated.

0:29:35 > 0:29:38At a fraction of the cost, now in a matter of weeks,

0:29:38 > 0:29:41we can sequence billions of letters of DNA.

0:29:41 > 0:29:45But the basic technique is still that of Fred Sanger.

0:29:47 > 0:29:50Over the next 30 years, as technology grew in sophistication,

0:29:50 > 0:29:55the few thousand bases scientists could sequence grew to millions,

0:29:55 > 0:29:59and in the '90s, the awesome potential of Sanger's technique

0:29:59 > 0:30:01could finally be realised.

0:30:01 > 0:30:04And then, we set our sights on what I think is

0:30:04 > 0:30:08the most ambitious scientific project of all time -

0:30:08 > 0:30:11sequencing the entire human genome.

0:30:15 > 0:30:18Upscaling Sanger's sequencing system for the human genome

0:30:18 > 0:30:21was a colossal task.

0:30:22 > 0:30:27A truly global collaboration that took over a decade...

0:30:28 > 0:30:32..thousands of scientists and billions of dollars.

0:30:34 > 0:30:39But in February 2001, the first results of all that work and money

0:30:39 > 0:30:41hit the news stands.

0:30:43 > 0:30:47So in February 2001, I was sitting in the lab doing my PhD,

0:30:47 > 0:30:51about a mile in that direction, at Great Ormond Street Hospital,

0:30:51 > 0:30:55and the copy of Nature and the copy of Science landed on my desk,

0:30:55 > 0:30:59announcing that the human genome sequence was completed.

0:30:59 > 0:31:02There was a big, grandstanding announcement saying,

0:31:02 > 0:31:05"We've done it, we've sequenced the human genome,

0:31:05 > 0:31:08"we've read the book of life." Great big phrases like that.

0:31:08 > 0:31:12It will revolutionise the diagnosis, prevention and treatment

0:31:12 > 0:31:15of most, if not all, human diseases.

0:31:15 > 0:31:18In coming years, doctors increasingly will be able to cure

0:31:18 > 0:31:22diseases like Alzheimer's, Parkinson's, diabetes and cancer,

0:31:22 > 0:31:24by attacking their genetic roots.

0:31:27 > 0:31:30I have to admit that the President's words

0:31:30 > 0:31:32left many of us in the business uneasy.

0:31:34 > 0:31:37Just having the code still meant we were a long way from being able

0:31:37 > 0:31:40to do anything clinically useful with it.

0:31:42 > 0:31:45After all, reading the code is one thing,

0:31:45 > 0:31:48but understanding all of it is something else.

0:31:48 > 0:31:52In fact, as we started to look at the code and search for

0:31:52 > 0:31:56all of the genes that made us, we were in for a big shock.

0:32:03 > 0:32:07This is Dr Ewan Birney. At the tender age of 26,

0:32:07 > 0:32:12he was one of the lead researchers on the human genome project.

0:32:14 > 0:32:17With the human genome nearly decoded,

0:32:17 > 0:32:20the best brains in the genetics world were asking,

0:32:20 > 0:32:22how many genes does a human have?

0:32:25 > 0:32:30Certainly, the consensus feeling, I can remember being told,

0:32:30 > 0:32:34that it was somewhere between 50,000 and 100,000 genes

0:32:34 > 0:32:36that seemed to make sense to most people.

0:32:36 > 0:32:39What were these guys, who are the experts in their fields,

0:32:39 > 0:32:41the top geneticists in the world,

0:32:41 > 0:32:44where were they getting these numbers from?

0:32:44 > 0:32:48There was a kind of textbook, back-of-the-envelope calculation,

0:32:48 > 0:32:50where they took the average length

0:32:50 > 0:32:55of a human gene, on the bits of genomic sequence known at the time.

0:32:55 > 0:32:58It was 30,000 base pairs

0:32:58 > 0:33:02and they took the whole size of the human genome - three billion -

0:33:02 > 0:33:05divided one by the other and you get 100,000.

0:33:07 > 0:33:09And by a strange quirk, we know exactly

0:33:09 > 0:33:14what the best brains in the genetics world actually believed back then,

0:33:14 > 0:33:18because Ewan Birney got them to put their money where their mouths were,

0:33:18 > 0:33:22and got them to bet on how many genes they thought we had.

0:33:22 > 0:33:26So I went round with a plastic beer thing and the book

0:33:26 > 0:33:31and I bumped into people and said, "Do you want to bet?"

0:33:32 > 0:33:36If ever you want to see evidence of brilliant scientists

0:33:36 > 0:33:38getting it really wrong, this is it.

0:33:38 > 0:33:41You're in there first. Ewan Birney, number...

0:33:41 > 0:33:46- 48,251.- And the next number down...

0:33:46 > 0:33:49It's John Quackenbush, one of the big, big betters,

0:33:49 > 0:33:54118,259.

0:33:54 > 0:33:55- Huge.- Huge.

0:33:55 > 0:33:58Absolutely huge, but kind of in the consensus.

0:34:00 > 0:34:05Then, in early 2001, using the new complete Human Genome,

0:34:05 > 0:34:10Ewan Birney was able to count the real number of genes in a human.

0:34:11 > 0:34:14So when we got to the publication -

0:34:14 > 0:34:17I can't actually remember the phrase we used.

0:34:17 > 0:34:23I think we said something like we can confidently identify 25,000 genes,

0:34:23 > 0:34:27and we believed that maybe up to 35,000 genes in the human genome,

0:34:27 > 0:34:30and that up to 35,000 was because

0:34:30 > 0:34:34people were frankly not happy about the smaller number.

0:34:35 > 0:34:39Within a few years, scientists agreed on a rough figure.

0:34:39 > 0:34:44They could only find around 24,000 genes in the human genome.

0:34:44 > 0:34:49By far the majority of the code in our DNA seemed to be just useless.

0:34:49 > 0:34:51It wasn't genes at all.

0:34:51 > 0:34:55What most scientists, in fact, called "junk DNA".

0:34:55 > 0:34:59Imagine that this building is your genome -

0:34:59 > 0:35:02three billions letter of DNA code.

0:35:02 > 0:35:05Now, this is the amount that makes up genes.

0:35:05 > 0:35:10So according to the classical genetics model, a tiny proportion,

0:35:10 > 0:35:14just two or three percent, make the proteins that make you,

0:35:14 > 0:35:17and the rest is darkness.

0:35:20 > 0:35:23This was a real shock.

0:35:23 > 0:35:3098% of our genome is not genes and doesn't code for proteins.

0:35:30 > 0:35:33There's an assumption in a lot of genomics

0:35:33 > 0:35:37that a lot of the DNA is just junk, it's garbage, it's rubbish.

0:35:37 > 0:35:40And I have to say, at first glance, that seems reasonable

0:35:40 > 0:35:42because a lot of it just doesn't produce anything.

0:35:42 > 0:35:44There are only about 24,000 genes

0:35:44 > 0:35:47that go to make a mammal, a human being, say,

0:35:47 > 0:35:49which is about the same number of bits you need

0:35:49 > 0:35:52to make a double-decker bus. It's not very many.

0:35:52 > 0:35:54I would like to think I'm more complicated than a bus

0:35:54 > 0:35:56and that is a surprise.

0:35:56 > 0:35:58And what it tells you is something very important.

0:35:58 > 0:36:01It's that we don't understand genetics at all.

0:36:01 > 0:36:05We're in a situation that we've got a lot of boxes labelled

0:36:05 > 0:36:07screws, washers, bulbs, and we don't even know

0:36:07 > 0:36:09how to put them together,

0:36:09 > 0:36:13let alone how to start the bus and drive it through the streets.

0:36:13 > 0:36:17But because we were looking for genes that cause disease,

0:36:17 > 0:36:20this low number had an unexpected upside.

0:36:20 > 0:36:23It meant fewer genes to study.

0:36:23 > 0:36:24Now, this was crucial,

0:36:24 > 0:36:29because at the time, sequencing DNA was still colossally expensive.

0:36:29 > 0:36:32So by narrowing down on just a small proportion of the genome,

0:36:32 > 0:36:36it meant that large-scale studies were financially realistic.

0:36:38 > 0:36:41And then scientists found something intriguing.

0:36:41 > 0:36:44As we started to compare people's whole genomes,

0:36:44 > 0:36:50we realised that everyone's DNA is almost identical.

0:36:51 > 0:36:54If you compare one human genome with another

0:36:54 > 0:36:57they would be identical at most positions,

0:36:57 > 0:37:00they differ at about one position in a thousand, on average.

0:37:05 > 0:37:09Yet we know we are hugely different.

0:37:09 > 0:37:11We are all unique.

0:37:11 > 0:37:14So the challenge now was to find those relatively few

0:37:14 > 0:37:18individual differences in genes, genetic variants,

0:37:18 > 0:37:20that account for differences in people.

0:37:20 > 0:37:24And more specifically, to find the variants that cause disease.

0:37:25 > 0:37:29So in 2005, the Wellcome Trust, here in the UK,

0:37:29 > 0:37:33united many labs, by launching a huge survey

0:37:33 > 0:37:36to read half a million DNA letters,

0:37:36 > 0:37:39within known genes, for not just one,

0:37:39 > 0:37:42but thousands of ill and healthy people.

0:37:42 > 0:37:46The hope was that half a million DNA letters would be sufficient

0:37:46 > 0:37:51to identify the most significant common variants that link to disease.

0:38:03 > 0:38:06Professor Peter Donnelly was part of the team

0:38:06 > 0:38:09who actually crunched the massive amounts of data.

0:38:09 > 0:38:14Some ten billion pieces of genetic information were analysed,

0:38:14 > 0:38:19harvested from over 10,000 people, at a cost of over £9 million.

0:38:20 > 0:38:24The first experiment looked at seven illnesses

0:38:24 > 0:38:28that, like height, were linked to many genes.

0:38:28 > 0:38:31So here's an example from the large study we did initially

0:38:31 > 0:38:33and the paper we published.

0:38:33 > 0:38:38So this shows a row for each disease, and along each row we plot a measure

0:38:38 > 0:38:42of the difference for each of the 500,000 variants we measured

0:38:42 > 0:38:44between the sick people and healthy people.

0:38:44 > 0:38:47The graph shows a summary of those results.

0:38:47 > 0:38:50The half a million DNA letters are run from left to right,

0:38:50 > 0:38:54divided up from chromosomes 1 to 22 and the X chromosome.

0:38:54 > 0:38:58And when there is a noticeable difference

0:38:58 > 0:39:01in the letters between sick and healthy people,

0:39:01 > 0:39:03it shows up as a green peak.

0:39:03 > 0:39:06You're saying that the green ones

0:39:06 > 0:39:09are where a disease is associated with the genome?

0:39:09 > 0:39:13Yes, the green ones are the ones where there's a genetic variant

0:39:13 > 0:39:16which is considerably more common in the sick people

0:39:16 > 0:39:19than the healthy people, in a way which is associated with disease.

0:39:19 > 0:39:22It was a huge breakthrough.

0:39:22 > 0:39:26Now, for the first time, it looked like we could find diseases

0:39:26 > 0:39:30that were caused by errors in more than just one gene in our DNA.

0:39:30 > 0:39:33I still remember the first time we sat down and had a serious look.

0:39:33 > 0:39:36It was an extraordinary moment, knowing it would deliver

0:39:36 > 0:39:39and we'd get some insights into the genetics of those common diseases.

0:39:39 > 0:39:40It was really exciting.

0:39:42 > 0:39:47That excitement was felt well beyond the scientific community.

0:39:50 > 0:39:54Good evening. British scientists unveiled a new era in medicine today,

0:39:54 > 0:39:57when they announced they'd finally unravelled a genetic link

0:39:57 > 0:39:59to seven major diseases,

0:39:59 > 0:40:03raising the prospect of predicting a child's medical future at birth.

0:40:03 > 0:40:07Well, it was an awesome breakthrough, but looking back,

0:40:07 > 0:40:11perhaps the media machine got a little ahead of itself,

0:40:11 > 0:40:15because what this genome survey actually says about the health

0:40:15 > 0:40:19of individuals, of real people, is, in fact, rather limited.

0:40:21 > 0:40:23Because what we have to remember

0:40:23 > 0:40:26is that even though our genes may indicate that we are susceptible

0:40:26 > 0:40:31to disease, it doesn't mean we will actually get that disease.

0:40:35 > 0:40:39Mark Hurst is a senior lecturer in human genetics.

0:40:40 > 0:40:43He has a strong family history of diabetes.

0:40:45 > 0:40:47This is my mum and dad.

0:40:47 > 0:40:50They got married just after the war.

0:40:50 > 0:40:55I was aware that my dad had diabetes. He'd test his sugar levels

0:40:55 > 0:41:00and he was on various drugs to try and control it.

0:41:00 > 0:41:03And it became obvious in the late '70s and '80s

0:41:03 > 0:41:05that there was a genetic component

0:41:05 > 0:41:08and so, as I was studying human genetics,

0:41:08 > 0:41:11I sort of followed it with some interest.

0:41:13 > 0:41:16The heritability of Type 2 diabetes,

0:41:16 > 0:41:19if you've got close relatives, is very high.

0:41:19 > 0:41:23Over the last 20 years, two of my sisters and one of my brothers

0:41:23 > 0:41:26have all developed Type 2 diabetes,

0:41:26 > 0:41:28so the genetic lottery says

0:41:28 > 0:41:32I may have some of the genes, I might not, I don't know.

0:41:34 > 0:41:39But Dr Hurst knows that even though he probably has variants in his DNA

0:41:39 > 0:41:42which mean he is likely to suffer from diabetes,

0:41:42 > 0:41:46it's far from certain he will actually get the disease,

0:41:46 > 0:41:49because other factors are important, too.

0:41:49 > 0:41:55There's a great...almost belief that you're a slave to your genes

0:41:55 > 0:41:59and I think for some of the monogenetic disorders,

0:41:59 > 0:42:02that probably is very much the case,

0:42:02 > 0:42:06but for most complex diseases, multigenic diseases,

0:42:06 > 0:42:09a large amount of environmental component,

0:42:09 > 0:42:12so you can control large amounts of your environment

0:42:12 > 0:42:14through things like exercise and diet.

0:42:14 > 0:42:18Recent studies suggest that, simplistically,

0:42:18 > 0:42:22diabetes is 70% genetic and 30% environmental.

0:42:22 > 0:42:25There's this environmental component which was related to

0:42:25 > 0:42:28your body weight, your waist size, your diet,

0:42:28 > 0:42:32and I decided I can't control my genes, but I can at least

0:42:32 > 0:42:35do something about the environment, so I started to run.

0:42:37 > 0:42:41Hurst runs three miles a day.

0:42:41 > 0:42:45This, he believes, has kept his diabetes at bay.

0:42:47 > 0:42:50I suspect I would have been quite a lot heavier...

0:42:52 > 0:42:55..and I think I would have probably developed

0:42:55 > 0:42:57the signs of early diabetes by now.

0:42:58 > 0:43:02Hurst's story reminds us that in most cases our traits and diseases

0:43:02 > 0:43:07spring from our environment as well as our genetic code.

0:43:13 > 0:43:18So now we have to ask ourselves a crucial question.

0:43:18 > 0:43:22How much of a disease is to do with our genes at all?

0:43:22 > 0:43:25It turns out we can make a guess at that

0:43:25 > 0:43:29from a more traditional kind of genetic experiment.

0:43:29 > 0:43:33The next questionnaire for you is a questionnaire about you.

0:43:33 > 0:43:36This is Dr Claire Howarth.

0:43:36 > 0:43:38And her unusual research tool?

0:43:38 > 0:43:40Twins.

0:43:48 > 0:43:52So why are twins so useful for any sort of genetic study?

0:43:52 > 0:43:55It's a fantastic natural experiment, twins.

0:43:55 > 0:43:59They provide the opportunity to investigate the roles

0:43:59 > 0:44:02of nature - genes - and nurture - the environment,

0:44:02 > 0:44:05so if a trait has a genetic influence

0:44:05 > 0:44:09then you'd expect identical twins who share more genes to be more similar

0:44:09 > 0:44:13than non identical twins, who share less of their genes.

0:44:14 > 0:44:18Identical twins grow from one egg and one sperm,

0:44:18 > 0:44:20so they are genetically the same.

0:44:20 > 0:44:22There are quite a lot of similarities between us.

0:44:22 > 0:44:25We look quite similar. We talk quite similar.

0:44:25 > 0:44:27But I can't really say.

0:44:27 > 0:44:32- Same likes, dislikes almost, kind of mainly.- Yeah.

0:44:32 > 0:44:34What do you like that's the same?

0:44:34 > 0:44:37- We like the same music.- Yep, music.

0:44:37 > 0:44:39- Like the same food.- Sport.

0:44:39 > 0:44:41You're about to go to university, right?

0:44:41 > 0:44:43- In two years.- Two years, yeah.

0:44:43 > 0:44:46You really do finish each other's sentences!

0:44:49 > 0:44:53But non-identical twins are from different eggs and sperm,

0:44:53 > 0:44:57and like normal siblings, share only about 50% of their genes.

0:44:58 > 0:45:01I'm very into my sport, like watching,

0:45:01 > 0:45:04whereas Maddy's more into playing.

0:45:04 > 0:45:08I'm really, really musical. Caroline can't carry a tune in a bucket.

0:45:08 > 0:45:10She doesn't play any instruments.

0:45:10 > 0:45:13I love art and design more.

0:45:18 > 0:45:22Studies like Dr Haworth's compare traits like height

0:45:22 > 0:45:26between thousands of identical and non-identical twins.

0:45:26 > 0:45:31We can break up the variance in a trait such as height, say,

0:45:31 > 0:45:35and we can say how much that is due to genetic differences between people

0:45:35 > 0:45:39and how much is due to environmental experiences they've had.

0:45:39 > 0:45:42173.

0:45:42 > 0:45:45These studies show that many common traits

0:45:45 > 0:45:48are inherited much more than we thought.

0:45:48 > 0:45:52Reading disability and reading ability are both highly heritable.

0:45:52 > 0:45:55Somewhere between 50% and 70% of the variance

0:45:55 > 0:45:59is due to DNA sequence that people have inherited from their parents.

0:46:04 > 0:46:06And when Howarth started to test

0:46:06 > 0:46:08for traits that genome surveys had looked at,

0:46:08 > 0:46:11she got unexpected results.

0:46:13 > 0:46:16Many traits were more heritable

0:46:16 > 0:46:20than the genetic studies had previously revealed.

0:46:20 > 0:46:26For height, twin studies say it's around 80% inherited

0:46:26 > 0:46:30versus 5% that was found in the genome scan.

0:46:30 > 0:46:35For Type 2 diabetes, it's 70% versus 6%.

0:46:36 > 0:46:40There was a large proportion of the DNA's influence

0:46:40 > 0:46:41simply not being seen.

0:46:43 > 0:46:47Some have called this "the missing heritability".

0:46:47 > 0:46:51We know there is missing heritability because twin studies have told us

0:46:51 > 0:46:53that a lot of traits are very highly heritable.

0:46:53 > 0:46:56For example height is about 80% heritable,

0:46:56 > 0:46:59so when we do a molecular genetics study of height,

0:46:59 > 0:47:02what we find is the DNA variance that we've identified only explain

0:47:02 > 0:47:06about 5% of the variance, so we have this mismatch

0:47:06 > 0:47:12between 80% heritability and only 5% that's been identified in the genome.

0:47:16 > 0:47:18This was an extraordinary result.

0:47:18 > 0:47:23Where was this missing heritability coming from?

0:47:24 > 0:47:29Could it be that something in the mysterious 98% of the genome

0:47:29 > 0:47:31that doesn't seem to do anything,

0:47:31 > 0:47:34is actually far more important than we thought?

0:47:35 > 0:47:41And this seemed possible, when we compared the code of our genome

0:47:41 > 0:47:42with the code of other animals,

0:47:42 > 0:47:47looking for parts preserved over millions of years of evolution.

0:47:51 > 0:47:54So if you look between human and chimpanzee, for example,

0:47:54 > 0:47:56most of our DNA is the same.

0:47:56 > 0:48:02Between human and mouse, a fair bit is still pretty much the same.

0:48:02 > 0:48:05But by the time you go to chicken, it's very clear that,

0:48:05 > 0:48:08if there's a piece of DNA that's the same between human and chicken,

0:48:08 > 0:48:12then it's important for humans and it's important for chickens

0:48:12 > 0:48:14and there's no real way of getting round that.

0:48:14 > 0:48:18And there's quite a lot of this stuff and it's not all near genes.

0:48:18 > 0:48:20So there are these big chunks of the genome

0:48:20 > 0:48:23that don't seem to have any protein-coding genes,

0:48:23 > 0:48:27yet is still conserved between human and chicken and human and mouse.

0:48:29 > 0:48:34If these pieces of DNA were cropping up in many species

0:48:34 > 0:48:36they were clearly important for life.

0:48:39 > 0:48:43But many of them weren't in the genes.

0:48:43 > 0:48:47They were in the so-called junk DNA, and that meant that this wasteland

0:48:47 > 0:48:50was far more important than we had previously imagined.

0:48:59 > 0:49:01We'd assumed genes would account for

0:49:01 > 0:49:04the vast majority of our inheritance.

0:49:04 > 0:49:05But the new message was this -

0:49:05 > 0:49:09a good place to be looking for the missing heritability

0:49:09 > 0:49:12was in the 98% of the genome that isn't made up of genes.

0:49:12 > 0:49:15And now we have the technology to start hunting.

0:49:19 > 0:49:21So in this room we've got six or seven

0:49:21 > 0:49:26of the newest generation of machines. Each one of these runs

0:49:26 > 0:49:28for about a week, and in that week,

0:49:28 > 0:49:32it'll sequence well over 20 whole human genomes.

0:49:32 > 0:49:36And that's about 300 billion bases.

0:49:38 > 0:49:40Professor Mark McCarthy

0:49:40 > 0:49:43at the Wellcome Trust Centre for Human Genetics

0:49:43 > 0:49:45is harnessing this new technology

0:49:45 > 0:49:50to sequence the entire genome of hundreds of diabetes sufferers.

0:49:50 > 0:49:54And the hope is that this will reveal influential variants

0:49:54 > 0:49:59that had slipped through the net of earlier, less accurate surveys.

0:49:59 > 0:50:03So the big advance in the last year or two has been the ability

0:50:03 > 0:50:06to sequence the whole genome with much higher accuracy

0:50:06 > 0:50:08and much lower cost than has been possible before.

0:50:08 > 0:50:11If you remember, the original genome sequence took many years

0:50:11 > 0:50:14and many billions of dollars to complete.

0:50:14 > 0:50:15It involved many scientists.

0:50:15 > 0:50:19It's now possible to do experiments on that scale

0:50:19 > 0:50:22in a trivial amount of time for a few thousand dollars.

0:50:22 > 0:50:27It's now become possible to consider re-sequencing the whole genome

0:50:27 > 0:50:31of many thousands of individuals to understand the differences

0:50:31 > 0:50:35between for example those that have diabetes and those that don't.

0:50:35 > 0:50:39His ambitious project intends to sequence

0:50:39 > 0:50:42the whole genome of 3,000 people,

0:50:42 > 0:50:45comparing every single one of the three billion bases

0:50:45 > 0:50:48in diabetes sufferers and healthy people.

0:50:48 > 0:50:52They will throw up new genes and regions

0:50:52 > 0:50:57that we hadn't hitherto implicated in disease risk

0:50:57 > 0:50:59and that will give us new ways of understanding

0:50:59 > 0:51:01the biology of the disease.

0:51:01 > 0:51:06And we may have much better prospects for using genetics

0:51:06 > 0:51:10as a tool for predicting risk of disease and response to treatment

0:51:10 > 0:51:12than is currently possible

0:51:12 > 0:51:15with the common variants that we have identified so far.

0:51:18 > 0:51:22Already McCarthy has begun to find many new variants

0:51:22 > 0:51:25that are associated with diabetes

0:51:25 > 0:51:29in the part of the genome that aren't made of genes -

0:51:29 > 0:51:33the increasingly misnamed junk DNA.

0:51:33 > 0:51:36And not just a few, but many.

0:51:38 > 0:51:42It seems that, for common variants at least,

0:51:42 > 0:51:46most of the action lies in that non-coding DNA.

0:51:51 > 0:51:55That non-coding DNA, the 98% of our genome,

0:51:55 > 0:52:00was turning out to be not just important, but critical.

0:52:02 > 0:52:05I think McCarthy's technique is the way forward.

0:52:05 > 0:52:08If we want to really understand how our genetics makes us

0:52:08 > 0:52:13utterly unique, we need to sequence many more human genomes.

0:52:13 > 0:52:16Maybe everybody's.

0:52:18 > 0:52:22In 2003, Ewan Birney started a series of experiments

0:52:22 > 0:52:26to find out exactly what this junk DNA actually did.

0:52:26 > 0:52:33The project was called ENCODE, the Encyclopedia of DNA Elements.

0:52:33 > 0:52:35Using the very best technology of the time,

0:52:35 > 0:52:38hundreds of scientists from around the world

0:52:38 > 0:52:41scoured sections of the junk DNA.

0:52:41 > 0:52:44There's a really obvious question which I'm dying to ask,

0:52:44 > 0:52:46which is what is it? What is it doing?

0:52:46 > 0:52:50There isn't an easy answer to what these things do,

0:52:50 > 0:52:54but our best understanding was the prediction going in -

0:52:54 > 0:52:56and it's still what we think now -

0:52:56 > 0:53:00is that a lot of this is switching where genes switch on and off.

0:53:01 > 0:53:06The idea that genes switch on and off grew from my own field,

0:53:06 > 0:53:10the study of the development of embryos.

0:53:11 > 0:53:15As an organism grows, its cells decide to be organs,

0:53:15 > 0:53:19brains, limbs and livers, and this means the genes that control them

0:53:19 > 0:53:22have themselves to be controlled.

0:53:25 > 0:53:30And the location of this system of gene regulation?

0:53:30 > 0:53:33Well, not within the genes themselves,

0:53:33 > 0:53:35but in the rest of the DNA.

0:53:35 > 0:53:40There's this incredible choreography of molecules in each cell.

0:53:40 > 0:53:45And so this dance of how all these different molecules inside the cell

0:53:45 > 0:53:49work out is working on these parts of the genome,

0:53:49 > 0:53:53many of them are not close to even protein-coding genes.

0:53:53 > 0:53:58They're spread out in the big dark matter of the genome.

0:54:01 > 0:54:06But the ENCODE project revealed yet another layer of complexity.

0:54:06 > 0:54:11Not only was the dark matter of DNA actually very important,

0:54:11 > 0:54:15but it was also becoming clear that the physical structure,

0:54:15 > 0:54:18the shape of DNA, affected us, too.

0:54:24 > 0:54:26This is how we think about DNA,

0:54:26 > 0:54:29the classic Crick and Watson double helix.

0:54:29 > 0:54:33You can see in the middle, in the core, are the base pairs,

0:54:33 > 0:54:36and outside, the twin backbones that spiral up

0:54:36 > 0:54:38to give it that iconic shape.

0:54:38 > 0:54:40But this is just a portrait

0:54:40 > 0:54:42and portraits are not the same as people.

0:54:42 > 0:54:46This is a much better representation of DNA in action.

0:54:46 > 0:54:48The double helix here is in purple

0:54:48 > 0:54:52but it's wrapped around a complex of proteins called histones

0:54:52 > 0:54:54and they again wind up on each other

0:54:54 > 0:54:57and the whole thing is covered in another protein.

0:54:57 > 0:54:59It may look like chemical chaos and certainly

0:54:59 > 0:55:04it's a far cry from the classic double helix model we're used to.

0:55:09 > 0:55:14It's a complex world buzzing with activity.

0:55:14 > 0:55:17Chemicals squeezing past other chemicals,

0:55:17 > 0:55:19proteins constantly moving,

0:55:19 > 0:55:24remodelling the shape of the DNA on the fly.

0:55:24 > 0:55:28And this model shows only a tiny stretch of DNA,

0:55:28 > 0:55:30about 150 bases long.

0:55:30 > 0:55:35There are 100 million more of these in the full genome.

0:55:36 > 0:55:38It should come as no surprise

0:55:38 > 0:55:42but this is seriously sophisticated stuff.

0:55:50 > 0:55:53The last 50 years has been a revolution

0:55:53 > 0:55:56in our understanding of our genome.

0:55:56 > 0:55:59From breaking the code in our DNA

0:55:59 > 0:56:01to learning how mistakes in that code

0:56:01 > 0:56:04lead to tragedies like Huntingdon's disease,

0:56:04 > 0:56:08to glimpsing how our genome relates to complex traits

0:56:08 > 0:56:10and diseases like diabetes,

0:56:10 > 0:56:13and seeing how its effect can be mitigated

0:56:13 > 0:56:15by changing our environment.

0:56:15 > 0:56:17But it's only in the last ten years,

0:56:17 > 0:56:20since the publication of the full human genome sequence,

0:56:20 > 0:56:23that I believe we are seeing the biggest revelations,

0:56:23 > 0:56:27because the real breakthrough has been understanding

0:56:27 > 0:56:30just how little we know about the genome.

0:56:30 > 0:56:33That is true enlightenment.

0:56:33 > 0:56:37There isn't going to be a moment where we can stand up and say,

0:56:37 > 0:56:41"That's it, we understand the human genome."

0:56:41 > 0:56:42It is...

0:56:42 > 0:56:44as complex as we are.

0:56:44 > 0:56:48And we're pretty complex.

0:56:48 > 0:56:52So I don't think it will be reached within my lifetime.

0:56:52 > 0:56:57But I think we'll know so much more in five years than we do now.

0:56:57 > 0:56:59And so much more in ten years than we do now,

0:56:59 > 0:57:02that I think I'll be surprised.

0:57:05 > 0:57:09I believe there never were going to be any easy answers.

0:57:09 > 0:57:14Human beings are amazing, complex creatures.

0:57:14 > 0:57:18Ten years on from completing the Human Genome Project,

0:57:18 > 0:57:20we shouldn't be disappointed

0:57:20 > 0:57:23that the results were different from what we expected,

0:57:23 > 0:57:27nor surprised that we didn't come up with any definitive answers.

0:57:27 > 0:57:29That is how science works.

0:57:29 > 0:57:32It's a journey, a continuous exploration

0:57:32 > 0:57:34of how things work and who we are.

0:57:34 > 0:57:37And now, with the human genome complete,

0:57:37 > 0:57:45we can finally see the road ahead.

0:57:45 > 0:57:48Subtitles by Red Bee Media Ltd

0:57:48 > 0:57:51E-mail subtitling@bbc.co.uk