When Amit Khera explains how he predicts disease, the young cardiologist’s hands touch the air, arranging imaginary columns of people: 30,000 who have suffered heart attacks here, 100,000 healthy controls there.
There’s never been data available on as many people’s genes as there is today. And that wealth of information is allowing researchers to guess at any person’s chance of getting common diseases like diabetes, arthritis, clogged arteries, and depression.
Doctors already test for rare, deadly mutations in individual genes. Think of the BRCA breast cancer gene. Or the one-letter mutation that causes sickle-cell anemia. But such one-to-one connections between a mutation and a disease—“the gene for X”—aren’t seen in most common ailments. Instead, these have complex causes, which until recently have remained elusive.
The day I visited him, Khera was constructing what is called a polygenic score—“poly” because his calculations involve thousands of genes, not just one. This particular score predicted a person’s chance of developing atrial fibrillation, or irregular heartbeat. It’s a common disorder but is often diagnosed only after someone has been rushed to the ER with a stroke.
Khera pointed to his screen. There, seven-digit numbers, each representing an anonymous DNA donor, appeared alongside their scores. The outliers had a risk four times the average.
Khera, who works in the laboratory of heart doctor and gene hunter Sekar Kathiresan at the Broad Institute, in Cambridge, Massachusetts, says the new scores can now identify as much risk for disease as the rare genetic flaws that have preoccupied physicians until now.
“Where I see this going is that at a young age you’ll basically get a report card,” says Khera. “And it will say for these 10 diseases, here’s your score. You are in the 90th percentile for heart disease, 50th for breast cancer, and the lowest 10 percent for diabetes.”
Such comprehensive report cards aren’t being given out yet, but the science to create them is here. Delving into giant databases like the UK Biobank, which collects the DNA and holds the medical records of some 500,000 Britons, geneticists are peering into the lives of more people and extracting correlations between their genomes and their diseases, personalities, even habits. The latest gene hunt, for the causes of insomnia, involved a record 1,310,010 people.
The sheer quantity of material is what allows scientists like Khera to see how complex patterns of genetic variants are tied to many diseases and traits. Such patterns were hidden in earlier, more limited studies, but now the search for ever smaller signals in ever bigger data is paying off. Give Khera the simplest readout of your genome—the kind created with a $100 DNA-reading chip the size of a theater ticket—and he can add up your vulnerabilities and strengths just as one would a tally in a ledger.
Such predictions, at first hit-or-miss, are becoming more accurate. One test described last year can guess a person’s height to within four centimeters, on the basis of 20,000 distinct DNA letters in a genome. As the prediction technology improves, a flood of tests is expected to reach the market. Doctors in California are testing an iPhone app that, if you upload your genetic data, foretells your risk of coronary artery disease. A commercial test launched in September, by Myriad Genetics, estimates the breast cancer chances of any woman of European background, not only the few who have inherited broken versions of the BRCA gene. Sharon Briggs, a senior scientist at Helix, which operates an online store for DNA tests, says most of these products will use risk scores within three years.
“It’s not that the scores are new,” says Briggs. “It’s that they’re getting much better. There’s more data.”
When they launched the first modern gene searches a decade ago, following the completion of the Human Genome Project, medical researchers still hoped that a few major genetic culprits would explain common diseases like diabetes. “I expect there are about 12 genes involved [in diabetes], and that all of them will be discovered in the next two years,” Francis Collins, now the head of the US National Institutes of Health and one of the leading players in sequencing the human genome, confidently declared in 2006.
If that had turned out to be true, the small list of genes would have given drug designers clear, tangible targets. That would easily have justified the whole enterprise, financed with hundreds of millions of US tax dollars. In the case of a few diseases, like macular degeneration, the searches paid off. Mostly, though, geneticists drew in empty nets. By 2009, Collins and others had begun to talk glumly about “the missing heritability.”
Where were the disease-causing genes? Everywhere, it turns out. And by 2014, genetic studies were finally big enough to prove it. As the number of people with diabetes who enrolled in the gene search studies rose from 661 to 10,128 to 81,412, the “hits” began rolling in. Instead of 12 genes, we now know, type 2 diabetes is influenced by at least 400 locations in our DNA, and probably many more—each with only a tiny, hard-to-detect effect.
To scientists seeking the ultimate cause of common diseases, that’s a huge disappointment. If the causes of diabetes, depression, or schizophrenia are sprinkled around the genome like so much powdered sugar, it means we’re far from understanding or curing them. “No one wanted that to be the answer,” says Mark Daly, a geneticist at the Broad Institute. “But it is what it is.”
While the scattershot nature of inheritance may make disease hard to comprehend, though, the same data is making it much easier to predict. To create their models, Khera and Kathiresan use 6.6 million positions in a person’s genome. Each position is a single DNA letter. It could be A for you and G for me. From big genetic studies, Khera can now look up how much more likely a person with a G in that position is to have a heart attack. Maybe it raises the chances by 0.1 percent. That’s a negligible amount. Maybe a G in another position reduces the risk by 0.2 percent. But if you add up all the tiny genetic influences, the effect can become substantial.
When they built a predictor for coronary heart disease, for instance, Kathiresan’s team discovered that the people it predicted to have the very highest risk, the top 2.5 percent, had four times the average chance of developing clogged arteries. That’s about equal to the risk of clogged arteries caused by familial hypercholesterolemia, a condition marked by sky-high cholesterol levels and caused by a single critical gene. If doctors worry about that—which they do—why not also pay attention to the high end of genomic risk scores?
“That’s the thing that convinced me,” says Kathiresan. And the number of people whose genome predictions raise a red alert will also be much larger. Familial cholesterolemia affects only about one in 250 people. Genome scores would identify about eight times as many people at high risk for heart disease, he believes.
What he’s not yet sure about, Kathiresan says, is how to get the new risk information into people’s hands. He has considered launching an app or selling the statistical model to a diagnostics company. “Everyone wants to get their score. Everyone is asking where is the product for heart disease,” he says. “I tell them, we are working on it.”
Heart disease is, in some ways, a best-case scenario for using risk scores. That’s because you can change your real-life risk—say, by going on a diet or taking a cholesterol-lowering statin pill. What’s more, probabilities are already a big part of heart medicine. Khera, who dons a white coat once a week to treat patients at Massachusetts General Hospital, uses a combination of a person’s age, weight, cholesterol levels, and habits like smoking to guess the chance of a heart attack in the next 10 years. Now genetic scores could be added to those models, making them more accurate.
What’s powerful about DNA predictions is that they are measurable at any time of life, unlike most risk factors. “If you line up a bunch of 18-year-olds, none of them have high cholesterol, none of them have diabetes. It’s a zero in all the columns, and you can’t stratify them by who is most at risk,” says Khera. “But with a $100 test we can get stratification at least as good as when someone is 50, and for a lot of diseases.”
Drug companies have started to notice. Last year Anders Dale, a brain researcher at the University of California, San Diego, announced his intention to market a risk calculator for Alzheimer’s disease. It will guess whether a person will develop the disease and, if so, at what age.
The service won’t launch until this summer, but Dale says drug companies immediately got in touch. Now he is helping three of them test the DNA of people in clinical trials for Alzheimer’s drugs (he declined to name them). Despite the billions spent developing such drugs, every one tried so far has flopped. The problem is that when no one knows who will get the disease, it’s difficult to know whether a preventive drug is working. If companies could test the drugs only on people with a high risk of Alzheimer’s, it would be much easier. It’s possible future drugs will be labeled “Recommended for those with polygenic scores 90 and above.”
Dale is working with commercial partners to put his Alzheimer’s predictor online and charge as little as $99 to anyone who wants to use it. More than ten million people already have their DNA data because they signed up for 23andMe or Ancestry.com to research their family trees. Dale says they will be able to upload the data with a click and receive a report. I asked him why people would want to know ahead of time about a disease that’s currently untreatable. “They may want to make plans,” he said.
Other doctors believe risk scores will give people the push they need to think harder about their well-being. “I love the idea of polygenic risk scores because the future is health, not medicine,” says Steven Tucker, a physician who practices in Singapore. He likes his patients to use wearable devices and trackers, and risk scores could be combined with those. Someone at high risk for atrial fibrillation, for instance, might wear a smart watch with a heart monitor built into it. “My patients want to manage the future,” says Tucker. “If you can define it more accurately, there is a better chance you can do something about it.”
Even so, the value of the new genomic future-gazing is hotly disputed. That’s because the scores are not individual certainties; they are merely rough probabilities derived from large populations. Of people given high scores by Khera’s atrial fibrillation predictor, for example, only a small minority, 7 percent, would actually develop the condition by age 55.
This uncertainty matters because if people are given risk scores, they’ll base decisions on them. Last fall, Myriad Genetics became the first large diagnostics company to introduce a polygenic risk test to the US market. Called riskScore, it measures 81 variants to estimate a woman’s chance of breast cancer. Women with a high score might undergo extra mammograms; those at low risk might skip them. What no one yet knows is whether those decisions will lead to fewer cancer deaths. Finding out will require expensive long-term studies that Myriad, which is selling the test, hasn’t yet done.
One physician who finds all this troubling is Patrick Sullivan at the University of North Carolina, Chapel Hill, where he leads the Psychiatric Genomics Consortium. The group has DNA data from more than 900,000 people with confirmed mental illness, including more than 60,000 with schizophrenia. This disease is known to be highly influenced by genes. If one identical twin develops schizophrenia, for example, there is a 50 percent chance the other one will too.
But Sullivan says it would be reckless to tell apparently healthy people whether their DNA score for schizophrenia is high or low. Just think of those twins, he says: they have the same DNA and the same score, yet it is even odds a schizophrenia prediction would be wrong. Giving such a flawed forecast to someone “is a terrible idea,” says Sullivan. “What you want it to do is distinguish who has it and who doesn’t, and we aren’t there yet.”
In addition to predicting disease, geneticists can build models to predict any human trait that can be measured, including behaviors. Is this person destined for a life of crime and recidivism? Will that one be neurotic, depressed, or smarter than average?
The scoring technology, scientists say, will soon shed uncomfortable light on such questions. In January, two leading psychologists argued that direct-to-consumer DNA IQ tests will soon become “routinely available” and will predict children’s ability “to learn, reason, and solve problems.” They believe parents will test toddlers and use the results to make school plans.
To some, using foggy genetic horoscopes to decide who goes to college and who ends up in trade school sounds like an extraordinarily bad idea. On his blog Gloomy Prospect, Eric Turkheimer, a prominent psychologist at the University of Virginia, says the danger is that the scores will be overinterpreted to “recommend some truly dreadful social policies.” That, he thinks, would be “the worst possible kind of biologically determinist discrimination.” To Turkheimer, polygenic scores are “less than meets the eye” and about as fair as “predicting your IQ from a cousin you haven’t met.”
Such views aren’t stopping the rapid pace of genetic exploration. Until last year, no gene variant had ever been tied directly to IQ test results. Since then, studies involving more than 300,000 people’s DNA have linked 206 variants to intelligence. It means genetic scores can now account for 10 percent of a person’s performance on an IQ test. That could reach 25 percent within a few years, as more data accumulates. One US company, Genomic Prediction, even says it wants to test IVF embryos for intelligence, so parents can discard those expected to be mentally unfit.
Dystopia, dubious medicine, or a breakthrough in prevention? Genomic prediction may well be all three. What is clear is that, with the data needed to create predictors becoming freely available online, 2018 will be a breakout year for DNA fortune-telling.