This post is an adaptation of the “TED Talk,” or a video lecture, that was presented by Harvard University’s Erez Lieberman Aiden and Jean-Baptiste Michel in September 2011 called “What we learned from 5 million books.” Essentially, the two researchers created a tool called the Ngram Viewer, which allowed them to search the millions of books that Google has scanned and made publicly available. The large volume of text—which spans centuries and now includes more than 8 million books and approximately a half trillion words—can provide insights into our world, our history and emerging trends simply based on the popularity of certain words or phrases.
Naturally, I wondered what this nifty tool could tell me about the history of PKU and how its treatment has evolved over the years. Before experimenting, I had to narrow the dates in which I would conduct my search. Google’s digitized books include publications between the years 1800 and 2000. For the purpose of investigating PKU, I selected 1930-2000. To understand why, here’s a brief look at PKU chronology:
- 1934 – PKU discovered by Norwegian physician Dr. Ashbijorn Folling
- 1953 – Researchers found that a phenylalanine restricted diet helped to treat PKU patients
- 1963 – Guthrie Method developed (Bacterial Inhibition Assay)
- 1965 – Newborn screening initiated in the U.S.
- 1980s – The gene responsible for phenylalanine hydoxylase production was isolated and since that time over 500 mutations in the gene have been identified
And here are some insights that I gathered:
Metabolic disorders were in the English vernacular long before PKU or newborn screening. But we can tell the results of this search correlate with the chronology of PKU. The PKU acronym appears as early as the 1930s—when Folling discovered PKU—but it does not pick up in frequency until the Guthrie test was created in the 1960s. Almost immediately or soon thereafter, the concept of newborn screening takes off. We now know that the Guthrie test, which was originally developed to screen for PKU at birth, has led to newborn screening tests for more than 40 developmental, genetic and metabolic disorders.
One of the cool features of the Google Ngram Viewer is that you can use the characters “=>” to determine what words are used to modify other words or phrases. For this particular search, I wanted to see what adjectives were used to describe PKU. Was it disease, disorder or syndrome? As you can see here, disease is the most common descriptor for PKU while syndrome is the least frequently used term. Interestingly, disease is also the first descriptor we see in the literature—it appears that old habits are hard to break!
When I was on the PKU diet growing up in Pittsburgh, Penn., my mother counted phe exchanges. When I first considered a return-to-diet not too long ago, I was advised to count grams of protein. However, after speaking to some other women with PKU, I’ve learned that counting milligrams of phe is a more precise way to track protein intake, especially when considering a PKU pregnancy. Given that experience, I wanted to see what published works in the Google library might suggest is most commonly cited. Here you can see that while the use of grams of protein started strong in the 1930s, tracking milligrams of phe has slowly gained popularity and eventually taken the lead. Interestingly, phe exchanges are not mentioned at all.
For my last Ngram PKU search, I decided to compare the phrases PKU diet and PKU treatment. In the past, some PKU patients have been denied access to medical formula and foods because of the misconception that the PKU diet is cosmetic and for losing weight. Despite that barrier to PKU therapies, there has been little effort to change the language by which we describe medical care for PKU. Perhaps if Google expanded the scanned library of books to include the years beyond 2000, we would start to see a reversal in this trend.
To be fair, there are some limitations to conducting this Ngram PKU experiment. PKU literature is mainly published in research journals rather than actual books and more recent developments like the first FDA-approved prescription drug to treat PKU would not be included within this data. Nonetheless, I couldn’t resist having a look to see what kind of insight this tool might provide.
Have you played with Google Labs’ NGram Viewer? What other PKU trends and phrases do you think would be interesting to search for?