There is no known cause of autism spectrum disorder and no cure. The number of children with autism spectrum disorder is increasing rapidly — in 2000, 1 in 150 children received a diagnosis, now it is 1 in 68.
Currently, the Center for Disease Control tracks ASD diagnoses in 11 states. Yet, in order to track its prevalence and distribution, researchers need to scour through thousands of pages of often complex records.
A team of UA researchers aims to change that.
“Our project hopes to make that process a lot faster, cheaper, easier, and bigger,” said Gondy Leroy, a professor of management information systems in the Eller College of Management.
The project is being funded by a grant of $292,404 from the Agency for Healthcare Research and Quality, a part of the U.S. Department of Health and Humans Services.
Her team is creating Natural Language Processing algorithms that can “read” the text of electronic health records and identify mentions of the criteria that are used to diagnose ASD.
Leroy, alongside Mihai Surdeanu, an associate professor of computer science, is responsible for developing the algorithms.
The team also includes Sydney Pettygrove, assistant professor in the UA Zuckerman College of Public Health, who helps provide the electronic health records and Maureen Kelly Galindo, genetics and developmental research coordinator in the UA College of Medicine, who provides clinical feedback on the analysis of the health records.
Two months into the two-year project, Leroy said they have an algorithm that can successfully identify and annotate some of the diagnostic criteria for ASD.
They are currently working on training a machine learning algorithm which can learn from its mistaken identification and do better in the future.
One of the diagnostic criteria the algorithm will look for in the health records is if children make eye contact with their caregiver. This measure of decreased social interaction can be written by doctors and psychologists in many different ways.
One of the challenges for the algorithm is to be able to correctly identify all these different phrases — something that comes naturally to human researchers.
Another challenge is providing the learning algorithms with a sufficient number of examples of all 12 diagnostic criteria for ASD, even those that are very rare, for it to be able to recognize in the future.
As of now, the team is working with a small sample of Arizona health records.
“Ideally, we want to scale up and go as big as the entire United States,” Leroy said. “Then we could test whether something in the environment is affecting ASD.”
With the current data, Leroy plans to conduct a set of case studies to determine how ASD diagnosis has changed over time or based on who is doing the diagnosing.
According to Pettygrove, the recent increase in ASD diagnosis could be attributed to a greater awareness of the disorder among the population and doctors. For example the number of non-Hispanic white children diagnosed with ASD used to be 3.8 percent more than Hispanic children. That gap has since narrowed.
Yet, Pettygrove said she cannot rule out environmental or regional factors that could be the cause of ASD.
A future case study with the help of data collected from this algorithm could help answer these questions Leroy said.
Pettygrove also said they are studying how the language used to describe ASD has changed over the years.
More critically, this technology could be used to help flag cases of ASD from healthcare records and help lead to earlier diagnoses of ASD and earlier treatment interventions for children, Pettygrove said.
“Children who are evaluated earlier in life and receive services at a younger age do better in the long run,” Pettygrove said.
In the future, Leroy believes this algorithm could also benefit the study and diagnosis of mental illnesses. If it can lead to earlier and increased services for children with ASD, it could do the same to help those suffering from mental illness.