Machine Learning at The Roskamp Institute – An Interview with Ph.D. Candidate Andrew Pearson

The use of artificial intelligence and machine learning have been making headlines across the globe as computers become more powerful and more people have access to growing collections of information. At The Roskamp Institute, Ph.D. student Andrew Pearson, working under the direction of Dr. Joseph Ojo, talked with Brain Waves about his journey to create a computerized brain to predict potential drug treatments for traumatic brain injury (TBI). By taking a list of hundreds of FDA approved drugs for other ailments such as diabetes, asthma, and anti-inflammatory purposes, and running them through a complex machine algorithm, Andrew was able to pinpoint drugs with beneficial qualities that meet target criteria for further investigation.

Tell us about what you’re doing with machine learning at The Roskamp Institute.
We performed “next generation” RNA sequencing where we analyzed the genomic data (transcriptome) of immune cells and other cells in the brain, namely microglia and astrocytes and we performed this in animals that had experienced repetitive mild traumatic brain injury; also in animal models that mimic Alzheimer’s Disease (AD) and Chronic Traumatic Encephalopathy (CTE) type pathology to identify potential links between the mechanics of TBI and Alzheimer’s Disease.

So, machine learning really allows us to do a better job of pattern recognition and also to use a more unbiased approach. Machine learning uses the data to identify the important factors, or the important genetic signatures, that regulate the outcomes from TBI. What we can then do with these data, based on the targets that the machine learning had identified, is to take a series of drugs known to already target these structures and train a machine algorithm to predict new structures based on the efficacy of already existing drugs, in order to design and make new or better drugs that we could put into the clinical study pipeline.

Where do you see machine learning and neural networks playing out in the future of drug discovery?
Machine learning is very important just because of the amount of time that it takes to individually screen drugs against each other, and its ability to use solid evidence to sort through hundreds of thousands of candidate compounds before we even get to the stage of cell culture experiments in the lab – it saves countless amounts of time and money.

Obviously, the more data we get, and the more we learn about what structures work and what structures don’t, we can continuously refine our approach and the machine will continuously learn as we go. The neural networks are really good at that.

We can also throw in curveballs to trick the machine and see whether it can detect patterns and still has a good predictive function, or whether it’s just memorizing what we’ve told it. That kind of artificial intelligence is what’s really going to help set this method of drug discovery apart – the fact that we can have the have the computer understand the important structures of a drug and how they relate to the drug’s efficacy, and we can therefore include those factors in future compounds. Trying to do by manually synthesizing a bunch of drugs, and then screening everything in the laboratory would take a millennium.

Do you think that’s the future here at the Institute?
Yes, I think so. We generally like to pursue new technologies and keep ahead of the field, which is why philanthropic support is so important to us as this can be difficult to do with grant funding. We’ve just acquired a cell sorter to do fluorescence activated cell sorting, which will allow us to do more specific single cell transcriptomics (the study of genomic data), so we’ll be able to assess the gene signature of individual cells in the brain. Because we can see the massively heterogeneous populations in the brain, we could look more accurately at targeting one specific type of immune cell or specific subsets within that type of immune cell that might be able to beneficially alter outcomes. To continuously feed in our transcriptomic data and the drug discovery data, back and forth, will really allow us to refine our future therapeutic drug candidates.

Is your Ph.D. thesis about this kind of research?
Yes, we did a lot of base level drug screening of drugs already approved by the FDA, but not approved for TBI or Alzheimer’s Disease. We had diabetes drugs, asthma drugs, and anti-inflammatory drugs that have already been shown to have good safety profiles and are already FDA approved, and we looked at their translational ability to treat a traumatic brain injury and how they may impact Alzheimer outcomes.

Really, machine learning for drug discovery has only come in to play more recently as we started to get into transcriptomics, or what is called “Next Generation Sequencing”. But it’s what we’re planning to do more of in the future. One of the next projects I’m working on is going to be looking at isolating the single cells from the brain in transgenic mice (containing genetic material from a different organism). These mice have specifically altered gene profiles to make them more or less likely to have Alzheimer’s. We then switch on or off certain high-risk, high-susceptibility genes, or genes that may influence inflammatory outcomes following brain injury or Alzheimer’s Disease, and see how single genes, or the factors which control them, can alter that. And maybe we can also look at the drugs that would also regulate those genes to give us a better understanding of the mechanics of how these cells are influencing their outcomes.

Can you talk about your results with the machine learning work you’ve done so far?
The machine learning work has allowed us to overlay a lot of our next generation sequencing data with human data that has been published, and not just one or two studies. We’ve been able to skim datasets from vast studies that include huge cohorts of people (upwards of 2000 samples) to increase our actual likelihood of hitting the right subset groups and attempting to map our significantly altered transcripts with them to make sure that the pathways that we are identifying as important in our models are also pathways that are biologically relevant in humans.

Based on that research we took a few drugs such as pioglitazone – an already approved drug for Type 2 diabetes – and we broke down the structure of the drug into kind of a chemical fingerprint. We then identified the certain features of the chemical structure of the drug that were important to its efficacy and were able to target its actual molecular target. From there we allowed the machine to develop a scoring system for its efficacy and then, based on the fingerprint and the scoring system, we were able to predict new drug candidates that we are looking to develop further.

What’s the hardest thing in working with machine learning?
As someone who doesn’t really come from a computer science background, having to learn how to talk to a computer was difficult. Learning Python programming was initially very challenging, but luckily there are people at The Roskamp Institute who are already experts in this field and have a good background in what we’re trying to do.

Machine Learning at The Roskamp Institute – An Interview with Ph.D. Candidate Andrew Pearson

Explore More Posts