Skip to Main Content

Purdue-USDA team develops fast-track process for genetic improvement of plant traits

Researchers interested in improving a given trait in plants can now identify the genes that regulate the trait’s expression without doing any experiments.

Purdue University’s Kranthi Varala, and 10 co-authors published the details of the new web-based regulatory gene discovery tool in the April 23 issue of Proceedings of the National Academy of Sciences. Varala has a patent pending on the results that relates to economically important seed oil biosynthesis.

The Purdue-USDA team sought to build a resource that learns, from large amounts of publicly available data, to quickly identify what special genes called transcription factors regulate the expression of a given trait in various plant species.

“Every study focuses on a handful of them,” said Varala, assistant professor of horticulture and landscape architecture. “Our premise was that if we can put all of it into a single analysis, then we can use this data to build something global.”

Arabidopsis served as the PNAS study’s model plant, “but this approach has nothing specific to Arabidopsis,” Varala said. “The approach is general enough that you could start with a corn dataset. You could do it with rice, with tomato, whatever crop you’re working on as long as you have thousands of gene expression measurements that people have done. And there are over a dozen species now where we have tens of thousands of gene-expression studies.”

To prove the system works, the team focused on a genetic pathway that regulates how plants make and store oil in their seeds. The team picked that trait because of its importance in food and biofuel production, and because more than 300 of the genes involved are already known. 

By genetically manipulating a plant’s transcription factors, researchers can increase or decrease the amount of oil produced in its seeds.

Arabidopsis seedlings being cultivated for research to study the effects of specific genes on traits such as rate of growth, plant size etc. Arabidopsis seedlings being cultivated for research to study the effects of specific genes on traits such as rate of growth, plant size etc.

Like other researchers, Varala has pursued many projects over the years where his goal was to identify the genes and regulators involved in solving one problem. This meant conducting careful, time-consuming experiments. But the data generated fell short of providing all the answers he sought. He compared it to working an equation knowing only three of the 10 factors involved.

“You can’t solve the equation,” he said. Likewise, Varala often wanted to ask more questions than the data could answer. That motivated him to build a framework that uses all possible data to ask those questions without having to do all the relevant experiments to obtain a list of candidates that then need genetic validation.

“I’m trying to short-circuit the initial data collection phase,” Varala said, so that scientists can focus on conducting the genetic validations. But to do so, his team had to begin with a dataset based on 18,000 individual studies.

Varala and his team analyzed this massive dataset using the Bell and the now-retired Brown supercomputers at Purdue’s Rosen Center for Advanced Computing. The team built a machine-learning framework to speed the process for others.

It would be impossible for one person to do this manually. A team could do it, but that would introduce biases in how group members process the data. The machine-learning classifier operates without bias.

The novelty of the approach is that instead of pulling data related to all organs, it focuses on organ-specific datasets. Independent gene networks regulate these organs — leaves, roots, shoots, flowers and seeds.

“Instead of using all organs, we said, within the seed experiments that people have done over the years, can we use all the data to learn something that’s happening in the seed and not necessarily the root or the leaf or the flower? That improved our approach a lot.”

The team used a computational method called the inference approach to predict what transcription factors were going to regulate the seed oil biosynthesis process in Arabidopsis. 

“The ones we know help us validate that our approach is working correctly. The ones that we don’t know are good candidates for finding out new biology,” Varala said. “This purely computational approach knows nothing about seeds or oil or anything like that. We gave it a list of genes and it was able to rediscover the known ones without knowing any biological context.”

The lead author, Rajeev Ranjan , a postdoctoral researcher in the department of horticulture and landscape architecture at Purdue, took the other 12 of the top 20 and asked if those predictions are true. “We were able to generate mutant lines for 11 of those 12. Five of those 11 do change the seed oil content,” he said. “Further, we also showed that overexpression of one factor increases seed oil up to 12%.”

Rajeev Ranjan, a postdoctoral researcher in horticulture and landscape architecture, analyzes genetically modified Arabidopsis seeds that have higher oil content to confirm that other agronomically important traits, including seed size and seed per fruit, are not negatively affected. Rajeev Ranjan, a postdoctoral researcher in horticulture and landscape architecture, analyzes genetically modified Arabidopsis seeds that have higher oil content to confirm that other agronomically important traits, including seed size and seed per fruit, are not negatively affected.

The eight known regulatory genes, added to the eight new ones, showed that the inference approach accurately identified 13 of the top 20 candidates. The strength of the approach is working only from a list of genes, it can predict with high accuracy which ones will regulate a trait of interest.

“It took a long time to do because it’s a long, complicated process, and there was no guarantee that it would work,” said Varala of the four-year project. “Nothing on this scale had been attempted before.”

Varala has disclosed the innovation to the Purdue Innovates Office of Technology Commercialization, which has applied for a patent to protect his intellectual property.

This research was supported by the U.S. Department of Energy Office of Science.

Featured Stories

Dog outdoors drinking water
Keeping your pets safe during the dog days of summer

As temperatures and humidity rise across the U.S., Candace Croney, director of the Center for...

Read More
Eastern hellbender salamanders feeding on bloodworms in their raceway at the Purdue Hellbender the Hellbender lab.
Metazoa Beer to Benefit Help the Hellbender Lab

Metazoa Brewing Company and the Indiana Lakes Management Society have teamed up to collaborate on...

Read More
Sonling Fei in front of digital trees
Digital forestry can help mitigate and prevent wildfires

The National Interagency Fire Center reports that, as of this writing, 19,444 fires have burned...

Read More
tomas hook next to boat
What you can do this summer to reduce the spread of aquatic invasive species

In 2020, an alligator was captured in a lagoon of Chicago’s Humbolt Park. The reptile out...

Read More
Researcher uses pipette on parsley plant
Researchers examine nanotechnological methods for improving agriculture

Nanoscale particles could potentially help address agricultural and environmental sustainability...

Read More
Fairgoers ride a tractor, sponsored by the Indiana Soybean Alliance, and browse food tents during the 2023 Indiana State Fair. (Purdue Agricultural Communications photo)
Purdue Extension to present engaging art and nature demonstrations at Indiana State Fair

The Indiana State Fair kicks off Aug. 2 and highlights the theme “The Art & Nature of...

Read More
To Top