Skip to Main Content

Purdue-USDA team develops fast-track process for genetic improvement of plant traits

Researchers interested in improving a given trait in plants can now identify the genes that regulate the trait’s expression without doing any experiments.

Purdue University’s Kranthi Varala, and 10 co-authors published the details of the new web-based regulatory gene discovery tool in the April 23 issue of Proceedings of the National Academy of Sciences. Varala has a patent pending on the results that relates to economically important seed oil biosynthesis.

The Purdue-USDA team sought to build a resource that learns, from large amounts of publicly available data, to quickly identify what special genes called transcription factors regulate the expression of a given trait in various plant species.

“Every study focuses on a handful of them,” said Varala, assistant professor of horticulture and landscape architecture. “Our premise was that if we can put all of it into a single analysis, then we can use this data to build something global.”

Arabidopsis served as the PNAS study’s model plant, “but this approach has nothing specific to Arabidopsis,” Varala said. “The approach is general enough that you could start with a corn dataset. You could do it with rice, with tomato, whatever crop you’re working on as long as you have thousands of gene expression measurements that people have done. And there are over a dozen species now where we have tens of thousands of gene-expression studies.”

To prove the system works, the team focused on a genetic pathway that regulates how plants make and store oil in their seeds. The team picked that trait because of its importance in food and biofuel production, and because more than 300 of the genes involved are already known. 

By genetically manipulating a plant’s transcription factors, researchers can increase or decrease the amount of oil produced in its seeds.

Arabidopsis seedlings being cultivated for research to study the effects of specific genes on traits such as rate of growth, plant size etc. Arabidopsis seedlings being cultivated for research to study the effects of specific genes on traits such as rate of growth, plant size etc.

Like other researchers, Varala has pursued many projects over the years where his goal was to identify the genes and regulators involved in solving one problem. This meant conducting careful, time-consuming experiments. But the data generated fell short of providing all the answers he sought. He compared it to working an equation knowing only three of the 10 factors involved.

“You can’t solve the equation,” he said. Likewise, Varala often wanted to ask more questions than the data could answer. That motivated him to build a framework that uses all possible data to ask those questions without having to do all the relevant experiments to obtain a list of candidates that then need genetic validation.

“I’m trying to short-circuit the initial data collection phase,” Varala said, so that scientists can focus on conducting the genetic validations. But to do so, his team had to begin with a dataset based on 18,000 individual studies.

Varala and his team analyzed this massive dataset using the Bell and the now-retired Brown supercomputers at Purdue’s Rosen Center for Advanced Computing. The team built a machine-learning framework to speed the process for others.

It would be impossible for one person to do this manually. A team could do it, but that would introduce biases in how group members process the data. The machine-learning classifier operates without bias.

The novelty of the approach is that instead of pulling data related to all organs, it focuses on organ-specific datasets. Independent gene networks regulate these organs — leaves, roots, shoots, flowers and seeds.

“Instead of using all organs, we said, within the seed experiments that people have done over the years, can we use all the data to learn something that’s happening in the seed and not necessarily the root or the leaf or the flower? That improved our approach a lot.”

The team used a computational method called the inference approach to predict what transcription factors were going to regulate the seed oil biosynthesis process in Arabidopsis. 

“The ones we know help us validate that our approach is working correctly. The ones that we don’t know are good candidates for finding out new biology,” Varala said. “This purely computational approach knows nothing about seeds or oil or anything like that. We gave it a list of genes and it was able to rediscover the known ones without knowing any biological context.”

The lead author, Rajeev Ranjan , a postdoctoral researcher in the department of horticulture and landscape architecture at Purdue, took the other 12 of the top 20 and asked if those predictions are true. “We were able to generate mutant lines for 11 of those 12. Five of those 11 do change the seed oil content,” he said. “Further, we also showed that overexpression of one factor increases seed oil up to 12%.”

Rajeev Ranjan, a postdoctoral researcher in horticulture and landscape architecture, analyzes genetically modified Arabidopsis seeds that have higher oil content to confirm that other agronomically important traits, including seed size and seed per fruit, are not negatively affected. Rajeev Ranjan, a postdoctoral researcher in horticulture and landscape architecture, analyzes genetically modified Arabidopsis seeds that have higher oil content to confirm that other agronomically important traits, including seed size and seed per fruit, are not negatively affected.

The eight known regulatory genes, added to the eight new ones, showed that the inference approach accurately identified 13 of the top 20 candidates. The strength of the approach is working only from a list of genes, it can predict with high accuracy which ones will regulate a trait of interest.

“It took a long time to do because it’s a long, complicated process, and there was no guarantee that it would work,” said Varala of the four-year project. “Nothing on this scale had been attempted before.”

Varala has disclosed the innovation to the Purdue Innovates Office of Technology Commercialization, which has applied for a patent to protect his intellectual property.

This research was supported by the U.S. Department of Energy Office of Science.

Featured Stories

Chris Wirth holding bug specimen
Behind the Research: Chris Wirth

Many people are involved in the remarkable range of programs, services and facilities that...

Read More
Purdue College of Agriculture.
Farmer sentiment recovers in May; interest in solar leasing rising

U.S. farmers’ outlook improved in May as the Purdue University/CME Group Ag Economy...

Read More
Ken Fuelling leans against a brown pole in an empty classroom. Empty chairs and a blank projector screen fill the background.
Promoting acceptance in agricultural education

Ken Fuelling (he/they) had already been accepted into graduate school to work with Sarah LaRose...

Read More
The 2024 Ecology of Natural Disturbances course students and faculty on a bridge in Smoky Mountain National Park
Smoky Mountain Spring Break Trip Brings Disturbance Ecology Coursework to Life

While some students headed to tropical locales for Spring Break excursions, those in the FNR...

Read More
Composting bins outdoors
Unlocking the benefits of composting: tips for a greener garden

For centuries, gardeners have provided nutrients to plants through composting, but Karen...

Read More
Sarah Stanhope
Sarah Stanhope - Graduate Ag Research Spotlight

Sarah Stanhope likes investigating things: “I always asked a lot of questions,” she...

Read More
To Top