Biologists glean insight into repetitive protein sequences: A computational analysis reveals that many repetitive sequences are shared across proteins and are similar in species from bacteria to humans

About 70 percent of all human proteins include at least one sequence consisting of a single amino acid repeated many times, with a few other amino acids sprinkled in. These “low-complexity regions” are also found in most other organisms.

The proteins that contain these sequences have many different functions, but MIT biologists have now come up with a way to identify and study them as a unified group. Their technique allows them to analyze similarities and differences between LCRs from different species, and helps them to determine the functions of these sequences and the proteins in which they are found.

Using their technique, the researchers have analyzed all of the proteins found in eight different species, from bacteria to humans. They found that while LCRs can vary between proteins and species, they often share a similar role — helping the protein in which they’re found to join a larger-scale assembly such as the nucleolus, an organelle found in nearly all human cells.

“Instead of looking at specific LCRs and their functions, which might seem separate because they’re involved in different processes, our broader approach allows us to see similarities between their properties, suggesting that maybe the functions of LCRs aren’t so disparate after all,” says Byron Lee, an MIT graduate student.

The researchers also found some differences between LCRs of different species and showed that these species-specific LCR sequences correspond to species-specific functions, such as forming plant cell walls.

Lee and graduate student Nima Jaberi-Lashkari are the lead authors of the study, which appears today in eLife. Eliezer Calo, an assistant professor of biology at MIT, is the senior author of the paper.

