The structure of fitness landscapes is critical for understanding adaptive protein evolution. Previous empirical studies on fitness landscapes were confined to either the neighborhood around the wild type sequence, involving mostly single and double mutants, or a combinatorially complete subgraph involving only two amino acids at each site. In reality, the dimensionality of protein sequence space is higher (20∧L) and there may be higher-order interactions among more than two sites. Here we experimentally characterized the fitness landscape of four sites in protein GB1, containing 20∧4 = 160,000 variants. We found that while reciprocal sign epistasis blocked many direct paths of adaptation, such evolutionary traps could be circumvented by indirect paths through genotype space involving gain and subsequent loss of mutations. These indirect paths alleviate the constraint on adaptive protein evolution, suggesting that the heretofore neglected dimensions of sequence space may change our views on how proteins evolve.
Submitter: Marie Ary
Submission Date: Jan. 18, 2017, 4:55 p.m.
|Number of data points||458722|
|Assays/Quantities/Protocols||Experimental Assay: Count input ; Experimental Assay: Count selected ; Derived Quantity: Fitness (w) ; Derived Quantity: Imputed fitness|
|Libraries||observed variants for all possible combinations of mutants at four positions in positive epistatic region ; imputed fitness of missing variants|