Identifying Patterns in Proteins
Contributors
This tutorial was created with funding from NSF-DUE (1022793, 1323414, 1725940) for the CREST program.
Last revision 1/2021

This Jmol Exploration was created using the Jmol Exploration Webpage Creator from the MSOE Center for BioMolecular Modeling.

version 2.0
Exploration Content

Identifying Patterns in Proteins

High throughput processes and advancing technology has led to exponential growth in the number of solved protein structures in the Protein Data Bank, such that as of mid-August 2020 over 168,000 protein structures had been deposited in the Protein Data Bank! One common scientific practice is to look for similarities (pattern recognition) among a collection to see if a pattern (structure) relates to the function of that collection. Scientists have identified a number of common patterns in proteins, which we will briefly explore here. When you click on a button, the structure of that protein will appear at right. You can click and drag your mouse to rotate the protein to view it from all sides. Note that in most cases, protein names and functions are not discussed. But each button includes a four letter code which refers to the protein's access code in the Protein Data Bank (PDB). If you wish to dig deeper, you can explore more about any given protein by entering its access code at the PDB.

Protein Folds

Many proteins adopt one of a limited number of common protein folds, some of which have rather descriptive names. Occasionally, a protein will be described as having a 'unique' fold, meaning that at the time it was discovered, it was the only known protein to adopt that fold. As more proteins are discovered, if they adopt the same 'unique' fold, often they are grouped together, and the fold may be described based on the first protein identified with that particular fold. For instance, a number of proteins involved in immunity share a similar fold, known as the immunoglobulin fold. While this name doesn't describe the type of fold, it identifies a protein that has this particular fold. Click the buttons below to view examples of protein folds.

alpha horseshoe PDB ID: 1lrv
beta barrel PDB ID: 1prn
beta propeller PDB ID: 1erj
alpha/beta 3-layer PDB ID: 1fua

Structural Motifs

Structural motifs refer to a smaller region of the protein that may share primary or secondary structural features, which often are related to a particular function. A leucine zipper consists of two structural motifs - a basic region that interacts with DNA and an a-helix with leucine residues every 7 residues. Leucine zippers are often found in transcription factors that interact with DNA and very often function as dimers. Another motif that interacts with DNA is a zinc finger, whose structure we will explore in a little more depth below.

Zinc Fingers

David Goodsell explores protein structure-function in a regular feature of the PDB called Molecule of the Month. One of the explorations is on Zinc Fingers. You should read this article before proceeding.

Common Features of All Zinc Fingers

A zinc finger consists of an antiparallel two-stranded ß sheet and a single a helix. This structure is stabilized by a single zinc atom that is coordinated (held in place) by interactions with four histidine and/or cysteine residues strategically positioned on the helix and ß strands. Residues on the inside of the fold between the helix and sheet often form a hydrophobic core (look for phenylalanine and leucine residues). At the turn between the helix and sheet, there is often an arginine residue that points away from the structure, much like a finger pointing away from a hand. This residue (arginine or otherwise) reaches in and interacts with the nucleotides in the major groove of DNA. This is a sequence specific interaction, and thus zinc finger motifs are often found in transcription factors. Very often, a series of zinc finger motifs occur linearly in a single protein, and the zinc fingers wrap around the DNA, recognizing multiple regions simultaneously.

Exploring a Zif268, A Transcription Factor with Three Zinc Fingers

Zif268 is a mouse transcription factor affecting the rate of transcription of genes with which it interacts. It contains three zinc fingers, which are crystallized, along with a short stretch of DNA, in 1zaa.pdb.

Zif268 zinc fingers and DNA PDB ID: 1zaa

In this depiction, the DNA backbone is green, helices are salmon and sheets are light yellow. Arginine residues that interact with the DNA are displayed in ball and stick and colored in CPK (carbon is gray, nitrogen is blue). The cysteine and histidine residues that coordinate the zinc atoms (in brick red) are also displayed in ball and stick and colored in cpk (sulfur is yellow). Note how the protein wraps around the DNA and interacts in the major groove. Next, let's look more closely at the structure of a single zinc finger.

First zinc finger from transcription factor Zif268 PDB ID: 1zaa

Click the buttons below to identify various features in this zinc finger.

zinc atom PDB ID: 1zaa
hydrophobic core PDB ID: 1zaa
histidines and cysteines that coordinate zinc PDB ID: 1zaa
arginine residue that interacts with DNA PDB ID: 1zaa
Jmol