Characteristics of EF-hands and EF-hand proteins
In the most simple scheme a calcium modulated protein is in the apo
form in a quiescent cell with [Ca2+] ~10-7.2 M.
Following cell stimulation, [Ca2+] rises to
~10-5.8 M and the protein binds a Ca2+ ion in its
EF-hand(s). This induces a change in conformation enabling the EF-
hand protein to bind and activate a target enzyme. There are many
known, and certainly more yet to be discovered, variations or
elaborations on this scheme.
1. By far the most egregious approximation is the simple model of
pCaout ~ -2.8 , pCacyt ~ 7.2 in a quiescent cell, then a stimulus with
[Ca2+]rising to pCacyt ~5.8, then falling to ~7.2 with relaxation.
Many cells maintain a tonic activity - seldom maximally stimulated or
fully quiescent. Further, [Ca2+] varies across the cytosol with
distance and time. Of special relevance, there are constant leaks of
calcium into the cytosol through the membranes of cell surface and of
the endoplasmic reticulum balanced by ever active calcium extrusion
pumps. This results in the [Ca2+] being higher near these membranes
than the average value in the cytosol.
2. Although we speak of apo and
calcium forms, the situation is more complex. In most EF- hand domains
the affinity for calcium is about 104 times stronger for
Ca2+ than for
Mg2+.
With [Mg2+] in the cytosol near constant, pMg ~2.8, this EF-hand
domain would be primarily in the apo form in the quiescent cell. In
contrast if the pKd(Ca2+) ~ 6.8 and pKd(Mg2+) ~ 2.8,
then this EF-hand
would be at least half magnesium in the resting cell. At higher
affinity for divalent cations pKd(Ca2+) ~ 7.4
and pKd(Mg) ~ 3.4 the
EF-hand would be partially in calcium and partially in magnesium forms
in the resting cell. For instance EF-hands 3 and 4 of troponin C (TNC)
appear to be mostly in the calcium form in the resting muscle cell and
serve a structural function while EF-hands 1 and 2 are mostly apo and,
following stimulation, function to transduce information. The actual
functions of calcium modulated proteins are exquisitely dependent on
their pKd(Ca2+)'s and pKd(Mg2+)'s and
on the spatial and temporal
distributions of the calcium pulse. This fine tuning is not captured
in simple on/off diagrams.
3. The targets of the EF-hand, calcium
modulated protein need not be an enzyme. Proto-oncogene Cbl protein
(CBL) binds to DNA. alpha-actinin (ACTN), fordrin (FDRN), and fimbrin
(FIMB) interact with thin filaments. Trichohyalin (HYFL) interact with
keratin based intermediate filaments.
4. A single EF-hand protein may
interact with multiple targets. Calmodulin (CAM) regulates over twenty
enzymes or structural proteins.
5. An EF-hand protein may be free in
the cytosol in the apo form and associate with its target in the
calcium form. Or the EF-hand protein, such as calcineurin B (CLNB) may
bind to its target calcineurin A (a protein phosphatase) in both apo
and calcium forms.
6. This permanent attachment to a target may be
taken one step further. Thirty-one of the 66 subfamilies are
heterchimeras; the gene encoding EF-hands has fused with the gene
encoding other domains. Of these 31, functions are known for only
fourteen (four kinases, a phosphatase, lipase, dehydrogenase, protease,
three that interact with the cytoskeleton, and one each that bind to a
ryranodine receptor, repressor, and DNA.
7. All of the cytosolic
EF-hand proteins are inferred to be calcium modulated. However, they
need not be directly involved in information transduction. Several
appear to be involved in temporal buffering of calcium, such as
parvalbumin (PARV), and/or calcium transport, ICBP. In a resting
muscle cell PARV, with relatively high affinity for divalent cations,
is (primarily) in the magnesium form. EF-hands 1 and 2 of TNC have low
affinity for divalent cations and are primarily apo. Following
stimulation the influx of calcium binds first to TNC, which has lower
affinity for calcium than does PARV, because it takes a few
milliseconds for the Mg2+ to dissociate from the PARV.
The Ca2+ then
diffuses to the apo sites on the PARV thereby facilitating and
sharpening the relaxation process and queueing the Ca2+ for
the calcium
extrusion pump [3].
8. The preceding overview of functions assumes the
existence of EF-hands as calcium modulated proteins in the cytosol.
Reticulocalbin (RTC) has a leader sequence typical of proteins found
within the lumen of the endoplasmic reticulum and glycerol phosphate
dehydrogenase (GPD) is found on the outer surface of the inner
mitochondrial membrane. Osteonectin (BM40) and QR1 are extracellular
in an environment whose pCa is near constant ~ 2.9. One EF-hand of
BM40 binds calcium, supposedly with high affinity, but not with any
modulation. This appears to be an example of Nature's having taken a
protein initially "designed" for one function and put it to use in
extracellular stabilization. S100 and PARV are sometimes found
extracellularly. Whether this reflects a normal function or pathology
has yet to be determined.
9. Although bacteria extrude calcium to
maintain pCa ~ 7.0 in the cytosol, there is no evidence of their using
calcium as a cytosolic messenger. There is one example of an EF-hand
protein (CMSE) in a prokaryote, Saccharopolyspora, perhaps transduced
from a eukaryote. An EF- hand protein (MSV) is encoded by a virus,
Entomopoxvirinae.
A great deal of variation in structure is realized
from this seemingly simple, thirty residue domain. When convoluted
with the finely shaped pulse(s) of messenger calcium, a rich tapestry
of information can be transduced by EF-hand proteins.
Most functions of EF-hand proteins
involve, or are inferred to involve, calcium binding. However, 81 of
the 247 domains of the 66 subfamilies, do not bind calcium. In some
domain subfamilies, indicated in table 1, by +/- some representative do
and some do not bind calcium. For a few, one is reluctant to predict,
e.g. ?/+. Four subfamily domains have non canonical Ca2+ coordination,
indicated by - a, b, c, or d -in table and illustrated in figure 1.
The important point is that one should not assume that all EF-hand
domains bind calcium nor that one can always predict calcium binding,
let alone affinity, from amino acid sequence.
The consensus sequence shown in figure 1 is a valuable
heuristic for illustrating and understanding the main
characteristics of an EF-hand. We emphasize, however, that the most
sensitive search involves testing a candidate protein against a large
data base, not against the heuristic. Such standard searches involve
several complications. One EF-hand is so short that an heterochimeric
protein will find homologs with higher significances based its
non-EF-hand regions. The portions of the test sequence that are
identified as similar to other proteins should be deleted and the
remaining sequence subject to a new search. A second caution is that
the proteins in the data base that are most similar to the test
sequence (or to its remainder) may not be indicated in the data base
descriptor as containing EF-hands, hence the value of table.
Conversely, one can search the data base including the test sequence of
interest with a known EF-hand. Again because one EF-hand is so short
and because there is such a range of EF-hand sequences, one is well
advised to search with several disparate EF- hands and/or to search
with pairs of EF-hands. It is important to compare the significance of
any alignment with the distribution of alignment scores from the entire
data base.
The canonical calcium binding loop of the EF-hand is best
regarded as a reference point for evaluating the numerous
variations. Positions X (residue 10), Y (12), Z (14), -Y (16), -
X(18), and -Z (21) represent the vertices of an octahedron. However
the Ca2+ ion is seven coordinate
in a pentagonal bipyramid, with major
axis along X. Six amino acids are involved in Ca2+ coordination.
Since both oxygen atoms of the carboxylate group at -Z bind the
coordination number is seven. Usually Asp or Asn are found at X and Y;
Asp, Asn, or Ser at Z; the carbonyl oxygen of a variety of residues is
a -Y; -X is more variable but usually Asp, Asn, or Ser; and usually Glu
at -Z. Although one can make a scheme that correlates calcium affinity
with distribution of residues, we know of no scheme that predicts
affinities for EF- hand loops that were not included in making the
scheme. This is because the free energy of the system depends on the
(change in) conformation of the entire protein, not just the loop in
question.
Little is known of the coordination of the Mg2+ ion in EF-hand
loops. However, given the many precedents [4] from the
structures of small molecule, one can safely infer that the Mg2+ ion
will be six coordinate. Probably the bidentate carboxylate at -Z
rotates to become monodentate and the other oxygen atoms are ~0.2 _
closer to the Mg2+ than to the Ca2+.
Three (a, b, and c of table 1) of the non-canonical calcium
coordinations involve several carbonyl oxygen atoms in place of
the usual oxygen containing side chains. None of these coordinations
were predicted; hence caution when considering the prediction (without
crystal structure) of non-binding, "-", in table 1. The fourth
exception, d in table 1 for CBL, is the use of the side chain of a Glu
from a nearby _-helix instead of the Ser at the -X vertex in the second
EF-hand loop, which by sequence appears to be canonical.
The angle between helix E and helix F varies between apo and
calcium bound. This is important because the targets,
discussed in the next two sections, of these calcium modulated
proteins, like CAM, are alpha-helices that fit into the groove between the
two EF-hands of a pair. The accessibility and hydrophobicity of this
groove depends critically on the interhelical angles of the two
EF-hands. To first approximation the interhelical angle, see figure 2
and legend, is closed in the apo form and more open in the calcium
form. However, we emphasize four points: The angles associated with
closed forms vary over a broad range as do the angles of open
conformations. Second, the change in orientation associated with
calcium binding varies. Third, several EF-hands of ELC and of RLC do
not bind calcium. Their conformations are invariant to [Ca2+] but
important to their functions. They cannot be simply classified as open
or closed [5]. Finally, many of these EF-hands are, as will be
discussed, in the magnesium, not the apo form in the quiescent cell; we
have little information about the conformations of magnesium EF-hands.
With only a few exceptions EF-hands occur in adjacent pairs.
The fifth EF-hands of mili and micro calpains (CALP), and probably
those of the close homolog, sorcin (SORC), pair to form a dimer. The
N-terminal domain of PARV, indicated as #2 in table 1, covers the
hydrophobic surface of the 3, 4 pair.
Just as there is a great range of angles between helix E and
helix F of a single EF-hand so there is also a range of
relationship between the first (ODD numbered) and second (EVEN)
EF-hand. Both, either, or neither might bind calcium. When both bind
calcium they may do so cooperatively but this is still subject to
debate.
There are three general patterns among the fourteen subfamilies
of known structure. The EF-hand protein is involved in
information transduction and an _-helix of the target protein lies in
the hydophobic groove between two EF-hands. This groove, in CAM and in
TNC is more exposed when the two EF-hands are in the calcium forms and
their respective helices E and F are more open. In the second pattern
an _-helix of a chimeric protein lies within the ODD, EVEN groove of
the same protein. This self binding is inferred to be, at least in
some instances, a component of the information transduction pathway.
In the third pattern the EF-hand protein is involved in calcium
buffering or transport and the ODD, EVEN groove is either covered as in
PARV by EF-hand 2 or partially occluded by the 1,2 loop as in ICBP.
One can safely anticipate that other patterns of function will be
revealed as more structures are determined.
Congruence is an
important characteristic in assigning proteins to subfamilies and in
grouping subfamilies such as CTER and CPV. By definition all members
of a subfamily must be congruent, as illustrated in the dendrogram
computed from the 56 domains of fourteen TNC's from ten different
species, figure 4. The dendrogram illustrates the two essential
characteristics of congruence. All domains 1 are more closely related
to one another than to other domains; correspondingly all domains 2
cluster together as do domains 3 and 4. Second, the distribution of
domains within each of the dendrograms of the four subdomains is
(nearly) identical. An additional interesting characteristic, not
inherent to the concept of congruence, is that the cluster of domains 1
is most closely related to domains 3 and the domains 2 are more closely
related to domains 4. This reflects a gene duplication and fusion of
an ODD, EVEN pair of EF-hands in the
ancestor of all animals that have TNC.
Correspondingly, if two subfamilies are congruent, the
dendrograms computed from their constituent proteins will be
(nearly) identical to those computed from the corresponding domains of
those constituent proteins. For instance all ten subfamilies within
CTER are inferred to have evolved from a common, four domain precursor
by gene duplication and subsequent divergent evolution figure 5.