The Generation Of
Diversity For Antigen Recognition
We know that
the immune system has to be capable of recognizing virtually any pathogen that
has arisen or might arise. The awesome genetic solution to this problem of anticipating
an unpredictable future involves the generation of millions of different
specific antigen receptors, probably vastly more than the lifetime needs of the
individual. As this greatly exceeds the estimated number of 25 000–30 000 genes
in the human body, there are some clever ways to generate all this diversity,
particularly as the total number of V, D, J, and C genes
in an individual human coding for antibodies and TCRs is only around 400. Let’s
revisit the genetics of antibody diversity, and explore the enormous
similarities, and occasional differences, seen with the mechanisms employed to
generate TCR diversity.
Intrachain amplification of
diversity
Random VDJ combination increases
diversity geometrically
We saw in
Chapter 3 that, just as we can use a relatively small number of different
building units in a child’s construction set such as LEGO® to create a rich
variety of architectural masterpieces, so the individual receptor gene segments
can be viewed as building blocks to fashion a multiplicity of antigen specific
receptors for both B‐ and T‐cells. The immunoglobulin light chain variable
regions are created from V and J segments, and the heavy chain
variable regions from V, D, and J segments. Likewise, for
both the αβ and γδ T‐cell receptors the variable region of one of the chains (α
or γ) is encoded by a V and a J segment, whereas the variable
region of the other chain (β or δ) is additionally encoded by a D segment.
As for immunoglobulin genes, the enzymes RAG‐1 and RAG‐2 recognize recombination
signal sequences (RSSs) adjacent to the coding sequences of the TCR V, D,
and J gene segments. The RSSs again consist of conserved heptamers and
nonamers separated by spacers of either 12 or 23 base‐pairs and are found at
the 3′ side of each V segment, on both the 5′ and 3′ sides of each D segment,
and at the 5′ of each J segment. Incorporation of a D segment is
always included in the rearrangement; Vβ cannot join directly to Jβ, nor Vδ
directly to Jδ. To see how sequence diversity is generated for TCR, let us take
the αβ TCR as an example (Table 4.2). Although the precise number of gene
segments varies from one individual to another, there are typically around 75 Vα
gene segments and 60 Jα gene segments. If there were entirely random
joining of any one V to any one J segment, we would have
the possibility of generating 4500 VJ combinations (75 × 60). Regarding
the TCR β‐chain, there are approximately 50 Vβ genes that lie upstream
of two clusters of DβJβ genes, each of which is associated with a Cβ gene (Figure
4.11). The first cluster, associated with Cβ1, has a single Dβ1 gene and 6 Jβ1
genes, whereas the second cluster associated with Cβ2 again has a single Dβ
gene (Dβ2) with 7 Jβ2.
The Dβ1
segment can combine with any of the 50 Vβ genes and with any of the 13 Jβ1
and Jβ2 genes (Figure 4.11). β2 behaves similarly but can only
combine with one of the 7 downstream Jβ2 genes. This provides
1000 different possible VDJ combinations for the TCR β‐chain. Therefore,
although the TCR α and β chain V, D, and J genes add up
arithmetically to just 200, they produce a vast number of different α and β
variable regions by geometric recombination of the basic
elements. But, as with immunoglobulin gene rearrangement, that is only the
beginning.
Figure
4.11 Rearrangement of the T‐cell receptor β‐chain gene locus. In this example
Dβ1 has rearranged to Jβ2.2, and then the Vβ2 gene selected out of the 50 or so
(Vβn) Vβ genes. If the same V and D segments had been used, but this time Jβ1.4
had been employed, then the Cβ1 gene segment would have been utilized instead
of Cβ2.
Playing with the junctions
Another ploy
to squeeze more variation out of the germline repertoire that is used by both
the TCR and the immunoglobulin genes (see Figure 3.25) involves variable
boundary recombinations of V, D, and J to produce
different junctional sequences (Figure 4.12.).
As discussed
in Chapter 3, further diversity results from the generation of palindromic
sequences (P‐elements) arising from the formation of hairpin structures during
the recombination process and from the insertion of nucleotides at the N region
between the V, D, and J segments, a process associated
with the expression of terminal deoxynucleotidyl transferase. While these
mechanisms add nucleotides to the sequence, yet more diversity can be created
by nucleases chewing away at the exposed strand ends to remove nucleotides.
These maneuvers again greatly increase the repertoire, especially important for
the TCR γ and δ genes, which are otherwise rather limited in number.
Additional
mechanisms relate specifically to the D‐region sequence: particularly in
the case of the TCR δ genes, where the D segment can be read in three
different reading frames and two D segments can join together. Such DD
combinations produce a longer third complementarity determining region
(CDR3) than is found in other TCR or antibody molecules.
As the CDR3
in the various receptor chains is essentially composed of the regions between
the V(D)J segments, where junctional diversity mechanisms can introduce
a very high degree of amino acid variability, one can see why it is that this hypervariable
loop usually contributes the most to determining the fine antigen‐binding
specificity of these molecules.
Figure
4.12 Junctional diversity between a TCR V α and J α germline segment producing
three variant protein sequences. The nucleotide triplet that is
spliced out is colored the darker blue. For TCR β chain and Ig heavy chain
genes junctional diversity can apply to V, D, and J segments.
Receptor editing
Recent
observations have established that lymphocytes are not necessarily stuck with
the antigen receptor they initially make: if they don’t like it they can change
it. The replacement of an undesired receptor with one that has more acceptable
characteristics is referred to as receptor editing. This process
has been described for both immunoglobulins and for TCR, allowing the
replacement of either nonfunctional rearrangements or autoreactive
specificities. Furthermore, receptor editing in the periphery may rescue
low‐affinity B‐cells from apoptotic cell death by replacing a low‐affinity
receptor with a selectable one of higher affinity. That this does indeed occur
in the periphery is strongly supported by the finding that mature B‐cells in
germinal centers can express RAG‐1 and RAG‐2 that mediate the rearrangement
process.
But how does
this receptor editing work? Well, in the case of the receptor chains that lack D
gene segments, namely the immunoglobulin light chain and the TCR α chain, a
secondary rearrangement may occur by a V gene segment upstream of the previously
rearranged VJ segment recombining to a 3′ J gene sequence, both
of these segments having intact RSSs that are compatible (Figure 4.13a).
However, for immunoglobulin heavy chains and TCR β deletes all of the D segment‐associated
RSSs (Figure 4.13b). Because VH and JH both have 23 basepair
spacers in their RSSs, they cannot recombine: that would break the 12/23 rule.
This apparent obstacle to receptor editing of these chains may be overcome by
the presence of a sequence near the 3′ end of the V coding sequences
that can function as a surrogate RSS, such that the new V segment would
simply replace the previously rearranged V, maintaining the same D and
J sequence (Figure 4.13b). This is probably a relatively inefficient
process and receptor editing may therefore occur more readily in immunoglobulin
light chains and TCR α chains than in
immunoglobulin heavy chains and TCR β chains. Indeed, it has been suggested
that the TCR α chain may undergo a series of rearrangements, continuously
deleting previously functionally rearranged VJ segments until a
selectable TCR is produced.
Figure 4.13 Receptor
editing. (a) For immunoglobulin light chain or TCR α chain the recombination
signal sequences (RSSs; heptamer– nonamer motifs) at the 3′ end of each
variable (V) segment and the 5′ of each joining (J) segment are
compatible with each other and therefore an entirely new rearrangement can
potentially occur as shown. This would result in a receptor with a different
light chain variable sequence (in this example Vκ37Jκ4 replacing Vκ39Jκ3)
together with the original heavy chain. (b) With respect to the immunoglobulin
heavy chain or TCR β chain the organization of the heptamer–nonamer sequences
in the RSS precludes a V segment directly recombining with the J segment.
This is the so‐called 12/23 rule whereby the heptamer–nonamer sequences
associated with a 23 base‐pair spacer (colored violet) can only base‐pair with
heptamer–nonamer sequences containing a 12 base‐pair spacer (colored red). The
heavy chain V and J both have an RSS with a 23 base‐pair spacer
and so this is a nonstarter. Furthermore, all the unrearranged D segments
have been deleted so that there are no 12 base‐pair spacers remaining. This
apparent bar to secondary rearrangement is probably overcome by the presence of
an RSS‐like sequence near the 3′ end of the V gene coding sequences, so
that only the V gene segment is replaced (in the example shown, the
sequence VH38DH3JH2 replaces VH40DH3JH2).
Recognition of the correct
genomic regions by the RAG recombinase
A question
that is only now being resolved is how the RAG‐1/ RAG‐2 recombinase selects the
correct genomic regions to target for recombination. Clearly it would be
disastrous were this complex able to access all DNA, randomly leaving
double‐stranded breaks in its wake. One mechanism of protection is to induce
RAG expression only where and when it is needed, but this does not explain how
the RAG complex is targeted only to Ig and TCR loci in the cells in which it is
expressed. This puzzle is explained by observations suggesting that alterations
to histones – the proteins upon which DNA is packaged – flag particular
loci for binding of the RAG complex. Recent studies have shown that histone H3
that has been modified by tri methylation on lysine at position 4 (H3K4me3)
acts as a bind ing site for RAG‐2. Thus, genomic regions that are poised for
VDJ recombination are located close to H3K4me3 histone “marks.” Consistent with
this idea, experimental ablation of H3K4me3 marks results in greatly impaired
V(D)J recombination. But the H3K4me3 mark is found at many more sites
throughout the genome than there are antigen receptor loci, so how does the
RAG‐1/RAG‐2 complex find the correct sites? The answer seems to be that the
specificity of RAG‐1 for RSS sites, combined with that of RAG‐2 for H3K4me3
chromatin marks, may act as a clamp that guides the recombinase to the right
locations. Binding of the RAG complex to the H3K4me3 mark may also activate the
recombinase activity of RAG‐1 through an allosteric mechanism, increasing the
catalytic activ ity of the complex when it has been positioned at the correct
location.
Interchain amplification
The immune
system took an ingenious step forward when two different types of chain were
utilized for the recognition molecules because the combination produces not
only a larger combining site with potentially greater affinity, but also new
variability. Heavy–light chain pairing among immunoglobulins appears to be
largely random and therefore two B‐cells can employ the same heavy chain but
different light chains. This route to producing antibodies of differing
specificity is easily seen in vitro where shuffling different
recombinant light chains against the same heavy chain can be used to either
fine‐tune, or sometimes even alter, the specificity of the final antibody. In
general, the available evidence suggests that in vivo the major
contribution to diversity and specificity comes from the heavy chain, perhaps
not unrelated to the fact that the heavy chain CDR3 gets off to a head start in
the race for diversity being, as it is, encoded by the junctions between three
gene segments: V, D, and J.
This random
association between TCR γ and δ chains, TCR α and β chains, and Ig heavy and
light chains yields a further geometric increase in diversity. From Table 4.2
it can be seen that approximately 230 functional TCR and 153 functional Ig
germline segments can give rise to 4.5 million and 2.3 million different
combinations, respectively, by straightforward associations without taking
into account all of the fancy junctional mechanisms described above. Hats off
to evolution!
Somatic hypermutation
As discussed
in Chapter 3, there is inescapable evidence that immunoglobulin V‐region
genes can undergo significant somatic hypermutation. Analysis of
18 murine λ myelomas revealed 12 with identical structure, four showing just
one amino acid change, one with two changes and one with four changes, all
within the hypervariable regions and indicative of somatic hypermutation of the
single mouse λ germline gene. In another study, following immunization with
pneumococcal antigen, a single germline T15 VH gene gave rise by
mutation to several different VH genes all encoding phosphorylcholine
antibodies (Figure 4.14).
A number of
features of this somatic diversification phe nomenon are worth revisiting. The
mutations are the result of single nucleotide substitutions, they are
restricted to the variable as distinct from the constant region and occur in
both framework and hypervariable regions. The mutation rate is remarkably high,
approximately 1 × 10−3 per base‐pair per generation, which is approximately a
million times higher than for other mammalian genes. In addition, the
mutational mechanism is bound up in some way with class switch recombination as
the enzyme activation‐induced cytidine deaminase (AID)
is required for both processes and hypermutation is more frequent in IgG and
IgA than in IgM antibodies, affecting both heavy (Figure 4.14) and light
chains. However, VH genes are, on average more mutated than VL
genes. This might be a consequence of receptor editing acting more
frequently on light chains, as this would have the effect of wiping the slate
clean with respect to light chain V gene mutations while maintaining
already accumulated heavy chain V gene point mutations.
As we
outlined in Chapter 3, AID initiates both class switch recombination as well as
somatic hypermutation through deaminating deoxycytidine within certain DNA
hotspots that are characterized by the presence of WRC sequences (W = A or T, R
= purine, and C is the deoxycytidine that becomes deaminated). Although the
target of AID was initially thought to be RNA, more recent evidence suggests
that this enzyme works directly on DNA, although RNA editing is not ruled out.
Deamination of deoxycytidine changes this base to a deoxyuracil that would
normally be repaired by mismatch repair enzymes but, for reasons that are not
yet fully understood, can result in removal of the mismatched uracil that
generates a gap that is filled in by an errorprone polymerase to generate a
point mutation at this position and can also mutate surrounding bases. It
remains unclear how AID is targeted to the correct locations within V regions
of rearranged Ig genes, to ensure that mutations are not inadvertently
introduced at other loci, but similar to the RAG recombinase, this might
involve specific histone modifications. Hyperacetylated versions of histones H3
and H4 appear to be more abundant in mutating V regions than in the C regions
of Ig genes. This observation, coupled with observations that AID is recruited
to actively transcribing Ig genes by proteins that bind to CAGGTG sequences
found in all Ig transcriptional enhancers, suggests a possible mechanism. Thus,
the combination of the CAGGTG sequence motif, coupled with the modified
histones discussed above, may position AID at the correct locations from which
to operate.
Somatic
hypermutation does not appear to add significantly to the repertoire available
in the early phases of the primary response, but occurs during the generation
of memory and is responsible for tuning the response towards higher affinity.
Recently,
data have been put forward suggesting that there is yet another mechanism for
creating further diversity. This involves the insertion or deletion of short
stretches of nucleotides within the immunoglobulin V gene sequence of
both heavy and light chains. This mechanism would have an intermediate effect
on antigen recognition, being more dramatic than single point mutation, but
considerably more subtle than receptor editing. In one study, a reverse
transcriptase‐polymerase chain reaction (RT‐PCR) was employed to amplify the
expressed VH and VL genes from 365 IgG+
B‐cells and it was shown that 6.5% of the cells contained nucleotide insertions
or deletions. The transcripts were left in‐frame and no stop codons were
introduced by these modifications. The percentage of cells containing these
alterations is likely to be an underestimate. All the insertions and deletions
were in, or near to, CDR1 and/or CDR2. N‐region diversity of the CDR3 meant
that it was not possible to analyze the third hypervariable region for
insertions/deletions of this type and therefore these would be missed in the
analysis. The fact that the alterations were associated with CDRs does suggest
that the B‐cells had been subjected to selection by antigen. It was also
notable that the insertions/deletions occurred at known hotspots for somatic
point mutation, and the same error‐prone DNA polymerase responsible for somatic
hypermutation may also be involved here. The sequences were often a duplication
of an adjacent sequence in the case of insertions or a deletion of a known
repeated sequence. This type of modification may, like receptor editing, play a
major role in eliminating autoreactivity and also in enhancing antibody
affinity.
T‐cell receptor genes, on the other hand, do
not generally undergo somatic hypermutation. It has been argued that
this would be a useful safety measure as T‐cells are positively selected in the
thymus for weak reactions with self MHC, so that mutations could readily lead
to the emergence of high‐ affinity autoreactive receptors and autoimmunity.
One may ask
how it is that this array of germline genes is protected from genetic drift.
With a library of 390 or so functional V, D, and J genes,
selection would act only weakly on any single gene that had been functionally
crippled by mutation and this implies that a major part of the library could be
lost before evolutionary forces operated. One idea is that each subfamily of
related V genes contains a prototype coding for an antibody
indispensable for protection against some common pathogen, so that mutation in
this gene would put the host at a disadvantage and would therefore be selected
against. If any of the other closely related genes in its set became defective
through mutation, this indispensable gene could repair them by gene conversion,
a mechanism in which two genes interact in such a way that the nucleotide
sequence of part or all of one becomes identical to that of the other. Although
gene conver sion has been invoked to account for the diversification of MHC
genes, it can also act on other families of genes to main tain a degree of
sequence homogeneity. Certainly it is used extensively by, for example,
chickens and rabbits, in order to generate immunoglobulin diversity. In the
rabbit only a single germline VH gene is rearranged in the
majority of B‐cells; this then becomes a substrate for gene conversion by one
of the large number of VH pseudogenes. There are also large
numbers of VH pseudogenes and orphan genes (genes located
outside the gene locus, often on a completely different chromosome) in humans
that actually outnumber the functional genes, although there is no evidence to
date that these are used in gene conversion processes.
Figure 4.14 Mutations
in regions of five IgM and five IgG monoclonal phosphorylcholine antibodies
generated during an antipneumococcal response in a single mouse are compared
with the primary structure of the T15 germline sequence. A line indicates
identity with the T15 prototype and an orange circle a single amino acid
difference. Mutations have only occurred in the IgG molecules and are seen in
both hypervariable and framework segments. (After Gearhart P.J. (1982) Immunology
Today 3, 107.) Although in some other studies somatic hypermutation
has been seen in IgM antibodies, the amount of mutation usually greatly
increases following class switching.