RNA as genetic material
While it had been known since Bawden and Pirie’s work in 1937 that TMV particles contained RNA, followed later by a number of other viruses, it must be remembered that DNA had only really been accepted as the genetic material of cells and viruses after the Hershey-Chase experiment in 1952 and the Watson-Crick demonstration of the nature of DNA in 1953. Moreover, the way in which the information in DNA was used to make proteins was still very obscure in the 1950s, given that the proof that RNA was used as a template for the production of proteins was only provided in 1961 by Marshall Nirenberg.
It was hailed as a major development in molecular biology, therefore, when between 1955 and 1957, Heinz Fraenkel-Conrat, B Singer and Robley C Williams demonstrated that it was possible to reconstitute fully infectious TMV from separately-purified preparations of coat protein and RNA. At the time it was assumed that neither of the two components was infectious on its own; however it was subsequently shown by Fraenkel-Conrat and Singer, and separately by A Gierer and G Schramm, that purified TMV RNA was in fact infectious – albeit several hundred times more weakly per unit mass than the native or reconstituted particles.
While this was revolutionary in itself, the clinching experiment was the proof that mixed reconstitution – or the reassembly of a RNA of one strain of TMV with the coat protein of another – followed by infection of plants resulted in particles made of protein specified by the RNA component rather than being determined by the protein donor. This work possibly represents the birth of molecular virology as a sub-discipline within molecular biology, given that the molecular nature of viruses had so conclusively been shown – vindicating the prescient remark made by the virus pioneer Thomas Rivers in 1941, on the occasion of the presentation of a gold medal to Wendell Stanley, that:
“In fun, it has been said that we do not know whether to speak of the unit of this infectious agent [TMV] as an “organule” or a “molechism”” (p.7, CA Knight, Chemistry of Viruses 2nd Edn., 1975. Springer-Verlag, Wien)
Further important developments with TMV included the demonstration in 1958 by Gierer and KW Mundry that TMV mutants with altered genomes could be produced by treatment of virions with nitrous acid, which only alters nucleic acids, and the sequencing of the TMV coat protein in 1960 by two groups including Fraenkel-Conrat and Stanley and Knight in one, and Schramm in the other.
Between 1953 and 1954, an interesting class of new viruses was discovered in humans, birds, and later in other animals too. These were dubbed “respiratory enteric orphans” based on where they were found, and the fact they were not associated with any disease – which gave rise to the name “reovirus”, and their description as a distinct group of viruses by Albert Sabin in 1959. By 1962 the unique double-layered capsid morphology had been seen and the virions shown to contain RNA, and then in 1963 PJ Gomatos and I Tamm showed using physical and chemical techniques that the viruses as well as the similar wound tumour virus isolated from plants had a genome consisting of double-stranded (ds) RNA – a finding unprecedented in biology at the time. Gomatos and W Stoeckenius went on to show in 1964 – by electron microscopy – that the reovirus genome was also segmented – another unprecedented finding for viruses. In the 1963 paper the authors remark that “…all attempts to isolate the nucleic acid of reovirus in an infective form have failed” – which distinguished these viruses from the ssRNA viruses previously looked at – not surprisingly, given the requirement for a different replication method for dsRNA compared to viruses like TMV or poliovirus (see here).
A major highlight in molecular biology in 1961 was Marshall Nirenberg and Heinrich Matthaei’s 1961 demonstration of “…an assay system in which RNA serves as an activator of protein synthesis in E. coli extracts”, or the proof in an in vitro translation system that RNA was the “messenger” that conveyed genetic information into proteins.
In 1962, A Tsugita, Fraenkel-Conrat, Nirenberg and Matthaei used the still extremely novel in vitro translation system with purified TMV genomic RNA, and were able to show that:
“The addition of TMV-RNA to a cell-free amino acid incorporating system derived from E. coli caused up to 75-fold stimulation in protein synthesis (C14-incorporation). Part of the protein synthesized formed a specific precipitate with anti-TMV serum.”, indicating that TMV coat protein had been made.
This was the first demonstration of in vitro translation from any specific mRNA, and incidentally also direct proof that the single-stranded TMV genome was “messenger sense”. They also concluded that their result showed that the newly-determined “genetic code” – the nucleotide triplets that code for individual amino acids – was universal, given that it was a tobacco virus RNA being translated by a bacterial system.
Later in 1962, D Nathans and colleagues used coliphage f2 RNA as template for translation in the same type of bacterial extract. They showed that polypeptides corresponding to the coat as well as other proteins were made, showing that it was the input virion RNA that was responsible.
The proof that RNA was both the “messenger” that conveyed information from DNA to be made into protein, and was in fact a genetic material in its own right, made possible a revolution in virology that transformed it into the science we know today. The new molecular biology together with well-established physical and biochemical techniques for molecular characterisation, coupled with the ability to reliably culture bacterial, plant and now animal viruses as well, enabled an explosion of discovery that continues to this day.
A tour de force experiment in the modern molecular biological era was the in vitro synthesis of an infectious phage RNA genome by S Spiegelman and coworkers in 1965, using only purified Qbeta coliphage single-stranded virion RNA and the purified viral replicase. They remarked:
“The successful synthesis of a biologically active nucleic acid with a purified enzyme is itself of obvious interest. However, the implication which is most pregnant with potential usefulness stems from the demonstration that the replicase is, in fact, generating identical copies of the viral RNA. For the first time, a system has been made available which permits the unambiguous analysis of the molecular basis underlying the replication of a self-propagating nucleic acid.”
Ribosomes translating protein from a messenger RNA molecule. Russell Kightley Media
In 1967 there followed the demonstration that the same could be done for a single-stranded (ss)-DNA virus: M Goulian and colleagues reported in that they had successfully made a completely synthetic and infectious PhiX174 coliphage genome, by means of a series of syntheses using purified virion ssDNA, E coli DNA polymerase and a “polynucleotide-joining enzyme”, or DNA ligase. It is instructive that the authors offer this as evidence for the involvement of the same enzymes in E coli chromosomal replication, the mechanism for which which was still obscure at the time. Their justification for their work:
“If enzymatic synthesis of infectious bacteriophage DNA were achieved, it would be made clear at once that relatively few, if any, mistakes had been made in replicating a DNA sequence of several thousand nucleotides.”
- was undoubtedly borne out, in yet another example in the growing number of cases of the use of viruses to demonstrate important facets of cellular biology.
Naked nucleic acids as infectious agents: viroids
A potato disease that had been known in the New York and New Jersey state areas in the US since the 1920s was the source of an exciting discovery by Theodor (Ted) Diener and WB Raymer, reported in Science in 1967. The potato spindle tuber disease agent had proved recalcitrant over many years to being characterised or isolated; all that was known was that it could be transmitted mechanically using sap, or via grafting, and that no fungi, bacteria or viruses could be isolated from diseased material. Diener and Raymer showed that:
“Infectious entities, extractable, with phosphate buffer, from tissue infected with potato spindle tuber virus and inciting symptoms on tomato that are typical of this virus, have properties incompatible with those of conventional virus particles. …[Their properties] suggest that the extractable infectious agent may be a double-stranded RNA.”
By 1971 Diener had determined that
“…the infectious RNA occurs in the form of several species with molecular weights ranging from 2.5 × 104 to 1.1 × 105 daltons. No evidence for the presence in uninoculated plants of a latent helper virus was found. Thus, potato spindle tuber “virus” RNA, which is too small to contain the genetic information necessary for self-replication, must rely for its replication mainly on biosynthetic systems already operative in the uninoculated plant.”
This was a revolutionary concept: an infectious, pathogenic entity in the form of a naked RNA that was too small to encode a replicase or any other protein. He proposed the term “viroid” to designate this and similar agents, a term that persists up to today. By 1979, they were known to be single-stranded circular RNA molecules with a high degree of sequence self-complementarity, which results in them appearing as “highly base-paired rods”.
Reverse transcription and tumour viruses
While it was apparent in the 1960s that there were single-and double-stranded DNA and RNA viruses, it was only in 1970 that two back-to-back papers in Nature, by Howard Temin and S Mituzami, and David Baltimore respectively, revealed a highly novel viral replication strategy. They showed that “RNA tumour viruses” such as the agents found by Ellerman and Bang and Peyton Rous contained an enzyme activity named reverse transcriptase – a colloquial term for RNA-dependent DNA polymerase – in their virions, which converted the single-stranded RNA genomes into double-stranded DNA. Later this was shown to result in resulted in insertion of the DNA into the host cell genome, vindicating Howard Temin’s 1960 proposal that “…a RNA tumor virus can give rise to a DNA copy which is incorporated into the genetic material of the cell”.
When Francis Crick formulated his ”Central Dogma” in 1956, it was indisputable that genetic information flowed from DNA to progeny DNA, from DNA to RNA, and from messenger RNA to protein – while he only postulated no return information flow from protein, it was generally assumed that this was also true for RNA.
In the words of David Baltimore, in his Nature article:
“Two independent groups of investigators have found evidence of an enzyme in virions of RNA tumour viruses which synthesizes DNA from an RNA template. This discovery, if upheld, will have important implications not only for carcinogenesis by RNA viruses but also for the general understanding of genetic transcription: apparently the classical process of information transfer from DNA to RNA can be inverted.”
This gives rise to a modified Central Dogma, where information flows from DNA to DNA, from DNA to RNA, from RNA to RNA, from RNA to DNA, and from RNA to protein. It is interesting that RNA seems central to this flow – which, incidentally, strengthens the proposal that RNA is the original genetic material.
Baltimore and Temin both received a share of the Nobel Prize in Physiology or Medicine 1975 for their discovery of reverse transcriptase – and shared it with Renato Dulbecco, who was credited with clarifying the process of infection and of cellular transformation by DNA tumour viruses. He used the double-stranded (ds) DNA polyomavirus SV40: this was originally isolated from monkeys, but shown to cause a variety of tumours in a number of experimental animals, hence the name “poly-oma”.
He and colleagues showed that polyomavirus grew and could be assayed normally in certain cell cultures, but caused tumour-like transformation of cells in others in which it did not grow. They showed that transformed cell chromosomes contained covalently integrated viral DNA termed a provirus, which was active in producing mRNA which made virus-specific proteins. Thus, his work was the first to show how DNA viruses might cause cancer, and he and his colleagues deserved their award “…for their discoveries concerning the interaction between tumour viruses and the genetic material of the cell.”
Viral genome cloning and sequencing: the new age
The techniques of recombinant DNA technology – or the artificial introduction of genetic material from one organism into the genome of another – were pioneered between 1971 and 1973 by Paul Berg, Herbert Boyer and Stanley Cohen. In 1971 Berg performed an in vitro exercise in which a segment of the lambda phage genome was ligated into the purified DNA of SV40, which had been linearised using the then-new restriction endonuclease, EcoRI. Cohen, Annie Chang, Boyer and Robert Helling took the technology further in 1973 by showing that:
“The construction of new plasmid DNA species by in vitro joining of restriction endonuclease-generated fragments of separate plasmids is described. Newly constructed plasmids that are inserted into Escherichia coli by transformation are shown to be biologically functional replicons that possess genetic properties and nucleotide base sequences from both of the parent DNA molecules.”
Cloning had arrived – made possible in part by use of viruses. The fundamental nature of this advance of molecular biology was rewarded by a half share of the 1980 Nobel Prize in Chemistry to Paul Berg.
Nucleotide sequencing, or the determination of the order of bases in nucleic acids, started with laborious, difficult techniques such as the two-dimensional fractionation of enzyme digests of 32P-labelled for RNA described by Frederick Sanger and colleagues in 1965. DNA sequencing followed in 1970: Ray Wu described the use of E coli DNA polymerase and radiolabelled nucleotides to sequence the single-stranded ends of phage lambda DNA. He and colleagues followed this with a more general method in 1973, using extension of synthetic oligonucleotide “primers” annealed to target DNA.
Walter Gilbert and Allan Maxam published in February 1977 an immediately popular paper entitled “A new method for sequencing DNA”. This became known as Maxam-Gilbert sequencing, or the chemical method, as it entailed sequencing by chemical degradation. Also in 1977, however, Frederick Sanger and colleagues adapted the Wu technique to come up with the so-called Sanger method, or “DNA sequencing with chain-terminating inhibitors“: this soon became the industry standard for at least the next twenty years, because it was easier and cheaper than the chemical method.
Gilbert and Sanger were awarded a share of the Nobel Prize in Chemistry in 1980, “for their contributions concerning the determination of base sequences in nucleic acids“.
MS2 phage sequencing
A highlight of Ed Rybicki’s introduction to the world of viruses was discovering during his Honours year in 1977, the paper in Nature in 1976 by Walter Fiers and his coworkers on completing the genome sequencing of the ssRNA E coli phage, MS2. They had previously also been responsible for the first ever gene sequence, in 1972: this was of the coat protein gene from the same virus. This was a landmark publication, because it completed the work of years by their group by sequencing the replicase gene, using the ribonuclease digestion and genome fragmentation and two-dimensional electrophoresis technique from Sanger. Moreover, they proposed a secondary structure for the replicase gene based on intrasequence complementarity, and described it eloquently as follows:
“The secondary structure of the coat gene resembles a flower, and there are similar foldings in other parts of the molecule; the secondary structure of the whole viral RNA therefore constitutes a bouquet”.
Their achievement looks modest in retrospect, in this era of high-throughput sequencing – however, it is worth remembering that at this time in 1976,
“MS2 is the first living organism for which the entire primary chemical structure has been elucidated”.
Depiction of the linear sequence of MS2 phage. The maturation (M), coat (CP) and replicase (Rep) genes and proteins were known at the time of sequencing; the lysis gene that partially overlaps the Rep open reading frame was shown to be functional only in 1982
While this comprised just 3569 nucleotides, encoding only three genes, this is sufficient to constitute a self-replicating entity with an independent evolutionary history.
The immediate value of their work was that it provided a basis for understanding the biology of the interaction of the genome with the bacterial cell at the molecular level. Moreover, the proposed secondary structures also helped explain how such a simple genome managed to temporally regulate its own expression – by means of long-distance interactions between different areas of the sequence.
PhiX174 phage sequencing
The next complete viral genome sequenced was that of the circular single-stranded DNA coliphage PhiX174, in 1977 by Sanger and his team in Cambridge, using the new sequencing technique invented by them. The abstract of their paper reads:
“A DNA sequence for the genome of bacteriophage phi X174 of approximately 5,375 nucleotides has been determined using the rapid and simple ‘plus and minus’ method. The sequence identifies many of the features responsible for the production of the proteins of the nine known genes of the organism, including initiation and termination sites for the proteins and RNAs. Two pairs of genes are coded by the same region of DNA using different reading frames.“
This was the first complete genome sequenced for any DNA-containing organism, and a satisfying conclusion to many decades of work on the virus. One of the most interesting features of the sequence was the fact that several of the 11 genes are highly overlapping: that is, the same DNA sequence is used to encode completely different genes in different open reading frames. This represented an economy of use of genetic information that was hitherto unknown.
Ed Rybicki was also able to greatly impress his Honours external examiner – one DR Woods – by launching into a detailed account of the sequencing and the genetic implications, when asked “What did you find interesting in the literature this year?”
The simian vacuolating virus 40, or SV40, was discovered in 1960 by Ben Sweet and Maurice Hilleman as a contaminant of live attenuated polio vaccines made between 1955 and 1961: this was as a result of use of vervet or African green monkey cells that were inadvertently infected with SV40 to grow up the polioviruses. As a consequence, between 1955 and 1963 up to 90% of children and 60% of adults – 98 million people – in the USA were inadvertently inoculated with live SV40. Given the demonstration by Bernice Eddy and others in 1962 that hamsters inoculated with simian cells infected with SV40 developed sarcomas and ependymomas, the class of viruses including SV40 and MPyV described earlier became known as “polyomaviruses”, and DNA tumour viruses. However, and despite considerable concern over many years, SV40 has not been shown to cause or to definitively be associated with any human cancers.
Still, it had become an object of considerable interest as mentioned earlier in connection with Renato Dulbecco, and it was accordingly the next virus to be completely sequenced. This was by Walter Fier’s group: they determined by Maxam-Gilbert sequencing that the circular dsDNA genome comprised 5224 base pairs, and had an interesting organisation. In their words:
“Particular points of interest revealed by the complete sequence are the initiation of the early t and T antigens at the same position and the fact that the T antigen is coded by two non-contiguous regions of the genome; the T antigen mRNA is spliced in the coding region. In the late region the gene for the major protein VP1 overlaps those for proteins VP2 and VP3 over 122 nucleotides but is read in a different frame.”
Linear depiction of the circular SV40 genome and its protein coding capacity. Regions of RNA spliced out of of transcribed genomic sequence, and the direction of transcription, are shown as red arrows. Genes shown are those depicted in the current Genbank sequence entry.
This was the first time that RNA splicing had been demonstrated for an entire genome; indeed, it had only been discovered in 1977 when two separate groups of researchers showed that adenovirus-specific mRNAs made late in the replication cycle in cell cultures were mosaics, being comprised of sequences from noncontiguous or separated sites in the viral genome. This was subsequently found to be a common feature in eukaryotic but not prokaryotic mRNAs.
The SV40 genome showed major gene overlaps, as for the PhiX174, again demonstrating the effectiveness with which viruses could pack protein coding capability into a small genome.