Introduction to Viruses – Part 1

Introduction to Viruses

Viruses as a concept are just a little younger than bacteria – they were first described only in the 1890s – yet have probably co-existed with cellular life through nearly the whole of evolutionary history on this planet.

This chapter will give an account of the history of the discovery of viruses, concentrating on the technological developments that were necessary for the discovery events to happen. 

Section 1: Discovery of Viruses

Section 2: Viruses as Organisms

Section 3: Origins of Viruses

Section 1: The Discovery of Viruses


Viruses were discovered as an excluded entity rather than by being seen or cultured, due to the invention of efficient filters: the fact that cell-free extracts from diseased plants and animals could still cause disease led people to theorise that an unknown infectious agent – a “filterable virus” – was responsible.

While people were aware of diseases of both humans and animals now known to be caused by viruses many hundreds of years ago, the concept of a virus as a distinct entity dates back only to the very late 1800s.  Although the term had been used for many years previously to describe disease agents, the word “virus” comes from a Latin word simply meaning “slimy fluid”.

Porcelain filters and the discovery of viruses

The invention that allowed viruses to be discovered at all was the Chamberland-Pasteur filter.  This was developed in 1884 in Paris by Charles Chamberland, who worked with Louis Pasteur.  It consisted of unglazed porcelain “candles”, with pore sizes of 0.1 – 1 micron (100 – 1000 nm), which could be used to completely remove all bacteria or other cells known at the time from a liquid suspension.  Though this simple invention essentially enabled the establishment of a whole new science – virology – the continued development of the discipline required a string of technical developments, which we will highlight.

As the first in what was to be an interesting succession of events, Adolph Mayer from Germany, working in Holland in 1886, showed that the “mosaic disease” of tobacco could be transmitted to other plants by rubbing a liquid extract, filtered through paper, from an infected plant onto the leaves of a healthy plant.  However, he came to the conclusion it must be a bacterial disease.

The first use of porcelain filters to characterize what we now know to be a virus was reported by Dmitri Ivanowski in St Petersburg in Russia, in 1892.  He had used a filter candle on an infectious extract of tobacco plants with mosaic disease, and shown that it remained infectious: however, he concluded the agent was either a toxin or bacterial in nature.

The Dutch scientist Martinus Beijerinck in 1898 described how he did similar experiments with bacteria-free extracts, but made the conceptual leap and described the agent of mosaic disease of tobacco as a “contagium vivum fluidum”, or living contagious fluid. The extract was completely sterile, could be kept for years, but remained infectious.  The term virus was later used to describe such fluids, also called “filterable agents”, which were thought to contain no particles.  The virus causing mosaic disease is now known as Tobacco mosaic virus (TMV).

Early Virus Discovery

1899: The second virus discovered was what is now known as Foot and mouth disease virus (FMDV) of farm and other animals, by the German scientists Friedrich Loeffler and Paul Frosch.  Again, their “sterile” filtered liquid proved infectious in calves, providing the first proof of viruses infecting animals. 

1898: G Sanarelli, working in Uruguay, described the smallpox virus relative and tumour-causing myxoma virus of rabbits as a virus.

1901: The first human virus described was the agent which causes yellow fever: this was discovered and reported in by the US Army physician Walter Reed, after pioneering work in Cuba by Carlos Finlay proving that mosquitoes transmitted the deadly disease. 

1902: A finding that was later to have great importance in veterinary virology was the discovery by Maurice Nicolle and Adil Mustafa in Turkey in 1902, that rinderpest or cattle plague was caused by a virus.    

Sir Arnold Theiler, a Swiss-born veterinarian working in South Africa, had developed a crude vaccine against rinderpest by 1897, without knowledge of the nature of the agent: this consisted of blood from an infected animal, injected with serum from one that had recovered.  This risky mixture worked well enough, however, to eradicate the disease in the region.  He went on to do the same thing successfully for African horsesickness virus and others.

1903: The description in Annales de l’Institut Pasteur by Remlinger and Riffat-Bay from Constantinople in 1903 of the agent of rabies as a “filterable virus” was the culmination of many years of distinguished work in France on the virus, started by Louis Pasteur himself.

1906: Adelchi Negri – who had previously discovered the Negri bodies in cells infected with rabies virus – showed that vaccinia virus was filterable.  This was the final step in a long series of discoveries around smallpox, that started with Edward Jenner’s use of what was supposedly cowpox, but may have been horsepox virus, to protect people from the disease in 1796.

Also in 1906, A Zimmermann proposed – in a paper entitled “Die Krauselkrankheit des Maniok” – that the agent of mosaic disease of cassava that had first been described from German East Africa (now Tanzania) in 1894, was a filterable virus.

1908: Oluf Bang and Vilhelm Ellerman in Denmark were the first to associate a virus with leukaemia: they successfully used a cell-free filtrate from chickens with avian leukosis to transmit the disease to healthy chickens.

1908: Karl Landsteiner and Erwin Popper in Germany showed that poliomyelitis or infantile paralysis in humans was caused by a virus.

1911: The first solid tumour-causing virus, or virus associated with cancer, was found by Peyton Rous in the USA in.  He showed that chicken sarcomas, or solid connective tissue tumours, could be transmitted by grafting, but also that a filterable or cell-free agent extracted from a sarcoma was infectious.  The virus was named for him as Rous sarcoma virus, and is now known to be a “retrovirus”.

1915: Frederick Twort in the UK accidentally found a filterable agent that caused the bacteria he was growing to lyse, or burst open.  However, he was not sure whether or not it was a virus.

1917: Félix d’Hérelle in Paris published that he had discovered a virus that lysed a bacterial agent he was culturing that caused dysentery, or diarrhoea.  He named the virus “bacteriophage”, or eater of bacteria, derived from the Greek term “phagein”, meaning to eat. 

1918-1922: Possibly the worst human plague the world has ever seen swept across the planet: this was known as the Spanish Flu, from where it was first properly reported, and it went on to kill more than 50 million people all over the world.  We now know it to have been H1N1 influenza type A: modern reconstruction of the virus from archived tissue samples and frozen bodies found in permafrost has shown it probably jumped directly into humans from birds, as all influenza A viruses appear to originate in birds.

1920s: Other agents of other diseases were found to be “filterable viruses” in the 1920s, including yellow fever virus by Adrian Stokes in 1927, in Ghana.  Indeed, the US bacteriologist and virologist Thomas Rivers in 1926 counted some sixty-five disease agents that had been identified as viruses.

1929: Howard Andervost, at Harvard University, showed that human herpes simplex virus could be cultured by injection into the brains of live mice

1930: the South African-born Max Theiler – son of Sir Arnold, also at Harvard – showed that yellow fever virus could be similarly cultured.  It also allowed the development of attenuated or weakened strains of virus by serial passage or repeated transmission of the virus between mice, and the successful animal testing of vaccine candidates and of protective antisera.  Theiler was awarded the Nobel Prize in 1951 for this work, which until 2008, was the first and only recognition of virus vaccine work by the Nobel Foundation. 

1931: A landmark in medical virology was the development of human vaccines against yellow fever virus, by Wilbur Sawyer in the USA: this followed on Theiler’s mouse work in using brain-cultured virus plus human immune serum from recovered patients to immunize humans – very similar to Theiler Senior’s strategy with rinderpest.

Also in 1931, Richard E Shope in the USA managed to recreate swine influenza by intranasal administration of filtered secretions from infected pigs.  Moreover, he showed that the classic severe disease required co-inoculation with a bacterium – Haemophilus influenza suis – originally thought to be the only agent. 

1929-1931: Patrick Laidlaw and William Dunkin, working in the UK at the National Institute for Medical Research (NIMR), had by 1929 successfully characterised the agent of canine distemper – a relative of measles, mumps and distemper morbilliviruses – as a virus, proved it infected dogs and ferrets, and in 1931 got a vaccine into production that protected dogs. 

1930 – 1931: The fact that phages adsorbed irreversibly to their hosts as part of the infection process was shown by AP Krueger and Max Schlesinger in 1930 – 1931.  Schlesinger later showed between 1934 and 1936 that the bacteriophage he worked with consisted of approximately equal amounts of protein and DNA, which was incidentally the first proof that viruses might be nucleoprotein in nature.

1931: CG Vinson and AM Petre, working with the infectious agent causing mosaic disease in tobacco – tobacco mosaic virus, or TMV – showed that they could precipitate the virus from suspension as if it were an enzyme, and that infectivity of the precipitated preparation was preserved.  Indeed, in their words: “…it is probable that the virus which we have investigated reacted as a chemical substance”.

1933: Christopher Andrewes, Laidlaw and W Smith reported in that they had isolated a virus from humans infected with influenza from an epidemic then raging.  They had done this by infecting ferrets with filtered extracts from infected humans – after the fortuitous observation that infected ferrets could apparently transmit influenza to investigators!  The “ferret model” was very valuable, as strains of influenza virus could be clinically distinguished from one another.

Eggs and viruses

1931: Ernest Goodpasture, working at Vanderbilt University in the USA, showed in that it was possible to grow fowlpox virus – a relative of smallpox – by inoculating the chorioallantoic membrane of eggs, and incubating them further. 

While tissue culture had in fact been practiced for some time – for example, as early as the 1900s, investigators had grown “vaccine virus” or the smallpox vaccine now called vaccinia virus in minced up chicken embryos suspended in chicken serum – this technique represented a far cheaper and much more “scalable” technique for growing pox- and other suitable viruses.

Eggs, assays and vaccines

1936: Frank Macfarlane Burnet used embryonated egg culture of viruses to demonstrate that it was possible to do “pock assays” on chorioallantoic membranes that were very similar to the plaque assays done for bacteriophages, with which he was also very familiar. 

Also in 1936, Burnet started a series of experiments on culturing human influenza virus in eggs: he quickly showed that it was possible to do pock assays for influenza virus, and that:

“It can probably be claimed that, excluding the bacteriophages, egg passage influenza virus can be titrated with greater accuracy than any other virus.”

1937: Max Theiler and colleagues in the USA took advantage of the new method of egg culture to adapt the French strain of yellow fever virus (YFV) he had grown in mouse brains to being grown in chick embryos, and showed that he could attenuate the already weakened strain even further – but it remained “neurovirulent”, as it caused encephalitis or brain inflammation in monkeys.  He then adapted the first YFV characterised – the Asibi strain, from Ghana in 1927 – to being grown in minced chicken embryos lacking a spinal cord and brain, and showed that after more than 89 passages, the virus was no longer “neurotrophic”, and did not cause encephalitis.   The new 17D strain of YFV was successfully tested in clinical trials in Brazil in 1938: the strain remains in use today, and is still made in eggs.

TMV is a nucleoprotein

1935-1937: An important set of discoveries started in 1935, when Wendell Stanley in the USA published the first proof that TMV could be crystallised, at the time the most stringent way of purifying molecules.  He also reported that the “protein crystals” were contaminated with small amounts of phosphorus.  An important finding too, using physical techniques including ultracentrifugation and later, electron microscopy, was that the TMV “protein” had a very high molecular weight, and was in fact composed of large, regular particles.  This was a very significant discovery, as it indicated that some viruses at least really were very simple infectious agents indeed.

However, his first conclusion that TMV was composed only of protein was soon challenged, when Norman Pirie and Frederick Bawden working in the UK showed in 1937 that ribonucleic acid (RNA) – which consists of ribose sugar molecules linked by phosphate groups – could be isolated consistently from crystallised TMV as well as from a number of other plant viruses, which accounted for the phosphorus “contamination”.  This resulted in the realisation that TMV and other plant virus particles – now known to be virionswere in fact nucleoproteins, or protein associated with nucleic acid.

The Phage Age

1939: The former physicist Max Delbrück, working with the biologist Emory Ellis at Caltech, elucidated the growth cycle of a sewage-isolated Escherichia coli bacteriophage in a now-classic paper simply entitled “The Growth of Bacteriophage”.  This used the simple technique of counting plaques in a bacterial lawn in a Petri dish, following infection of a standard bacterial inoculum with a dilution series of a phage preparation.

Their principal finding was that viruses multiply inside cells in one step, and not by division and exponential growth like cells. This was determined using the so-called “one-step growth curve”, which allowed the accurate determination of the titres of viruses released from bacteria that had been synchronously infected.  This allowed calculation of not only the time of multiplication of the virus, but also the “burst size” from individual bacteria, or the number of viruses produced in one round of multiplication.  This was a fundamental discovery, and allowed the rapid progression of the field of bacterial and phage genetics

1952: Alfred Hershey and his assistant Martha Chase performed the legendary Hershey-Chase or “Waring blender” experiment in order to prove whether or not DNA was the genetic material of the phage.  They grew up preparations of the E coli bacteriophage T2 separately in the presence of the radioisotopes 35S and 32P, to label the protein and nucleic acid components of the phage respectively.  They allowed adsorption of phages to bacteria in liquid suspension for different times, then sheared off adsorbed phage particles from the bacteria using the blender.  Pelleting the bacteria by centrifugation and assaying radioactivity allowed them to determine that over 75% of the 35S – incorporated into cysteine and methionine amino acids – remained in the liquid, or outside the bacteria, whereas over 75% of the 32P – incorporated into the phage DNA – was found inside the bacteria.  Subsequent production of phage from the bacteria showed that DNA was probably the genetic material, and that protein was not involved in phage heredity – a fundamental discovery at the time.

Animal cell culture and viruses

1949: Possibly the most important development for the study of animal viruses since their discovery was the growing of poliovirus in cell culture: this was reported by John Enders, Thomas Weller and Frederick Robbins from the USA, and was rewarded with a joint Nobel Prize to them in 1954.  They did this around the same time as David Bodian and Isabel Morgan identified three distinct types of poliovirus. Previously, titration or assay of poliovirus, for example, required the injection of virus preparations into the brains of monkeys, or later, in the case of the Lansing or Type II poliovirus strain, into brains of mice.

1952: Renato Dulbecco in 1952 adapted the technique to primary cultures of chicken embryo fibroblasts grown as monolayers in glass flasks.  Using  Western equine encephalitis virus and Newcastle disease virus of chickens, he showed for the first time that it was possible to produce plaques due to an animal virus infection, and that these could be used to accurately assay infectious virus titres

He and Marguerite Vogt went on in 1953 to show the technique could be used to assay poliovirus – and went on to show that the principle of “one virus, one plaque” first established with phages, and later to plant viruses, could be extended to animal viruses too.

1954: The agent of measles was characterised by Thomas Peebles and Enders via tissue culture by; adenoviruses were discovered in 1953 by Wallace Rowe and Robert Huebner and shown to be associated with acute respiratory disease soon afterwards, by Maurice Hilleman and others.

……..A longer version is available here as a web page, and here as an ebook


Section 2: Viruses as Organisms


  1. Viruses are acellular organisms with nucleic acid genomes, which make particles to protect the genome and transfer it between cells.
  2. While they do not exhibit all of the supposed attributes of cellular organisms, viruses are independent entities that are not limited to one host.
  3. Virus-like agents include plasmids, satellite viruses, satellite nucleic acids, viroids and retroelements

What is a virus?

Viruses are organisms that are at the interface between molecules and cells; between what is usually termed “living” and “dead”.  This creates problems for some traditional biologists; however, many of these can be cleared up with some concise definitions.

Definition 1: Viruses

“Viruses are acellular organisms whose genomes consist of nucleic acid, and which obligately replicate inside host cells using host metabolic machinery and ribosomes to form a pool of components which assemble into particles called VIRIONS, which serve to protect the genome and to transfer it to other cells.”

A more radical definition:

“A virus is an infectious acellular entity composed of compatible genomic components derived from a pool of genetic elements.”

Ed Rybicki, 2009

Definition 2: Organisms

Traditionally, “living” organisms are supposed to display the following properties:

  • Reproduction
  • Nutrition
  • Irritability
  • Movement
  • Growth

However, these derive from a top down sort of definition, which has been modified over years to take account of smaller and smaller things (with fewer and fewer legs, or leaves), until it has met the ultimate molecular organisms – or “molechisms” or “organules” – that are viruses, and has proved inadequate.

If one defines life from the bottom up – that is, from the simplest forms capable of displaying the most essential attributes of a living thing – one very quickly realises that the only real criterion for life is:

The ability to replicate

and that only systems that contain nucleic acids – in the natural world, at least – are capable of this phenomenon. This sort of reasoning led some virologists to a new definition of organisms:

“An organism is the unit element of a continuous lineage with an individual evolutionary history.”

SE Luria, JE Darnell, D Baltimore and A Campbell (1978). General Virology, 3rd Edn. John Wiley & Sons, New York, p4 of 578.

The key words here are UNIT ELEMENT, and INDIVIDUAL: the thing that you see, now, as an organism is merely the current slice in a continuous lineage; the individual evolutionary history denotes the independence of the organism over time. Thus, mitochondria and chloroplasts and nuclei of eukaryotic cells are not organisms, in that together they constitute a continuous lineage, but separately have no possibility of survival.  This is despite the independence of mitochondria and chloroplasts as independent bacteria before they entered initially symbiotic, and then dependent associations, within another organism.

The concept of the virus as organism is contained within the concepts of individual viruses constituting continuous genetic lineages, and having independent evolutionary histories.

Thus, given this sort of lateral thinking, viruses become quite respectable as organisms:

  • they most definitely replicate,
  • their evolution can be traced quite effectively, and
  • they are independent in terms of not being limited to a single organism as host, or even necessarily to a single species, genus or phylum of host.

Viruses: Living or Dead?

Viruses are simply acellular organisms – which find their full being inside host cells, where some measure of essential support services are offered in order to keep the virus life cycle turning.  What everybody sees as “viruses” are in fact virions, the particles that viruses cause to be made in order to transport their genomes between cells, and to preserve them while doing so.  

Thus, in a very real sense a virus IS the cell it infects – because it effectively takes the latter over, and uses it to make portable versions of the genome that can infect other cells.

Qualitatively, this is exactly what seeds and spores of plants and fungi do: they make specialised vehicles that preserve their genomes, and which can respond to changes in their environment to initiate a new organism.

While the debate on whether viruses are living or are indeed organisms gets almost theological in its intensity in certain biological circles, there is a very simple way around the problems – and that is to regard them as a particle/organism duality, much as physicists have learned to do with the dual wave/particle nature of light.

Indeed, some virologists have gone as far as to suggest that the domain of “life” should be divided into two types of organisms: those that encode the full suite of protein-synthesising machinery (cells), and those that do not (viruses).  We agree with that sort of thinking – and recent evidence on the deep evolution of viruses supports it.

Virus Genomes

Viruses have the largest variety of genome types of all organisms:









single or


single or






single or
multiple components

+ sense*

– sense

single or

single or

* = mRNA-sense or translatable.  (-)sense is complementary to (+)sense and must be transcribed to give mRNA

In contrast, prokaryotes have mainly single-component circular (occasionally multiple) or sometimes linear dsDNA (Streptomyces, Helicobacter) while all eukaryotes have multi-component linear dsDNA, and all the genomes replicate via the classic semi-conservative route.

Virus genomes range in size from around 1800 nucleotides (ssDNA circoviruses) up to 2.5 million nucleotides (dsDNA, pandoraviruses)

Viruses are the only organisms on this planet to still have RNA as their sole genetic material. They are also the only autonomously replicating organisms to have single-stranded DNA.

Virus-like agents

Viruses are not alone in the acellular space.

There are a number of other types of genomes which have some sort of independence from cellular genomes: these include “retrons” or retrotransposable elements; bacterial and fungal (and eukaryotic organelle) plasmids; satellite nucleic acids and satellite viruses which depend on helper viruses for replication; and viroids.   A very different class of infectious agents – PRIONS – appear to be “proteinaceous infectious agents“, with no nucleic acid associated with them at all.  Other virus-derived genomes and structural components (polydnaviruses, tailocins) have been co-opted by organisms as diverse as wasps and bacteria for their own survival.

Satellite Viruses

There are viruses which depend for their replication on “helper” viruses: a good example is Tobacco necrosis satellite virus (sTNV), which has a small piece of ssRNA which codes only for a capsid protein, and depends for its replication on the presence of TNV.  Another example is Hepatitis D virus: this has a genome consisting of 1700 bases of negative sense circular ssRNA, which also codes for a structural protein (“delta antigen”).  The 36 nm diameter infectious particles consist of Hepatitis B surface antigen (HBsAg) embedded in a cell-derived envelope, with internal HDV nucleoprotein (200 molecules of delta antigen complexed with the RNA). 

The adeno-associated viruses (AAVs) are also satellite viruses dependent on the linear dsDNA adenoviruses for replication, but which have linear ssDNA genomes and appear to be degenerate or defective parvoviruses.

Much larger satellite viruses have been found parasitising giant dsDNA viruses: the so-called “Sputnik virophage” was discovered associated with a mimivirus inside infected amoebae.  This has 74 nm isometric particles with an internal lipid bilayer, and a 18 kbp circular dsDNA genome capable of encoding up to 21 proteins.  Unlike other satellite viruses, the virophage uses the mimivirus’s cytoplasmic virus factories for transcription, replication, and virion assembly, completely independent of the amoebal host’s genes.

A unified system of classification for satellite viruses has recently been proposed, that regularises the apparent discrepancy between regarding some (eg: AAVs) as defective viruses, and others simply as satellites.

Satellite Nucleic Acids

Certain viruses have associated with them nucleic acids that  are dispensable in that they are not part of the genome, which have no (or very little) sequence similarity with the viral genome, yet depend on the virus for replication, and are encapsidated by the virus. These are mainly associated with plant viruses and are generally ssRNA, both linear and circular – however, several circular ssDNA satellites of plant geminiviruses have recently been found, in two separate classes – alpha and beta satellites.


Plasmids may share a number of properties with viral genomes – including modes of replication, as in single-strand circular DNA viruses and plasmids which replicate via rolling circle mechanisms, and circular dsDNA genomes replicating with a “theta-like” bidirectional replication forks – but they are not pathogenic to their host organisms.  Some are transferred by conjugation between cells rather than by free extracellular particles, by means of tra genes encoding pili, or rod-like structures.  Plasmids generally encode some function that is of benefit to the host cell, to offset the metabolic load caused by their presence.


Viroids are small naked circular ssRNA genomes which appear rodlike under the electron microscope due to their secondary structure and and tertiary folding, which are capable of causing diseases in plants.

They code for nothing but their own structure, and are presumed to replicate by interacting with host RNA polymerase II, and to cause pathogenic effects by interfering with host DNA/RNA metabolism and/or transcription. A structurally similar disease agent in humans is the Hepatitis D virus, although this does encode protein.


Classic RNA-containing retroviruses, and the DNA-containing pararetroviruses (hepadnaviruses, caulimoviruses and badnaviruses) all share the unlikely attribute of the use of an enzyme complex consisting of a RNA-dependent DNA polymerase/RNAse H (reverse transcriptase) in order to replicate. They share this attribute with several retrotransposons, which are eukaryotic transposable cellular elements with striking similarities with retroviruses. This includes entities such as the yeast Ty element and the Drosophila copia element, now classified as pseudoviruses and metaviruses

Retroposons are similar in that they are eukaryotic elements which transpose via RNA intermediates, but they share no obvious genomic similarity with any viruses, other than the use of reverse transcriptase. 

The human and other mammalian LINE-1s (Long Interspersed Nuclear Elements) are a group of retrotransposable elements which make up approximately 15 % of the human genome.

Bacteria such as E coli also have reverse-transcribing transposons – known as retrons – but these are very different to any of the eukaryotic types, while preserving similarities in certain of the essential reverse transcriptase sequence motifs.


Polydnaviruses are unusual in that they appear to be integral parts of their insect host genomes, yet have a virus-like stage and are used by their host to modify another insect’s behaviour and physiology.

The two genera so far described – Bracovirus and Ichnovirus – of family Polydnaviridae contain viruses which have a variable number of circular double-stranded DNA components, with components ranging in size from 2 to >31 kbp, for a total genome size of between 150 – 250 kbp.  Both sets of viruses occur as integrated proviruses in the genomes of endoparasitic hymenopteran wasps, replicate by amplification of the host DNA, followed by excision of episomal genomes by site-specific recombination, and only produce particles by budding from (ichnoviruses) or lysis of (bracoviruses) calyx cells in the oviducts of female wasps during pupal-adult transition

Moreover, the viruses in the two groups may well not be evolutionarily linked to one another, given that there is no antigenic or genome similarity, and the particles formed by the two groups are very different: ichnoviruses make ellipsoidal particles with double membranes containing one nucleocapsid; bracoviruses make single-enveloped particles containing one or more cylindrical nucleocapsids.  The latter may derive from nudiviruses, which appear to have contributed very substantially to wasp survival.

Particles are injected along with eggs into larvae of lepidopteran hosts; the DNA gets into secondary host cells and is expressed, but does not replicate – and this expression leads to some quite profound physiological changes, many of which are responsible for successful parasitism of the larva by the wasp.  The association between wasp and virus has been termed an “obligate mutualistic symbiosis”, and appears to have evolved over more than 70 million years.


Bacteria often make use of bacteriocins, or proteins that mediate a variety of bacteriotoxic effects frequently directed at even close relatives of the strains producing them.  A particular subset of bacteriocins have recently been labelled “tailocins”, or bacteriophage tail-like bacteriocins: these are produced by both Gram-positive and -negative bacteria, and one species of bacteria may produce more than one kind of tailocin.  For example, the human pathogen Pseudomonas aeruginosa produces tailocins with either a flexible (F) or rigid (R) appearance, which are denoted as F- and R-types: these have served as models for similar tailocins found widely among bacteria.

Tailocins are devices that penetrate the cell envelope of target bacteria and cause lysis.  Similar assemblies, related to R-type tailocins, are used by certain bacteria to inject toxic proteins into eukaryotic cells.

Among Pseudomonas spp., the two kinds of tailocins appear to derive from two lineages of myoviruses (family Myoviridae; R-type) and several lineages of siphoviruses (family Siphoviridae; F-type).  The bacteria encode the several genes necessary for specifying the individual tailocins in distinct clusters – and it appears as though recombination between various components of the clusters can produce tailocins with altered target surface specificities.  This can potentially be engineered to produce novel specificities such as for the human enteric pathogens enterotoxigenic Escherichia coli, and Clostridium difficile.

A recent review commented that:

“Tailocins illustrate the daedalian capacity of bacteria to accommodate exogenous genetic elements and domesticate them for their own benefit. The stinging device used by tailed (bacterio)phages against bacteria has been cunningly converted into tools to manipulate eukaryotic cells and into precision weapons for interbacterial warfare.”

Maarten G.K. Ghequire and René De Mot, 2015. The Tailocin Tale: Peeling off Phage Tails.  Trends in Microbiology  Vol. 23, No. 10, 587-590.


Section 3: Origins of Viruses

SummaryVirus Origins

  1. Viruses probably have more than one origin
  2. Some virus genomes probably escaped from cells as rogue mRNAs
  3. Many viruses result from the exchange of “cassettes” of essential genes
  4. Many ssRNA- viruses infecting plants and vertebrates probably originated in insects
  5. The “retroid cycle”, or DNA -> RNA -> DNA, is a characteristic of cellular elements as well as of two classes of viruses
  6. Rolling circle replication is a feature of bacterial plasmids and diverse families of small ssDNA viruses
  7. Big DNA viruses may have evolved very early

So from what did viruses evolve, or how did they initially arise?

The answer to this question is not simple, because, while viruses all share the characteristics of being obligate intracellular parasites which use host cell machinery to make their components which then self-assemble to make particles which contain their genomes, they most definitely do not have a single origin, and indeed their origins may be spread out over a considerable period of geological and evolutionary time.

Viruses infect all types of cellular organisms, from Bacteria through Archaea to Eukarya; from E. coli to mushrooms; from amoebae to human beings – and virus particles may even be the single most abundant and varied organisms on the planet, given their abundance in all the waters of all the seas of planet Earth. 

Given this diversity and abundance, and the propensity of viruses to swap and share successful modules between very different lineages and to pick up bits of genome from their hosts, it is very difficult to speculate sensibly on their deep origins – but we shall outline some of the probable evolutionary scenarios.

The graphic depicts a possible scenario for the evolution of viruses: “wild” genetic elements could have escaped, or even been the agents for transfer of genetic information between, both RNA-containing and DNA-containing “protocells”, to provide the precursors of retroelements and of RNA and DNA viruses.  Later escapes from Bacteria, Archaea and their progeny Eukarya would complete the virus zoo.

It is generally accepted that many viruses have their origins as “escapees” from cells; rogue bits of nucleic acid that have taken the autonomy already characteristic of certain cellular genome components to a new level. Simple RNA viruses are a good example of these: their genetic structure is far too simple for them to be degenerate cells; indeed, many resemble renegade messenger RNAs in their simplicity.

RdRp cassettes and virus evolution

What they have in common is a strategy which involves use of a virus-encoded RNA-dependent RNA polymerase (RdRp) or replicase to replicate RNA genomes – a process which does not occur in cells, although most eukaryotes so far investigated do have RdRp-like enzymes involved in regulation of gene expression and resistance to viruses.  The surmise is that in some instances, an RdRp-encoding element could have became autonomous – or independent of DNA – by encoding its own replicase, and then acquired structural protein-encoding sequences by recombination, to become wholly autonomous and potentially infectious.

A useful example is the viruses sometimes referred to as the Picornavirus-like and Sindbis or Alphavirus-like supergroups of ss(+)RNA viruses, respectively.  These two sets of viruses can be neatly divided into two groups according to their RdRp affinities, which determine how they replicate.  However, they can also be divided according to their capsid protein affinities, which is where it is obvious that the phenomenon the late Rob Goldbach termed “cassette evolution” has occurred: some viruses that are relatively closely related in terms of RdRp and other non-structural protein sequences have completely different capsid proteins and particle morphologies, due to acquisition by the same RdRp module of different structural protein modules.

Given the very significant diversity in these sorts of viruses, it is quite possible that this has happened a number of times in the evolution of cellular organisms on this planet – and that some single-stranded RNA viruses like bacterial RNA viruses or bacteriophages and some plant viruses (like Tobacco mosaic virus, TMV) may be very ancient indeed.

However, other ssRNA viruses – such as the ss(-)RNA mononegaviruses in Order Mononegavirales, which includes the families Bornaviridae, Rhabdoviridae, Filoviridae and Paramyxoviridae, represented by Borna disease virus, rabies virus, Zaire Ebola virus, and measles and mumps viruses respectively – may be evolutionarily much younger.  In this latter case, the viruses all have the same basic genome with genes in the same order and helical nucleocapsids within differently-shaped enveloped particles

Their host ranges also indicate that they originated in insects: the ones with more than one phylum of host either infect vertebrates and insects or plants and insects, while some infect insects only, or only vertebrates – indicating an evolutionary origin in insects, and a subsequent evolutionary divergence in them and in their feeding targets.

The Retroid Cycle

The ssRNA retroviruses – like HIV – are another good example of possible cell-derived viruses, as many of these have a very similar genetic structure to elements which appear to be integral parts of cell genomes – the previously-mentioned retrotransposons – and share the peculiar property of replicating their genomes via a pathway which goes from single-stranded RNA through double-stranded DNA (reverse transcription) and back again, and yet have become infectious. 

They can go full circle, incidentally, by permanently becoming part of the cell genome by insertion into germ-line cells – so that they are then inherited as “endogenous retroviruses“, which can be used as evolutionary markers for species divergence.

Indeed, there is a whole extended family of reverse-transcribing mobile genetic elements in organisms ranging from bacteria all the way through to plants, insects and vertebrates, indicating a very ancient evolutionary origin indeed – and which includes two completely different groups of double-standed DNA viruses, the vertebrate-infecting hepadnaviruses or hepatitis B virus-like group, and the plant-infecting badna- and caulimoviruses

Metaviruses and pseudoviruses

These are two families of long terminal repeat-containing (LTR) retrotransposons, with different genetic organisations. 

Members of family Pseudoviridae, also known as Ty1/copia elements,  have polygenic genomes of 5-9 kb ssRNA which encode a retrovirus-like Gag-type protein, and a polyprotein with protease (PR), integrase (IN) and reverse transcriptase / RNAse H  (RT) domains, in that order.  While some members also encode an env-like ORF, the 30-40 nm particles that are an essential replication intermediate have no envelope or Env protein.  They are not infectious.  Host species include yeasts, insects, plants and algae.

Metaviruses – family Metaviridae – are also known as Ty3-gypsy elements, and have ssRNA genomes of 4-10 kb in length.  They replicate via particles 45-100 nm in diameter composed of Gag-type protein, and some species have envelopes and associated Env proteins.  Gene order in the genomes is Gag-PR-RT-IN-(Env), as for retroviruses.  One virus – Drosophila melanogaster Gypsy virus – is infectious; however, as for pseudoviruses, most are not.  The genomes have been found in all lineages of eukaryotes so far studied in sufficient detail.

Both pseudovirus and metavirus genomes are clearly related to classic retroviruses; moreover, RT sequences point to metavirus RTs being most closely related to plant DNA pararetrovirus lineage of caulimoviruses.  This gives rise to the speculation that pseudoviruses and metaviruses have a common and ancient ancestor – and that two different metavirus lineages gave rise to retroviruses and caulimoviruses respectively.

EVEs: Endogenous viral elements

The possibility that certain non-retro RNA viruses can actually insert bits of themselves by obscure mechanisms into host cell genomes – and afford them protection against future infection – complicates the issue rather, by reversing the canonical flow of genetic material.  This may have been happening over aeons of evolutionary time, and to have involved hosts and viruses as diverse as plants (integrated poty– and geminivirus sequences), honeybees (integrated Israeli bee paralysis virus) – and the recent discovery of “…integrated filovirus-like elements in the genomes of bats, rodents, shrews, tenrecs and marsupials…” which, in the case of mammals, transcribed fragments “…homologous to a fragment of the filovirus genome whose expression is known to interfere with the assembly of Ebolavirus”.

Another fascinating recent example of molecular virus “paleovirology” was the demonstration that endogenous viral elements (EVEs) derived from  a wide variety of RNA viruses (ssRNA+, ssRNA-, dsRNA) could be found in the genomes of mammals and insects, as well as ssDNA circo- and parvoviruses in mammal and bird genomes, and a family of Hepatitis B-like pararetrovirus sequences in birds and a tick

The finding of related families of virus-derived insertions in widely-diverged animal species allows the deduction that the viruses have been diverging for at least as long as their hosts – pushing back their possible origins to more than 30 million years in the case of filoviruses.

Rolling circle replication

There are also obvious similarities in mode of replication between a family of elements which include bacterial plasmids, bacterial single-strand DNA viruses, and ssDNA viruses of eukaryotes which include geminiviruses and nanoviruses of plants, parvoviruses of insects and vertebrates, and circoviruses and anelloviruses of vertebrates. 

These agents all share a “rolling circle” DNA replication mechanism, with replication-associated proteins and DNA sequence motifs that appear similar enough to be evolutionarily related, albeit very distantly – and again demonstrate a continuum from the cell-associated and cell-dependent plasmids through to the completely autonomous agents such as relatively simple but ancient bacterial and eukaryote viruses.

Big DNA viruses

There are a significant number of viruses with large DNA genomes for which an origin as cell-derived subcomponents is not as obvious.  In fact, one of the larger viruses yet discovered – mimivirus, with a genome size of greater than 1 million base pairs of DNA – has a genome which is larger and more complex than those of obligately parasitic bacteria such as Mycoplasma genitalium (around 0.5 million), despite sharing the obligately intracellular life habits of tiny viruses like canine parvovirus (0.005 million, or 5000 bases). 

Mimivirus has been joined, since its discovery in 2003, by Megavirus (2011; 1.2 Mbp) and now Pandoravirus (2013; 1.9 -2.5 Mbp). 

The nucleocytoplasmic large DNA viruses or NCLDVs – including pox-, irido-, asfar-, phyco-, mimi-, mega- and pandoraviruses, among others – have been grouped as the proposed Order Megavirales, and it is proposed that they evolved, and started to diverge, before the evolutionary separation of eukaryotes into their present groupings.

It is a striking fact that the largest viral DNA genomes so far characterised seem to infect primitive eukaryotes such as amoebae and simple marine algae – and they and other large DNA viruses like pox- and herpesviruses seem to be related to cellular DNA sequences only at a level close to the base of the “tree of life”

This indicates a very ancient origin or set of origins for these viruses, which may conceivably have been as obligately parasitic cellular lifeforms which then made the final adaptation to the “virus lifestyle”. 

However, their actual origin could be in an even more complex interaction with early cellular lifeforms, given that viruses may well be responsible for very significant episodes of evolutionary change in cellular life, all the way from the origin of eukaryotes through to the much more recent evolution of placental mammals.  In fact, there is informed speculation as to the possibility of viruses having significantly influenced the evolution of eukaryotes as a cognate group of organisms, including for some the intriguing possibility that a large DNA virus may have provided the first cellular nucleus.

The unification of cellular and viral “trees of life” has been a goal for many years – and may have been partially achieved, with the demonstration by analysis of protein fold similarities that there must have been ancient protein lineages “common to both cells and viruses before the appearance of the “last universal cellular ancestor” that gave rise to modern cells”.

In summary, viruses are as much a concept as a unitary entity: all viruses have in common, given their polyphyletic origins, is a base-level strategy for expressing their genomes.  Otherwise, their origins are possibly as varied as their genomes, and may remain forever obscure.

%d bloggers like this: