All published articles of this journal are available on ScienceDirect.
The Strengths and Challenges of Conventional Short-Term Genetic Toxicity Assays
Abstract
The field of toxicology is moving forward rapidly, and the trend is toward non-animal testing. In terms of genetic toxicity testing, there are many accepted OECD test guidelines reliant on in vivo animal models. In the interest of providing a complete picture of the methods available for both animal-derived and non-animal genetic toxicity testing, Part I in the series discusses the existing short-term accepted test methodologies, while Part II examines available New Approach Methodologies (NAMs). The advantages and disadvantages of the current methodologies are discussed here, while those of new methodologies and the current regulatory landscape relative to them are reserved for discussion in Part II.
1. INTRODUCTION
Genotoxicity refers to the ability of a substance to damage genetic material, while mutagenicity refers to the permanent and transmissible variations in the amount or structure of genetic material, leading to increased mutational frequency [1]. Carcinogens may operate by genetic (heritable via the damage of genetic material) or epigenetic (somatically heritable, such as histone modification) mechanisms. Central to the theory of carcinogenesis is that mutations, under the right circumstances, cause cancer. It has been theorized that even one single mutation (“one hit”) may cause cancer, and hence ultimately be fatal.
The origins of the “one hit” model can be traced to the ‘linearity at low-dose’ concept (LNT), also referred to as linear low-dose extrapolation, in ionizing radiation-induced mutation. This concept alleges that there is no lowest safe dose for carcinogens (no ‘threshold’ of carcinogenicity) because any random mutational event could result in cancer. As detailed by Calabrese et al. [2-7], from the 1920s onward, physicists and radiation geneticists developed and integrated the LNT theory, and it became accepted science. Originally, this theory asserted that mutational events were proportional to the amount of energy absorbed, driving the evolution of biological organisms. Developed further through work with Drosophila, this concept replaced the previous ‘gold standard’ threshold dose-response theory long accepted in medicine and physiology. By 1958, the LNT theory was generalized to somatic cells and cancer risk assessment by the U.S. National Committee (the National Committee on Radiation Protection [NCRPM] [8]) and quickly adopted nationally and internationally by various committees [4], eventually resulting in adoption by the U.S. Safe Drinking Water Committee (1977) of the National Academy of Science, who extended the BEAR/BEIR committee linear dose-response risk assessment model to chemical carcinogens. It was subsequently adopted by the US EPA and the US FDA in 1977 for animal carcinogen drug residues (U.S. CFR 21). Thus, the search was launched for genotoxic compounds, and several different assays were introduced to test for the genotoxicity of various chemical and biological agents. The Ames Test [9], is considered the bedrock of testing. This test for mutational histidine revertants is carried out in four or more Salmonella bacterial strains engineered to be more susceptible to specific types of mutants, enhancing the sensitivity of the test. An E. coli strain was also developed. The Ames Assay, Micronucleus Test, Chromosomal Aberration Assay, Comet Assay, Thymidine Kinase Assay, and the more recent, indirect ROSGlo Assay are now familiar to every genetic toxicologist. Advantages include in vitro accessibility, relatively low cost, and speed. The advent of in vitro/ex vivo genetox testing accelerated the acquisition of knowledge about the genotoxic potential of hundreds to thousands of substances, both natural and anthropogenic. In some instances, this caused controversy because the substances were not anticipated to be hazardous (such as char-cooked meats or tires [10]). In other cases, the tests provided proof of the mutagenic character of substances that had long eluded cancer biologists, such as cigarette smoke condensate, for which animal models had proved problematic [11]. The strengths and weaknesses of the standard short-term genetic toxicity tests currently in use are discussed in Part I of this review.
2. METHODS
The literature was searched using the following strings:
“Short-term” AND genetox* AND testing AND method*
Genotox* AND testing AND conventional AND method*
Genotox* AND testing AND conventional AND limitation*,
The following phrases:
“Conventional short-term genotoxicity testing methods”
“Limitations of conventional short-term genotoxicity testing”,
Using Google, Google Scholar, PubMed, ResearchGate, ScienceDirect.com, Wiley Online Library, ACS Publications, and U.S. Food and Drug Administration.
Then the snowball technique was used to build on the results obtained. ‘Conventional Short-Term Assays’ were defined as those non-chronic or non-sub-chronic assays that have been in use for at least 20 years and are accepted by OECD or EURL/ECVAM and have been subjected to validation through multi-laboratory and various other testing strategies.
The results were sorted according to date of publication and relevancy, and any duplicates were discarded. The methods were defined as either OECD TGs, EURL/ECVAM, or non-OECD/EURL/ECVAM TGs. Publications about other tests were not included.
Results were then categorized into major methodologies. Methods that were not described as (gold) standard/standardized, frequently employed, or having been accepted, validated, and improved through longstanding usage, were not included.
The remaining references were analyzed for their content and included for discussion if they described the advantages or disadvantages of the test method. Finally, they were compared with the specified TG to verify accuracy at specific points.
2.1. Conventional Short-Term Assays: Shortcomings and Strengths
The following conventional short-term genetic toxicity assays are briefly compared and contrasted in Table 1 .
2.1.1. Ames Test (OECD 471)
2.1.1.1. Assay Principle & Applicability
The Ames Assay was the first genetic toxicity test to be developed, and it is the ‘gold standard’ test for classifying a chemical substance as genotoxic. Since the test uses prokaryotic (bacterial) cells, direct concordance with human carcinogenesis, or even mutagenesis, is not possible. It is assumed that if a substance is mutagenic in bacteria, it is likely to be mutagenic in mammals as well. However, as pointed out in the test guideline (TG) OECD 471 [12], bacterial cells have different uptake, metabolism, chromosomal structure, and DNA repair processes than mammalian cells. Each test substance should be evaluated in terms of known Toxicokinetics (TK) and metabolism, where possible.
2.1.1.1.2. Method and Suggested Tips for Success
Due to the inherent differences between bacterial and mammalian cellular systems, some compounds are not suitable for testing in the Ames Assay. These include some antibiotics (because they interfere with bacterial cell systems) and some topoisomerase inhibitors/nucleoside analogues (because they interfere with mammalian cell systems). Indeed, there are mechanisms of carcinogenicity that are considered non-genotoxic, such as oxidative stress or epigenetic (i.e., histone modification) processes. Many of these cannot be tested for in the Ames Assay (except for several oxidative mutagens such as hydrogen peroxide and other peroxides, X rays, bleomycin, neocarzinostatin, streptonigrin, and other quinones and phenylhydrazine, using the TA102 strain) [13]. To reduce the possibility of false negative test results, the preincubation method is considered appropriate for derivatives of aliphatic N-nitroso compounds or alkaloids [14-16], or for azo dyes the Prival method [17], and discussed in Gatehouse et al. [18]. Importantly, some chemicals have specific properties rendering them unsuitable for testing with Ames Assays (benzene, urethane, procarbazine, salicylazosulfapyridine), and there may be others [19, 20]. For nanomaterials, the inability to permeate the bacterial cell wall is expected to interfere with obtaining test results.
An exogenous source of metabolic activation (typically rat S9 liver extract) is required, and it is usual to include + and – S9 conditions to confirm whether the substance requires metabolic activation to be mutagenic. Since Guengerich et al. [21], different substances were known to suppress or induce CYP450s, and there have been studies examining the effects of induction with Aroclor 1254 and of differences between rat and human liver induction or other organs, showing that induction is not uniform and varies by orders of magnitude. Some CYP450s are expressed less or not at all, depending on species, age, and sex [22-27]. Human liver S9 is generally not recommended for use due to its reduced sensitivity and lack of concordance of enzymatic activity with the level of mutational effect [28]. However, it could prove useful for evaluating chemicals such as aromatic amines, which have species-specific metabolic differences. It is important to understand the system limitations (which CYP450s are present and what is their usual induction in S9). Examples exist of compounds that were negative in the Ames Assay as usually performed but showed strong results in variations of the test, which depended on the properties of the test substance itself (e.g., Ochratoxin A) [22, 29]. An enhanced Ames Test has been published that addresses the reduced sensitivity of the Ames assay to N-nitrosamines [30], which include substances found as impurities in drugs containing a wide variety of functional groups. The enhanced protocol should be used if the test substance is an N-nitrosamine and does not initially test positive in the standard Ames assay. Thomas et al. [31], discusses the optimal selection criteria for accurate detection of N-nitrosodimethylamine and N-nitrosodiethylamine.
2.1.1.3. Advantages and Disadvantages
Advantages of the test include relative ease of performance, cost, and time. However, conflicting results are sometimes obtained for several chemicals that may be attributed to differences in DNA repair capacity or metabolism among cell types or species used in metabolic activation, bioavailability, or factors specific to the mechanism/endpoint of the substance, resulting in false negatives or positives [28, 29]. The former would appear to be most critical for human health, although the latter may be the most resource-intensive.
The Ames Assay does not eliminate animal testing entirely because the S9 rat liver microsomal fraction and the various cell lines that may be employed all originate from vertebrate mammalian species. However, it can be said to reduce the use of animals in toxicology testing by an extraordinary degree, especially if a ‘negative’ compound is not perceived to require further testing in vivo, or if a ‘positive’ compound is dropped from further development due to the potential for carcinogenicity. Both scenarios can mean substantial savings in effort, time, and resources, as well as in animal lives.
2.1.2. Micronucleus Test (OECD 474, 475, 487)
2.1.2.1. Assay Principle & Applicability
The Micronucleus (MN) test is used to evaluate clastogenic and aneugenic damage in vivo or in vitro. MN is a widely accepted and validated assay that is covered under the recently updated OECD TG 487 (in vitro; MNvit) [32, 33], OECD 474 [34], OECD 475 [35], and FDA Redbook [36] (in vivo). Micronuclei were first identified by Howell and then Jolly [37, 38] and first used to identify and quantify chromosomal damage by Evans et al. [39]. Others [40-48], as described below, went on to develop and refine the principles and usage of the test. This assay identifies chromosomal breakage or spindle disruption by means of counting micronucleated cells (cells with an extra, small or ‘micro-’ sized nucleus in addition to the original nucleus), which were early on found to result from folic acid and vitamin B deficiency, X-ray treatment, and exposure to other mutagens like hydrogen peroxide. Such cells were first identified in the bone marrow of fetal mice, but later it was determined that peripheral blood erythrocytes, and then other cell types (both human and rodent), were also a suitable system for quantitating micronuclei, which obviated the need to kill animals and clearly would be an advantage in multiple dosing studies
2.1.2.2. Method and Suggested Tips for Success
The in vivo MN test (mammalian erythrocyte micronucleus assay) detects chromosomal or spindle disruption in erythrocytes sampled from bone marrow or peripheral blood cells, typically of rodents [35, 37]. If the collection of cells is from bone marrow, the amount of information that can be collected will be limited to the single collection time at the sacrifice of each treatment group. If blood is used, the regimen can be more varied with repeat dosing and multiple collection times. The method is intensive in its use of animals (at least 5 animals per group, multiple doses, potentially multiple animal groups for different sacrifice times, concurrent TK groups, potentially recovery groups, positive and negative control groups, initial limit testing, and spare animals). Animals may be treated once or twice within 24 hours, with blood samples withdrawn 36 to 72 hours after the last dose. Alternatively, they may be treated once per day for several days, with samples collected 24 hours after the last treatment (bone marrow) or 40 hours after the last treatment (peripheral blood). At least 200 erythrocytes should be counted for bone marrow and 1000 for peripheral blood, and the ratio of immature to total red blood cells should be determined for each animal. At least 2000 immature erythrocytes per animal are scored for micronuclei, with the potential to score mature erythrocytes for more information (especially if animals are treated continuously for four or more weeks). Further quality control/assurance measures can be found in Howell and OECD TG 475 [35, 37].
OECD TG 487 [33], the MNvit test (micronucleus in vitro) is distinct from the in vivo micronucleus test [49]. Chromosomal damage, resulting from the formation of acentric chromosomal fragments or whole chromosomes that fail to migrate to the poles during anaphase, is detected in cells treated in vitro. Cells must have undergone one cell division during or after test substance exposure. The resulting micronuclei are easily visualized and counted manually or by automated means (e.g., FACS or cell sorting, image analysis, laser scanning cytometry) of at least 2000 cells. Chromosomal aberrations scored in metaphase differ from MNvit scoring because the damage may not transfer to daughter cells, whereas anaphase scored chromosomal damage indicates that the damage will transfer permanently.
The addition of Cytochalasin B (CytoB) prior to mitosis prevents the completion of cytokinesis after nuclear division, resulting in binucleate cells. This characteristic is useful for counting micronuclei in cells that have undergone only one mitotic event. However, the use of CytoB is not required if it can be shown that cells have undergone one mitosis, except for when human lymphocytes are used.
This is because cell cycle times vary among donors, and not all lymphocytes respond to Phytohaemagglutinin (PHA) stimulation, which is a requirement for the activation of proliferation in lymphocytes. Further, cells are not typically treated with CytoB if flow cytometry is used for quantitation [33].
Different metrics (relative increase in cell count [RICC] and relative population doubling [RPD] when CytoB is not used, cytokinesis block proliferation index [CBPI] or Replication Index [RI] when CytoB is used) are used to quantify cytotoxicity. Therefore, for cells not treated with CytoB, RICC or RPD should be used to inform the mitotic status. In either case, cytotoxicity should be quantified with and without metabolic activation during actual testing and may be as well during the preliminary phase of dose selection [33]. All the key stipulations that are present in the Ames assay apply to MNvit as well, such as the inclusion of multiple exposure concentrations (at least 3) plus positive and negative controls, limitations on the amount of cytotoxicity (the highest cytotoxicity should achieve 55 ± 5%), and control of pH, osmolality, and solubility are among them. Assay conditions should include 3-6 hr exposure to test chemical ±S9, removal of test substance, followed by counting after 1.5 – 2.0 normal cell lengths from the beginning of treatment. To thoroughly rule out a negative result, conditions should also include a continuous exposure without S9 for 1.5 – 2.0 cell lengths. The conditions can be carried out sequentially, stopping after the first positive result. The same difficulties may arise: capturing the optimal active range of the suspected genetic toxicant, dealing with insoluble substances, interacting with cell culture or other reagents, karyotypic instability, and the background frequency of micronuclei are some examples. Further, the origin of the cells, their intrinsic p53 status, and their DNA damage repair capabilities should be considered. All rigorous cell culture quality control measures should be followed, as should the recommendations for the specific cell types that are used [33].
2.1.2.3. Advantages and Disadvantages
Quantitation of micronuclei in either immature nucleated or enucleated mature red blood cells is relatively simple, as is interpretation in most cases [32]. The in vivo MN assay is superior in that it encompasses metabolism, PK, and DNA-repair processes of the whole organism during the treatment. However, the variations in these parameters among organisms, coupled with the requirement that the test substance reach the bone marrow and interact with the blood-forming elements to see a positive result, are limiting.
Advantages of the MNvit assay are its robustness and validity in many cell types, human or other mammalian peripheral blood lymphocytes [40, 44, 46, 50], CHO, V79, CHL/IU, L5178Y, or human TK6 [29, 50-62], as endorsed by OECD (2023), SFTG [29, 50-53], IWGT [63, 64], ECVAM [65, 66], and ESAC [67].Further, the method can be augmented by the addition of immunochemical labeling of kinetochores or the use of FISH (fluorescence in situ hybridization) methods to label centromeres or telomeres [44, 47, 64, 68-64]. These methods increase the amount of information gleaned from the experiments, allowing positive results to identify the mechanism of damage as clastogenic or aneugenic. The method allows for the identification of aneugens that are otherwise difficult to study (OECD 473) but cannot differentiate substances that induce changes in ploidy or chromosome number, without the use of FISH [34].
In identifying positive genetic toxicants, the success of the micronucleus test is considered comparable to that of the Comet and γH2AX phosphorylation assays. It is specific to certain types of cellular damage. However, a major shortcoming has been identified [77]: 30 to 40% of compounds that are negative in both the in vivo MN assay and the ToxTracker assay are positive in the in vitro MN assay, possibly due to oxidative stress generated by the compound rather than direct DNA damage. Additionally, the question arises whether the toxicant or its metabolite(s) reach the target tissue in vivo, inferring false negative results, or whether false positives have occurred, causing systemic or generalized toxicity from excessively high doses. Therefore, caution is advised in the interpretation of MN results, and such results are best used as part of an in vitro and/or in vivo battery of tests.
2.1.3. In Vitro Mammalian Cell Chromosomal Aberration Test (OECD 473)
2.1.3.1. Assay Principle and Applicability
This assay was originally adopted in 1983 and has undergone periodic revision and updates [78]. It is intended for the detection of clastogens causing chromosomal aberrations and does not detect aneugens.
2.1.3.2. Method & Suggested Tips for Success
Human or rodent cell lines or primary cells that have a stable karyotype and a low rate of spontaneous chromosomal aberration should be used with metabolic activation unless they are known to be competent. Cells are treated ± S9 with the test substance for 3–6 hours or continuously exposed for 1.5 cell cycles and treated 1 to 3 hours prior to harvest with Colcemid to arrest the cell cycle at metaphase, treated hypotonically, fixed, stained, and the fixed preparations microscopically scored for the presence of chromatid- and chromosomal aberrations, which are recorded separately. Polyploidy and endoreduplication are recorded. At least 300 metaphases are scored to satisfy statistical requirements unless there is a high number of cells with chromosomal aberrations (test substance is clearly positive). The methods for this assay differ from MNvit only in that Colcemid is used instead of CytoB. All other guidelines and recommendations mentioned above for the MN assay also apply [78]. Advantages and Disadvantages
This assay is a staple in genetic toxicity testing and is simple procedurally and quantitatively, but the test cannot detect aneugens as polyploidy alone does not distinguish aneugens and may indicate cell cycle perturbation or cytotoxicity only. Additionally, the test requires metabolic activation and requires metaphase arrest.
2.1.4. Comet Assay (OECD 489)
2.1.4.1. Assay Principle and Applicability
Originally developed by Cook et al. [79] and later by Ostling [80], the comet assay (OECD 489) [81] is also known as single-cell gel electrophoresis (SCGE) and the alkaline comet assay. It is used to measure the occurrence of single-strand breaks (SB) and alkali-labile sites (ALS) in eukaryotic DNA [82, 83], which are pro-mutagenic. Not only the presence or absence of DNA damage, but the type and amount of damage can be identified and quantified [83]. For instance, by the introduction of bacterial lesion-specific endonucleases, it can be determined whether UV-induced pyrimidine dimers, oxidized bases, or alkylation damage has occurred [82, 84]. Specific bacterial enzymes used are endonuclease II (oxidized pyrimidines), formamidopyrimidine DNA glycosylase for 8-oxoguanine and other purines, T4 endonuclease V for UV-induced cyclobutene pyrimidine dimers, or Alk A for 3-methyladenine sites. The endonucleases act on the accessible DNA sites, thus the amount of activity over time is a direct measure of pre-existing damage [82].
2.1.4.2. Method and Suggested Tips for Success
The rate of strand break repair can be measured by adding damaged comet tail material to cell isolates and monitoring the repair over time, as a measure of the cells’ reparative capacity. Cells (either from disaggregated tissue, circulating lymphocytes, or cells in culture; plant cells may be used if finely minced) are embedded in agarose, placed on a plain glass slide, and immersed in a lysis buffer, which denatures the DNA (i.e., releasing supercoils). Finally, the cells are subjected to electrophoresis, and the images are analyzed under fluorescence microscopy [82-84]. Ethidium Bromide or DAPI, both of which bind strongly to double-stranded DNA, are commonly used to visualize comets. Acridine orange can differentiate single-stranded (red) from double-stranded (yellow-green) DNA. The intensity of the fluorescence of the comet DNA in the ‘tail’ is linearly correlated with the amount of damaged DNA. The assay has been commercialized (for instance, R & D Systems).
The lysis conditions can be adjusted to scan for single- vs. double-strand breaks (neutral conditions for double-stranded breaks, alkaline for smaller amounts of damage, including both single- and double-stranded breaks). Software is available to analyze the strand break results, but the human eye can readily visualize and classify the level of severity of a strand break as accurately, if not as efficiently, as machine-aided classification [82-84].
Some common misconceptions have evolved, for instance, that a high pH is needed to detect single strand breaks. The use of alkali increases the visibility of comet tails and the types of damage detected but not the sensitivity of the assay. Tail intensity increases accordingly. Other variations on the comet theme include bromodeoxyuridine (BrDU) labeling to detect DNA breaks associated with replicating DNA in S-phase. The BrDU label will show up in the comet tail, versus the head, which would instead indicate post-replicative labeled DNA. Another nuance is to inhibit DNA synthesis using, for instance, hydroxyurea, cytosine arabinoside, or aphidicolin, blocking repair synthesis and resulting in the accumulation of breaks over time. In peripheral lymphocytes, which are non-dividing, breaks increase over time without the use of inhibitors as the rate of repair is naturally slow. FISH can be used to identify specific areas of chromosomes, centromeres, telomeres, and single-copy genes that are damaged or to monitor gene-specific repair rates [82].
It should not be assumed that the cells in the comet tail are apoptotic; damaged DNA can be repaired while apoptosis is irreversible. According to Collins [82], some apurinic (AP) sites might not be fully converted to strand breaks, although strong alkali conditions are likely to convert more of them than weaker conditions might, leading to the supposition that weaker conditions limit sensitivity. Some caution is advised in interpreting results, as the intensity of staining is likely cell cycle phase dependent, as the total fluorescence signal reflects DNA content. Cells on a slide should be limited to about 2 x 104 to prevent overlapping comets that are impossible to count. Additionally, signal saturation may occur, leading to an underestimation of damaged bases. This effect can be checked for by performing a dose-response curve. There should be no deviation from linearity at the highest exposure concentration. Air bubbles and edges should be avoided. About 50 comets per slide should be quantified. It should be noted that strand breaks typically occur (in well-studied X-ray experiments) only about 2.5 times per 109 Dalton, that is, once every 160 µm. Consequently, it is not possible to determine fragment length in this assay as in conventional DNA electrophoresis. DNA in the tail does not migrate by fragment length [82-84]. Quality control in the assay should be determined by comparing with a well-known standard, either as supplied by the manufacturer or through γ- or x-irradiated controls [85]. As in any well-planned study, the number of samples needed should be determined a priori by a Power Analysis. Preliminary studies can help determine the intra- and inter-individual variability that needs to be overcome to separate the true result from background noise. A good idea is to establish and maintain a pool of frozen cells with known damage under controlled conditions, which can be used as laboratory controls in any experiment. Cells can be frozen at -80°C with dimethyl sulfoxide (DMSO) and with or without fetal bovine serum (FBS) [82]. Checking the viability of cells using trypan blue can be misleading, as blue cells may yet be viable notwithstanding damaged cell membranes. Thus, the optimal condition of a cell for use in the comet assay is for untreated cells to show 10% or less of DNA in the comet tail, indicating they are undamaged. After analysis, slides can be dried and stored indefinitely on plain glass slides [82].
2.1.4.3. Advantages and Disadvantages
2.1.4.3.1. The Method Is Simple, Relatively Rapid, And Inexpensive To Perform.
Comet assays are a dependable measure of DNA damage; however, there are nuances to the method and the results are sometimes over-interpreted. As with the Ames assay, the comet assay depends initially on biological material that must be obtained from living organisms. Perpetually cultured cells can be used, decreasing the use of live organisms, but they are subject to deterioration and deviations over time and must be stringently checked to ensure they maintain their cellular and genomic identity and integrity. HeLa cells are a common source of cross-contamination through volatilization and have been found to contaminate many cell lines [86]. Some estimates have even put the number of misidentified or contaminated cell lines as high as 36 percent [86, 87].
2.1.5. Mouse Lymphoma (MLA) and Thymidine Kinase Assay (TK6) (OECD 490)
2.1.5.1. Assay Principle and Applicability
These assays have been widely used since the 1980’s and are described under OECD Test Guideline(TG) 490 [88], (superseded TG 476 of 1984, 1997), which was written for the MLA assay but since it uses the TK locus it covers both assays, which have a similar endpoint although the two cell lines are not interchangeable [L5178Y mouse lymphoma cell line (L5178Y), TK6 human lymphoblastoid cell line (TK6)]. These two major cell lines measure forward mutations in the endogenous thymidine kinase gene. The endogenous thymidine kinase gene (human TK, rodent Tk, and referred to together as TK) is used as a reporter gene and, if deleted, will produce a cell that does not produce the enzyme thymidine kinase. Viable colonies deficient in thymidine kinase after mutation from TK+/- to TK-/- are then quantified. The types of mutations that can be detected are point mutations, frame-shift mutations, small deletions, chromosomal large deletions, rearrangements, and mitotic recombinations (Loss of Heterozygosity, LOH). Loss of the entire chromosome that might occur from spindle malformation, impairment, or mitotic non-disjunction could also be detected. However, these tests are unable to detect aneugens, for which a more appropriate test would be the MN assay [33].
2.1.5.2. Method and Suggested Tips for Success
Treatment with a mutagenic substance produces two mutant types: normal growing and slow growing. Slow-growing mutants have prolonged doubling times compared with the heterozygous parent cells. In the MLA, they are large-colony and small-colony mutants, while in the TK6 assay, they are early appearing and late appearing colonies. Either way, the slow-growing mutants have genetic damage to growth-regulatory genes near the TK locus, causing increased doubling times and late appearing/small colonies [89], and entailing major structural changes to chromosomes (i.e., clastogenic changes). The normal growing mutants do not have these growth-regulatory changes and are typically point mutations (i.e., mutagenic changes) [90-92]. Treatment with cytostatic trifluorothymidine (TFT) will cause cells to arrest if they are TK proficient (unmutated), and thus mutant cells having this selection advantage will proliferate and form visible colonies. In this test, as in all others described here, metabolic activation with S9 or knowledge of metabolic competency is required. An important consideration is that if the test substance bears resemblance to thymidine by structure or behavior, it may increase spontaneous background mutant frequency, requiring a correction. Nanomaterials are not covered by TG 490 [88].
The assay is carried out by first treating cells in suspension (±S9) with the test substance for 3–4 hr, or up to 24 hr without S9 as necessary, followed by sub-culture to carry out cytotoxicity testing (relative total growth [RTG] for MLA; relative survival [RS] for TK6) and allow for the expression of the mutant phenotype (MLA, 2 days; TK6, 3–4 days).
Once the expression is complete, cells are seeded in TFT-containing medium in soft agar or liquid medium to determine positivity (or without TFT for viability, aka cloning efficiency), grown for a period, and large and small colonies counted (MLA; 10-12 days incubation, and TK6; 10-14 days [early appearing] and after re-feeding and re-treating with TFT an additional 7 days [late appearing] incubation) long-term treatment is recommended [93]. Mutant Frequency (MF) is calculated as the number of colonies corrected by the cloning efficiency. Therefore, careful recordkeeping is important, and daily counts are made at each step [88].
The MLA assay is carried out using the TK+/- 3.7.2C subline of L5178Y cells, and the TK6 assay is carried out using the WI-L2 human lymphoblastoid cell line; both cell lines have a well-described karyotype and can be obtained from a qualified repository [94]. At the time of beginning cell culture, cultures should be checked to be free of mycoplasma, karyotyped, and their population doubling confirmed, then stored at < -150 °C, then cleansed of pre-existing mutant cells. The method stipulates that there should be between 10 and 100 spontaneous mutants present throughout the experiment for both MLA and TK6, which necessitates treating at least 6 × 10^6 (MLA) and 20 × 10^6 (TK6) cells. The concentration of the test agent should produce cytotoxicity in the range between 20 and 10% RTG (MLA) and between 20 and 10% RS (TK6). The calculations for the RTG, RS, and MF are contained in the method [88]. For MLA, colony characterization is carried out by size or growth for the highest acceptable positive concentration and on the positive and negative controls for positive substances, and on the controls for negative substances, according to the method used (agar or microwell). For TK6, both early and late appearing mutants are scored for all cultures, including positive and negative control cultures. If the positive and negative controls do not give the expected result, then the test substance cannot be characterized [88].
The criteria for an acceptable MLA result are found in previous studies [95-102] but are not available for TK6. For MLA, the Global Evaluation Factor (GEF), which is an induced mutant frequency based on historical negative control data from participating laboratories, is used as a comparator. If using the agar version of the test, the GEF is 90 x 10-6, and if using the microwell version, the GEF is 126 x 10-6. The GEF defines the level of response considered biologically relevant and replaces the use of statistical measures for interpreting MLA assay positivity/negativity. For a result to be considered positive, the increase in MF must exceed the GEF and be concentration-related, as determined by a trend test in any experimental condition. For a result to be considered negative, the increase in MF must not exceed GEF, and no trend should be found in all experimental conditions. Whereas, in the TK6 assay, a result is positive if at least one test concentration shows a statistically significant increase compared with the negative control, which is concentration-related, and if any of the results are out of the bounds of historical negative control data in any experimental condition. Conversely, for TK6, the result is considered negative if none of the test conditions shows a statistically significant increase compared with the control. Additionally, there is no concentration-related increase based on a trend test, and all results are within the bounds of historical negative control data as assessed using the Poisson 95% control limit. On rare occasions, results for a test substance can be equivocal [88].
2.1.5.3. Advantages and Disadvantages
Harmonization and standardization are a distinct advantage, with the procedures clearly defined and potential pitfalls and nuances of the tests spelled out in detail in OECD 490 [88], particularly for MLA. These assays are best applied as part of a battery of several tests, ideally as a follow-on test to a positive Ames Assay result. They cover a broad spectrum of genotoxic effects, as the heterozygosity of the TK6 gene makes it possible to detect point mutations, large deletions, and recombinations. The results are consistent and comprehensive when used in concert with other assays; for instance, it is possible to detect mutagens that otherwise test negative in the Ames Assay. However, its sensitivity is low for some applications, i.e., the detection of direct-acting substances, and for MLA, specificity is low. The time needed to perform the assay is relatively short at 72 hours.
2.1.6. ROSGlo Assay (OECD 442E, OECD 425, OECD 442D)
2.1.6.1. Assay Principle and Applicability
The ROSGlo assay is not strictly a genotoxicity test; rather, it provides indirect evidence of cellular damage through oxidative stress caused by Reactive Oxygen Species (ROS). Oxidative stress is a mechanism that may, in some circumstances, lead to cancer. ROS such as H2O2 are important mediators of oxidative stress, which are implicated in cancer and neurodegenerative diseases/aging [103]. ROS cause oxidation of proteins, lipids, RNA, and DNA. When the balance of reductive/oxidative mediators within the cell leads to the over-production of ROS, it may disrupt cellular homeostasis and potentially result in DNA damage. Importantly, cancer cells elevate ROS production via oncogenic mutation, reduction in tumor suppressor activation or transcription, increased metabolism, and adaptive changes that allow the tumor to proliferate and grow in a hypoxic environment [104]. In this assay, bioluminescence is produced via activation of the luciferin precursor, which is directly proportional to the presence of hydrogen peroxide (H2O2) in cells or enzymatic reactions [105].
2.1.6.2. Advantages and Disadvantages
Disadvantages are, again, that it is a short-term assay for what may best be described as a chronic process and is only an indirect or inferred measurement of effect (oxidative stress, which may or may not result in mutational events, which may or may not cause cancer). Advantages are that it does not use horseradish peroxidase (HRP), known to produce a high rate of false positive results; it is amenable to high-throughput screening (for instance, via liquid handling), and little sample preparation is required [106]. Multiplexing with other fluorescence-based measures of cell health is possible and should be considered. Mammalian 2.1.7 HPRT and xprt Assay (OECD 476)
2.1.7.1. Assay Principle and Applicability
First adopted in 1984, the Hypoxanthine-guanine Phosphoribosyltransferase (HPRT) in vitro mammalian cell gene mutation test is used to detect forward mutations of the hypoxanthine-guanine phosphoribosyl transferase gene (HPRT in human cells, Hprt in rodent cells) and the xanthine-guanine phosphoribosyl transferase transgene (gpt) (called the XPRT test) in Chinese Hamster Ovary (CHO) or lung (V79) fibroblasts [107, 108]. The types of mutations that can be detected are base pair substitutions, frameshifts, small deletions, and insertions (HPRT), or all the foregoing plus large deletions and possibly mitotic recombination for XPRT, because HPRT is located on the X chromosome [95, 96, 109-112].
2.1.7.2. Method and Suggested Tips for Success
Cells in suspension are incubated with several test concentrations and controls for 3-4 hours ±S9 metabolic activation, subcultured for 7-9 days, and then seeded in ± 6-thioguanine (TG) containing medium. TG is cytostatic; thus, positive mutations will escape and continue to grow while unmutated cells will not. Cytotoxicity is assessed by relative survival (RS, ‘cloning efficiency’) measured just after treatment, compared to survival at the end of treatment and the control. Positive results are determined as statistically significant, dose-dependent increases in mutant frequency above historical negative controls as determined by colony counts.
Cell types used are sensitive, stable, have high cloning efficiency, and a stable spontaneous rate of mutation, and include CHO, CHL, V79, L5178Y, and TK6 [112, 113] for HPRT, as well as AS52 cells (which do not contain hprt, for XPRT). After checking for the presence of contaminating mycoplasma and confirming the correct modal chromosome number, cell cycle time, and spontaneous mutant frequency should also be verified. Pre-existing mutant cells may need to be removed from working stocks with the use of specific media (i.e., HAT media for HPRT, MPA media for XPRT). Specific cell lines require careful adherence to individual requirements and should be used when growing in the log phase, ensuring optimal cloning efficiency, with at least four test concentrations and controls. Guidelines specify that the spontaneous mutant frequency generally ranges between 5 and 20 x 10-6, and that the number of sufficient spontaneous mutants is ≥ 10; therefore, at least 20 x 106 cells should be treated, and at least 2 x 106 are to be seeded for mutant selection [114]. During phenotypic expression, cell subculturing is continued to maintain log phase growth, followed by re-plating in TG-selective medium. For those samples meeting the laboratory historical control limits of positive and negative controls, tested ±S9, at appropriate cell densities and at concentrations that do not exceed the recommended cytotoxicity or the TG recommendations for maximum testing concentration, positivity is established if the following criteria are met: 1) at least one tested concentration differs significantly from the negative control, including being outside of historical control limits; and 2) there is a dose response as evaluated by a trend test. A true negative result is established when: 1) none of the test concentrations are outside of the current negative control historical limits (differ significantly from negative controls using the Poisson-based 95% control limit); and 2) no dose response exists [108]. Rarely, equivocal results are obtained.
2.1.7.3. Advantages and Disadvantages
Mutation frequency increases in cells that have escaped the requirement for 6-thioguanine in media treated with substances that cause limited or small genetic damage, which may be detected using Ames or large colony MLA tests. Therefore, it is a good confirmatory assay for these types of changes, and its processing efficiency makes it a good screening method. This test can catch a relatively small proportion of mutation-causing agents not captured by bacterial reverse mutation or chromosomal aberration testing strategies, as it detects any mutations, not just specific ones. It can detect a wide range of substances capable of causing small mutational changes, using human cells or knock-out cell lines [8, 107].
2.1.8. γH2AX Assay (EURL-ECVAM)
2.1.8.1. Assay Principle and Applicability
γH2AX is a phosphorylated (Ser-139) version of the histone variant H2AZ. Formation of γH2AX is an early cellular response to DNA double-strand break formation and is considered an essential part of the DNA Damage Response (DDR) [115]. It is widely recognized as a specific and sensitive marker of DNA damage from ionizing radiation, ultraviolet rays, oxidative stress, chemical agents, and certain drugs [116]. The development of antibodies specific for the detection of γH2AX has produced an assay with high specificity.
2.1.8.2. Method and Suggested Tips for Success
Commonly, the assay results are measured by microscopic quantitation of γH2AX-positive foci or single cells. Other methods that are less specific include flow cytometry, which has the disadvantage of measuring only relative fluorescence intensity, without regard to specific location or origin. Immunoblotting or ELISA, on the other hand, only determines the sample’s total γH2AX protein level, which can also include γH2AX-positive apoptotic cells. Since these cells are non-viable, they should not be lumped with damaged, but viable γH2AX-positive cells in the quantitation. Reddig et al. [117], compares the advantages/disadvantages of microscopic γH2AX foci quantitation, automated fluorescent microscopy, flow cytometry, and immunoblotting in PBMCs treated with etoposide for one hour. Their analysis revealed that automated microscopic γH2AX foci quantitation was the most sensitive and specific, compared to the Limit of Detection (LoD), with immunoblotting showing the highest LoD. The authors concluded that clinical utility could be achieved by using automated microscopic γH2AX foci quantitation, which is based on clinical plasma etoposide levels associated with hematological toxicity and antitumor activity. An important limitation of the assay is that when signal saturation is reached, individual foci are no longer distinguishable and, therefore, no longer quantifiable. Detection cannot always be increased through the use of shorter exposure times. Recently, an inter-comparison exercise was undertaken by the European biodosimetry network (RENEB) [118], which should help to increase the clinical utility of the assay.
2.1.8.3. Advantages and Disadvantages
Prediscreen, PrediProtect, and PrediRepair are branded versions of the γH2AX test [119-121], which is recommended by the European Union Reference Laboratory for alternatives to animal testing [122]. Claimed results are the true detection of 95% of carcinogenic compounds tested, no false positive compound detection (sensitivity 98%), and a specificity of 91%. Kopp et al. [120], reviewed 27 publications examining 329 chemicals tested using the Ames, MN, HPRT, and Comet assays and compared those results to the ones obtained with their γH2AX (Prediscreen) assay (referred to as an ‘in cell western assay’) and found an overall sensitivity of 60-75%, specificity of 87-100%, and predictivity of 79-90%.
2.1.9. Other Assays
Other currently used genetox assays include the 3D Skin Model (EpiDerm®), Embryonic Stem Cell Test, drug uptake in vitro, hepatocyte proliferation assay in vitro (mouse, rat, dog, human), Pig-a Assay, and non-disjunction test using FISH or antikinetochore. Co-culture or 3D models have an improved ability to detect secondarily caused genotoxicity, such as by the inclusion of immune cell components. They are excellent for exploring mechanisms of toxicity, such as using specifically engineered or treated cells or those of specific population backgrounds.
2.1.10. In vivo Pig-a Gene Mutation Assay (OECD 470)
2.1.10.1. Assay Principle and Applicability
This assay assesses the prevalence of blood cells with mutant phenotypes through the detection of mutations in Pig-a (encodes a catalytic subunit of the N-acetylglucosamine transferase complex, which synthesizes Glycosylphosphatidylinositol (GPI) cell surface anchor proteins, [123]), and is found only on the X chromosome. Functionally, these mutations are an indicator of Paroxysmal Nocturnal Hemoglobinuria (PNH) affecting erythrocytes (CD59), granulocytes (CD55), and monocytes (CD24 [123],). The mutation of one locus on the X chromosome will produce the functional deficit. However, other autosomal gene products are part of the complex; their mutation does not produce the deficit. Thus, while initially a test specifically for paroxysmal nocturnal hemoglobinuria, the Pig-a assay was suggested as a generalized test for gene mutational capacity [124, 125] and later developed by many others. Prototypical mutants such as N-ethyl-N-nitrosourea (ENU) and Dimethylbenzanthracene (DMBA) were investigated using the Pig-a assay in both rats and mice and found to be positive for both reticulocyte and red blood cell mutations, and subsequent work clarified that such changes could be persistent, develop in a sequence over time reaching a maximum and subsequently declining, and be dependent on the dosing regimen (frequency and timing). More information regarding dose additivity effects in specific agents and in several species was later discovered. Human cells from patients with Fanconi anemia and ataxia telangiectasia were tested and shown to be susceptible to mutation of Pig-a. Other cells, from individuals with known DNA mutations or manipulated in vitro to produce specific mutations, were also susceptible. Olsen et al. [123] identified 21 studies using mice and various intentional or accidental exposures of humans that have been studied for Pig-a. TK6 cells were found to harbor few PIG-A, but many PIG-L mutations. The examination of the various chemicals, environmental exposures, and hereditary conditions and their outcomes using Pig-a makes interesting reading.
2.1.10.2. Method and Suggested Tips for Success
As with the Ames assay, a chemical treatment may require metabolic activation to exert its mutagenic effects. Mutations may accumulate over time depending on their fixation, but may also be repaired over time. Therefore, the timing of measurements is key, as the maximum mutational frequency may occur weeks or longer after the last exposure. Inter- and intra-individual variation must be accounted for in the experimental design. Other factors, such as dietary deficiency, can and have been shown to play a role in toxicity [126]. An important consideration is that enrichment by magnetic separation techniques strongly affects detection capacity, so care should be taken not to interpret a negative result as evidence of no in vivo genotoxicity. In several cases (see Olsen et al. [123]), no positive Pig-a results were observed, indicating no effects in the bone marrow; however, positive results were obtained in the Comet assay for cells from different organs, highlighting differential organ sensitivity. A similar outcome was seen after testing the chemicals dichloropropane (DCP) and dichloromethane (DCM), which target the liver, in mice. Neither assay was “wrong,” but together they provided more comprehensive information. In other cases, reticulocytes tested positive while red blood cells (RBCs) did not, interpreted as a protective effect in the non-treated (dietary sufficiency) group. A battery of genetox assays used in concert is superior to any single assay alone, and equivocal results should be resolved through repeat testing, as demonstrated in nanoparticle testing in mice [127-129]. A significant advancement came from measuring the mutation rate per cell division, rather than just mutation frequency, in cells from Fanconi anemia and ataxia telangiectasia patients, lymphoma cancer patients, transformed myeloid cells, and normal donors [129-132], revealing notable differences among these groups. Recently, Dertinger et al. [133] adapted the PIG-A assay for use with human blood cells, transitioning it from its original rodent application. Additionally, the assay was applied to B-lymphoblastoid TK6 cells by several studies [134-136]. Interestingly, TK6 cells harbor a heterozygous autosomal deletion of the PIG-L gene on chromosome 17, alongside the X-linked PIG-A gene. This results in a high spontaneous mutation rate, necessitating depletion of pre-existing mutant cells before assay use. It is hypothesized that mutations in PIG-L may detect clastogenic events, while PIG-A mutations primarily identify point mutations. To date, studies have shown positive results for prototypical mutagens and negative results for non-mutagens, with increased sensitivity compared to the p53-deficient WI-L2-NS cell line when exposed to Ethylmethanesulfonate (EMS) and Ultraviolet C light (UVC). Future research will need to determine whether other cell types could also serve as effective substrates for the combined PIG-L/PIG-A assay.
2.1.10.3. Advantages and Disadvantages
The Pig-a assay, which may be performed either as an in vitro or an in vivo assay, offers advantages over other assay types. Very low volumes of peripheral blood samples are required, allowing animals to be repeatedly sampled without euthanizing, and flow cytometry detection enables rapid quantification. Human, rat, or mouse cells may be used, in cells in culture or from the blood cells of treated animals. Immunomagnetic separation or fluorescent-labelled aerolysin reagent (FLAER) is a variation on the preparation method that depletes wild-type cells and enriches the pool of either the FLAER-labeled GPI anchors or the lack of immunostaining of GPI anchor proteins (e.g., CD55, CD59), increasing the assay sensitivity by orders of magnitude. Verification of mutants by DNA sequencing is required to confirm their identity and quantify mutant frequency. New modifications in sequencing have speeded this process as well.
A strength of the Pig-a assay is the ability to investigate other basic cell functions, such as the roles of DNA repair enzymes in base excision repair.
A caution is that the kinetics of accumulation and repair appear to differ between rats, mice, or humans, but specificity is excellent in all three.A disadvantage of the Pig-a assay is the assumption that the compound or its metabolite reaches the bone marrow tissue at levels comparable to those in the target organs. However, previous observations have shown that different tissues can exhibit varying responses. This limitation does not exist in the comet or the transgenic mouse assay. As previously stated, a negative Pig-a result does not affirm the absence of genotoxicity. Timing, accumulation, and repair of mutations may be crucial. The variations within and between individuals are important study parameters that should be considered when planning or interpreting an experiment.
Tables 1 compares the advantages and disadvantages of conventional and NAMs for short-term genotoxicity testing.
Table 1.
| Test name | Applicability | Endpoint | Assay Length | Advantages | Disadvantages | OECD TG or regulatory status | Reference |
|---|---|---|---|---|---|---|---|
| Ames Assay | Preliminary screening tool to evaluate the carcinogenic potential of chemicals that are directly acting or require metabolic activation Best used to rank similar MOA substances by relative potency |
DNA frameshift or point mutations | 48 hr incubation 2 or 5 days (fluctuation method) |
Ease of performance Cost Time Availability of a library of tested compound results to compare Prevents unnecessary further tests Allows detection of potentially carcinogenic compounds, preventing wasted effort |
Conflicting results (false -/false +) Not directly concordant with human carcinogenesis or mutagenesis Exogenous S9 required from the rat Dependent on cell culture conditions Some compounds untestable Unsuitable for non-genotoxic substances Must establish a proper concentration range Complicated test conditions are required to get it right |
OECD 471 Required under the Pesticide Act (US) Required under the TSCA (US) |
Ames et al. 1973 [9] Follmann 2013 [154] |
| MN | Staple guideline test Best used as part of a battery of tests to prevent misinterpretation of results |
Chromosomal loss, breakage & spindle malformation | 72 hr incubation | Sensitive Can test human lymphocytes In vitro italic Easily scorable |
30-40% of compounds that are (-) in both in vivo and ToxTracker are (+) in in vitro MN assay Question of whether the toxicant reaches the target tissue (false -) Question of excessive doses (false +) Maybe detecting ox stress, not DNA damage |
OECD 474, 487 FDA CFSAN Redbook 2000: IV.C.1.d (July 2000) |
Evans et al. 1979 [39] Fenech and Morley 1985, 1986, [40, 44] Fenech 1999 [46] Schlegel, et al. 1986 [41] Heddle 1983 [42] Countryman, and Heddle 1976 [43] Ramalho et al. 1988 [45] Thomas et al. 2003 [31] |
| In Vitro Mammalian Chromosomal Aberration Test | Staple guideline test | Chromosome or Chromatid damage | If lymphocytes are used, add 48 hr for mitogenic stimulation Exposure for 3-6 hr, followed by incubation for 1.5 – 2 cell cycles |
Simple procedure and quantitation | Cannot detect aneugens. Polyploidy alone does not distinguish aneugens and may indicate cell cycle perturbation or cytotoxicity only Requires metabolic activation Requires metaphase arrest |
OECD 473 | OECD 2016 [34] |
| TK6/MLA | Staple guideline test used since the 1980s Best used as part of a battery of tests Follow-up test after a positive Ames Assay result |
Broad spectrum of genotoxic effects | 3-6 hr or 24 hr without S9 if 3 hr is negative + 48 hr culture time (MLA) 72 hr (TK6) |
Heterozygosity of the TK6 gene makes it possible to detect point mutations and large deletions & and recombination Consistent results Comprehensive, with other assays (can detect mutagens that test negative in the Ames Assay) |
Sensitivity is low for some applications to detect direct-acting agents Low specificity (MLA) |
OECD 490 (July 2016) Very well standardized ICH4 |
Honma et al. 1999 [93] OECD 2016 [34] |
| HPRT | Preliminary screening assay Confirmatory assay for Ames or large colony MLA |
Limited or small genetic damage Detects any mutations |
7-8 days + incubation on selection medium | Efficient processing Catches mutations missed by Ames or TK6/MLA |
Relatively long protocol Low spontaneous frequency of mutation at the HGPRT locus makes it difficult to derive enough cells for quantitation |
OECD 476 | Johnson 2012 [107] |
| Comet | Used as part of a test battery or as a confirmatory assay | DNA single-strand breaks Type and amount of damage Rate of strand break repair Alkaline labile sites |
1 - 3 days | Simple to perform Rapid Inexpensive Adaptable Reproducible Reliable Economical Sensitive |
Caution advised in interpreting results; intensity of stain is cell cycle phase dependent Careful QC required Cells come from live organisms Indirect measure of DNA damage Low sensitivity for oxidative damage, crosslinks, bulky adducts |
OECD 489 | Cook et al. 1976 [79] Collins 2004 [82] Karbaschi and Ji 2019 [85] |
| ROSGlo | Used as part of a test battery or as a confirmatory assay | Oxidation of DNA, RNA, proteins, and lipids | Variable incubation period with the test substance. measurements 2 hr post-reagent addition |
Does not use HRP (produces false positive results) Amenable to HTS Little sample prep required Multiplexing possible Simple procedure Does not require sample manipulation Fast Sensitive |
Indirect measure Short-term assay for chronic process Not a standalone test |
OECD 442E OECD 425 OECD 442D |
Holmstrom and Finker 2014 [103] Promega.com [105] Biospace.com [106] |
| γH2AX | Clinical use to assess DNA damage in biopsies Used as part of a test battery or as a confirmatory assay |
DNA double-strand breaks | ~8 hrs Reaction peaks from 30 min to 12 hr (depending on substance and dose level) |
Rapid Specific (91%) Sensitive (98%) HTS is possible, but with reduced interpretability Detects 95% of carcinogenic compounds tested |
Lack of standardization/ harmonization Overlapping foci cannot be quantified, signal saturation |
EURL-ECVAM | Reddig et al. 2018 [117] Kopp et al. 2019 [120] Khoury et al. 2013, 2020 [119, 121] Kirkland et al. 2008 [105] |
| Pig-a | Used as part of a test battery or as a confirmatory assay Monitoring humans for somatic mutation |
Deletions or mutations in Pig-a | 28 days of treatment; detection is within minutes | Flexible (in vitro or in vivo) Low volume of blood required Rapid quantification The mutation rate per cell division is also determined Accurately predicts mutagens, non-mutagens Roles of DNA repair enzymes in BER and other cell functions can be investigated HTS method |
Maximum mutational frequency may occur weeks or longer after the last exposure Verification of mutants by DNA sequencing is required to confirm the identity and quantify mutant frequency The timing of measurements is key Differential organ sensitivity Negative results should not be interpreted as negative results Does the compound reach bone marrow? |
OECD 470 | Araten et al. 1999, 2005, 2010, 2013 [124, 130-132] Chen et al. 2001 [125] Olsen et al. 2017 [123] Dertinger et al. 2015 [13] Nicklas et al. 2015 [134] Kruger et al. 2015, 2016 [135, 136] |
3. DISCUSSION
The current deficiencies that exist in standard testing approaches include insufficient physicochemical characterization of some substances, a lack of demonstration of cell or tissue uptake and internalization, and limitations in the coverage of genotoxic modes of action [137]. As pointed out therein, current in vitro genotoxicity test methods do not evaluate potential carcinogenicity caused secondarily through inflammation (e.g., fibrosis, nanomaterial toxicity). Acute in vitro tests do not have the ability to correctly identify carcinogens that act only chronically. Other studies [138, 139], have pointed out that very little information about the genotoxic Mechanism of Action (MOA) is found through the assays individually, and they are resource-intensive and may not be high-throughput capable. Often, the traditional in vitro assays are only poorly predictive of human mutagenicity [138]. The use of numerous assays in an in vitro test battery may require large amounts of test chemical. Although individual traditional assays are capable of a high degree of specificity, the overall specificity of the test battery may be lower. Recent efforts to address these shortcomings are discussed in previous studies [138, 139].The main criticism of conventional in vitro genotoxicity assays is their tendency to produce excessive false positive results [140]. Chromosomal damage assays, in particular, are well known for yielding false positives. Although test systems have been developed to enhance the sensitivity of in vitro assays, over-prediction remains a concern. For example, a compound may contain amino acids that promote the growth of non-mutant colonies [141], flavonoids known to be mutagenic in the Ames assay [142], or bacterial nitroreductases may reduce some nitro compounds—an activity not present in mammalian cells [29]. False positives can also arise when repair-deficient rodent cells are used [122, 143, 144], or due to differences between human and non-human cells [61], highlighting species-specific variations in cellular responses. Several studies [61, 77] have shown that the p53 status of cells is an important factor that varies by species. Additionally, some cell lines may be subclones that have developed genomic instability or altered metabolism and detoxification pathways [145]. Other false positive “red herrings” have been linked to cell culture conditions and propagation practices, such as pH, osmolality, excessive toxicity, apoptosis, or chelation effects [146, 147]. For drug or agrichemical developers, these false positives can lead to wasted time, effort, and resources following up on substances that are ultimately not mutagenic to humans [29, 122, 148, 149]. However, in the EU, regulations prohibit in vivo re-testing of cosmetic substances that test positive in vitro. In all in vitro methods, quality control is key to obtaining useful and meaningful experimental results [150]. The researcher should understand the limitations of the method(s) used and, first and foremost, have a good grounding in the state of the art regarding cell culture, which has evolved. They should understand that replicating a cell over multiple passages degrades cellular material and introduces myriad changes that are not usually monitored. Similarly, working with multiple cell lineages in the same laboratory carries risks of cross-contamination, and each different cell lineage requires its own growth conditions and specific monitoring. The genetic identity of a cell line should always be ascertained before first use and at intervals thereafter to prevent uninterpretable and meaningless results.
An important consideration for the Ames Assay and all other in vitro short-term tests is cytotoxicity. The principle of the test method requires that the amount of the substance producing mutations must not fall within the range of cytotoxicity. Therefore, a titration for cytotoxicity of the test substance to the five bacterial strains must be performed prior to testing, using at least five doses separated by at least ½ log. To avoid excessive toxicity, it has been recommended to reduce the top exposure concentration from 5000 µg/mL to 10 mM or 2000 µg/mL [81, 108, 151].
For the Ames Assay, it seems obvious, but it is important that agar not be overlaid while too hot, as it will kill microorganisms. Other specific procedural issues related to the type or class of compound being tested and the bacterial strains being used (including how many and which ones to use for classes of compounds) are described in detail in the OECD TG.
Another important consideration is the test result interpretation [152]. The result of the assay is always compared to the control (reference), but the difference (increase in mutation frequency) may be slight.
Treatment with increasing exposure concentrations is informative, and a trend test is typically performed. In some cases, the highest tested concentration may not produce a statistically significant increase in mutation frequency, but the trend test may suggest that a higher concentration, if tested, would likely yield a significant result. Since the upper concentration is limited by cytotoxicity, it can be challenging to definitively classify a result as positive based on established criteria. Therefore, biological relevance should always be considered alongside statistical thresholds such as p < 0.05 or a ≥2-fold increase.
The International Workshop on Genotoxicity Testing (IWGT) recommends evaluating results using a combination of three criteria: (1) a dose-related increase in revertants, (2) a clear increase in revertants at one or more doses compared to the concurrent negative control, and (3) at least one dose producing revertants above laboratory-established historical control limits [152]. These criteria can be adapted for other conventional assays as well. Schoeny et al. further discuss how to establish a clear response [152].
Good laboratory practice, standardization, and strict adherence to Test Guideline (TG) methods are essential. The TG methods provide detailed instructions, including the use of positive and negative controls, vehicle controls, appropriate concentration ranges, multiple exposure levels, and specific bacterial strains. Dertinger et al. [102], in a recent IWGT report, emphasized the importance of using historical control data and proper methods for its interpretation.
Although these requirements may complicate the execution of the Ames and other conventional assays, over 10,000 substances have already been tested using the Ames assay. This extensive database is available for use by others, helping to reduce redundant testing efforts [153].
CONCLUSION
Conventional short-term genetic toxicity tests were described in detail, along with a discussion on how these tests can be misleading if not carefully performed and interpreted. It must be stressed that one should know the value of the test being performed and its limits, plus the mechanism(s) of action of the compound under study, its physicochemical properties, and the potential confounding issues before undertaking the assay(s). Finally, it is key to identifying a genotoxic substance (and potentially a carcinogen) to perform multiple assays to confirm its genotoxicity and identify or confirm its mechanism of action. Some assays are far superior for investigating the mechanism or even the molecular initiating event of a substance, and knowing which one to choose can avoid many problems.
FUTURE DIRECTIONS
Part II of this manuscript will describe and discuss the following alternative testing approaches (new approach methodologies): the in vitro yeast DEL recombination assay, 3D cell cultures and the 3D RS Comet assay, the RS Skin MN, Bhas 42 CTA, ToxTracker™, TGX-DDI transcriptomic biomarker, Multiflow DNA Damage, and MutaMouse FE1 and PH assays for genetic toxicity testing. Part I provides an update on the regulatory status and progress of alternatives to conventional in vitro genetic toxicity methods. A further discussion is included on the quantitative In Vitro – In Vivo (qIVIVE) approach to extrapolating non-whole organism results to human risk assessment, as well as the Weight of Evidence (WoE) approach applicable towards the elimination of the cancer bioassay requirement for new chemical registration.
AUTHOR’S CONTRIBUTIONS
The author confirms sole responsibility for the following: Study conception and design, data collection, analysis and interpretation of results, and manuscript preparation.
LIST OF ABBREVIATIONS
| ALS | = Alkali-Labile Sites |
| AP | = Apurinic |
| BrDU | = Bromodeoxyuridine |
| CBPI | = Cytokinesis Block Proliferation Index |
| CYP450’s | = Cytochrome P450’s, Phase I Detoxifying Enzymes |
| CytoB | = Cytochalasin B |
| DAPI | = 4',6-Diamidino-2-Phenylindole |
| DCM | = Dichloromethane |
| DCP | = Dichloropropane |
| DDR | = DNA Damage Response |
| DMBA | = Dimethylbenzanthracene |
| DMSO | = Dimethylsulfoxide |
| EMS | = Ethylmethanesulfonate |
| ENU | = N-Ethyl-N-Nitrosourea |
| FACS | = Fluorescence-Activated Cell Sorting |
| FBS | = Fetal Bovine Serum |
| FISH | = Fluorescence In Situ Hybridization |
| FLAER | = Fluorescent-Labelled Aerolysin Reagent |
| GEF | = Global Evaluation Factor |
| Genetox | = Genetic Toxicology |
| GPI | = Glycosylphosphatidylinositol |
| gpt | = Xanthine-Guanine Phosphoribosyl Transferase Transgene |
| H2O2 | = Hydrogen Peroxide |
| HPRT | = Hypoxanthine-Guanine Phosphoribosyl Transferase Gene |
| HRP | = Horseradish Peroxidase |
| IWGT | = International Workshop on Genotoxicity Testing |
| LNT | = ‘Linearity at Low Dose’ Concept |
| LoD | = Limit of Detection |
| LOH | = Loss of Heterozygosity |
| MF | = Mutant Frequency |
| MLA | = Mouse Lymphoma Assay |
| MN | = Micronucleus Test |
| MNvit | = In Vitro Micronucleus Test |
| MOA | = Mechanism of Action |
| PBMC | = Peripheral Blood Mononuclear Cell |
| PHA | = Phytohemagglutinin |
| PK | = Pharmacokinetics |
| PNH | = Paroxysmal Nocturnal Hemoglobinuria |
| qIVIVE | = Quantitative In Vitro – In Vivo Extrapolation |
| RBC | = Red Blood Cell |
| RI | = Replication Index |
| RICC | = Relative Increase in Cell Count |
| ROS | = Reactive Oxygen Species |
| RPD | = Relative Population Doubling |
| RS | = Relative Survival |
| RTG | = Relative Total Growth |
| SB | = Single Strand Breaks |
| SCGE | = Single Cell Gel Electrophoresis |
| TFT | = Trifluorothymidine |
| TG | = 6-Thioguanine |
| TG | = OECD Test Guideline |
| TK | = Toxicokinetics |
| TK6 | = Thymidine Kinase Assay |
| UVC | = Ultraviolet C Light |
| XPRT | = Test Using gpt |
ACKNOWLEDGEMENTS
Declared none.

