Big-IN Landing Pads
See our resources page for plasmids available from Addgene
Overview:
In Big-IN, Landing Pads (LPs, also referred here as LP Regions) are DNA sequences designed as an intermediate step between deleting a specific allele and integrating a Synthetic Assemblon (aka Payload, PL) into that allele/locus. We are using a variety of LPs which typically include the following components:
A pair of heterotypic recombination sequences, currently LoxM on the left (ataacttcgtataggatactttatacgaagttat) and LoxP on the right (ataacttcgtatagcatacattatacgaagttat), flanking all the components listed below.
A promoter (human EF1-alpha) driving a single ORF with multiple components separated by P2A peptides. We also have LPs with a human PGK promoter (weaker than EF1-alpha), but these are not being used currently.
A positive selection marker, mainly PuroR. This marker is used to positively select cells harboring the LP. Note: additional markers such as HygroR, BSD (BlastR), and NeoR can also be used for positive selection, but we have not tested them. Fluorescence Proteins can also be used to sort LP-expressing cells.
A negative selection marker, e.g. HSV-TK, hmPIGA, This marker is used to negatively select for cells that lose the LP following recombination of the PL. Fluorescence proteins can be used to sort cells that lost LP expression. Our current conterselection marker of choice is hmPIGA due to the lack of a bystander effect when applying proaerolysin selection (unlike with HSV-TK). Notably, the use of hmPIGA as a counterselection marker necessitates deletion/inactivation of the cells' endogenous PIGA gene. This inactivation is embryonic lethal and thus prevents engineering mice models using Big-In. A potential solution is using HSV-TK, but it hadn't been completely validated yet (we’re working on it).
Optional: Fluorescence Protein (FP), e.g. mScarlet or HaloTag. Can be theoretically used for positive and negative sorting or for monitoring of LP acquisition and loss.
A terminator, currently EIF1 p(A) signal.
P2A Peptides: are a type of 2A self-cleaving peptides derived from Porcine Teschovirus-1. P2A is considered the most effective 2A Peptide type. Since we are using multiple P2A Peptides in a single plasmid and ORF, they were recoded to reduce similarity and prevent recombination.
Important: In the original version of Big-IN, our LPs included an inducible Cre recombinase (CreERT2). Our recent data indicate that CreERT2 is inefficient or incapable in mediating cassette exchange (i.e. PL integration), despite the fact that it is very efficient in excision of a short LoxP flanked region. We have shown that using a co-transfected Cre expression plasmid (pCAG-iCre) cassette exchange can be accomplished efficiently. Thus, we have decided to remove CreERT2 from the LPs. Preliminary data also indicate that removing CreERT2 from the LP alleviates LP transcriptional silencing.
Currently, LPs are delivered to cells by transfecting a Landing Pad plasmid (pLP) that comprises of the LP Region, Homology Arms (HAs), and a plasmid backbone. The HAs direct specific integration via homology direct repair (HDR).
pLPs are (typically) co-transfected with two pSpCas9 plasmids (plasmids that express both SpCas9 and a gRNA). The pLP’s Homology Arms are designed to perfectly match the break sites induced by the pair of gRNAs so that the LP Region would replace the deleted genomic region.
Our latest pLP backbones harbor a negative selection cassette comprising of phPGK1-ΔTK-SV40pA, allowing negative selection of clones in which the backbone integrated using ganciclovir (GCV).
Choosing a Landing Pad - Considerations:
The following should be considered when choosing a LP for a project:
Project Type: Different projects might be compatible with different Big-IN strategies. For example, a Synthetic Hypervariation project might be more permissive to genomic scars (e.g. transcriptional units that aid in positive selection of the payload DNA) compared with a Synthetic Haplotype project in which there cannot be any transcription-altering scars. Additionally, if the planned project includes assays that involve staining/sorting, one should make sure not to use fluorescence proteins that might interfere with those assays. As explained before, the use of hmPIGA prevents (or complicates) mice generation.
Cell type: Different cell types are compatible with different Big-IN strategies. For example, hESCs and mESCs greatly differ in their passaging method (hESCs are passaged as clumps while mESCs as single cells). These differences affect the ability to use certain counterselection methods such as HSV-TK, FCU1 and hmHPRT1 which all suffer from a bystander effect and poses a larger problem for hESCs. Similarly, if using non-adherent cells, the feasibility of counterselection methods with a high bystander effect is predictably higher.
Single-cell cloning and sorting: Different cell types tolerate sorting and low-density single-cell plating very differently. This should be considered when choosing the positive selection strategy for the project.
Bystander Effect: As mentioned above, this effect hinders the use of certain counterselection markers (e.g. HSV-TK, FCU1 and hmHPRT1) and is predictably very different between various cell types depending on their mode of passaging and their composition of tight junctions.
Cloning Homology Arms (HAs):
Before planning and cloning HAs, please read the list of considerations below.
Primers for adding HAs (N4, 4-nt extension to facilitate restriction enzyme binding; BsaI site, overhangs; N20, ~20 sequence-complimentary nucleotides) :
For primer for HAL: N4-GGTCTCACCCT-N20
Rev primer for HAL: N4-GGTCTCACGTT-N20
For primer for HAR: N4-GGTCTCGTATG-N20
Rev primer for HAR: N4-GGTCTCAGGAT-N20
HA Design Considerations:
Length: generally, the longer the HAs, the higher the efficiency of integration and perhaps the specificity. In hESCs, HAs in the range of 100-1000 bps have been tested and there was almost a linear correlation between HA length and efficiency of integration. However, longer HAs have the following disadvantages:
Mappability: In NGS sequencing the short read length limits the ability to accurately map the LP integration site. Generally, we've successfully mapped across HAs of 250 bps. Similarly, when PCR-genotyping across the HAs, the longer they are, the longer the PCR product would be, which could be limiting when using crude gDNAs as template.
Ease of cloning: longer HAs might be harder to clone (especially from gDNA).
BsaI sites (see below).
Coordinates: It is ideal to design HAs such that they target the LP exactly into the genomic cut site induced by the pair of gRNAs. This means that the left HA’s 3' end would correspond to the 5' gRNA cut site and right HA’s 5' end would correspond to the 3' gRNA site. However, in cases where the Synthetic Assemblon is shorter than the deleted genomic region, one can bridge this gap by extending the HAs inwards beyond the deletion sites. When doing so, it is crucial to eliminate/mutate the PAM sites so that the gRNAs won't be able to bind the HAs.
Composition: It is advised to avoid including low complexity regions such as repetitive elements or areas with extremely low GC content.
Restriction Enzyme Sites: Ideally, the HAs would not contain BsaI sites, as those would render them targets to the BsaI enzyme during the Golden-Gate Assembly. However, if BsaI sites are included, an easy way to overcome this is to use an alternative Golden-Gate Assembly protocol in which the terminal step is an overnight ligation at 16°C (note: keep the reactions cold at all times until transformed).
DNA template source, alleles, and haplotypes: The DNA template used to amplify the HAs should also be considered. BACs are an ideal template as they offer more primer specificity. gDNA should be used if important variants reside in the HA region, and attention should be given to heterozygous variants in the source DNA.
Adding gRNA sites and PAMs: Adding gRNA binding sites and PAMs makes it possible to induce in vivo linearization of the pLP by the same gRNAs that are co-transfected with the pLPs for inducing the genomic deletion. The added gRNA sites should match one or two of the co-transfected gRNA sequences and include a PAM. Optimally, these sites would face inwards so that only 6 bps would be added to the linearized fragment (3 bps from the gRNA site and 3 bps from the PAM). In preliminary experiments in hESCs we’ve seen that this form of in vivo pLP linearization reduces the frequency of pLP backbone integration. We have also noticed that in hESCs it causes rapid loss of transient Puromycin resistance (that is rendered by non-integrated pLP plasmids), most likely due to dramatically decreased half-life of linearized plasmids. However, it is possible that in vivo LP linearization increases the chance of tandem LP integration. Below is a template for primers for amplifying HAs with gRNA sites (N4, 4-nt extension to facilitate restriction enzyme binding; BsaI site, overhangs; N20, ~20 sequence-complimentary nucleotides) :
For primer for HAL: N4-GGTCTCACCCT-5'gRNA-PAM-N20
Rev primer for HAL: N4-GGTCTCACGTT-N20
For primer for HAR: N4-GGTCTCGTATG-N20
Rev primer for HAR: N4-GGTCTCAGGAT-3'gRNA-PAM-N20
Deleting endogenous PIGA:
Prior to using hmPIGA as counterselectable markers, the endogenous copies of this gene must be deleted. This can be achieved by direct replacement of the loci with a Landing Pad, which would then mediate Ectopic Big-IN. For Allelic Big-IN, the endogenous PIGA gene must be deleted prior to LP integration. Deleting PIGA is achieved simply by transfecting cells with 2 pSpCas9-Puro plasmids that express 2 gRNAs that cut the entire gene out and and then selecting for loss of PIGA with Proaerolysin. Importantly, if the research plan includes making transgenic mice from engineered mESCS, the implications of Piga deletion should be considered (Piga deletion is embryonic lethal).
Integrating Landing Pads into Mammalian Cells:
Overview: There are potentially many strategies to integrate LPs into the mammalian genome. Variations exist in methods to induce site-specific DNA breaks (e.g. different ways to deliver Cas9, alternative Cas proteins, 1 or 2 breaks etc.) as well as in the delivery and nature of the integrating LP (circular/linear plasmid, PCR product, 1 or 2 homology arms etc.). We are currently delivering Big-IN LPs into human and mouse ESCs by co-transfecting a Landing Pad plasmid (pLP) with 2 pSpCas9-GFP plasmids, each encoding Cas9 and expressing a single gRNA. Following transfection, the pair of gRNAs induce 2 distal dsDNA breaks at the edges of the region to be replaced. Additionally (and this is optional), the same gRNAs mediate linearization of the LP due to the presence of gRNA binding sites (and PAMs) outside the LP Homology Arms. The genomic break is then repaired by the cell's HDR machinery using the pLPs HAs that extend distally from the dsDNA breaks. Starting day 1 post-transfection, Puromycin is used to select for cells that, initially, were transfected with the pLP, and later on (after the pLP had been degraded or diluted), cells in which the LP has been integrated into the genome. If a negative selection cassette is present on the pLP backbone (e.g. a TK cassette), negative selection (e.g. GCV) is applied at approx. 1 week post transfection to eliminate cells in which the pLP backbone integrated. It is important not to use pCas9-PuroR plasmids in combination with pLPs that harbor PuroR as this favors random integration of the pCas9-Puro plasmids. Use pCas9-GFP or pCas9-BSD instead.
Notes and Considerations
Positive selection: All landing pads contain a positive selection marker for isolating cells that harbor the landing pad. The selection marker is currently PuroR. However, different markers can be used for isolation of LP-harboring cells. When using Puromycin for positive selection, keep in mind that the transfected pLP (as well as co-transfected pSpCas9-Puro plasmids) render cells transiently Puromycin resistant following transfection (typically starting day 1 post-transfection) and until diluted/degraded (which varies a lot between different cell types and is drastically reduced by in vivo plasmid linearization). Integrated LPs would render cells stably resistant to Puromycin. Thus, a prolonged period (> 7 days) of Puromycin selection is necessary for isolating LP-integrated cells. Typical Puromycin concentrations utilized are 0.5-1 µg/ml for human and mouse ESCs when PuroR is driven by phEF1a.
Negative selection: As of 2020, we have included a negative selection cassette (phPGK1-ΔTK-SV40pA) on the pLP backbone. Following transfection and Puromycin selection, GCV selection (~1 µM) is applied (at ~5-7 days post transfection) to eliminate cells in which the pLP backbone integrated.
CRISPR-Cas9 in vivo pLP linearization: Adding gRNA binding sites and PAMs make it possible to induce in vivo linearization of the pLP by the same gRNA(s) that are co-transfected with the pLPs for inducing the genomic deletion. The added gRNA sites should match one or two of the co-transfected gRNA sequences and include a PAM. These sites would optimally face "inwards" so that only 6 bps would be added to the resulting linearized fragment (3 bps from the gRNA binding sites and 3 PAM bps). In preliminary experiments in hESCs we have seen that this form of in vivo pLP linearization reduces the frequency of pLP backbone integration. We have also noticed that in human and mouse ESCs it causes rapid loss of transient Puromycin resistance (that is rendered by non-integrated pLP plasmids), most likely due to dramatically decreased half-life of linearized plasmids.
pSpCas9 plasmids: when integrating LPs, 2 pSpCas9 plasmids are co-transfected with the pLP. We have initially used pSpCas9-Puro plasmids and have detected high frequency integration of these plasmids, likely due to a positive selection for this event exerted by the puromycin selection of the LP. Thus, we are now using pSpCas9-GFP plasmids, which, presumably (this has not been validated), will integrate at a lower frequency, and their integration can be potentially monitored by monitoring GFP fluorescence.
Single-cell cloning: It is highly preferable to isolate a single-cell clone of cells with correct LP integration. This would greatly simplify the interpretation of downstream QC analyses, but more importantly, would guarantee that all the following daughter cells that will be delivered with different Synthetic Assemblons would have a common ancestor. Single-cell cloning can be achieved by several means:
Singularizing cells and plating them at very low densities in large plates followed by manual isolation (picking) of colonies.
Single-cell sorting into 96-well plates.
Plating cells at very low densities in 96-well plates.
Each of these strategies has advantages and disadvantages and these differ considerably between various cell types:
hESCs: In general, hESCs have poor survival and clonability following singularization. To improve their chances of surviving and forming colonies, pre-treat hESCs with RevitaCell (or a different Rock inhibitor) for 1-24 hours before singularization and for 1-3 days post plating. Use TrypLE-Select or Accutase for singularizing cells and plate singularized hESCs on rhLaminin-521 -coated plates. Preliminary efforts (by Ran) to sort hESCs using the SONY SH800 sorter (with a 130 µm sorting chip and very low sample pressure) have yielded ridiculously low survival rates. Potentially, sorting 3-5 cells/well may increase the % of cells with colonies, at the risk of obtaining polyclonal growth. Try plating at 0.5 cells/well to maximize monoclonality and ~5 as backup if you are manually plating singularized cells in 96-well plates. Another option is to sort hESCs into a recovery media containing RevitaCell and 10% FBS, and replace to regular growth media the next day.
mESCs are more easily cloned compared with hESCs and tolerate singularization very well.
Allele-specific Landing Pad integration: When working with a hybrid mouse cell line, and potentially also when working with a locus that has high density of variants, it is possible to target the integration of the Landing Pad to a specific allele. To this end, the gRNAs that cut at the locus ends should be designed to be allele-specific by targeting them to variant regions (ideally, the variants should be at the PAMs). This will guarantee that only one allele is "cleaved" and replaced with the Landing Pad. Moreover, it will ensure that no DNA breaks (leading to indels) are induced in the second allele. Check out Getting Started to help you find gRNAs targeting variant regions in a hybrid mouse cell line.
Quality Control:
Following isolation and expansion of LP-harboring cells, the following quality control (QC) measures can/should be taken to assess the integrity and quality of the generated clones.
PCR Genotyping and Sanger sequencing: By performing PCR on gDNA using primers that span the junctions between the LP and its surrounding genomic sequence it is possible to verify the LP had correctly integrated at the designed locus. It is important to design high quality and specific primers, and to make sure that the primers that target the genomic region outside the LP region bind outside the homology arms. When performing allelic integration, make sure that the LP integrated only into one allele and that the other allele is intact. When preparing gDNA from a few samples, the Qiagene kit is preferred. However, when genotyping many clones, a Quick & Dirty gDNA Prep protocol can be used.
The bands obtained from such genotyping PCRs can be extracted from the gel and submitted to Sanger sequencing to verify precise integration at the base pair level.
It is also important to genotype for possible unwanted outcomes. These include:
pLP backbone integration: This is a common outcome (and can be reduced when including gRNA binding sites outside the HAs to induce in vivo LP linearization, as explained above, as well as by backbone counterselection). We have detected backbone integration by performing PCR for AmpR or Ori.
pCas9 integration.
Finally, it is also recommended to genotype for the loss of the genomic region that is replaced by the LP using two pairs of primers on each side of the deletion, preferably ones that span the deletion edges (one primer inside the deletion and one outside). When integrating LPs allelically this is more complicated as the 2nd (unedited) allele would be amplified (which is in fact something additional to verify - that the 2nd allele is unedited). However, allele-specific primers can sometimes be designed to solve this issue.
Capture-Seq: Capture-Seq is a great way to unbiasedly and relatively cheaply characterize engineered clones. It can enable accurate mapping of LP integration event(s) and assessment of LP integrity. When integrating LPs, we usually perform Capture-Seq with 3 types of Nick-translated baits:
Genomic BAC: bait is produced by one (or, when necessary, more than one) BAC(s) covering the entire edited region of the genome (the region replaced by the LP). When targeting one allele of an autosome we expect the coverage to drop by half over the replaced genomic region.
pLP: bait is produced from the integrated pLP (with or without the HAs cloned, it doesn't matter as these sequences are included in the genomic BAC). We map reads independently to the LP Region of the pLP, expecting full coverage and no variants over this region, and to the pLP backbone, expecting no coverage over some background. Coverage over the pLP backbone indicate a bad clone.
pCas9: bait is produced from a pSpCas9 plasmid (same version as was co-transfected with the pLP). Reads are mapped to pCas9 with good clones expected to have no coverage.
In addition to inspecting the coverage and variants as explained above, a separate analysis, called bamintersect, is performed to unbiasedly identify integration sites. In bamintersect, reads for which only a single mate has been mapped to a reference genome are mapped to other reference genomes, thus detecting desired or undesired integration events of the LP, its backbone, pCas9 etc. For a successful LP integration we expect only two bamintersect hits corresponding to the 5' and 3' junctions between the LP Region and its surrounding genomic regions. Any additional such hits (junctions), or hits between the pLP backbone or pCas9 to the genome, indicate a bad clone. Importantly, each projects requires some filtering out of false positive bamintersect hits that depend on homology between the integrated regions and the edited genome.
Whole Genome Sequencing: WGS is more expensive than Capture-Seq but has the advatage of being more sensitive for detection of off-target editing/mutations as a result of passaging, Cre acvtivity, CRISPR-Cas9, as well as a better chance of identifying off-target LP integration events.
ddPCR ploidy screen: Dave Truong has implemented a digital droplet PCR (ddPCR) assay to screen for aneuploidy in mESCs. The assay measures the copy number of 4 chromosomes (1, 8, 11 and Y), which combined account for 99% of blastocyst injection failures (PMID: 27496052).
qPCR can be used to assess and verify several properties of the engineered LP-harboring cells:
Pluripotency and differentiation of hESCs and mESCs can be accurately assessed by measuring the expression (mRNA) of key markers of pluripotency (e.g. Nanog, Oct4, Sox2, Esrrb, Tcl1), endoderm differentiation (e.g. GATA4, GATA6) mesoderm differentiation (e.g. T, MIXL1) and ectoderm differentiation (e.g. PAX6, PAX3, NES, SOX7, SOX17). Note that the markers for mouse and human ESCs are slightly different. It is advisable to include an ES positive control (e.g. a known undifferentiated ESC sample or the parental un-engineered cells), as well as positive controls for differentiation if possible (e.g. MEFs, retinoic acid - treated ESCs, embryoid bodies etc).
LP Region components: Measuring the expression (mRNA) of key LP Region components such as the selection markers can help verifying their presence, the activity of the chosen promoter, and to compare expression levels between different LP designs. When using selection markers such as hmPIGA, a qPCR assay can also help assessing the relative expression level compared to endogenous copy (in parental cells). Primers that distinguish between the endogenous and exogenous copies are useful here. To this end, endo-specific primers can be designed to target the UTRs or exon-exon boundaries. Exo-specific primers are harder to design unless the genes have been recoded. Note: we have observed that the expression of LP components can be silenced over time in hESCs, probably due to the presence of toxic components in the LP Region. Therefore, if is useful to allow engineered cells (following positive selection) to grow for 1-2 weeks with or without the positive selection drug (e.g. Puromycin) and compare the expression of LP components between these two conditions. If marked (tens to hundreds fold) reduction in expression is observed in cells that grow without selection, then it is safe to conclude that selective pressure to silence the LP exists.
LP copy number: While we haven't try this, there are a few examples in the literature for the applicability of a qPCR assay performed on gDNA to accurately distinguish between cells that have 1, 2 or more integrated copies of a LP.
Additional Pluripotency QC assays: To ensure that engineered human or mouse ESC clones are pluripotent, several different assays can be performed (in addition to the aforementioned qPCR). These include immunofluorescence (IF) of pluripotency markers (e.g. Nanog, Oct4. Sox2) and IF/Flow cytometry of pluripotent stem cell markers SSEA4 and TRA-1-60 (for hESCs) and SSEA1 (for mESCs). RNA-Seq data can also be used to assess pluripotency using public databases and services such as StemChecker.
CD59 and FLAER staining: When using hmPIGA as a counterselection marker, it is possible to measure its activity in live cells (using flow cytometry) by staining for either CD59 (a GPI-anchored protein that is lost upon PIGA deletion and restored upon hmPIGA expression) or FLAER (a fluorescent inactive aerolysin that binds GPI anchors). We currently (May 2019) only have human-specific CD59 antibodies, while FLAER should work for both human and mouse.
Positive control payload delivery: The ultimate QC for LP-harboring cells would be to perform Big-IN delivery of a PL. However, as the efficiency of large DNA transfection is low, it is also possible to test and calibrate Big-IN in newly-engineered cells with a small positive control payload. To this end, we have generated pPL001, which harbors a phEF1a-eGFP-T2A-BSD-bGHpA cassette flanked by LoxM/LoxP sites. pPL001 can be positively selected with Blasticidin S or detected/sorted by flow and sorted/analyzed. When using a LP that harbors an RFP (e.g. mScarlet), it is possible to track the replacement of the LP by the positive control payload by monitoring the transition between red fluorescence to green.