BLASTing Sequences for Detecting Repetitive Regions (Boeke Lab 2019)

This is a guide on how to check your sequence for repeats by blasting it against itself. This is performed before de novo assembly in yeast to understand if the sequence is likely to be challenging due to mis-assembly outside intended homology arms.

We would appreciate it if you email us the result from your self-alignment along with your nomination.

 

Open NCBI’s BLASTn and paste the your sequence in both the Query and Subject boxes

 Adjust algorithm parameters to the following:

  • Word size=32

  • Low complexity regions=unchecked

  • To limit the algorithm for only nearly-perfect stretches of homologies (which is what yeast “care” about"), change the Match/Mismatch Scores to “1,-4”.

Click on BLAST and wait for alignment to finish

2.png

Click on Dot Matrix View

3.jpg

Now, here’s an example of a ‘good’ sequence that has been successfully de novo assembled (Mostly on the diagonal, few spots off the diagonal)

4.jpg

And here’s an example of a more challenging locus containing a large direct repeat (long extent of spots off the diagonal, indicating repeat sequences). Such an assembly is not impossible, but more challenging.

6.jpg