The Primer Designer is a tool for designing primers for DNA amplification experiments. It provides a user-friendly interface to input target DNA sequences and generate optimized primer sequences for PCR or other DNA amplification techniques. It uses the primers python library https://pypi.org/project/primers/.
- Input target DNA sequence(s)
- Generate optimized primer sequences
- View primer properties (GC content, melting temperature, etc.)
- Export primer sequences in CSV format.
- Since most websites provide the option of inputting comma separated primers in batch, simply copy and paste the exported primer names and sequences to your order. This works great for Eurofines Genomics, IDT, etc.
- Clone the repository:
git clone https://github.com/KodyKlupt/primer_designer
- Use the scripts in the 'Automated Design' Folder, otherwise you can use the original forked repository files which I have left.
- A sample entry format is provided. Simply add your sequence name and sequences to be amplified in the csv file.
- Open the Primer Designer application.
- Input the target DNA sequence(s) in the designated field.
- Keep creating new primers, when complete type 'DONE' in the prompt for the primer name.
- View the generated primer sequences and their properties.
- Export the primer sequences in the CSV format. A log file will be provided that provides scores for the primers designed. The primer sequences can be copy and pasted into a batch order for nucleotide synthesis.
Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.
For any questions or inquiries, please contact the project maintainer at [email protected].
Does it actually work well? Is it better than just creating primers manually in a program like SnapGene or Benchling?
I think so...1) it is easy to make small mistakes when designing primers and copying them between files or order entries - the automated and interactive scripts I have written streamline this so you only need to copy and paste once. 2) Thanks to the original Primers package developers, they have optimized primers to have reasonable GC content, length, melting temperature, etc... 3) If you have a library of constructs all going into the same plasmid, if the prefix and suffix (i.e. the restriction digest regions) are identical, it is so much quicker.
This is the forward and reverse primer overhangs, the regions that DO NOT anneal to your construct. The forward (or prefix) is generally several nucleotides (e.g. TATA), followed by a restriction enzyme cut site, and likely a start codon (ATG). The reverse (suffix) site contains most likely a stop codon, followed by a restriction enzyme cut site, and finally several nucleotides (e.g. TATA). I have altered the program using the Bio python package, reverse complement function, to automatically generate the reverse primer correctly. This means, write the reverse (suffix) in the 5'->3' direction after your construct sequence and it will AUTOMATICALLY BE CONVERTED TO THE REVERSE COMPLEMENT.
This is a small, straightforward tool for creating PCR primers. Its target use-case is DNA assembly.
Reasons to choose primers
instead of Primer3 include its:
- features: It is uniquely focused on DNA assembly flows like Gibson Assembly and Golden Gate cloning. You can design primers while adding sequence to the 5' ends of primers.
- simplicity: It is a small and simple Python CLI/library with a single dependency (seqfold). It is easier to install and use.
- interface: The Python library accepts and create primers for Biopython
Seq
classes. It outputs JSON for easy integration with other applications. - license: It has a permissive, business-friendly license (MIT) instead of a copyleft GPL v2 license.
pip install primers
primers
chooses pairs while optimizing for length, tm, GC ratio, secondary structure, and off-target binding. In the simplest case, you just pass the sequence you want to amplify:
$ primers create CTACTAATAGCACACACGGGGACTAGCATCTATCTCAGCTACGATCAGCATC
dir tm ttm gc dg p seq
FWD 63.6 63.6 0.5 0 2.6 CTACTAATAGCACACACGGG
REV 63.2 63.2 0.5 -0.16 1.52 GATGCTGATCGTAGCTGAGATA
Additional sequence is added to the 5' end of primers via the add_fwd/add_rev
args (-f/-r
with CLI). By default, it will prepend the entire additional sequence. If you want it to choose the best subsequence to add to the 5' end (factoring in the features dicussed below), allow it to choose from a range of indicies via the add_fwd_len/add_rev_len
(-fl/-rl
with CLI). Each primer has two tms: "tm", the melting temperature for the portion of the primer that binds to the template sequence and "tm_total", the melting temperature for the entire primer including the additional sequence added to primers' 5' end.
from primers import create
# add enzyme recognition sequences to FWD and REV primers: BsaI, BpiI
fwd, rev = create("AATGAGACAATAGCACACACAGCTAGGTCAGCATACGAAA", add_fwd="GGTCTC", add_rev="GAAGAC")
print(fwd.fwd) # True
print(fwd.seq) # GGTCTCAATGAGACAATAGCACACACA; 5' to 3'
print(fwd.tm) # 62.4; melting temp
print(fwd.tm_total) # 68.6; melting temp with added seq (GGTCTC)
print(fwd.dg) # -1.86; minimum free energy of the secondary structure
# add from a range of sequence to the FWD primer: [5, 12] bp
fwd, rev = create("AATGAGACAATAGCACACACAGCTAGGTCAGCATACGAAA", add_fwd="GGATCGAGCTTGA", add_fwd_len=(5, 12))
print(fwd.seq) # AGCTTGAAATGAGACAATAGCACACACAGC (AGCTTGA added from add_fwd)
print(fwd.tm) # 62.2
print(fwd.tm_total) # 70.0
$ primers create --help
usage: primers create [-h] [-f SEQ] [-fl INT INT] [-r SEQ] [-rl INT INT] [-t SEQ] [-j | --json | --no-json] SEQ
positional arguments:
SEQ create primers to amplify this sequence
options:
-h, --help show this help message and exit
-f SEQ additional sequence to add to FWD primer (5' to 3')
-fl INT INT space separated min-max range for the length to add from '-f' (5' to 3')
-r SEQ additional sequence to add to REV primer (5' to 3')
-rl INT INT space separated min-max range for the length to add from '-r' (5' to 3')
-t SEQ sequence to check for off-target binding sites
-j, --json, --no-json
write the primers to a JSON array
By default, the primers are logged in table format in rows of dir, tm, ttm, gc, dg, p, seq
where:
- dir: FWD or REV
- tm: the melting temperature of the annealing portion of the primer (Celsius)
- ttm: the total melting temperature of the primer with added seq (Celsius)
- gc: the GC ratio of the primer
- dg: the minimum free energy of the primer (kcal/mol)
- p: the primer's penalty score. Lower is better
- seq: the sequence of the primer in the 5' to the 3' direction
$ primers create -f GGTCTC -r GAAGAC AATGAGACAATAGCACACACAGCTAGGTCAGCATACGAAA
dir tm ttm gc dg p seq
FWD 60.8 67.0 0.5 -1.86 5.93 GGTCTCAATGAGACAATAGCACACAC
REV 60.8 65.8 0.5 0 3.2 GAAGACTTTCGTATGCTGACCTAG
The --json
flag prints primers in JSON format with more details on scoring. The example below is truncated for clarity:
$ primers create CTACTAATAGCACACACGGGGACTAGCATCTATCTCAGCTACGATCAGCATC --json| jq
[
{
"seq": "CTACTAATAGCACACACGGG",
"len": 20,
"tm": 63.6,
"tm_total": 63.6,
"gc": 0.5,
"dg": 0,
"fwd": true,
"off_target_count": 0,
"scoring": {
"penalty": 2.6,
"penalty_tm": 1.6,
"penalty_tm_diff": 0,
"penalty_gc": 0,
"penalty_len": 1,
"penalty_dg": 0,
"penalty_off_target": 0
}
},
...
Choosing PCR primers requires optimizing for a few different characteristics. Ideally, pairs of primers for PCR amplification would have similar tms, GC ratios close to 0.5, high minimum free energies (dg), and a lack off-target binding sites. In primers
, like Primer3, choosing amongst those (sometimes competing) goals is accomplished with a linear function that penalizes undesirable characteristics. The primer pair with the lowest combined penalty is chosen.
The penalty for each possible primer, p
, is calculated as:
PENALTY(p) =
abs(p.tm - optimal_tm) * penalty_tm + // penalize each deg of suboptimal melting temperature
abs(p.gc - optimal_gc) * penalty_gc + // penalize each percentage point of suboptimal GC ratio
abs(len(p) - optimal_len) * penalty_len + // penalize each bp of suboptimal length
abs(p.tm - p.pair.tm) * penalty_tm_diff + // penalize each deg of melting temperature diff between primers
abs(p.dg) * penalty_dg + // penalize each kcal/mol of free energy in secondary structure
p.offtarget_count * penalty_offtarget // penalize each off-target binding site
Each of the optimal (optimal_*
) and penalty (penalty_*
) parameters is adjustable in the primers.create()
function. The defaults are below:
optimal_tm: float = 62.0
optimal_gc: float = 0.5
optimal_len: int = 22
penalty_tm: float = 1.0
penalty_gc: float = 0.2
penalty_len: float = 0.5
penalty_tm_diff: float = 1.0
penalty_dg: float = 2.0
penalty_offtarget: float = 20.0
If you already have primers, and you want to see their features and penalty score, use the primers score
command. The command below scores a FWD and REV primer against the sequence -s
that they were created to amplify:
$ primers score GGTCTCAATGAGACAATA TTTCGTATGCTGACCTAG -s AATGAGACAATAGCACACACAGCTAGGTCAGCATACGAAATTT --json | jq
[
{
"seq": "GGTCTCAATGAGACAATA",
"len": 18,
"tm": 39.4,
"tm_total": 55,
"gc": 0.4,
"dg": -1.86,
"fwd": true,
"off_target_count": 0,
"scoring": {
"penalty": 49.9,
"penalty_tm": 22.6,
"penalty_tm_diff": 19.6,
"penalty_gc": 2,
"penalty_len": 2,
"penalty_dg": 3.7,
"penalty_off_target": 0
}
},
{
"seq": "TTTCGTATGCTGACCTAG",
"len": 18,
"tm": 59,
"tm_total": 59,
"gc": 0.5,
"dg": 0,
"fwd": false,
"off_target_count": 0,
"scoring": {
"penalty": 24.6,
"penalty_tm": 3,
"penalty_tm_diff": 19.6,
"penalty_gc": 0,
"penalty_len": 2,
"penalty_dg": 0,
"penalty_off_target": 0
}
}
]
Usually, off-target binding sites should be avoided. In primers
, off-target binding sites are those with <= 1
mismatch in the last 10 bair pairs of the primer's 3' end. This definition is experimentally supported by:
Wu, J. H., Hong, P. Y., & Liu, W. T. (2009). Quantitative effects of position and type of single mismatch on single base primer extension. Journal of microbiological methods, 77(3), 267-275
By default, primers are checked for off-targets within the seq
parameter passed to create(seq)
. But the primers can be checked against another sequence if it is passed to the optional offtarget_check
argument (-t
for CLI). This is useful when PCR'ing a subsequence of a larger DNA sequence like a plasmid.
from primers import create
seq = "AATGAGACAATAGCACACACAGCTAGGTCAGCATACGAAA"
seq_parent = "ggaattacgtAATGAGACAATAGCACACACAGCTAGGTCAGCATACGAAAggaccagttacagga"
# primers are checked for offtargets in `seq_parent`
fwd, rev = create(seq, offtarget_check=seq_parent)