A fast exact sequential algorithm for the partial digest problem
Background
Restriction site analysis involves determining the locations of restriction sites after the process of digestion by reconstructing their positions based on the lengths of the cut DNA. Using different reaction times with a single enzyme to cut DNA is a technique known as a partial digestion. Determining the exact locations of restriction sites following a partial digestion is challenging due to the computational time required even with the best known practical algorithm.
Results
In this paper, we introduce an efficient algorithm to find the exact solution for the partial digest problem. The algorithm is able to find all possible solutions for the input and works by traversing the solution tree with a breadth-first search in two stages and deleting all repeated subproblems. Two types of simulated data, random and Zhang, are used to measure the efficiency of the algorithm. We also apply the algorithm to real data for the Luciferase gene and the E. coli K12 genome.
Conclusion
Our algorithm is a fast tool to find the exact solution for the partial digest problem. The percentage of improvement is more than 75% over the best known practical algorithm for the worst case. For large numbers of inputs, our algorithm is able to solve the problem in a suitable time, while the best known practical algorithm is unable.
Other Information
Published in: BMC Bioinformatics
License: https://creativecommons.org/licenses/by/4.0/
See article on publisher's website: https://dx.doi.org/10.1186/s12859-016-1365-2
Funding
Open Access funding provided by the Qatar National Library.
History
Language
- English
Publisher
Springer NaturePublication Year
- 2016
License statement
This Item is licensed under the Creative Commons Attribution 4.0 International License.Institution affiliated with
- Hamad Bin Khalifa University
- Qatar Computing Research Institute - HBKU