Systems Seminar - CSE
Efficient Pattern Mining across DNA Sequences
Add to Google Calendar
This talk examines the relevance of a particular form of stationariness to pattern mining and explores its application to a novel approach of DNA homology and sequence alignment. First, a stationary-based data reduction technique is applied to significantly reduce the size of input sequences. Then, a stationary-optimized approximate pattern matching technique is applied to uncover the presence of approximate homologues of a given signature within a target sequence. Finally, an optimized alignment technique is applied to each such approximate homologue to verify alignment under a relaxed level of mismatches, inserts, deletes, transpositions, and re-arrangements. The approach provides means to identify near and distant homologues as well as to uncover their evolutionary relationships. More importantly, under certain conditions, the pattern mining identification approach (i.e., a combinatorial similarity-based data mining engine) can yield sub-linear run-time performance with respect to the size of input DNA sequences.
Dr. Manohar-Alers earned a B.S. in computer engineering in 1987. He also earned a M.S. computer engineering from the University of Wisconsin at Madison in 1988 and a M.S.E. in industrial engineering from the University of Michigan at Ann Arbor in 1992. Dr. Manohar-Alers earned his Ph.D. in computer science and engineering specializing in software systems from the University of Michigan at Ann Arbor in 1997. Previously, he worked at the IBM T. J. Watson Research Center as a Research Staff Member and at AT&T Bell Laboratories (now Lucent Bell Labs) as a Member of the Technical Staff. He has been lead inventor in several patents on adaptive resource management.