Applied Unsupervised machine Learning in Bioinformatics Sequences
Abstract
In recent years, bioinformatics has begun to develop in finding the type of disease, finding vaccines and treatment, forensic medicine, etc. Due to the abundance of data and obtaining accurate results as quickly as possible, and based heavily on machine Learning algorithms which executed on these huge data to do different tasks, such Predictions, Classification, Outler Detection, Model Discovery and Description, and many other tasks.
In this study a type of unsupervised machine learning algorithms, clustering was used to classify the DNA sequences. The Clustering Methods is very useful in Biomedical data, al different levels, DNA, RNA, and proteins, it used to predicate and identify unknown sequences depending on known ones, classify different sequences in groups, and build a hieratical structure that represents the genealogical tree, which is very useful in knowing genealogy and detecting crimes.
In this study, we used two types of clustering algorithms, K-mean and Hierarchical clustering, use elbow algorithm to find optimal value of K and different similarity measurements. found to give similar results. The used Dataset consist of 160 amino acids sequences that was collected from gene bank and the agriculture collage of Basrah university. It is stored in different extension such as (fasta, txt, docx).
How to Cite This Article
Esraa Abdul Hussein Alwan, Hassan Nima Habib, Salma A Mahmood (2025). Applied Unsupervised machine Learning in Bioinformatics Sequences . International Journal of Engineering and Computational Applications (IJECA), 1(5), 20-25. DOI: https://doi.org/10.54660/.IJECA.2025.1.5.20-25