BioGem Blog

Posts

Showing posts with the label Multiple Sequence Alignment

Constructing Phylogenetic Tree using UPGMA Method

June 17, 2025

UPGMA (Unweighted Pair Group Method with Arithmetic Mean) is a distance-based method for constructing phylogenetic trees. It works by iteratively clustering the two closest groups of sequences together, forming a new cluster until all sequences are grouped into a single tree. The distances between clusters are calculated using the average of all pairwise distances between sequences within those clusters. UPGMA produces rooted trees, meaning it has a defined root representing the common ancestor. Here's a more detailed explanation: 1. Distance Matrix UPGMA begins with a distance matrix, which contains the pairwise distances between all sequences being compared. These distances can be based on sequence alignment, protein structure comparisons, or other relevant metrics. \[D_{i,j}=\max\begin{cases}D_{i-1,j-1} & + & s(a_i,b_j) \\D_{i-1,j} & + & s(a_i,-) \\D_{i,j-1} & + & s(-,b_j)\end{cases}=\max\begin{cases}D_{i-1,j-1}& + ...

Predicting Functional Regions in the Protein Sequence using SMART

June 05, 2025

Prediction of functional regions in the protein sequence plays a crucial role in the computer-aided drug discovery. SMART (a Simple Modular Architecture Research Tool) helps identify and annotate protein domains and analyze domain architectures by BLAST search. In bioinformatics, domains refer to distinct functional, structural, or evolutionary units within proteins, DNA, or RNA. Here are some key types of domains in bioinformatics: 1. Protein Domains Structural Domains : Compact, independently folding units within a protein (e.g., SH3, zinc finger, immunoglobulin domains). Functional Domains : Regions responsible for specific biochemical activities (e.g., kinase domain, DNA-binding domain). Evolutionary Domains : Conserved regions indicating common ancestry (e.g., Pfam domains). 2. DNA/RNA Domains Regulatory Domains : DNA regions controlling gene expression (e.g., promoters, enhancers). Functional RNA Domains : Motifs in non-coding RNAs (e.g., ribozyme catalytic cor...

Constructing Phylogenetic Tree using MEGA Software

August 22, 2023

A phylogenetic tree ( a.k.a. , cladogram or dendrogram) is a diagrammatic/graphical representation of the genetical/evolutionary relationship of species/organisms/genes. It helps to find the common ancestor. Construction of a phylogenetic tree consists of two phases, multiple sequence alignment and computing distance matrix. This is a simple video tutorial for constructing a phylogenetic tree using Molecular Evolutionary Genetics Analysis ( MEGA ) software. The MEGA software produces phylogenetic trees from multiple sequences in various formats: rectangular, slanting, curved, radial, and curved.

Constructing Entropy Plot from Multiple Sequence Alignment

September 13, 2020

The entropy in sequence analysis refers to the measure of the variation of characters (column) in multiple sequences. Entropy plot through multiple sequence alignment can be predicted using different types of entropy formulas, namely Shannon's Entropy , Schneider's Entropy , Shenkin's Entropy , Gerstein's Entropy , and Gap normalized Entropy . Prediction of entropy plot consists of two phases: ( i ) performing multiple sequence alignment and consensus, and ( ii ) calculation of entropy number for each column through consensus of multiple sequence alignment. The entropy plot is generated by plotting vertical lines in the order of the consensus sequence on the x -axis, and the entropy number on the y -axis. This simple video tutorial demonstrates how to predict entropy plot through multiple sequence alignment. The tools used in this tutorial are ClustalW , and Entropy Plotter . Note: We can choose any multiple sequence alignment tool, but the ...

Search This Blog