Posts

Compiling and Creating locBLAST Image using Docker

Image
loc BLAST is a PHP library that provides a graphical user interface (GUI) for the command-line NCBI BLAST+ programs. The official Docker image of loc BLAST is available on Docker Hub . 💿 Using Existing Docker Images for Web BLAST The most straightforward way to run a web-based BLAST service with Docker is to use a pre-built image, such as those provided by the NCBI or the open-source community: NCBI BLAST+ Command Line Tools: The NCBI provides official Docker images for the standalone command-line BLAST+ suite, which can be found on their GitHub page and Docker Hub . 🛠️ Setting up loc BLAST in a Docker Environment loc BLAST requires a web server (like Apache or Nginx) with PHP support, the standalone NCBI BLAST+ suite, and the loc BLAST library files. To run loc BLAST in a Docker container, you would typically need to: Install Docker: Ensure you have Docker Hub installed on your system. Set up a Dockerfile: Create a D...

3D Protein Structure Prediction Server AlphaFold 3

Image
The AlphaFold Server is a free, web-based platform launched by Google DeepMind and Isomorphic Labs to provide the scientific community with access to AlphaFold 3 . While the original AlphaFold was revolutionary for predicting protein structures, the AlphaFold 3 server expands this capability to virtually all “life’s molecules,” allowing researchers to model how proteins interact with DNA, RNA, ligands, and more in a single, unified system. 🧬 Key Capabilities Unlike its predecessors, the AlphaFold 3 server is a multimodal model. It doesn't just fold proteins; it predicts the joint 3D structure of complex molecular assemblies. Protein-Ligand Interactions: Accurately models how small molecules (drugs) bind to proteins, showing a 50% improvement over traditional docking methods. Nucleic Acids: Predicts the structures of DNA and RNA and how they interact with proteins (e.g., transcription factors or CRISPR complexes). Chemical Modifications: Pr...

Bioinformatics Protocol for NGS Data Analysis

Image
A step-by-step Bioinformatics protocol for Next-Generation Sequencing (NGS) are, Data quality control using tools like FastQC to assess raw data. Data preprocessing for adapter trimming and low-quality base removal with tools like Trimmomatic or FastP. Read mapping to a reference genome using aligners such as BWA or Bowtie2. Post-alignment processing including duplicate removal with Picard and variant calling with GATK or Samtools. Downstream analysis and visualization for specific applications like differential gene expression or variant interpretation using tools like R packages or IGV. A more detailed breakdown of those were given below 1. Data Quality Control (QC) Purpose: To check the quality of the raw sequencing reads and identify any potential issues. Tools: FastQC: A widely used tool to generate quality control reports for raw sequencing data. Output: A report summarizing metrics like Phred scores, adapter contamination, and sequence qu...

Creating 2D Line Plot using GNU Plot Software

Image
A simple Gnuplot script designed to generate a customized 2D line plot of potential energy data (presumably vs. simulation steps or time), using data from a file named PE.txt (available in GitHub ). The script applies several visual formatting settings to make the plot visually appealing and informative. This script produces a clean, stylized 2D plot of potential energy vs. simulation steps with (1) custom fonts, colors, and line styles, (2) no legend, (3) margins around the x-axis, and (4) automatic y-axis scaling. Source Code set title '{/Times-New-Roman=14:Bold Potential Energy Plot}' tc rgb '#167116' set xlabel '{/Arial:Italic Number of steps}' tc rgb 'red' set ylabel '{/Arial:Italic Potential energy}' tc rgb 'red' set style line 1 lt 1 lc rgb '#f70453' lw 0.5 set grid layerdefault lt 0 lc rgb 'blue' lw 0.5 set border lt 1 lc rgb 'blue' lw 1 unset key plot 'PE.txt' with lines ls 1 set xrang...

Molecular Dynamics Simulation of Micromolecules using Chimera

Image
Performing a Molecular Dynamics (MD) simulation of a small molecule in UCSF Chimera involves a series of steps to prepare the molecule, set up the simulation environment, run the simulation, and finally, analyze the resulting trajectory. Here's a step-by-step guide for the same: 1. Loading and Preparing the Small Molecule Structure Open Chimera: Launch UCSF Chimera or ChimeraX. Load your molecule: Import your small molecule structure into Chimera using File > Open or File > Fetch by ID if the structure is available in a database like the Protein Data Bank (PDB). Add Hydrogens: Use the "Molecular Dynamics Simulation" tool's "Prep Structure" section to add hydrogens. You might also be able to use the addh command. Assign Force Field Parameters: Since you are working with a small molecule (a nonstandard residue), you will use Amber's Antechamber module, which is included in Chimera, to assign force field parameters. This involves ass...

Constructing Phylogenetic Tree using UPGMA Method

Image
UPGMA (Unweighted Pair Group Method with Arithmetic Mean) is a distance-based method for constructing phylogenetic trees. It works by iteratively clustering the two closest groups of sequences together, forming a new cluster until all sequences are grouped into a single tree. The distances between clusters are calculated using the average of all pairwise distances between sequences within those clusters. UPGMA produces rooted trees, meaning it has a defined root representing the common ancestor. Here's a more detailed explanation: 1. Distance Matrix UPGMA begins with a distance matrix, which contains the pairwise distances between all sequences being compared. These distances can be based on sequence alignment, protein structure comparisons, or other relevant metrics. \[D_{i,j}=\max\begin{cases}D_{i-1,j-1} & + & s(a_i,b_j) \\D_{i-1,j} & + & s(a_i,-) \\D_{i,j-1} & + & s(-,b_j)\end{cases}=\max\begin{cases}D_{i-1,j-1}& + ...

Predicting Functional Regions in the Protein Sequence using SMART

Image
Prediction of functional regions in the protein sequence plays a crucial role in the computer-aided drug discovery. SMART (a Simple Modular Architecture Research Tool) helps identify and annotate protein domains and analyze domain architectures by BLAST search. In bioinformatics, domains refer to distinct functional, structural, or evolutionary units within proteins, DNA, or RNA. Here are some key types of domains in bioinformatics: 1. Protein Domains Structural Domains : Compact, independently folding units within a protein (e.g., SH3, zinc finger, immunoglobulin domains). Functional Domains : Regions responsible for specific biochemical activities (e.g., kinase domain, DNA-binding domain). Evolutionary Domains : Conserved regions indicating common ancestry (e.g., Pfam domains). 2. DNA/RNA Domains Regulatory Domains : DNA regions controlling gene expression (e.g., promoters, enhancers). Functional RNA Domains : Motifs in non-coding RNAs (e.g., ribozyme catalytic cor...