Works


Work Experience in Statistical and Human Genetics

 I specialize in analyzing large-scale genomic data, including GWAS of psychiatric disorders involving Vitamin D, Rare-Variant Burden Analysis, and Common-variant GWAS for complex disorders like Schizophrenia and Bipolar disorder. I also perform Pathway Analysis to understand genetic compensation in zebrafish and utilize Mendelian Randomization to identify the causality of psychiatric disorders. My work extends to integrating multidimensional data from diverse omics datasets, such as RNA, DNA, Chip, ATAC, and nanopore sequencing, to explore the landscape of human diseases. Collaboration and cross-functional work are key aspects of my approach.

Working experience with high-performance compute clusters at both Yale and Stanford.

Relevant work experience at the interface of statistical genetics in the human microbiome

Gut microbiota plays an important role in the bidirectional communication between the gut and the central nervous system. Mounting evidence suggests that gut microbiota can influence brain function via neuroimmune and neuroendocrine pathways as well as the nervous system. Advances in sequencing techniques help in facilitating the investigation of the underlying relationship between gut microbiota and psychiatric disorders.

Goals pursued – Meta-analysis of data which is in scarcity from existing databases to studies in papers is under consideration.

Relevant Post-doctoral work at Yale Division of Psychiatric and Human Genetics

  1. Trying to understand the causality of complex psychiatric diseases to better understand the prognosis of the disease, therapeutic choices, and better clinical interventions. Some of the techniques used include calculation of Polygenic Risk Scores, Mendelian Randomization, Generalized Linear Models, and Population Genetics, Multi trait Conditional Joint Analysis, Genomic Structure-based Equation Modelling to decipher the role of Vitamin D GWAS with respect to Psychiatric diseases.

Post-Doctoral work at Yale Centre for Cardiovascular Research

  • Transcriptomic analysis of zebrafish RNA sequencing data to understand genomic regulation in zebrafish – Using bulk-scale transcriptomics and single-cell transcriptomics bioinformatics approaches to understand the phenomenon of genetic compensation in zebrafish with respect to the cardiovascular disease model under Dr. Nicol Stefania at Yale Centre for Cardiovascular Research (Manuscript under preparation). Cancer Biology experience- Familiarity with TCGA, Oncomine database leveraging data from metabolomic, proteomic, and exome sequencing profiles for biomarker development.

  • Single-cell RNA sequencing experience- I have experience with single-cell RNA sequencing data, especially with microRNA mutants which show a transition from hematopoietic to stem cell development. I am also working with PBMC (peripheral blood mononuclear cell) datasets using packages like Seurat and Monocle.  I have experience using UMAP architecture to understand trajectories in single-cell datasets. Apart from that Yale computer Scientists have developed a technique called PHATE ((Potential of Heat-diffusion for Affinity-based Trajectory Embedding) a python-based tool for visualizing high dimensional data. PHATE uses a novel conceptual framework for learning and visualizing the manifold to preserve both local and global distances that has advantages over usual trajectory analysis with UMAP.  I have also done RNA velocity analysis to find a proper trajectory for epithelial to hematopoietic transitions in microRNA mutants to see dynamic differences between spliced and unsliced isoforms.

Finally, I have applied convolutional neural networks (CNN) and supervised, unsupervised learning especially with respect to hierarchal, agglomerative, and non-Euclidean clustering to make sense of biological data about which clusters of single cells are more informative than others and when biological cells undergo the transition from one state to other. I have used both Python and R scripts to achieve these objectives. I have also compared different perturbation algorithms that can be used to make sense of large-scale biological data which includes the whole landscape of single cells.

Current work at Stanford University in brief

In my current postdoc at Stanford University, I am using a combination of sequencing technologies like RNA sequencing, ATAC sequencing, Chip-seq, Cut and Tag, and Oxford nanopore sequencing to help deliver better AAV viral vectors for effective drug design.

I am currently using neural network-based deep learning models using fold enrichment scores and entropy-driven modeling approaches for vector library design with better packaging efficiency and capsid library diversity.

Relevant machine learning and deep learning experience

  • Machine Learning techniques used– Linear and Logistic Regression. Random Forest, Convolution Neural Networks, and Hierarchical clustering to make sense of complex polygenic multimodal biological data.

Proficient in applying a diverse range of machine learning techniques, including random forests, clustering, and dimensionality reduction, within the context of systems biology and systems medicine especially in the context of single-cell perturbation data sets. Skilled in harnessing the potential of these algorithms to unravel complex patterns and relationships within extensive human genetics datasets.

  • Deep learning- Using convolutional neural networks, and U-Net architecture to solve image segmentation problems using tools and interfaces like Cellpose. Workflows deplored- Snakemake and Nextflow, GATK.
  • Programming experience-Python, R, Bash, Matlab, C , Java, version control with Git.