About me

I'm Abdullah Nayem Wasi Emran, currently working as a Lecturer at the Department of Computer Science and Engineering of Brac University. I have completed my Bachelor's degree in Computer Science and Engineering from Bangladesh University of Engineering and Technology (BUET). My research sits at the intersection of AI/ML for Healthcare and Bioinformatics, with a focus on methods that remain reliable across shifts in data and population—particularly in low-resource settings.

Recently, I led a cross-dataset study on mental health across diverse cohorts, published in Humanities & Social Sciences Communications (Nature Portfolio, 2025). On the bioinformatics side, our SCOPES work, accepted to the AAAI-26 AI2ASE Workshop, proposes stability-aware, cross-platform feature selection for matched TCGA microarray/RNA-seq data—targeting reproducible biomarkers rather than fragile, platform-specific signals. Broadly, I care about Causal and Fair Machine Learning: treatment-effect modeling, subgroup analysis, and evaluation protocols that reveal when a model’s apparent gains don’t translate to real-world impact. My goal is to design simple, principled pipelines—data curation, robust learning, and transparent metrics—that can be deployed without exotic infrastructure.

Outside of academics, I enjoy learning about advances at the intersection of biology and computing.

Research Experience

Research Experience

  1. Efficient RNA–Protein Interaction Prediction with Foundation-Model Embeddings

    May 2025 – Present

    Bioinformatics

    Computational Biology

    Deep Learning

    I am conducting an ongoing research project under Dr. Mohammad Saifur Rahman, on computational prediction of RNA–protein interactions. The project involves curating datasets from sources such as UniProt, EuRBPDB and developing preprocessing pipelines that integrate sequence and structural features. I am currently exploring the use of pre-trained protein language models (e.g., ESM2, ProtBERT) and lightweight self-supervised approaches, alongside potential graph-based methods, to design an embedding framework suitable for constrained GPU availability. The work is still in the dataset curation and model prototyping phase, but the long-term objective is to establish a scalable predictive framework for RNA–protein binding that can contribute to understanding post-transcriptional regulation and therapeutic target discovery.

  2. Domain-Adaptive Causal Uplift Modeling across Heterogeneous Mental-Health Cohorts

    Apr 2025 – Present

    Causal Inference

    Domain Adaptation

    Fair ML

    I am working with Dr. A. B. M. Alim Al Islam on developing a domain-adaptive causal uplift model for mental-health prediction across medical students, quarantined individuals, and clinically diagnosed psychiatric patients. My work involves building a unified preprocessing pipeline for harmonizing heterogeneous datasets, implementing a Domain-Adversarial Neural Network (DANN) with gradient reversal and multi-head outputs in PyTorch Lightning, and incorporating fairness-aware regularization to reduce disparities across gender and education groups. The current focus is on improving out-of-domain generalization using leave-one-cohort-out evaluation and AUUC as the primary metric, while exploring calibration techniques that account for inverted treatment effects in certain cohorts. This research aims to provide a fair and generalizable framework for treatment-effect estimation in behavioral health.

  3. Hybrid Fireworks–Whale Optimization for Multi-Objective Cloud Task Scheduling

    Nov 2023 – Mar 2025

    Cloud Computing

    Metaheuristic Optimization

    Scheduling

    Undergraduate thesis under Dr. Rezwana Reaz. Designed four hybrid meta-heuristics that fuse the Fireworks Algorithm (exploration) with the Whale Optimization Algorithm (exploitation) to minimise makespan, cost, and CPU / RAM / bandwidth usage on CloudSim Plus workloads. Added adaptive explosion amplitudes, time-varying encircling coefficients, and Gaussian-spark diversity controls for a balanced exploration–exploitation trade-off. Benchmarked the hybrids (SEQ, PAR, FWA-Encircling, WOA-Spark) on 100 – 500 independent tasks and the real-world GoCJ trace; achieved 2.5 – 7.5 % higher composite-fitness and shorter makespan than OBDFWA, MWOA, and GA-GWO baselines (Friedman + Nemenyi, p < 0.05). Open-sourced the complete CloudSim Plus toolkit, CSV logs, and Jupyter notebooks for full reproducibility on GitHub. Manuscript under review at Cluster Computing (Springer).

  4. Mental Health Across Contexts: A Cross-Dataset Study Covering Medical Students, Quarantined Individuals, and Psychiatric-Disordered Subjects

    Jul 2024 – Jan 2025

    Data Mining

    Mental Health

    Network Analysis

    I worked under the supervision of Dr. A. B. M. Alim Al Islam on a cross-dataset mental health research project involving three distinct populations: medical students, quarantined individuals, and patients with psychiatric disorders. Our goal was to uncover demographic and symptomatic trends using statistical modeling, network analysis, and machine learning. We performed causal inference using propensity score matching, identified variable interdependencies via clique-based network modeling, and incorporated intersectional effects with Bayesian and Random Forest models. Our study was published in Nature Humanities and Social Sciences Communications.

  5. Large-Language-Model-Enhanced IoT Threat Detection

    Nov 2024 – Present

    IoT Security

    LLMs (in progress)

    Explainable ML

    Working under the supervision of Dr. A. B. M. Alim Al Islam, I began by conducting a thorough exploratory analysis of the 13-million-row NF-ToN-IoT-v2 dataset, standardising feature types, encoding categorical fields, and experimenting with balanced-sampling schemes to address severe class imbalance. Classical models—Random Forest, LightGBM, and XGBoost—alongside 1-D CNN and bidirectional LSTM baselines were trained and benchmarked, achieving more than 98 % overall F1 yet revealing recall gaps on minority attack classes. Using SHAP and LIME, I built interactive dashboards that highlight how flow-level attributes such as dst_port, proto, and flow_duration influence predictions, laying the groundwork for transparent model auditing. The project is now integrating lightweight LLM-based embeddings as auxiliary features and refactoring the codebase into a dataset-agnostic pipeline so NF-BoT-IoT-v2, NF-CSE-CIC-IDS2018-v2, and NF-UNSW-NB15 can be ingested with minimal overhead. Upcoming milestones include finalising an imbalance-aware training strategy, executing cross-dataset evaluations, and drafting a results manuscript accompanied by a public notebook release.

Publications

Published

  1. Abdullah Nayem Wasi Emran, A. B. M. Alim Al Islam, “Domain Adaptive Uplift Modeling across Heterogeneous Mental Health Cohorts,” iScience, 2026 — Accepted (to appear).

  2. Abdullah Nayem Wasi Emran, Tanveer Rahman, “SCOPES: Stability-Aware Cross-Platform Feature Selection for Matched TCGA Gene Expression and RNA-Seq Data,” AAAI 2026 Workshop on AI to Accelerate Science and Engineering (AI2ASE), 2026.

  3. Abdullah Nayem Wasi Emran, A. B. M. Alim Al Islam, “Mental Health Across Contexts: A Cross-Dataset Study Covering Medical Students, Quarantined Individuals, and Psychiatric-Disordered Subjects,” Nature Humanities and Social Sciences Communications, 2025

Under Review

  1. Abdullah Nayem Wasi Emran, Majisha Jahan Disha, Rezwana Reaz, “Fusing Exploration and Exploitation: Hybrid Fireworks–Whale Optimization for Multi-Objective Independent Task Scheduling in Cloud Environments,” under review in Cluster Computing, 2025

Resume

Experience

  1. Lecturer (Full-time)

    2025–Present

    Department of Computer Science & Engineering (CSE), School of Data and Sciences (SDS), BRAC University, Dhaka, Bangladesh

    • Summer 2025: CSE 420 Compiler Design; CSE 421 Computer Networks (theory + lab)
    • Fall 2025: CSE 421 Computer Networks (theory); CSE 230 Discrete Mathematics (theory)

Education

  1. B.Sc. (Engg.), Computer Science & Engineering

    Bangladesh University of Engineering & Technology

    February 2020 – March 2025

    CGPA : 3.82 / 4.00
    Notable Courses : Introduction to Bioinformatics, Machine Learning, Artificial Intelligence, Simulation & Modelling, Data Structures and Algorithms, Software Engineering, Computer Security, Computer Networking, Operating System, Compiler, Computer Architecture, Microprocessor Microcontroller & Embedded Systems

  2. Higher Secondary Certificate

    Chattogram Cantonment Public College

    2017 – 2019

    GPA : 5.00

  3. Secondary School Certificate

    Cantonment English School and College

    2015 – 2017

    GPA : 5.00

Highlighted Projects

  • Computer Graphics

    OpenGL

    C++

    Ray Tracing

    Computer Graphics & Ray-Tracing Pipeline

    [Code]

    Built four demos: an OpenGL billiard board with a bouncing ball, a magic cube that morphs from sphere to octahedron, a four-stage raster pipeline covering modeling, view, projection and Z-buffer clipping, and a ray tracer that renders Phong-lit scenes with recursive reflections.

  • SCOPES: per-gene Agilent vs RNA-Seq agreement (Run B subset)

    Bioinformatics

    Machine Learning

    Feature Selection

    SCOPES: Stability-Aware Cross-Platform Feature Selection for Matched TCGA Gene Expression and RNA-Seq Data

    [Code]

    Built a leak-free, multi-objective pipeline to learn small, stable gene signatures on Agilent microarrays that transfer to RNA-Seq without re-training. SCOPES uses an unsupervised MAD slab and NSGA-II to balance three goals: AUC (accuracy), Kuncheva stability, and cross-platform alignment (MMD). On matched TCGA-BRCA (505 tumor / 25 normal), the “alignment-first” solution (1 gene) achieved AUCAgilent≈0.69, AUCRNA-Seq≈0.61 (ΔAUC≈−0.08), while a richer 30-gene set reached near-perfect source AUC but transferred poorly (AUCRNA-Seq≈0.62, ΔAUC≈−0.38). The figure shows per-gene platform agreement—panels near the diagonal align well; diffuse panels reveal platform mismatch that inflates MMD and hurts transfer.

  • Lightweight Cancer Cell Classifier

    Medical Imaging

    Deep Learning

    EfficientNet

    Lightweight DL for Histopathology Cancer Cell Classification

    [Code]

    Designed EfficientNet-B0 variants that separately add SEBlock, MSFF or CBAM attention; transfer-learned from PathMNIST, fine-tuned on CRC-VAL-HE-7K with aggressive augmentation, reached 98 % accuracy and showed SEBlock as the best accuracy-to-cost choice.

  • Mental Health Analysis

    Data Mining

    Mental Health

    Python

    Medical-Student Mental Health Analysis

    [Code]

    Examined survey responses from ~1000 medical students, modelling how gender, age and education relate to stress, anxiety and academic efficacy; statistical tests highlighted high-risk subgroups and key demographic predictors of poor mental health.

  • Compiler

    Flex

    Bison

    C++

    Toy Compiler

    [Code]

    Developed a compiler in C++ using Flex and Bison, featuring a lexer, LALR parser, symbol-table and semantic analysis, and generation of three-address intermediate code.

  • Computer Networks

    Networking

    Java

    NS-3

    Computer-Networks Lab Toolkit

    [Code]

    Implemented Java socket programs for host communication, built NS-3 simulations of diverse network topologies and ran TCP Adaptive Reno experiments to explore congestion-control behaviour.

  • RecipeShare

    Web Dev

    Next.js

    Django

    RecipeShare Platform

    [Code]

    Built a recipe-sharing site with a Next.js front-end and Django back-end that supports recipe upload and editing, blog posts, recipe requests, meal planning, fast search and generation of recipes from ingredient photos.

Skills

Skills

  • Languages

    Languages

    Python, C / C++, JavaScript, Java, Bash, SQL

  • Tools

    Tools & Frameworks

    NS-3, Autopsy, Next.js, Oracle DBMS, Git, Django, LaTeX

  • Libraries

    Libraries

    OpenGL, scikit-learn, Pandas, Matplotlib, Seaborn

  • Software

    Software

    Navicat, Wireshark, MS Office, Adobe Photoshop