We are CHEN Jingtong, DENG Zifeng, LI Haoran, and SHEN Boyu from the Class of 2024 Digital Technology and Computer Science majors at Hainan Bielefeld University of Applied Sciences. In the laboratories of the Shanghai Jiao Tong University School of Medicine, we embarked on an extraordinary voyage across the ocean of life sciences, using code as our vessel and data as our sail. This internship was not only a journey of technical advancement but also a profound transformation from “writing code” to “understanding life.”
Chen Jingtong, Class of 2024, Digital Technology
My internship focused on PIP-seq single-cell transcriptomics technology. The core task was to analyze the differentiation trajectory of human pluripotent stem cells (D0) into mesenchymal cells (D7) and to systematically compare this technology with the industry gold standard, 10x Genomics. Working in a Linux environment, I became proficient in the PIPseeker upstream processing pipeline and independently completed the full analysis workflow from raw FASTQ data to the expression matrix. Faced with the complexity of the D7 samples, I precisely adjusted the force-cells parameter and applied a filtering condition of nFeature < 9000, successfully retaining high-quality cell populations and laying a solid foundation for subsequent analysis.
During the R language Seurat analysis, I constructed the developmental trajectory from POU5F1+ stem cells to COL1A2+ mesenchymal cells. I also unexpectedly discovered a unique CGA+ secretory subpopulation, offering a new perspective for understanding the diversity of cell differentiation. In the cross-platform technology evaluation, through positive/negative mapping and CCAHarmony data integration, I confirmed that PIP-seq performs comparably to 10x Genomics in terms of gene detection sensitivity and cell clustering ability and can capture the same biological subpopulations. This conclusion provides critical evidence for the clinical translation of this technology.
This experience deeply impressed upon me that bioinformatics is far from simply running code. It requires integrating biological context, such as EMT transitions and gene alias corrections, to interpret the principles of life hidden within the data.
Deng Zifeng, Class of 2024, Digital Technology
I am Deng Zifeng. My internship revolved around bioinformatics analysis of replication timing (Repli-seq). It involved grouping samples into four region categories: CE, CL, EtoL, and LtoE, and then validating chromatin accessibility and methylation differences by combining ATAC-seq and WGBS data. To ensure result reliability, I repeatedly verified the mutual exclusivity of groups, region repetitiveness, and the consistency of chromosome naming across different data sources. Rigorousness became a fundamental principle on my research path.
From the analysis results, I encountered an unexpected situation: CE regions exhibited higher ATAC signals in both stages, aligning with the classic understanding that “early-replicating regions are more open.” However, EtoL and LtoE did not show the expected significant reversal between the D0 and D7 stages, which differed from conclusions drawn in classical differentiation systems. Faced with this “imperfect” result, I did not rush to a conclusion. Instead, I delved into the sample background and biological processes. By performing differential gene expression screening via Bulk RNA-Seq and re-plotting the data, I ultimately observed the significant reversal phenomenon.
This experience taught me that bioinformatics analysis is not a numbers game chasing “beautiful results” but a scientific process of continuously testing hypotheses, checking details, and using logic to clarify conclusions. Starting from zero knowledge in biology, I underwent a transformation from a technical operator to a scientific thinker through daily literature reading and consultations with senior lab members.
Li Haoran, Class of 2024, Digital Technology
I am Li Haoran. My research focused on the long-term repopulation capacity of the CD44+ALDH+ population within human keratinocytes. By mining single-cell sequencing data, I analyzed the synergistic role of CD44 and ALDH in maintaining stem cell homeostasis and differentiation potential. When I observed the specific gene expression signature of this population in complex heatmaps and its high enrichment in classical stem cell pathways, the sense of accomplishment from unveiling the veil of life’s rules made me truly appreciate the allure of scientific research.
The rigorous and pure research atmosphere at the Shanghai Jiao Tong University School of Medicine deeply inspired me. Here, I witnessed bioinformatics becoming a crucial bridge connecting wet-lab and dry-lab experiments and interpreting the mysteries of life. I gradually understood that outstanding bioinformatics researchers require not only solid algorithmic skills but also profound biological knowledge to accurately identify biologically significant signals within vast amounts of data. This internship enabled me to leap from “theoretical learning” to “research practice.” Faced with the highly promising CD44+ALDH+ cell population, I not only mastered data processing logic but also honed patience and critical thinking when dealing with experimental variability. This reverence for life and dedication to technology will drive me to continue delving deeper into the field of biomedical informatics.
Shen Boyu, Class of 2024, Computer Science
I am Shen Boyu. My internship involved weighted gene co-expression network analysis and joint time-series analysis of ATAC-seq and RNA-seq. In eyelid aging-related research, starting from initial count matrices and RPKM matrices containing over ten thousand genes, I filtered down to 3000 genes showing greater inter-group than intra-group variation, reasonable expression levels, and consistent trends. Using RStudio to create visualizations like soft threshold selection plots and merged module dendrograms, I screened for key genes within core modules, determined their functional localization through GO enrichment analysis, and ultimately identified the core gene groups regulating eyelid aging.
In the multi-omics joint analysis of mouse samples, I performed pairwise comparison GO enrichment analysis on ATAC-seq data, completed gene annotation using the GREAT website, and conducted joint time-series analysis of ATAC-seq and RNA-seq using the mFuzz tool. Confronted with complex multi-omics data, I successfully overcame challenges in data filtering and visualization, profoundly appreciating the unique value of bioinformatics in aging research.
This challenging internship experience enabled the deep integration of classroom knowledge with actual scientific research. It also made me realize that bioinformatics analysis is not merely a stacking of technologies but a vital link connecting genotype and phenotype and revealing life processes.