Role Summary *** this position only open for experienced engineer with DNA AI Analysis*** We are seeking an AI researcher with a deep technical background in model architecture design and a passion for open-source genomics. This role focuses on building and maintaining state-of-the-art machine learning models specifically designed for DNA sequence understanding, including language models, encoders, and predictive sequence transformers. Your work will shape the future of programmable biology, genome interpretation, and foundational models for genomics. Key Responsibilities * Design, fineture, and benchmark deep learning architectures (e.g., CNNs, Transformers, LSTMs, diffusion models) for DNA sequence tasks such as promoter prediction, enhancer classification, mutation impact scoring, and epigenomic signal inference. * Lead the development of open-source genomic AI models (e.g., BERT-style for DNA, Enformer-like architectures, foundation models for genomics). * Optimize models for long sequences (10k–100k bp) using memory-efficient attention, sparse encoding, and segment-based learning. * Contribute to and maintain public GitHub repositories of models, datasets, and benchmarks for the genomics AI community. * Integrate experimental annotations (e.g., ATAC-seq, ChIP-seq) as conditioning signals in multi-modal architectures. * Collaborate with academic and industry partners on pretraining strategies over large-scale human and non-human genomes. * Drive internal model interpretability efforts using saliency, attention attribution, and motif visualization tools. Required Qualifications * experience in Machine Learning, Bioinformatics, Computational Biology, or related field. * Demonstrated expertise in deep learning architectures, especially sequence modeling (e.g., BERT, GPT, Performer, Perceiver IO, Hyena, RWKV). * Strong proficiency in PyTorch or JAX/Flax and hands-on experience building custom model layers. * Experience with DNA/RNA sequence data, k-mer tokenization, reverse complement invariance, and genome data structures (e.g., FASTA, BED). * Strong open-source track record (e.g., GitHub projects, community tools, published preprints/code). Preferred Skills * Contributions to tools like DNABERT, Enformer, AlphaFold, Genome Transformer, or related models. * Experience training large models on genome-scale data (e.g., full reference genomes, species pangenomes). * Familiarity with distributed training, mixed precision, and hardware optimization * Knowledge of bioethical concerns and data governance in genomic datasets.
待遇面議
(經常性薪資達 4 萬元或以上)
未填寫
優於勞基法之福利制度,