We seek an experienced Data Engineer to join our innovative drug discovery team. The candidate will build tens of thousands of chemical and experimental data into a data pool. Will support the design and preparation of datasets and databases, including data collection, cleaning, normalization, and export. Additionally, you will create an automated data update pipeline and develop visualization tools. As a team member, you will work with cross-functional teams to integrate AI technologies into our drug discovery processes and mentor engineers to provide technical guidance and support. What You'll Do : o Build large-scale drug and experiment-related databases, extract, transform, clean, normalize, and integrate data from different databases. o Support the team leader in creating training datasets for model development. o Collaborate with other teams to develop database visualization tools and create related APIs for AI engineers. o Perform statistical analysis of outliers and clustering using a background in medicinal chemistry o Establish the pipelines of automated data updates. Minimum Qualifications : o Must have a background in biology or chemistry. o Proficiency in Python o Experience in handling and organizing related databases such as Chembl/PubChem, etc. o Experience using FastAPI or other RESTful APIs. o Experience building PyTorch datasets/dataloaders o Familiarity with various chemical notations and the ability to understand different biochemical experiment theories and methods o Experience with multithreading and handling data on a billion-scale. o Have a strong ability of problem-solving and ability to collaborate and work as part of a team Nice To Have : o Experience handling different types of drug data(Small molecular, Antibody …). o Familiarity with 3D structural data o Experience building DL/RL-related models or algorithms o Good debugging and performance profiling skills for Python
待遇面議
(經常性薪資達 4 萬元或以上)
未填寫