About the Team The Spatial Intelligence CV group builds production-grade computer vision systems that power multimodal and spatially-aware products, from property imagery understanding to 3D scene analysis. We focus on delivering reliable, high-performance services—turning state-of-the-art vision models into scalable, user-facing products. About the Role You will design, optimize, and scale computer vision inference software running on both on-prem and cloud GPU clusters. Beyond writing production-quality Python, you will integrate advanced vision models into end-to-end pipelines that deliver real business value in domains such as architecture, real estate, and interior design. Your work will directly improve system accuracy, throughput, and cost efficiency. In this role, you will Build reusable Python inference pipelines for object detection, segmentation, and image generation at scale. Implement and optimize training/inference flows for YOLO, U-Net, Segment Anything Model (SAM), and Stable Diffusion. Fine-tune diffusion and LoRA-based models for domain-specific generation and augmentation. Develop and integrate solutions in Novel View Synthesis, 6D Pose Estimation, Monocular Depth Estimation, Visual Question Answering (VQA), and Style Transfer. Build Scene Graph pipelines to support space design, dimension measurement, and spatial understanding. Conduct performance profiling and latency/throughput tuning for CV back-ends on GPU clusters. Collaborate with product teams to deliver features such as automated property feature extraction, content tagging, and high-quality image synthesis. You might thrive in this role if you Enjoy turning cutting-edge CV and spatial reasoning research into robust, scalable production services. Have experience across multiple vision tasks and can quickly adapt to new architectures. Are motivated by measurable impact—accuracy gains, inference speedups, or improved user experience. Communicate clearly with cross-functional stakeholders in Mandarin. Minimum Qualifications Master’s degree (or higher) in CS, EE, Applied Math, or related field, or equivalent practical experience. One or more academic or professional projects related to computer vision. Hands-on experience with OpenCV, PyTorch, Git, Docker. Practical experience with YOLO, U-Net, Segment Anything, and Stable Diffusion. Familiarity with at least two of the following: Novel View Synthesis, 6D Pose Estimation, Monocular Depth Estimation, VQA, Style Transfer. Understanding of Scene Graph concepts and their application in spatial reasoning tasks. Bonus Qualifications Experience in classification, segmentation, detection across various datasets. Familiarity with ControlNet or other controllable generation techniques. Publications in top-tier CV/AI conferences (CVPR, ICCV, ECCV, NeurIPS, ICML). Experience operating large-scale CV inference services with measurable performance metrics.
年薪1,400,000~2,400,000元
(固定或變動薪資因個人資歷或績效而異)不拘
未填寫
【法定項目】 週休二日、勞保、健保、勞退提撥金、優於勞基法特休 【其他福利】 咖啡吧、零食區、健身房 ▌ HOMEE.AI 以人為本,每位員工都是我們重要的家庭夥伴,我們致力提供優於勞基法規定的福利與升遷制度,在歡樂的工作氛圍中,讓夥伴們與公司共同快速成長。 ▌提供彈性上班時間,我們歡迎自律的優秀人才加入我們。 ▌提供優於勞基法的休假規定及福利。 ▌我們擁有完善升遷及調薪制度與明確的績效考核,促進人才的職涯發展及成長。 ▌公司大樓附屬健身房、休閒區、瑜伽室可以申請使用。 ▌歡樂的公司文化及環境,團隊年輕活力,工作氣氛佳,內部透明溝通、就事論事。 ▌零食區多元餐點供你挑選,吃得開心健康。 ▌茶水間提供新鮮咖啡豆現磨咖啡、茶葉。