Junior Research Scientist-speech output

09/04更新
40 分鐘前聯絡過求職者
徵才積極度:非常活躍
應徵

工作內容

CerenceAI China RD is seeking a Research Engineer to design and implement the next-generation text-to-speech systems and applications. In this role, with members in globe, you will work both frontend (Grapheme-to-phoneme, Text Normalization, phrasing and prosodic control, etc) and backend (acoustic modeling, neural vocoding) components of engine pipelines and end-2-end system for major languages and dialects. Your work will directly impact CerenceAI products ranging from voice assistants to accessibility tools, delivering natural, expressive, and multilingual speech synthesis. Job Description Representative responsibilities/duties will include but not limited to: • Design and optimize text/NLP preprocessing pipelines with Deep Learning or Machine Learning methods, including Grapheme-to-phoneme (G2P) conversion for multilingual support; Text normalization; polyphone disambiguation; Prosody prediction and control • Integrate language models (e.g., BERT, GPT variants) to improve contextual and semantic understanding for natural intonation • Develop rule-based and neural solutions for emotion/style control in synthesized speech • Build state-of-the-art acoustic models (e.g., Tacotron, FastSpeech, VITS) to map linguistic features to spectrograms or waveform parameters. • Optimize neural vocoders (e.g., WaveNet, HiFi-GAN, MelGAN, LPCNet) for high-fidelity, real-time speech synthesis • Optimize inference latencies for both edge devices and cloud platforms • Enhance robustness through noise suppression, speaker adaptation, and multilingual/cross-language/cross -gender voice cloning Knowledge, skills, and qualifications Education: Master in CS, AI, EE, Math, or related field. Required/preferred skills: • hands-on experience in speech generative system development with deep expertise in both frontend and backend components • Proficiency in C/C++ and Python, with mastery of ML frameworks (PyTorch, TensorFlow, etc) • Some background in NLP techniques and/or speech signal processing is welcome • Knowledge on transformer-based language models for prosody prediction • Basic understanding of autoregressive / non-autoregressive acoustic models and neural vocoders • Experience in optimizing models via quantization, pruning, or knowledge distillation • Experience with ONNX Runtime, TensorRT, or TorchScript, etc • Experience with zero-shot/one-shot/few-shot voice cloning or emotional TTS systems • Skilled GPU/TPU cluster and grid user • Fluent English is a must-have

工作待遇

月薪90,000~120,000元

(固定或變動薪資因個人資歷或績效而異)

工作性質

全職

上班地點

台北市松山區復興北路367號8樓 (距捷運中山國中站約120公尺)

遠端工作

部分遠端

管理責任

不需負擔管理責任

出差外派

無需出差外派

上班時段

日班

休假制度

週休二日

可上班日

不限

需求人數

不限

條件要求

工作經歷

1年以上

學歷要求

碩士

科系要求

資訊工程相關、其他數學及電算機科學相關、通信學類

語文條件

不拘

擅長工具

其他條件

未填寫

歡迎所有求職者,與
應屆畢業生

福利制度

周末双休;法定年假/额外年假;团队保险;每年一次体检报销;EAP服务

聯絡方式

聯絡人

Nora

電洽

133-26015425

應徵回覆

合適者將於15個工作天內主動聯繫,不合適者將不另行通知
104人力銀行提醒您履歷關閉時仍可投遞履歷喔!面試時請遵守求職禮儀準時赴約並小心安全
求職安全專線【勞動部】0800-085-151【104人力銀行】02-29126104轉2 或來信詢問
建議使用104內建訊息功能,以保障您的求職權益,職缺內容可能包含第三方通訊軟體,敬請謹慎評估。
職場安全提醒

適合你大展身手的工作

智能客服
您好,我是您的智能客服 找頭鹿有任何問題都可以問我喔!