案件名: Research Engineer・Research Scientist (Speech, Audio, and Music Foundation Models)
案件種類: Permanent
雇用形態: Full-time
業界: Consulting / コンサルティング
専門: Consulting / コンサルタント
給与: 交渉可
所在地: Tokyo
掲載済み案件: 2026-06-26
案件ID: 73461

職務内容

Career Opportunity: Research Engineer/ Scientist (Speech, Audio, and Music Foundation Models)  in Tokyo, Japan

Note: This position is open to both Japan-based and overseas candidates. No Japanese language proficiency is required; business-level English is sufficient.

■ Research Engineer/ Scientist 

■ Company Overview

Join a cutting-edge AI research team dedicated to advancing Speech, Audio, and Music Foundation Models. This role offers the opportunity to conduct world-class research using one of Japan's largest AI computing infrastructures while developing next-generation speech language models with real-world impact.

You will work alongside leading researchers to push the boundaries of Speech AI, Large Language Models (LLMs), Automatic Speech Recognition (ASR), Text-to-Speech (TTS), Speech Translation, and Spoken Dialogue Systems.

■ Your Role and Responsibilities 

●Research and develop state-of-the-art Speech Language Models.

●Design and improve technologies for: 

●Automatic Speech Recognition (ASR)

●Speech Synthesis (TTS)

●Speech Translation

●Spoken Dialogue Systems

●Build, generate, and preprocess large-scale speech datasets.

●Develop novel training methods combining linguistic knowledge with speech processing.

●Design benchmarks to evaluate Speech Foundation Models.

●Publish research findings through top-tier academic conferences, journals, and patent applications.

●Collaborate with multidisciplinary AI research teams on large-scale model development.

 

■ Experience and Qualifications

●Experience developing machine learning models in one or more of the following areas:

●Speech Recognition

●Speech Synthesis

●Speech Translation

●Spoken Dialogue Systems

●Strong programming skills in Python.

●Hands-on experience with PyTorch or similar deep learning frameworks.

●Experience using Git/GitHub for collaborative software development.

●Degree in Computer Science, Artificial Intelligence, Machine Learning, or a related field (Master's degree preferred).

●Strong problem-solving skills and the ability to drive large-scale AI training initiatives.

 

Preferred:

●Ph.D. in Computer Science, AI, Speech Processing, or a related field.

●Publications at leading AI conferences such as:

●ICASSP

●INTERSPEECH

●ACL

●EMNLP

●Experience with distributed model training.

●Business-level Japanese and English communication skills.

 

■ Good Reasons to Join

●Work on next-generation Speech Foundation Models with real-world impact.

Annual Salary: ¥6.5M – ¥18M

●Performance incentives may be provided separately.

●Salary is determined based on experience and qualifications.

●Access one of Japan's largest AI computing infrastructures.

●Collaborate with internationally recognized AI researchers.

●Publish research at leading international conferences.

●Contribute to technologies used by millions of users worldwide.

●Flexible work environment with strong support for research and innovation.

 

■ Work Location

Tokyo, Japan


If you're passionate about Speech AI, Audio AI, Music Foundation Models, Deep Learning, and Generative AI research, and want to work on cutting-edge technologies with significant real-world impact, we'd love to hear from you.

 

📩 Share your updated CV at kanika.pal@talisman-corporation.com

Details will be shared during the meeting.

ファイル (doc, docx, pdf, rtf) 2MB以下
ファイル (doc, docx, pdf, rtf) 2MB以下