Job title:	LLM Inference Engineer (SR)
Job type:	Permanent
Emp type:	Full-time
Industry:	IT & Telecommunications / IT・通信
Functional Expertise:	Technical (IT) / 技術職（IT）
Salary:	Negotiable
Location:	〒163-6006　東京都新宿区西新宿6-8-1 住友不動産新宿オークタワー6階
Job published:	2026-03-30
Job ID:	70843

Job Description

Career Opportunity for a LLM Inference Engineer in Japan!

■ LLM Inference Engineer

■ Company Overview

Japan-based AI technology company developing advanced large-scale language models and next-generation AI platforms, focused on delivering innovative AI-driven products and competing with leading global AI players.

■ Your Role and Responsibilities

● Build systems that translate the value of large-scale language models into real user-facing applications

● Design and manage development processes to improve engineering productivity

● Develop and operate large-scale distributed systems requiring high scalability and high availability

● Build and maintain infrastructure for machine learning model inference and online serving

● Optimize system performance for large-scale AI workloads

● Collaborate with engineering teams to design reliable and scalable AI platforms

● Contribute to continuous improvement of system architecture, monitoring, and operational processes

■ Experience and Qualifications

● 5+ years of experience in software engineering, machine learning infrastructure, or related fields

● Experience developing or operating large-scale distributed systems with high scalability and availability

● Strong system design and problem-solving skills

● Experience building or operating production systems

● Strong commitment to building high-quality and scalable systems

■ Additional Preferred Qualifications

● Experience designing systems running on on-premises or cloud GPU clusters

● Experience building high-availability systems across multiple data centers or regions

● Experience with distributed databases or large-scale search engines

● Experience designing online serving infrastructure for machine learning models

● Knowledge of inference optimization and performance acceleration for ML models

● Experience using LLM inference frameworks such as vLLM, SGLang, or TensorRT-LLM

● Experience designing monitoring and observability systems for distributed infrastructure

● Experience contributing to open-source software, publishing technical papers, or participating in technical communities

■ Good Reasons to Join

● Opportunity to work on advanced AI technologies and large-scale machine learning systems

■ Work Location

Tokyo, Japan

Details will be provided during the meeting.

Apply

Share this job

Consultant

Supriya Raut

supriya.raut@talisman-corporation.com

Similar jobs

Machine Learning Engineer - SR

Job type:	Permanent
Emp type:	Full-time
Salary:	Negotiable
Job published:	2026-03-30
Job ID:	70844

Software Engineer (AI Platform / Scalable Systems) (SR)

Location:	Tokyo
Job type:	Permanent
Emp type:	Full-time
Salary:	Negotiable
Job published:	2026-03-30
Job ID:	70842

Generative AI Engineer (SR)

Location:	〒163-6006　東京都新宿区西新宿6-8-1 住友不動産新宿オークタワー6階
Job type:	Permanent
Emp type:	Full-time
Salary:	Negotiable
Job published:	2026-03-30
Job ID:	70841

AI Product Manager (AI SaaS Platform)

Location:	〒163-6006　東京都新宿区西新宿6-8-1 住友不動産新宿オークタワー6階
Job type:	Permanent
Emp type:	Full-time
Salary:	Negotiable
Job published:	2026-03-30
Job ID:	70840

サイバーセキュリティマネージャー（製薬会社）

Job type:	Permanent
Emp type:	Full-time
Salary:	Negotiable
Job published:	2026-03-30
Job ID:	70839

Job Description

Our use of cookies