Apply forLLM Inference Engineer (SR), in 〒163-6006　東京都新宿区西新宿6-8-1 住友不動産新宿オークタワー6階

Job title:	LLM Inference Engineer (SR)
Job type:	Permanent
Emp type:	Full-time
Industry:	IT & Telecommunications / IT・通信
Functional Expertise:	Technical (IT) / 技術職（IT）
Salary:	Negotiable
Location:	〒163-6006　東京都新宿区西新宿6-8-1 住友不動産新宿オークタワー6階
Job published:	2026-03-30
Job ID:	70843

Job Description

Career Opportunity for a LLM Inference Engineer in Japan!

■ LLM Inference Engineer

■ Company Overview

Japan-based AI technology company developing advanced large-scale language models and next-generation AI platforms, focused on delivering innovative AI-driven products and competing with leading global AI players.

■ Your Role and Responsibilities

● Build systems that translate the value of large-scale language models into real user-facing applications

● Design and manage development processes to improve engineering productivity

● Develop and operate large-scale distributed systems requiring high scalability and high availability

● Build and maintain infrastructure for machine learning model inference and online serving

● Optimize system performance for large-scale AI workloads

● Collaborate with engineering teams to design reliable and scalable AI platforms

● Contribute to continuous improvement of system architecture, monitoring, and operational processes

■ Experience and Qualifications

● 5+ years of experience in software engineering, machine learning infrastructure, or related fields

● Experience developing or operating large-scale distributed systems with high scalability and availability

● Strong system design and problem-solving skills

● Experience building or operating production systems

● Strong commitment to building high-quality and scalable systems

■ Additional Preferred Qualifications

● Experience designing systems running on on-premises or cloud GPU clusters

● Experience building high-availability systems across multiple data centers or regions

● Experience with distributed databases or large-scale search engines

● Experience designing online serving infrastructure for machine learning models

● Knowledge of inference optimization and performance acceleration for ML models

● Experience using LLM inference frameworks such as vLLM, SGLang, or TensorRT-LLM

● Experience designing monitoring and observability systems for distributed infrastructure

● Experience contributing to open-source software, publishing technical papers, or participating in technical communities

■ Good Reasons to Join

● Opportunity to work on advanced AI technologies and large-scale machine learning systems

■ Work Location

Tokyo, Japan

Details will be provided during the meeting.

Upload Resume | Portfolio

File types (doc, docx, pdf, rtf, png, jpeg, jpg, bmp, jng, ppt, pptx, csv, gif) size up to 5MB

Upload Cover letter

File types (doc, docx, pdf, rtf, png, jpeg, jpg, bmp, jng, ppt, pptx, csv, gif) size up to 5MB

First name

Last name

Phone number

Location

Note

Nationality / 国籍

Country where you live now / 現在の居住国

Date of birth / 生年月日

Current salary / 現在の年収

Desired Salary / 希望年収

Current Company / 現職の企業名

Japanese Language Level / 日本語レベル

Native

Fluent

Business Level

Conversational

Basic

None

Permit to work in Japan / 日本での就労許可

Yes / あり

No / なし

By checking this box, you agree to our Privacy Policy and Terms of Service

Job Description

Our use of cookies