Job title: LLM Inference Engineer (SR)
Job type: Permanent
Emp type: Full-time
Industry: IT & Telecommunications / IT・通信
Functional Expertise: Technical (IT) / 技術職(IT)
Salary: Negotiable
Location: 〒163-6006 東京都新宿区西新宿6-8-1 住友不動産新宿オークタワー6階
Job published: 2026-03-30
Job ID: 70843

Job Description

Career Opportunity for a LLM Inference Engineer in Japan!

 

■ LLM Inference Engineer

 

■ Company Overview

Japan-based AI technology company developing advanced large-scale language models and next-generation AI platforms, focused on delivering innovative AI-driven products and competing with leading global AI players.

 

■ Your Role and Responsibilities 

● Build systems that translate the value of large-scale language models into real user-facing applications

● Design and manage development processes to improve engineering productivity

● Develop and operate large-scale distributed systems requiring high scalability and high availability

● Build and maintain infrastructure for machine learning model inference and online serving

● Optimize system performance for large-scale AI workloads

● Collaborate with engineering teams to design reliable and scalable AI platforms

● Contribute to continuous improvement of system architecture, monitoring, and operational processes

 

■ Experience and Qualifications

● 5+ years of experience in software engineering, machine learning infrastructure, or related fields

● Experience developing or operating large-scale distributed systems with high scalability and availability

● Strong system design and problem-solving skills

● Experience building or operating production systems

● Strong commitment to building high-quality and scalable systems

 

■ Additional Preferred Qualifications

● Experience designing systems running on on-premises or cloud GPU clusters

● Experience building high-availability systems across multiple data centers or regions

● Experience with distributed databases or large-scale search engines

● Experience designing online serving infrastructure for machine learning models

● Knowledge of inference optimization and performance acceleration for ML models

● Experience using LLM inference frameworks such as vLLM, SGLang, or TensorRT-LLM

● Experience designing monitoring and observability systems for distributed infrastructure

● Experience contributing to open-source software, publishing technical papers, or participating in technical communities

 

■ Good Reasons to Join

● Opportunity to work on advanced AI technologies and large-scale machine learning systems

 

■ Work Location

Tokyo, Japan
 

Details will be provided during the meeting.