Job title: LLM Inference Engineer (SR)
Job type: Permanent
Emp type: Full-time
Industry: IT & Telecommunications / IT・通信
Functional Expertise: Technical (IT) / 技術職(IT)
Salary: Negotiable
Location: 〒163-6006 東京都新宿区西新宿6-8-1 住友不動産新宿オークタワー6階
Job published: 2026-03-30
Job ID: 70843

Job Description

Career Opportunity for a LLM Inference Engineer in Japan!

 

■ LLM Inference Engineer

 

■ Company Overview

Japan-based AI technology company developing advanced large-scale language models and next-generation AI platforms, focused on delivering innovative AI-driven products and competing with leading global AI players.

 

■ Your Role and Responsibilities 

● Build systems that translate the value of large-scale language models into real user-facing applications

● Design and manage development processes to improve engineering productivity

● Develop and operate large-scale distributed systems requiring high scalability and high availability

● Build and maintain infrastructure for machine learning model inference and online serving

● Optimize system performance for large-scale AI workloads

● Collaborate with engineering teams to design reliable and scalable AI platforms

● Contribute to continuous improvement of system architecture, monitoring, and operational processes

 

■ Experience and Qualifications

● 5+ years of experience in software engineering, machine learning infrastructure, or related fields

● Experience developing or operating large-scale distributed systems with high scalability and availability

● Strong system design and problem-solving skills

● Experience building or operating production systems

● Strong commitment to building high-quality and scalable systems

 

■ Additional Preferred Qualifications

● Experience designing systems running on on-premises or cloud GPU clusters

● Experience building high-availability systems across multiple data centers or regions

● Experience with distributed databases or large-scale search engines

● Experience designing online serving infrastructure for machine learning models

● Knowledge of inference optimization and performance acceleration for ML models

● Experience using LLM inference frameworks such as vLLM, SGLang, or TensorRT-LLM

● Experience designing monitoring and observability systems for distributed infrastructure

● Experience contributing to open-source software, publishing technical papers, or participating in technical communities

 

■ Good Reasons to Join

● Opportunity to work on advanced AI technologies and large-scale machine learning systems

 

■ Work Location

Tokyo, Japan
 

Details will be provided during the meeting.

File types (doc, docx, pdf, rtf, png, jpeg, jpg, bmp, jng, ppt, pptx, csv, gif) size up to 5MB
File types (doc, docx, pdf, rtf, png, jpeg, jpg, bmp, jng, ppt, pptx, csv, gif) size up to 5MB