案件名: Data Engineer (SR)
案件種類: Permanent
雇用形態: Full-time
業界: IT & Telecommunications / IT・通信
専門: Technical (IT) / 技術職(IT)
給与: 交渉可
掲載済み案件: 2026-01-28
案件ID: 69110

職務内容

Career Opportunity for a Data Engineer in Japan!

 

■ Data Engineer

 

■ Company Overview

A research-driven organization at the intersection of AI and robotics, focused on advancing next-generation intelligent systems through cutting-edge research and real-world applications.

 

■ Your Role and Responsibilities 

● Design and implement large-scale data pipelines that cover the full lifecycle of high-quality datasets for robotics foundation models—collection, processing, curation, and publishing. 

● Design, build, and maintain data schemas, storage solutions, and query interfaces to enable VLA researchers to efficiently discover, query, and consume curated datasets. 

● Collaborate closely with VLA researchers to capture evolving data requirements and continuously improve data pipelines through analysis and experimentation.

● Design and scale distributed data-processing pipelines capable of handling petabyte-scale multimodal datasets (e.g., RGB/Depth, point clouds) with full lineage and reproducibility. 

● Define data-quality metrics and build feedback loops to continuously monitor and improve data quality.

 

■ Experience and Qualifications

● Master’s degree in Computer Science, Engineering, or related field (or equivalent practical experience).

● 5+ years professional experience in data engineering / data platform development.

● Experience in designing and operating large-scale ETL / ELT pipelines using Spark, Flink, Ray or similar distributed engine.

● Designed or led implementations using Delta Lake, Apache Iceberg, or Hudi.

● Integrated with Trino, Athena, Databricks SQL, or Glue/Unity Catalog

● Defined schema evolution, ACID compliance, partitioning strategy, time travel, and cost-performance optimization.

 

■ Additional Preferred Qualifications

● Experience working with terabyte or petabyte-scale datasets.

● Expertise in data lake storage systems such as Apache Iceberg or Delta Lake with query systems such as Trino and catalog systems such as Nessie.

● Expertise in distributed processing frameworks like Spark, Flink, or Ray.

 

■ Good Reasons to Join

● Be part of a research-first environment

 

■ Work Location

Tokyo, Japan
 

Details will be provided during the meeting.