Call for Interns

OMRON SINIC X (OSX) is looking for research interns throughout the year to work with our members in challenging research projects on a variety of topics related to robotics, machine learning, computer vision, and HCI. Many students have participated in our internship program, and their achievements have been published as academic papers at international conferences such as CVPR, ICML, IJCAI, ICRA, CoRL, or as OSS libraries. For more information about our activities at OSX, please visit Medium and GitHub

Conditions How to apply Explorer Activities

Keyword Search

Select keyword(s) to narrow down available projects

Machine learning
Robotics
Computer vision
Interaction
Signal processing
Natural language processing
Algorithm
Development

No results

Constructing a Dataset for Understanding Procedural Videos in Cell Biology Experiments Using Vision and Language Technologies

Our focus in this study is to construct a novel dataset targeting procedural videos captured within clean benches during cell biology experiments. We aim to submit this dataset paper to venues like the NeurIPS Dataset Track. Preference will be given to individuals who have access to environments where they can independently capture such videos.

Required skills and experience
- Programming skills in Python or similar, for statistical analysis of datasets and video analysis using VLLM
Preferred skills and experience
- Knowledge/experience in cell biology experiments
- Experience in writing papers in related fields

Machine learning
Computer vision
Interaction
Natural language processing
Development
Dataset Paper

Research on Vision-Language Models Sensitive to State Changes in Videos

This research focuses on developing Vision-Language Models capable of detailed understanding of object state changes and event transitions within videos.

Required skills and experience
- Knowledge and experience with Python, GitHub, Docker, and deep learning
Preferred skills and experience
- Knowledge and experience in natural language processing
- Knowledge and experience in computer vision

Machine learning
Computer vision
Natural language processing

Research on integration of physical or algorithmic principles into machine learning

In this research project, we aim to develop novel models and methods that seamlessly integrate machine learning with classical problems—such as physics simulations and mathematical optimization—that are grounded in physical and algorithmic principles. Based on the research outcomes, we will prepare and submit a paper targeting top-tier international conferences in the field of machine learning, such as NeurIPS, ICML, and ICLR.

Required skills and experience
- Experience with deep learning, either through research or reproducing existing methods
- Either 1) good understanding about geometric deep learning or 2) a Bachelor's degree or higher in physics or mathematics
- Python development skills
Preferred skills and experience
- Knowledge of geometric mathematics, such as spherical harmonics and rotational equivariance
- Background in physics or mathematics (e.g., Bachelor's degree in physics or math)
- Research and/or developement experience in geometric deep learning
- Skills in implementing custom forward/backward functions and custom GPU kernels (e.g., with PyTorch)
- Proficiency in C++
- Experience using GitHub/GitLab and Docker

Machine learning
Algorithm
Geometric deep learning
Physics

Development of an Explainable Language-Guided Planning Framework

In this project, we aim to create a framework for the automated generation of "trustworthy" task plans through language guidance. We will release it as open-source software and aim to submit to the OSS Track at international conferences like ACM Multimedia.

Required skills and experience
- Basic knowledge of CUI
- Research and development experience with Python
Preferred skills and experience
- Experience participating in development projects for systems with a stable user base

Interaction
Algorithm
Development

3D-printed one-piece force transmission mechanism design

We will develop a design method for a 3D-printed one-piece force transmission mechanism to be used for lightweight manipulators that are actuated from the root, such as the manipulator we developed in the past (https://omron-sinicx.github.io/twistsnake/).

Required skills and experience
- Experience in robotic mechanism design and 3D printers
Preferred skills and experience
- Robot competition
- Experience in team-based development
- Experience of receiving an award from an academic society or a scholarship
- Experience of coding in ROS, Python, or C++ in the development of a robot

Robotics
Mechanics design

Learning of object manipulation skills generalized to objects and tasks

We will develop a method for learning of object manipulation skills generalized to objects and tasks.

Required skills and experience
- Publication record in robot learning (ICRA, IROS, CoRL, ICML, NeurIPS, ICLR, etc.)
- Experience in manipulation learning
Preferred skills and experience
- Management of projects and code using git/GitLab/GitHub
- Experience in team-based development

Machine learning
Robotics
Manipulation
Imitation learning

Learning Manipulation with Segmentation Using Large-Scale Language Models

We will develop a method for learning manipulation with segmentation using Large-Scale Language Models

Required skills and experience
- Publication record in robot learning (ICRA, IROS, CoRL, ICML, NeurIPS, ICLR, etc.)
- Experience in learning
Preferred skills and experience
- Management of projects and code using git/GitLab/GitHub
- Experience in team-based development

Machine learning
Robotics
Manipulation
Large language model

Research on robust image recognition model and multi-modal language model

Image recognition models are noted to be vulnerable to domain shifts caused by environmental changes. This project aims to construct robust image recognition models resilient to environmental changes. We plan to select research topics on robust image recognition models from various fields, including but not limited to generalization of image classification models and applications such as Vision-Language models, and aim to submit them to international conferences.

Required skills and experience
- Experience in Python development
- Experience in developing machine learning model for image recognition
Preferred skills and experience
- Knowledge of image recognition model
- Knowledge of transfer learning and multi-modal model
- Experience writing papers in related fields

Machine learning
Computer vision
Natural language processing
Domain Generalization
Image recognition
Multi-modal model

AI for Science

You will work on AI research that accelerates and automates research and development itself. You will participate in partial projects in the realization of AI scientists who can formulate research claims, run experiments, analyze the results, and write papers in an interactive co-evolution with human researchers.

Required skills and experience
- Python, Github, Docker
- Knowledge and experience in deep learning
Preferred skills and experience
- Knowledge and experience in natural language processing, computer vision, and data science
- Mathematical knowledge and formulation ability in machine learning and deep learning

Interaction
Algorithm
Development

Research on Practical LLMs in the Medical Field

This research explores the practical application of large language models (LLMs) in the medical field, focusing on solving real-world challenges in diagnostic support, treatment planning, and patient communication. Key approaches include designing multimodal models that integrate diverse data for intuitive use, fine-tuning LLMs to enhance medical expertise, and connecting external knowledge bases for up-to-date medical information access. Furthermore, to support timely decision-making in information-rich medical environments, we also consider a multi-agent LLM system in which specialized agents collaborate to extract and present relevant information. By addressing these aspects, the study aims to reduce the workload of healthcare professionals, improve clinical efficiency, and develop foundational technologies that can be effectively utilized in real medical settings.

Required skills and experience
- Publication record in related fields
- Knowledge and research/development experience with LLMs
Preferred skills and experience
- Experience related to the medical field

Machine learning
Computer vision
Interaction
Natural language processing
LLM
Medical

Research on 3D vision including visual SLAM and NeRF

In this research project, we aim to develop novel models and methods for image-based 3D sensing technologies, such as Visual SLAM and NeRF. Based on the research outcomes, we will prepare and submit a paper targeting top-tier international conferences in the field of computer vision, such as CVPR, ICCV, and ECCV.

Required skills and experience
- Experience with deep learning, either through research or reproducing existing methods
- Good mathematical understanding about 3D geometries (e.g., perspective projection, rotation, translation, etc.)
- Python development skills
Preferred skills and experience
- Knowledge and experience in 3D deep learning or classical VSLAM
- Knowledge and experience in munerical optimization
- Skills in implementing custom forward/backward functions and custom GPU kernels (e.g., with PyTorch)
- Proficiency in C++
- Experience using GitHub/GitLab and Docker

Machine learning
Computer vision
3D vision

Study on multi-agent path planning algorithms from the topological viewpoint

We will study approaches to multi-agent path planning from the topological viewpoint. Accepted interns are expected to work in collaboration with the mentors to submit research results to top international conferences in the field of artificial intelligence and machine learning.

Required skills and experience
- Basic knowledge in topological geometry and computational geometry
Preferred skills and experience
- Research and development experiences on path planning, especially multi-agent path planning
- Publication record in the field of artificial intelligence (e.g., AAAI, IJCAI, AAMAS, ICML, NeurIPS, ICLR)
- Expert knowledge in topological geometry and computational geometry

Algorithm
Path planning
Multi-agent system

Research on application of machine learning to physics simulation method and results

We will conduct research and development on the application of machine learning to the algorithms or the data analysis of physics simulations, e.g. DFT/MD/Tensor Network, and write papers to journals e.g. Nature/Science, Physical Review, or conferences e.g. SC/ICML.

Required skills and experience
- Knowledge and experience in physics simulation e.g. DFT/MD/Tensor Network
Preferred skills and experience
- Knowledge and experience in machine learning
- Pytorch, Python
- Github, Docker

Interaction
Algorithm
Development

Research on application of machine learning methods to physics simulation

We will conduct research and development on the application of your machine learning methods to physics simulations and write papers in related fields (Nature/Science, Physical Review, SC/HPCG, etc.).

Required skills and experience
- Knowledge and experience in machine learning
Preferred skills and experience
- Knowledge and experience in physics simulation e.g. DFT/MD/Tensor Network
- Pytorch, Python
- Github, Docker

Interaction
Algorithm
Development

Conditions

Term:	From 3-month duration (assuming 5 working days a week). Start and end dates can be adjusted. Some projects accept short-term interns from 1-month duration.
Hours:	Full-time or part-time (e.g. 3 days a week, etc. negotiable). 45-minute breaks. Holidays and weekends off.
Location:	On-site, hybrid, or remote options are available. Hybrid and remote options are only available if you live in Japan; due to legal issues, we cannot pay salaries to remote interns who live outside of Japan. If you join our internship program, you must come to Japan. In such cases, we offer support for travel expenses. Some internship projects may require on-site work. In this case, you will be assigned to one of our offices in Hongo or Shinagawa.
Salary:	Full-time monthly salary ranges from 240,000 JPY to 480,000 JPY. Hourly rate is applied for part-time work. Social security and other benefits are provided according to the working conditions. Transportation and housing expenses are fully covered. In addition, other expenses necessary for research activities (PC, laptop, etc.) are fully supported.
Language:	Japanese or English (English-only communication is also fine.)
Others:	Two or more mentors with extensive research experience will provide in-depth support for each project. Computational resources (workstations, and server clouds with GPUs) and robotic facilities (robotic arms, various sensors, 3D printers, motion capture systems, and other prototyping and experimental equipment) are available.

How to apply

Please fill the application form. We will first screen each application based on those information.
For other inquires, please contact internships@sinicx.com. We will first screen each application based on those information.

Those who pass the above screening will be interviewed remotely. Please prepare slides or other materials to introduce past research and development activities and achievements.

Please contact us at least three months in advance if you need a visa to enter Japan.

Explorer Activities