About the Role
We are seeking a highly skilled Research Scientist/Engineer to advance the reasoning and planning capabilities of large foundation models. In this role, you will enhance model performance across the entire development lifecycle—including data acquisition, supervised fine-tuning (SFT), reward modelling, and reinforcement learning—while driving innovations in reasoning and decision-making. You will synthesise large-scale, high-quality datasets through rewriting, augmentation, and generation techniques to strengthen foundation models during pretraining, SFT, and RL stages. A key part of the role involves solving complex tasks using System 2 thinking and applying advanced decoding strategies such as MCTS and A*. You will design and implement robust evaluation methodologies, teach models to interact with external tools, APIs, and code interpreters, and build agents and multi-agent systems capable of addressing sophisticated real-world problems.
Responsibilities
- Reasoning and planning for foundation models: Enhance reasoning and planning throughout the entire development process, including data acquisition, model evaluation, SFT, reward modeling, and reinforcement learning, to improve overall performance.
- Synthesize large-scale, high-quality data using methods such as rewriting, augmentation, and generation to improve the capabilities of foundation models in various stages (pretraining, SFT, RL).
- Solve complex tasks using system 2 thinking and leverage advanced decoding strategies such as MCTS, A*.
- Investigate and implement robust evaluation methodologies to assess model performance at various stages.
- Teach foundation models to use tools, interact with APIs, and code interpreters. Build agents and multi-agent systems to solve complex tasks.
Requirements
- Proficiency in research experience with RL, LLM, and familiarity with large-scale model training is preferred.
- Proficiency in data structures and fundamental algorithm skills, and fluency in Python or C++/Java.
- Experience with influential projects or papers in RL, NLP, or Deep Learning is preferred.
- Excellent problem analysis and problem-solving skills, capable of deeply addressing challenges in large-scale model training and application.
- Good communication and collaboration skills, with the ability to explore new technologies with the team and promote technological progress.