SoTALab | Expert Intelligence

Research on Data Quality Validation Methods Develop scalable and generalizable approaches for evaluating data quality in large language models across complex real-world tasks. Areas of focus include, but are not limited to, complex reasoning, multimodal systems, autonomous agents, scalable supervision, and large-scale synthetic data generation.
Design of Quantitative Evaluation Frameworks Build rigorous and scientifically grounded evaluation systems for a wide range of high-value real-world AI applications and tasks.
Data Processing and Performance Reproduction Perform effective large-scale data cleaning and synthesis, and reproduce state-of-the-art research methods to validate and improve model performance through high-quality data.

Technical Expertise and Innovation Ability Passionate about practical AI deployment and breakthrough applications. Strong foundation in machine learning, with proven research and programming capabilities. Able to quickly reproduce, validate, and iterate on data-driven improvements and new ideas. Experience with large-model data pipelines and optimization strategies is a strong plus.
Self-Motivation and Research Mindset Highly self-driven, intellectually curious, and capable of thinking critically about underlying principles. Willing to tackle challenging and uncertain problems, explore cutting-edge synthetic data generation techniques, reproduce influential industry research, and continuously learn and grow in a fast-evolving field.
Collaboration and Communication Strong communication skills, collaborative mindset, and ability to work effectively within a team environment.

LLM Data Algorithm Researcher