Hi, I'm Wei-Wei Du.
A
Passionate Applied Research Scientist with a business mindset, driven to solve real-world challenges through data and innovation.
About
I am an Applied Research Scientist at Sony, working on personalization, recommender systems, and off-policy evaluation to enhance user engagement. Before joining Sony, I worked as a Machine Learning Scientist Intern at Appier, developing real-time bidding models. Before graduation, my research spanned several domains, including property valuation (self-supervised learning for few-shot scenarios, graph neural networks), natural language processing (depression detection, multi-modal fact-checking), and Sports AI, with multiple publications on these topics. This year, I published a comprehensive survey on self-supervised learning to make a broader impact on the academic community.
With over five years of experience in the field of Machine Learning, I am passionate about making an impact by applying data-driven solutions to real-world applications and continuously seeking new knowledge. In my free time, I enjoy photography, cycling, and practicing yoga.
- Programming: Python, Linux, Shell, R, SQL, PySpark, Java, C++
- Languages: English, Mandarin
- Tools & Technologies: Git, Docker, AWS, GCP
Experience
- Develop off-policy evaluation pipelines to assess and improve recommender system performance, reducing the cost and risk of online A/B testing for a streaming platform with 100M+ monthly active users.
- Lead research on LLM-based recommender systems for next-generation fan engagement. (first-author paper accepted at RecSys 2025)
- Collaborate with global R&D teams to deliver solutions across movies, gaming, music, and e-commerce.
- Implemented an ensemble real-time bidding model with new data-driven features from 10M+ e-commerce click stream data that achieved 2x performance in production.
- Conducted tree analysis and feature importance with SHAP to analyze model behavior.
- Cooperated with 3 data scientists to build the RTB model for re-engagement campaigns.
Research Papers
Not Just What, But When: Integrating Irregular Intervals to LLM for Sequential Recommendation (RecSys-25)
- The first work to integrate time intervals into LLMs for sequential recommendation.
- Propose Interval-Infused Attention to capture the temporal relevance between items and intervals.
- Introduce a novel perspective on the cold-start problem by considering time intervals and reveal that existing methods suffer from significant performance drops in interval cold-start scenarios.
A Survey on Self-Supervised Learning for Non-Sequential Tabular Data (ACML-24)
- The first comprehensive survey of recent advancements in SSL4NS-TD, consisting of problem definitions, taxonomy, application issues, NS-TD datasets, and evaluation protocols
- Evaluate representative SSL4NS-TD approaches of each learning category on the most recent and large-scale benchmark, TabZilla.
- Highlight the addressed challenges and future directions in SSL for the existing NS-TD methods.
Benchmarking Stroke Forecasting with Stroke-Level Badminton Dataset (IJCAI-24 Demo)
Ensemble Models with VADER and Contrastive Learning for Detecting Signs of Depression from Social Media (ACL-22 Workshop)
Parameter-Efficient Large Foundation Models with Feature Representations for Multi-Modal Fact Verification (AAAI-23 Workshop)
Dora: Domain-Based Self-Supervised Learning Framework for Low-Resource Real Estate Appraisal (CIKM-23)
- The first work focusing on low-resource real estate appraisal, which meets the needs of real-world scenarios.
- Introduced with novel and effective intra- and inter-sample SSL objectives to learn robust geographical knowledge from unlabeled records.
- Illustrate a developed system of DoRA and the real-world industrial scenarios for cities and towns with extremely limited transactions.
Look Around! A Neighbor Relation Graph Learning Framework for Real Estate Appraisal (PAKDD-24)
Invited Talk
- 2025/09 RecSys 2025 Workshop on Evaluating and Applying Recommender Systems with LLMs - LLMs for Next-Generation Recommender Systems: From Understanding User Behavior to Deployment [link]
Education
National Yang Ming Chiao Tung University
Advanced Database System Lab, Advisor: Prof. Wen-Chih Peng
Degree: Master of Data Science and Engineering
- Recommender System
- Natural Language Processing
- Explainable AI
- Self-supervised Learning
Research Interests:
Data Lab, Advisor: Prof. Shan-Hung Wu
Degree: Bachelor of Quantitative Finance and Computer Science
- Natural Language Processing
- Deep Learning
- Machine Learning
- Statistical Learning
- Database System
Relevant Courseworks:

