Tong Wu

Peking University · Intelligent Science and Technology

Researching the next interface between perception, action, and reasoning.

I am an undergraduate at Peking University, majoring in Intelligent Science and Technology. My work spans embodied intelligence, VLA efficiency, large model reasoning, and long-context systems. I am currently a research intern in Prof. Guozhang Chen's lab, where I work on efficient VLA pipelines and related multimodal systems.

Contact Me View Publications Open CV

Research Focus

Vision-Language-Action systems
Reasoning in large language models
Long-context inference and memory

Current Status

B.S. Student

Peking University · Beijing

About

What I work on

I am interested in building intelligent systems that can perceive rich environments, reason over long horizons, and make grounded decisions. My recent projects focus on agentic e-commerce assistants, efficient VLA action generation, multimodal dataset construction, and inference-time reasoning behavior in large models.

Education

Academic background

Peking University

Undergraduate in Intelligence Science · Aug 2023 to Present

Research Experience

What I am building now

Agentic E-commerce Assistant with Function Calling

Joint University-Enterprise Innovation Workstation Project · May 2025 to May 2026

Building an end-to-end LLM agent post-training pipeline with SFT and RL, long-term memory, benchmark expansion, and reward-stabilized data construction.

Research Intern at Prof. Chen's Lab

Peking University · Feb 2025 to Present

Worked on spike camera data collection for SpikeStereoNet and now focus on efficient VLA systems with conditional SSM-based action denoising and distillation.

TerraVerse Collaborator

TarraVerse Group · Dec 2025 to Feb 2026

Contributed to terrain-centric multimodal data construction for TerraVerse, including collection, cleaning, annotation, and scalable quality filtering.

Selected Publication

SpikeStereoNet: A Brain-Inspired Framework for Stereo Depth Estimation from Spike Streams

A representative research output exploring brain-inspired visual computation and stereo depth estimation from spike streams. Publication metadata can be expanded further as soon as venue, year, and paper links are finalized.

Publication List

Direction 01

Embodied Intelligence

Designing systems that connect multimodal perception, action generation, and decision-making under real-world constraints.

Direction 02

Cognitive Architectures

Exploring memory, tool use, and structured control mechanisms that improve reliability and long-horizon capability in LLM systems.

Direction 03

Interpretability

Studying model reasoning traces, efficiency tradeoffs, and mechanism-level understanding for safer and more controllable AI systems.