CulinaryCut-VLAP: A Vision-Language-Action-Physics Framework for Food Cutting via a Force-Aware Material Point Method

Robotics

MPM

VLA

Robotics

Taichi-Lang

👥 Authors

Hyunseo Koh*, Chang-Yong Song*, Youngjae Choi, Misa Viveiros, David Hyde, Heewon Kim (*Equal Contribution)

🏢 Venue

arXiv Preprint, 2026 (arXiv:2601.06451)

📄 Status

arXiv Preprint, 2026 (arXiv:2601.06451)

🔗 Links

[Paper]

🧠 Keywords

VLA Model, MPM, Diffusion, Auto Regressive

1. Background & Motivation (Why?)

Commanding a robot to "cut the apple in half" is deceptively difficult.

•

Physical Complexity: Food deforms, fractures, and changes shape under pressure. Standard datasets for rigid bodies cannot capture these dynamics.

•

Data Scarcity: Collecting real-world data (e.g., slicing thousands of fruits) is expensive and dangerous. Previous simulations lacked physical accuracy regarding forces and friction.

•

Quantitative Grounding: Few existing models can understand and execute precise numerical instructions, such as "cut at the 30% mark from the right."

2. Key Solution: CulinaryCut & VLAP (How?)

The researchers bridged ManiSkill (Robot Simulator) and MPM (Physics Simulator) to create a safe, intelligent environment for learning food processing.

Core Components

Hybrid Simulation

•

While ManiSkill handles robot kinematics, MPM (Material Point Method) takes over interaction physics the moment the knife touches the food.

•

It calculates deformation and fracture realistically, providing real-time estimates of Force and stress on the blade.

CulinaryCut Benchmark

•

A large-scale dataset featuring diverse ingredients (apples, bananas, etc.) and distinct cutting styles (Normal, Sawing, Bias, Guillotine).

•

It pairs these with precise language instructions (e.g., "slice the cucumber into three equal parts").

Safety & Style Modules

•

Safety Module: Uses force feedback from the simulation to regulate the robot's velocity, preventing it from striking too hard and damaging itself.

•

Cutting Style Transfer (CSTM): Adapts the robot's motion to match the material's needs, such as applying a sawing motion for fibrous foods instead of just pressing down.

3. Experimental Results

•

Current Limitations Exposed: Even advanced models (like OpenVLA) struggle to differentiate between specific cutting points, such as "30%" versus "50%."

•

Safety Verification: With the proposed safety module, the peak contact force dropped dramatically from a dangerous 129N to a safe 37N.

•

Topology Updates: The study proved that visually updating the "cut surface" (showing the object actually separating) is crucial for the robot to recognize task completion and proceed to the next step.

4. Limitations & Future Work

•

Cluttered Environments: The models still struggle to identify and cut a specific target when multiple fruits are placed together.

•

Continuous Slicing: Long-horizon tasks, like slicing an entire cucumber into many thin disks (julienne style), remain a challenge with low success rates.