Abstract: Many believe that large-scale datasets for robotics could be a key enabler of dexterous robotic policies that can generalize across diverse environments. While teleoperation provides high-fidelity datasets, its high cost limits its scalability. Instead, what if people could use their own hands, just as they do in everyday life, to collect data? In DexWild, a diverse team of data collectors uses their hands to collect hours of interactions across a multitude of environments and objects. To record this data, we create DW-Mocap, a low-cost, mobile, and easy-to-use system. The DexWild learning framework co-trains on both human and robot demonstrations, leading to improved performance compared to training on each dataset individually. Our large-scale dataset enables robots with only a handful of teleoperation demonstrations to generalize across many different environments, objects, and embodiments. The software, hardware, and the dataset used in this paper will be released on our website upon acceptance of the paper.