Building the Next-Generation Data Pipeline for Pretraining Autonomous Driving Vision Models with YouTube Videos