Data Engineer
Company details
Company: BakeLab
Job type: Remote
Country: United States
City: San Francisco
Experience: 4 years or more
Description of the offer
We are looking for a Data Engineer passionate about LLMs, VLMs, post-training, and reinforcement learning. You will design and implement scalable data systems that power dataset generation, filtering, and evaluation for model alignment and agentic reasoning. You’ll collaborate closely with our research and infrastructure teams to ship real systems that train the next generation of intelligent models.
Key Responsibilities
- Build and maintain scalable data pipelines for mid-training and post-training.
- Design high-throughput systems for data collection, deduplication, and quality measurement.
- Work with researchers to implement reward models, benchmarks, and feedback loops.
- Collaborate cross-functionally with infra and research teams to integrate new data modalities and tasks.
Qualifications
- Strong software engineering background.
- Experience with LLMs, RLHF/RLAIF, and/or post-training pipelines (SFT, DPO, PPO, etc.).
- Familiarity with modern data tooling (e.g., PySpark, Ray, Hugging Face Datasets, Arrow, Parquet).
- Comfort with large-scale data manipulation, storage, and retrieval.
- Understanding of data curation principles, filtering heuristics, and annotation workflows.
- (Bonus) Experience with training reward models.
- (Bonus) Experience with coding, tool-using or agentic LLM datasets.
- (Bonus) Experience building and maintaining hybrid compute clusters (Kubernetes, Slurm).
What We Offer
- Work with a world-class, research-driven team shaping the future of data-centric AI.
- Early technical ownership and influence in a fast-moving, well-funded startup.
- Competitive compensation with equity.
- Hybrid flexibility (SF Bay Area preferred, remote considered).
- Impactful open-source contributions (papers, codes) recognized by top research and industry labs.
Location of employment
How to apply?
Click on the button to get the company email or employment application form.
Apply with External LinkSponsored ads
