We’re an applied AI research lab pioneering breakthroughs that push the boundaries of AI research and transform how intelligent systems impact the world.
About the Role:
We’re seeking a Data Engineer to design and scale the data infrastructure that powers our foundation models. In this role, you’ll be central to how we capture, transform, and serve massive datasets for training and deployment. Working alongside research scientists, engineers, and product teams, you’ll build the pipelines and tools that enable us to push AI forward at scale.
What You’ll Do:
Design and maintain scalable, high-performance data pipelines to support training of large-scale foundation models.
Build systems for collecting, cleaning, labeling, and transforming multimodal datasets (text, image, video, structured data).
Optimize data flows for reliability, efficiency, and transparency—enabling explainable and reproducible AI.
Collaborate with research teams to deliver new data modalities and experimental datasets into production.
Implement best practices for data governance, quality, and compliance.
Continuously improve our data infrastructure to support faster iteration and breakthrough model performance.
Sponsored ads
Who You Are:
Experience: 5+ years of professional experience in data engineering or related roles.
Educational Background: Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
Technical Expertise: Proficiency in SQL, Python, and distributed data frameworks such as Spark, Beam, or Flink.
Cloud & Storage Systems: Hands-on experience with modern cloud platforms (AWS, GCP, Azure), object stores, and data warehouses (Snowflake, BigQuery, Redshift).
Pipeline Mastery: Strong background in ETL/ELT design, workflow orchestration (Airflow, Dagster, Prefect), and data versioning.
Collaboration: Comfortable working in cross-functional teams with researchers and engineers.
Startup Mentality: Thrives in a fast-paced, high-growth environment; proactive about ownership and solving hard problems.
Sponsored ads
Adaptability: Able to learn quickly, experiment, and pivot when needed.
Nice to Haves:
Experience with ML data pipelines for large-scale model training.
Knowledge of multimodal data (images, video, text, audio).
Familiarity with data labeling systems, weak supervision, or synthetic data generation.
Experience working in a research-driven startup environment.
What We Offer:
Competitive Salary: $150k – $260k
Equity Options: Share in the success and growth of the company.
Comprehensive Benefits: Medical, dental, vision, 401k, and more.
Flexible Vacation & Remote Work: We value balance and autonomy.
Career Growth: Opportunities to learn, grow, and shape the future of applied AI.
Collaborative Environment: Work alongside top-tier scientists and engineers in a dynamic, mission-driven team.
Commitment to Diversity:
At Synaptically, we are proud to be an Equal Opportunity Employer. We celebrate diversity and are committed to building an inclusive environment for all employees, regardless of race, color, religion, gender, sexual orientation, gender identity, age, national origin, or disability status.
How to Apply:
Ready to help build the data backbone of the next generation of AI? If you’re passionate about data engineering and want to make an impact at the frontier of applied AI, we’d love to hear from you. Submit your resume and a brief note on why you’re excited to join Synaptically.
Sponsored ads
XDOTG creates what’s next in gaming and culture by connecting brands with Gen Z, Gen Alpha, and Millennial audiences through talent, storytelling, and
Since 2015, moveBuddha has been on a mission to make moving simple. Every year, over 7 million people visit our site looking for
Sure Oak is a full-service Digital Marketing Agency specializing in SEO, Link Building, and Analytics. We are committed to helping businesses grow and