Career Opportunities at Earlybird Portfolio Companies

Are you seeking a new challenge at a growing startup where you can truly make a difference, take ownership, help build a function and change the world of tomorrow for the better? Below you'll find open jobs from the entire #EBVCGang. We are also hiring at Earlybird! If you want to work with us, please send us your application.

Senior Data Engineer (f/m/d)

Aleph Alpha

Aleph Alpha

Data Science
Heidelberg, Germany
Posted on Oct 17, 2024

Overview:

Join our forward-thinking data team to drive the development of cutting-edge data solutions for Generative AI applications. We leverage data to empower our customers, fuel our own innovation, and support groundbreaking research. As a Senior Data Engineer, you will take a leadership role in designing, optimizing, and scaling our data infrastructure, ensuring that our solutions are not only robust but also capable of meeting the challenges of tomorrow.

Your Responsibilities:

  • Lead the design and implementation of scalable data architectures to handle large-scale datasets (terabytes of text) with a focus on storage, versioning, and documentation best practices.

  • Architect, develop, and oversee the maintenance of web services that enable efficient consumption of harvested data.

  • Partner with researchers, software engineers, and leadership to continuously refine data collection methodologies and identify new data opportunities.

  • Strategize and prepare large datasets for diverse Machine Learning use cases, with an emphasis on Generative AI.

  • Build, optimize, and automate advanced preprocessing pipelines tailored to specific applications, ensuring high performance and reliability.

  • Ensure data services are robust, scalable, and meet the needs of cross-functional teams developing new products on top of our data infrastructure.

  • Mentor and guide junior data engineers, fostering a culture of knowledge sharing and continuous improvement.

Your Profile:

  • You have 7+ years of experience as a Data Engineer, with a proven track record of architecting large-scale data systems.

  • You are an expert in Python and proficient in at least one other programming language (e.g., Java, Scala, or Golang).

  • You possess a deep understanding of distributed systems, with a demonstrated ability to design and manage efficient data pipelines in both cloud and on-prem environments.

  • You have a strong software engineering background, with a focus on writing clean, maintainable, and well-documented code.

  • You excel at data wrangling, including advanced techniques for extracting, transforming, cleaning, and standardizing data from multiple sources.

  • You bring expertise in Generative AI use cases and understand the pivotal role of data in developing cutting-edge AI solutions.

  • You have experience in driving projects, influencing stakeholders, and aligning data strategies with broader business goals.

Nice to Have:

  • Experience working in multi cloud environments (e.g., GCP, Azure, AWS) as well as on-premise data solutions.

  • Background in Machine Learning or Data Science, with a particular focus on applying data engineering principles to AI research.

  • Proficiency in Golang and an interest in adopting new technologies.

  • Familiarity with Kubernetes for container orchestration and managing scalable deployments.