Data & Python Software Engineer
Ceartas DMCA
Ceartas is your internet delete button. As an AI-powered content protection platform built for the modern internet, we help creators, platforms, and brands protect what they've built - from piracy and leaks to impersonation and deepfakes.
At Ceartas, we lead the way in AI-powered brand protection, copyright law, and digital security, safeguarding the integrity of content creators, brands, and enterprises worldwide. As we scale rapidly, we're looking for a Data & Python Software Engineer to drive innovation in our data pipelines and crawling technologies. In this pivotal role, you'll collaborate with our CTO and Head of Engineering, steering our Data Engineering Team toward developing groundbreaking solutions for digital security challenges.
THE ROLE
What You'll Own
Build Scalable Data Pipelines:
Design and develop robust ingestion and processing pipelines that support large-scale web data extraction and brand protection workflows. Ensure that data moves reliably from collection through processing to storage while maintaining performance, resilience, and operational stability at scale.
Engineer Backend Services and APIs:
Design, implement, and maintain high-performance backend services and APIs that power internal platforms and external integrations. Deliver well-structured, maintainable systems using Python-based frameworks while ensuring scalability, security, and long-term extensibility.
Ensure Data Quality and Governance:
Own data validation, consistency, and governance across ingestion, storage, and serving layers. Establish clear standards for schema design, transformation logic, and monitoring to guarantee trustworthy, production-grade datasets that can be reliably consumed across the organization.
Optimize Performance and Reliability:
Continuously improve system efficiency through performance tuning, cost optimization, and architectural enhancements. Implement logging, metrics, and tracing to monitor production systems, diagnose issues quickly, and maintain high reliability under growing workloads.
Own Production Systems End-to-End:
Take full lifecycle ownership of deployed systems, from design and implementation to operation, iteration, and long-term improvement. Collaborate closely with product and engineering stakeholders to deliver features end-to-end, ensuring that systems remain scalable, maintainable, and aligned with evolving business needs.
WHAT YOU'LL DO
- Design, build, and maintain high-performance backend services and data pipelines supporting web data extraction and brand protection use cases
- Implement and maintain APIs and data services that power internal products and external integrations
- Build reliable ingestion, processing, and storage workflows for large-scale web data
- Ensure data quality, validation, and governance across ingestion, storage, and serving layers
- Optimize systems for performance, scalability, reliability, and cost efficiency
- Monitor, debug, and improve system reliability using observability tools (logging, metrics, tracing)
- Collaborate closely with product and engineering teams to deliver features from design through full end-to-end production deployment
- Take ownership of systems in production, including maintenance, iteration, and improvement over time
Core Requirements:
- Strong SQL skills
- Strong Python experience
- Experience with PostgreSQL or similar relational databases
- Experience designing and building scalable APIs and backend services (e.g. FastAPI, Django, or similar frameworks)
- Proven experience with web scraping or data extraction systems.
- Experience designing efficient, scalable data models and database schemas
- Hands-on experience deploying and operating systems in the cloud (AWS, GCP, or Azure)
- Experience working with Docker and containerized environments
Bonus points if you...
- Experience with workflow orchestration tools such as Airflow
- Experience with browser-based automation tools (Playwright, Selenium, or similar)
- Experience with DBT or analytics-focused data transformation workflows
- Experience building or operating high-concurrency systems and task queues
- Experience designing and deploying cloud-native workflows on AWS
- Familiarity with CI/CD pipelines and production deployment practices
- Experience working in a high-growth, early-stage startup environment
Background
University education in a technical field such as Computer Science, Engineering, or similar — Masters level preferred. 3+ years of relevant experience, or 1+ year in an early-stage startup environment.
What We Offer
🏢 Hybrid role based in Berlin - 3 days per week in office
🌴 25 paid days off per year, plus public holidays and your birthday!
🚀 A culture of innovation, continuous learning, and collaboration.
🌍 A vibrant, inclusive workplace that celebrates diversity and equal opportunity.
💻 Access to the latest technology.
🏆 Employee recognition and reward programs.

