About the position
Do you want to be part of the team responsible for enabling a data-driven company, with up to 200k events per second and hundreds of different messages? Our mission is to build a platform that works at scale to provide trusted data for the rest of the company.
At data-engineering we operate dozens of services (Scala, Golang, Python), pipelines (Apache Beam, Airflow), and our in-house developed machine learning platform. We are a hands-on team: we manage our own infrastructure (GCE and AWS) and Kubernetes clusters (GKE). We are looking for new members, and this is how you will play an important part in helping us achieve our mission:
· Designing and developing end-to-end data solutions and modern data architectures for Cabify products and teams (streaming ingestion, data lake, data warehouse...)
· Extracting data from internal and external sources to empower our Data Analytics team.
· Evolving and maintaining Lykeion, a Machine Learning platform developed along with the Data Science team, to take care of the whole lifecycle of models and features. It includes a feature store, which allows other teams inside Cabify to make better decisions based on data.
· Collaborating with other technical teams to define, execute and release new services and features.
· Designing and maintaining complex APIs exposing data at scale, that helps other teams to make better decisions.
· Managing and evolving our infrastructure.
· Continuously identifying, evaluating and implementing new tools and approaches to
maximize development speed and cost efficiency.
· Provide the company with data discoverability and data governance.
What we’re looking for
We are looking for an engineer with experience in building massively large-scale distributed systems, to boost Cabify to the next level. Ideally:
· At least 4 years tenure in coding and delivering complex software projects
· Fluent in different programming languages (we work with Python, Scala and Go)
· Experience with message delivery systems and streaming processing (Kafka,
RabbitMQ, Akka streams, Apache Beam...)
· Good understanding and application of modern data processing technology stacks
and distributed processing (Hadoop, Spark, Apache Beam, Apache Flink...)
· Deep understanding of different storage technologies (file-based, relational,
columnar, document-based, key-value...)
· Experience with orchestration tools such as Airflow, Luigi or Dagster.
· Be familiar with machine learning, specially with its lifecycle (features, models,
training & evaluation processes, productionizing)
· Experience with cloud infrastructures (GCP, AWS, Azure)
· Be comfortable with automation/IaC tools (Terraform, Puppet, Ansible...)
o Experience with Google Cloud BigData products (PubSub, Dataflow, BigTable, BigQuery...)
o Experience with Kubernetes.
o Experience with Apache Beam and Scio.
The good stuff:
We’re a company full of happy, motivated people and we never want that to change. Here are some more reasons why it rocks to be part of our family.
· Excellent Salary conditions: L4: 45k - 65k
· We also offer a very competitive stock options plan.
· Recharge day: Every 3rd Friday monthly off!
· Remote position, our HQ is located in Madrid, on-site position is also available for this role.
· Regular team events.
· Cabify staff free rides.
· Personal development programs based on our career paths.
· Annual budget for training.
· Flexible compensation plan: Restaurant tickets, transport tickets, healthcare and childcare
· All the equipment you need (you only have to bring your talent).
· A pet room ,so you don’t have to leave your furry friend at hom
· And last but not least...free coffee and fruit!