We are looking for a Data Platform/Data Infrastructure Engineer team responsible for building and developing a modern data?platform. The role involves working with AWS?infrastructure, S3?based Data Lake, CDC, streaming and Kubernetes in a production?environment. You will directly influence the quality, reliability and scalability of the data that analytics, product and business work with.Technology stack:Containerization and orchestration
Docker, containerd
Kubernetes (AWS EKS).
AWS and data?inf
We are looking for a
Data Platform/Data Infrastructure Engineer team responsible for building and developing a modern data?platform. The role involves working with AWS?infrastructure, S3?based Data Lake, CDC, streaming and Kubernetes in a production?environment.
You will directly influence the quality, reliability and scalability of the data that analytics, product and business work with.
Technology stack:Containerization and orchestration
- Docker, containerd
- Kubernetes (AWS EKS).
AWS and data?infrastructure
- S3 (Data Lake / DWH storage)
- EC2, VPC, ASG, ALB
- RDS (PostgreSQL, MySQL)
- Amazon Redshift
- Amazon Athena
- AWS Glue.
Data Lake & Table Formats
- Apache Iceberg (S3?based tables)
- Partitioning, schema evolution, lifecycle policies.
Data ingestion, CDC and ETL
- ETL / ELT pipelines
- Airbyte (batch ingestion)
- CDC and incremental downloads
- Debezium (CDC with PostgreSQL / MySQL)
- Kafka (AWS MSK)
- S3 as landing / raw / curated storage.
Infrastructure as Code and GitOps
- Terraform
- AWS CloudFormation (support)
- GitHub Actions
- ArgoCD, Helm.
Monitoring, Logging and Security
- Prometheus, Grafana
- ELK / Loki
- AWS CloudWatch
- IAM, AWS Secrets Manager
- AWS GuardDuty, AWS Inspector.
Tasks and area of responsibility:
- Building and developing S3?based Data Lake / DWH
- Working with Athena, Glue and Redshift
- Apache Iceberg implementation and support
- Ensuring the scalability, reliability and fault tolerance of the data?platform
- Data quality control, schema evolution, partitioning.
ETL / ELT and ingestion
- Development and support of ETL / ELT pipelines
- Integration of data sources via Airbyte
- Working with CDC and incremental downloads
- Orchestration of data?jobs in Kubernetes.
CDC and streaming
- Building CDC?pipelines based on Debezium
- Kafka support (AWS MSK)
- Integration of streaming data from S3 / Iceberg / DWH
- Control of lag, retry and data consistency.
Infrastructure and Kubernetes
- Support for Kubernetes?clusters (AWS EKS)
- Deployment of ETL, CDC and streaming services
- Infrastructure automation through Terraform and Helm
- GitOps?approach to releases (ArgoCD).
CI/CD and Automation
- Building and maintaining CI/CD for data and infra components
- Automation of deployments, migrations and upgrades
- Standardization of pipelines and templates.
Databases
- PostgreSQL and MySQL RDS administration
- Replication, performance optimization, user management
- Database preparation for CDC (logical replication, permissions)
- Backups tand recovery (snapshots, PITR).
Observability and stability
- Monitoring of ETL / CDC / streaming processes
- Building alerts for data?pipelines
- Analysis of incidents and performance problems.
FinOps
- Optimization of expenses for S3, Athena, Glue, Redshift
- Storage growth control and lifecycle policy
- Query optimization, file layout and partitioning
- Using Spot Instances and Savings Plans.
Your background:
- 3+ years of experience in DevOps / Data Platform / SRE roles
- Solid knowledge of AWS
- Experience building or maintaining a Data Lake / Data Warehouse
- Practical experience with Terraform
- Kubernetes (AWS EKS) in production
- Understanding ETL / ELT processes
- Experience with Athena / Glue / Redshift
- Hands-on experience or deep understanding of CDC and Debezium
- Understanding Apache Iceberg or modern table formats
- Experience in PostgreSQL and MySQL administration
- Experience with Kafka (AWS MSK)
- Experience with monitoring and logging.
Will be a plus:
- Deep experience with Apache Iceberg
- Athena optimization (partitioning, file size, cost)
- Building end?to?end CDC (DB > Debezium > Kafka > S3 / Iceberg)
- Understanding data governance and data quality
- Working with large volumes of data (TB+)
- Practical FinOps?experience for data?platforms.
What we offer:
- Working with a large-scale data?platform and a real production?load
- Impact on architectural solutions and development of the data?ecosystem
- Participation in BI and data?infrastructure transformation
- Strong technical team and open communication
- Official employment, holidays and sick days
- Regular feedback and professional development plan.