AI Data Engineer in UATech Hires

Posted more than 30 days ago

244 views

UATech Hires

0 reviews

Without experience

Kyiv

Full-time work

The RoleWe're looking for an AI Data Engineer to build and maintain the data infrastructure powering our AI-driven healthcare platform. This role focuses on implementing robust data pipelines, managing our data lakehouse architecture, and ensuring high-quality data processing for our AI systems.Responsibilities:Design and implement scalable data pipelines for diverse healthcare data sourcesBuild and maintain data lakehouse architecture on AWS for storing structured and unstructured medical dataC

The Role

We're looking for an AI Data Engineer to build and maintain the data infrastructure powering our AI-driven healthcare platform. This role focuses on implementing robust data pipelines, managing our data lakehouse architecture, and ensuring high-quality data processing for our AI systems.

Responsibilities:

Design and implement scalable data pipelines for diverse healthcare data sources
Build and maintain data lakehouse architecture on AWS for storing structured and unstructured medical data
Create efficient ETL processes for handling medical transcriptions, clinical documentation, and practice data
Implement data quality monitoring systems and validation frameworks
Develop and maintain data crawlers for collecting domain-specific medical content
Support RAG system implementation with optimized data storage and retrieval mechanisms

Ideal Candidate:

Strong experience with AWS data services (S3, RDS, Glue, EMR Serverless, Athena, DataZone, Lake Formation, DynamoDB)
Expertise in data orchestration tools (Dagster, Apache Airflow, AWS MWAA, Step Functions)
Proficiency in Python, SQL, and PySpark with experience in data processing frameworks
Experience with data lakehouse architectures, ETL pipeline development, and SageMaker Feature Store
Strong background with AWS analytics services (Glue Catalog, Glue ETL/EMR Serverless, Athena)
Experience with Apache Iceberg table format for organizing data in data lakehouse architecture, including working with time travel, ACID transactions, and schema evolution
Experience with PostgreSQL and vector databases (pgvector, OpenSearch, etc.)
Proficiency in data transformation tools like dbt
Experience implementing data quality frameworks (Great Expectations, Glue Data Quality, PyDeequ)
Knowledge of healthcare data structures and medical terminology preferred
Experience with data preprocessing for LLM applications strongly preferred (NLP libraries like spaCy, web scraping tools, text extraction, semantic chunking, etc.)
Understanding of data security and HIPAA compliance requirements
Collaborative mindset and ability to work in a fast-paced startup environment
Bachelor's degree in Computer Science, Engineering, or related field

Without experience

Kyiv

Full-time work

Want to get related jobs?

New job openings in your Telegram

We use cookies

Introducing the No Hiring Fee Package!

AI Data Engineer in UATech Hires