Next job

AI Data Engineer in UATech Hires

7 February

54 views

UATech Hires

UATech Hires

0
0 reviews
Without experience
Kyiv
Full-time work
The RoleWe're looking for an AI Data Engineer to build and maintain the data infrastructure powering our AI-driven healthcare platform. This role focuses on implementing robust data pipelines, managing our data lakehouse architecture, and ensuring high-quality data processing for our AI systems.Responsibilities:Design and implement scalable data pipelines for diverse healthcare data sourcesBuild and maintain data lakehouse architecture on AWS for storing structured and unstructured medical dataC

The Role

We're looking for an AI Data Engineer to build and maintain the data infrastructure powering our AI-driven healthcare platform. This role focuses on implementing robust data pipelines, managing our data lakehouse architecture, and ensuring high-quality data processing for our AI systems.

Responsibilities:

  • Design and implement scalable data pipelines for diverse healthcare data sources
  • Build and maintain data lakehouse architecture on AWS for storing structured and unstructured medical data
  • Create efficient ETL processes for handling medical transcriptions, clinical documentation, and practice data
  • Implement data quality monitoring systems and validation frameworks
  • Develop and maintain data crawlers for collecting domain-specific medical content
  • Support RAG system implementation with optimized data storage and retrieval mechanisms

Ideal Candidate:

  • Strong experience with AWS data services (S3, RDS, Glue, EMR Serverless, Athena, DataZone, Lake Formation, DynamoDB)
  • Expertise in data orchestration tools (Dagster, Apache Airflow, AWS MWAA, Step Functions)
  • Proficiency in Python, SQL, and PySpark with experience in data processing frameworks
  • Experience with data lakehouse architectures, ETL pipeline development, and SageMaker Feature Store
  • Strong background with AWS analytics services (Glue Catalog, Glue ETL/EMR Serverless, Athena)
  • Experience with Apache Iceberg table format for organizing data in data lakehouse architecture, including working with time travel, ACID transactions, and schema evolution
  • Experience with PostgreSQL and vector databases (pgvector, OpenSearch, etc.)
  • Proficiency in data transformation tools like dbt
  • Experience implementing data quality frameworks (Great Expectations, Glue Data Quality, PyDeequ)
  • Knowledge of healthcare data structures and medical terminology preferred
  • Experience with data preprocessing for LLM applications strongly preferred (NLP libraries like spaCy, web scraping tools, text extraction, semantic chunking, etc.)
  • Understanding of data security and HIPAA compliance requirements
  • Collaborative mindset and ability to work in a fast-paced startup environment
  • Bachelor's degree in Computer Science, Engineering, or related field
Without experience
Kyiv
Full-time work
Want to get related jobs?
New job openings in your Telegram
Subscribe
We use cookies
accept