273 views
UKEESS Software House
UKEESS Software House is looking forSenior Bioinformatics day (either in the office in Lviv, or with the possibility of remote cooperation in Ukraine).
About the customer:
This is one of the the world's largest family history and DNA resources. With the help of this service and DNA analysis, for example, it is possible to determine ethnic origin, susceptibility to allergies, various diseases, features of body structure, etc. (now there are about 100 predictions in terms of DNA analysis). You can also build your family tree based on more than 30 billion digitized archival records (in the USA) from the 18th century. (+ this number is constantly growing) and learn interesting facts about your ancestors. In general, our customer is considered the largest in its space. :)About the project and team:< /span>
You will join the Data Science Engineering team working on projects that related to DNA research. One of our projects is predicting human traits using SNPs.
In this post you will work with infrastructure and research activities such as EMR setup, Data Lake management, cloud computing, workflow automation, data mining and analytics.
Technical stack of the project: Python 3.8-3.11 / Django / MySQL / Nextflow / Airflow / PySpark / Docker / AWS (IAM, S3, EC2, FSx for Lustre) / Kubernetes
Duties of our future colleague:
Designing and developing complex large-scale systems that process billions of historical records every day
Development of ETL pipelines using Spark Airflow
Design and implementation of Nextflow pipelines for processing SNPs
Working with data engineering/processing/analytic
Working with the cloud environment (AWS services)
Determining the possibility of implementing innovative technologies
Writing code and unit tests
Permanent code review of pool requests
Diagnosis of complex problems involving several systems and technologies
Our ideal candidate is an engineer with:
good Python skills
ability to calculate summary statistics for data samples
critical thinking and open to new knowledge and skills.
Experience and skills required:
Good knowledge or experience with Bioinformatics tools and databases: DBs: BLAST, GATK, NCBI, UCSC Genome browser, etc
5+ years of commercial Python experience
Commercial experience with relational databases (MySQL preferred)
Commercial experience with orchestration tools (Airflow / Nextflow preferred)
Experience with AWS (S3, EC2)
CI/CD experience
English: above average at least (spoken and written)
Would be an advantage:
Good knowledge or experience with PySpark
Good knowledge or experience with Django or Flask
Good knowledge or experience with Rust
Good knowledge or experience with Java
Experience with Docker and Kubernetes
Knowledge or experience with Scikit-learn
Experience with AWS FSx for Luster and SageMaker
Knowledge or experience with ML
Commercial experience with Airflow and/or Kubernetes
What do we offer a new colleague?
Competitive compensation (based on market data, but also dependent on candidate's technical level)
Flexible work schedule
Annual paid leave
Free English lessons
Health insurance or two alternatives to choose from
Individual plans for professional and personal development
Absence of bureaucracy and micro-management"> normal; text-decoration: none"> Modern comfortable office (barbecue area, kitchen, etc. )
Foreign business trips (after the war)
Parking on the territory and a charging station for electric cars
Corporate gifts, holidays and entertainment
Sports activities: table tennis, football, workout
Send us your resume and let's get to know each other! ;)
------------------------------------------------ ----------------------------------------------------- -------------------
Team of UKEESS Software House is currently looking for a Senior Bioinformatics Engineer to join our team for a full-time position (remotely in Ukraine or in Lviv's office).
About the Customer:
Our customer is the world's largest DNA network from the USA. With more than 30 billion digitized global historical records, 130 million family trees, and 18+ million people in their growing database, our customers help people discover their family stories and gain actionable insights about their health and wellness.
About the Project and Team:
You will join the Data Science Engineering team and work on DNA research projects. One of our projects is a prediction of people’s traits using SNPs.
As a part of the team, you will work on infrastructure and research activities, such as setting up EMR, Data Lake management/governance, cloud-environment activities, workflow automation, data engineering, and analytics.
Technologies stack: Python 3.8-3.11 / Django / MySQL / Nextflow / Airflow / PySpark / Docker / AWS (IAM, S3, EC2, FSx for Lustre) / Kubernetes
Responsibilities will include:
Help architect, design, and develop complex, large-scale systems that process billions of records every day
Development of ETL pipeline using Spark and Airflow
Design and implement a Nextflow pipeline to process SNPs
Working with data processing/engineering/analytics
Working with cloud-environment (AWS services)
Identify opportunities to adopt innovative technologies and automation workflow
Write code and unit tests
Conduct code reviews
Diagnose complex problems involving multiple systems and technologies
Be an example of engineering excellence
Our ideal candidate will have good Python skills, the right mixture of critical thinking, a can-do attitude, solid programming fundamentals, an understanding of time series data, and the ability to calculate summary statistics for data samples.
Requirements:
Knowledge or experience with Bioinformatics tools & DBs: BLAST, GATK, NCBI, UCSC Genome browser, etc.
5+ years of commercial experience with Python
Commercial experience with relational databases (MySQL is preferable)
Commercial experience with orchestration tools (Airflow / Nextflow is preferable)
Experience with AWS (S3, EC2)
Experience with CI/CD
English: Upper-intermediate at least (both spoken and written)
It will be a plus:
Strong knowledge or commercial experience with PySpark
Strong knowledge or commercial experience with Django or Flask
Strong knowledge or commercial experience with Rust
Experience with Docker and Kubernetes
Knowledge or commercial experience with Scikit-learn
Experience with AWS FSx for Lustre and SageMaker
Strong knowledge or commercial experience with Java
Knowledge or experience with ML
What do we offer our new colleague?
Competitive compensation (based on market data but also depending on the technical level of the candidate)
Flexible work schedule
3 health packages to choose
Annual paid vacation and state holiday celebration
Free English classes (online)
Individual approach to professional growth
Lack of bureaucracy and micromanagement
Modern, comfortable office facilities (a barbecue zone, kitchens, lounge rooms, coffee machines, etc.)
Foreign business trips
On-site parking lot and charge station for Electric Cars
Corporate gifts, celebrations, and fun activities
Sports activities: ping-pong, soccer, work-out
If you have a passion for solving challenging problems; building scalable, robust systems; love working with the latest technologies in a fast-paced, flexible environment; and are excited at the prospect of having a significant impact on products which has more than 3 million paying subscribers, then we want to talk to you! ;-)