Next job

Senior AI Data Scientist / Engineer in UKEESS Software House

18 November

9 views

UKEESS Software House

UKEESS Software House

0
0 reviews
Without experience
Lviv
Full-time work
UKEESS Software House is looking forSenior AI Data Scientist / Engineerfull-time (either in the office in Lviv, or with the possibility of remote cooperation in Ukraine).About our client's project:This is one of the world's largest resources for family history research and DNA digitization (client from the USA). With the help of this service, you can build your family tree, research your genealogy, learn interesting facts about your ancestors and find relatives on the basis of more than 60 billi

UKEESS Software House is looking forSenior AI Data Scientist / Engineerfull-time (either in the office in Lviv, or with the possibility of remote cooperation in Ukraine).

About our client's project:

This is one of the world's largest resources for family history research and DNA digitization (client from the USA). With the help of this service, you can build your family tree, research your genealogy, learn interesting facts about your ancestors and find relatives on the basis of more than 60 billion digitized archive records (+ this number is constantly growing). Also, with the help of DNA analysis, for example, you can find relatives, determine ethnic origin, physical features, etc. (now there are almost 100 definitions by DNA analysis). Overall, our customer is considered the largest in its business niche!

About the team:

You will join the AI ​​Content team, a dynamic group working on Document Understanding. You will be instrumental in developing innovative AI models that extract and organize textual and graphical information from billions of historical and genealogical records, enabling clients to find, share and connect with their family history.

As a team member, you will work with KB (Knowledge Bases) and RAG (Retrieval Augmented Generation) implementations, integrating architectures using structured SQL databases together with vector databases supporting semantic search and retrieval applications. You will work with the data team as well as engineering teams to train, optimize and deploy models that drive product development, customer success and content creation within our project.

Responsibilities:

  • Configuration of structured and vector databases: synchronization of databases between structured and vector databases.
  • Curating and organizing content:capturing and formatting provided content collection metadata for compatibility with defined database schemas.

  • Loading content collection metadata:loading metadata from the provided sources.

  • Generation of embeds: font-style: normal; text-decoration: none">tool/script to populate a vector database.
  • enhancements:repeat configuration of database schema, indexes, embeds, etc. to support different queries and use cases for analyzing uploaded content collection metadata.

  • Collaboration on cloud deployment:Collaboration with ML Ops and Data Science Engineers for seamless deployment of datasets, truth sets, and learning and inference pipelines.

  • Effective communication:clear and confident presentation of one's conclusions, results and solutions of technical and non-technical audience, including teams, stakeholders, and executives.

Requirements:

  • 5+ years of experience in Data Science; font-style: normal; text-decoration: none">Commercial experience with LLM in production, RAG-architecture systems

  • Expertise in collecting, organizing, curating and formatting data to populate databasesx SQL.

  • Experience with SQL databases, including configuring schemas and indexes to optimize efficient queries. 

  • Understanding and experience with embed generation and using vector databases for semantic search and retrieval. 

  • style="font-weight: 400; font-style: normal; text-decoration: none">Practical experience with AWS cloud services (for example, Amazon SageMaker, EC2, S3, AWS Lambda).

  • English - above average level (speaking and writing).

It will be an advantage:

  • Knowledge and experience of cloud-based AI/ML services such as Google GCP, Azure, etc.

  • In-depth knowledge and experience with LightLLM
  • What you will get in this role:

    • Mentoring and professional growth:support from experienced Data Scientists and work on real AI projects. The opportunity to expand your knowledge and professional network within the framework of a culture of cooperation.

    • Spiexperience and influence:the opportunity to join a team of top specialists that forms innovative approaches in the field of Document Understanding.

    • Innovation and purpose:your contribution will help millions of users around the world to know their roots better.

    What do we offer to the new colleague? style="font-weight: 400; font-style: normal; text-decoration: none">

    Competitive compensation (based on market data, but also depends on the technical level of the candidate)

  • Flexible work schedule="font-style: normal; text-decoration: none">Annual paid leave">font-weight: 400; text-decoration: none">Medical insurance on selection

  • Individual plans for professional and personal development

  • Modern energy-equipped offices in Lviv etc.)

  • Parking on the territory and a charging station for electric vehicles

  • Business trips (after the war)

  • Sports activities: table tennis, football, workout

Send us your resume and let's get to know each other! ;)

----------------------------------------------------------------------------------------------------------------

The UKEESS Software House team is currently looking for a Senior AI Data Scientist / Engineer to join our team for a full-time position (remotely in Ukraine or in Lviv's office).

About the Customer and the Project:

Our customer is the world's largest DNA network, based in the USA. This presents a unique opportunity to work with more than 60 billion digitized global historical records, 100 million family trees, and 18+ million people in their growing database. Our customers help people discover their family stories and gain actionable insights about their health and wellness.

About the team:

You will join the AI ​​Content team, a dynamic group at the forefront of Document Understanding. You'll play a vital role in developing innovative AI models that extract and organize text and image information from billions of historical and genealogical records, enabling customers to discover, share, and connect with their family history.

As a member of the team, you will work with KB (Knowledge Base) and RAG (Retrieval Augmented Generation) implementations, integrating architectures leveraging SQL-structured databases along with vector databases supporting semantic search and retrieval applications. You will work with a dedicated mentor from the data science team, as well as engineering teams, to train, optimize, and deploy models that promote product development, customer success, and content creation across our project.

What you will do:

  • Configure structured and vector databases: Align and sync database schemas across structured and vector databases 

  • Curate and organize content collection metadata: Prepare and format provided content collection metadata to be compatible with defined database schemas 

  • Ingest content collection metadata:  Ingest collection metadata from provided sources into a structured SQL database.

  • Embeddings generation: Help develop a tool/script to generate embeddings from the structured data to populate the vector database.

  • Iterative improvement: Iterate on adjusting the database schema, indexes, embeddings, etc., to support various queries and use cases for analyzing the ingested content collection metadata

  • Collaborate on Cloud Deployment: Partner closely with ML Ops and Data Science Engineers to seamlessly deploy datasets, truth sets, models, and pipelines for training and inference in cloud environments.

  • Communicate Insights Effectively: Clearly and confidently present your findings, deliverables, and proposed solutions to technical and non-technical audiences, including teams, stakeholders, and executives.

Requirements:

  • 5+ years of experience in Data Science

  • Strong hands-on commercial experience with LLMs in production, RAG architecture, and agentic systems

  • Expertise with data collection, organization, curation, and formatting to populate SQL databases. 

  • Experience with SQL databases, including adjusting schemas and indices to optimize for efficient queries. 

  • Familiar with embedding generation and use of vector databases for semantic search and retrieval. 

  • Strong proficiency and experience with Python and relevant tools and libraries

  • Practical experience with cloud platform AWS (e.g. Amazon SageMaker, EC2, S3, AWS Lambda).

  • English: Upper-intermediate at least (both spoken and written)

It will be a plus:

  • Knowledge and experience with cloud platforms and related AI/ML services such as Google GCP Gemini API, Vertex AI, Azure, etc.  

  • Strong knowledge and experience with LightLLM

  • Commercial experience with Terraform or CloudFormation

  • Experience with agentic web scraping tools

What You’ll Gain

  • Mentorship & Growth: Learn from experienced Data Scientists while tackling meaningful, real-world AI projects, expanding your knowledge and professional network within a collaborative culture. 

  • Collaboration & Impact: Work alongside top industry professionals and help shape the tools that bring family history to life for millions of users.

  • Innovation & Purpose: Join a team at the forefront of applying AI to historical data - where every model you build helps preserve human stories.

What do we offer our new colleague?

  • Competitive compensation (based on market data, but also depending on the technical level of the candidate)

  • Flexible work schedule

  • 3 health packages to choose from

  • Annual paid vacation and state holiday celebration

  • Free English classes (online)

  • Individual approach to professional growth

  • Lack of bureaucracy and micromanagement

  • Modern, comfortable office facilities (a barbecue zone, kitchens, lounge rooms, coffee machines, etc.)

  • Foreign business trips (after the war)

  • On-site parking lot and charge station for Electric Cars

  • Corporate gifts, celebrations, and fun activities

  • Sports activities: ping-pong, soccer, work-out

Suppose you have a passion for solving challenging problems, building scalable, robust systems, love working with the latest technologies in a fast-paced, flexible environment, and are excited about the prospect of having a significant impact on products with more than 3 million paying subscribers.

In that case, we want to talk to you! ;-)

Without experience
Lviv
Full-time work
Want to get related jobs?
New job openings in your Telegram
Subscribe
We use cookies
accept