|
Описание: |
For the last 9 years we’re building AI-powered tools that understand, discover, and recommend scientific papers and experts. Our platform helps businesses and academic institutions stay at the cutting edge of progress. We work with Europe’s largest grant agencies, top-10 scientific publishers and universities.
You will take part in actual product development and implement product features from vision and research to production. Core Challenges You’ll Solve: * Architect topic modelling and document clustering pipeline * Optimize performance of information retrieval systems * Improve our author disambiguation system * And a lot of other stuff: deduplication, entity matching, structured data extraction from documents, resources allocation, etc.
This is not an LLM-integration job! Our tasks require software and data science engineering. Required background: * Strong engineering skills in Python, SQL * Extensive experience in applying data science to product development * Experience with classic Machine Learning, and Deep Learning in NLP
Nice to have: * Experience with embedding models (bi-encoders, cross-encoders, etc). * Understanding of information retrieval systems * Experience with text clustering and topic modelling * Experience with 100+ GB datasets * Background in math, statistics, probability and information theory
Technical stack: Python, PostgreSQL, Airflow, Vespa search engine. Dedicated servers.
No bureaucracy: technical team is less than 10 people. Bonus points for non-LLM cover letter.
Join us in making scientific knowledge more discoverable and connected! We’re tackling fascinating technical challenges that directly impact global research.
Відгукнутись на вакансію |