Data Scientist

  • eiGroup
  • Elanın qoyulma tarixi: 26.12.2023
    Vakansiyaya müraciət tarixi bitmişdir.

İşin təsviri

Responsibilities:

  • Design and development of scalable and fast auto-correction algorithm (for Azerbaijani/English languages) to eliminate typo in the description of inventory items. 
  • Design and Development of scalable & fast logic for extraction of key non numeric features from the description of inventory items.
  • Designing and developing NLP/AI based fast, scalable and consistent algorithm for matching, grouping and hierarchical restructuring of inventory items based on description and other features. 
  • Developing advance RAG solutions for smart query and interaction with custom documents.
  • Design and development of AI/advance NLP solution for analysis and solving domain specific problems based on unstructured source data such as reports, text, image etc. 
  • Developing new custom OCR solutions and/or embedment of existing ones to business pipelines.
  • Developing prediction systems and machine learning algorithms for timeseries data to solve domain specific engineering problems. 
  • Analyzing large amounts of information to find patterns and solutions. Carrying out preprocessing cleansing, and validating the integrity of structured and unstructured data to be used for analysis across different projects.
  • Collaborate with Business and IT teams, propose solutions and strategies to tackle business challenges
  • Presenting results in a clear manner

Requirements:

  • Bachelor’s Degree in Computer Science, IT, Mathematics/Physics or similar field; a Master’s is a plus
  • Minimum 3 years’ experience as a data engineer/scientist or in a similar role. Delivered several big projects with hands on experience.
  • Excellent knowledge of statistical programming languages like Python (OOP must), and hands on experience database query languages like SQL. Understanding of data structures, data modeling, ML algorithms and software architecture. ML frameworks (like TensorFlow or PyTorch) and libraries (like scikit-learn).
  • Proficiency in handling imperfections in data is an important aspect of a data scientist job description.
  • Good applied statistical skills, including knowledge of statistical tests, distributions, regression, maximum likelihood estimators, etc. Proficiency in statistics is essential for data-driven companies.
  • Hands on Experience with of Machine Learning techniques, including decision tree learning, clustering, artificial neural networks, NLP etc. on several projects. Ability to create pure or hybrid custom deep learning or computer vision architectures.
  • Strong Math Skills (Multivariable Calculus and Linear Algebra) - understanding the fundamentals of Multivariable Calculus and Linear Algebra. Great numerical and analytical skills
  • Hands on experience using NLP algorithms, pre-processing, developing models for classification, recommendation, clustering, dimensionality reduction (NLP libraries like NLTK / spaCy / PyTorch)
  • Hands on experience with timeseries data analysis/modelling and advanced regular expressions.  
  • Experience with development of RAG bases system using available LLM models for question answering, auto-correction etc. (Lang chain, Llama Index, Hugging Face etc.). Some experience with fine-tuning large language models for specific applications.
  • Familiarity with cloud-based infrastructure and experience deploying large-scale machine learning models (ML Ops) in production environments (Azure ML Studio, AWS S3, AWS SageMaker etc)
  • Hands on experience with Data Visualization Tools like Power BI, Spotfire, Tableau, matplotlib, etc.
  • Excellent Communication Skills –efficiently communicating with both a technical and non-technical audience.
  • Familiar with SDLC principles, such as Agile.
  • Experience with Gitlab, and Linux environments 
  • Can independently assemble a prototype from business requirements and data collection to directly training the model and creating a prototype, API for subsequent rollout

Hiring terms:

  • Full-time job
  • Five-days working week
  • Flexible working hours
  • Medical insurance package

To be considered for this position, please email your resume with reference “Data Scientist” in the email subject line. Only successful candidates will be contacted.