Data Science skill - Machine Learning, NLP, and LLMs
Databricks
Experience- 5 to 8yrs
Specialization
Data Science Advanced: Generative AI & Databricks
Job requirements
Job Title: Senior AI Engineer – Generative AI & Databricks
Experience: 5-8+ Years
About the Role
We are seeking a Senior AI Engineer with strong expertise in Generative AI (GenAI), Databricks, and end-to-end ML/LLM systems.
You will be responsible for designing, building, and deploying intelligent, scalable GenAI solutions integrated into enterprise-grade data and analytics platforms.
The ideal candidate combines strong software engineering, MLOps, and LLM engineering experience — with the ability to lead AI agentic workflows, data pipeline optimization, and model-driven automation using Databricks, MLflow, and Azure/Snowflake ecosystems.
Key Responsibilities
1. Solution Architecture & Implementation
Design and implement end-to-end Generative AI solutions on Databricks, leveraging Unity Catalog, MLflow, Delta Lake, and Vector Search.
Architect LLM-based multi-agent frameworks for intelligent automation, chatbot systems, and document reasoning tasks.
Integrate Cortex AI, OpenAI, or Anthropic APIs for retrieval-augmented generation (RAG), conversational reasoning, and workflow orchestration.
2. Model Development & Optimization
Fine-tune and evaluate LLMs and domain-specific NLP models (NER, Risk Assessment, Question Answering).
Develop pipelines for prompt engineering, context management, model evaluation, and hallucination detection.
Optimize inference performance, latency, and cost across multi-cloud and Databricks environments.
3. Data Engineering & Governance
Collaborate with data engineering teams to ensure clean, well-governed, and vectorized data pipelines.
Build and maintain feature stores and embeddings stores using Databricks or Snowflake.
Implement data validation, lineage, and monitoring using Delta Live Tables and Unity Catalog.
4. MLOps & Automation
Build reusable ML pipelines using Databricks Repos, MLflow, and Feature Store.
Automate deployment, monitoring, and retraining workflows for continuous model improvement.
5. Collaboration & Leadership
Partner with product managers, data scientists, and business stakeholders to translate ideas into production-ready AI systems.
Review code, mentor junior engineers, and enforce best practices in scalable AI/ML development.
Contribute to internal knowledge bases, documentation, and reusable component libraries.
Required Skills & Expertise
Core AI/ML
Strong background in Machine Learning, NLP, and LLMs (Transformers, RAG, embedding models).
Proven experience fine-tuning or implementing models using Hugging Face, LangChain, LlamaIndex, or OpenAI API.
Knowledge of retrieval-augmented generation, multi-agent orchestration, and context management.
Databricks & Cloud Ecosystem
Expertise in Databricks (Delta Lake, MLflow, Unity Catalog, Feature Store, Vector Search).
Familiarity with Azure Databricks, Azure OpenAI, or Snowflake Cortex AI.
Experience integrating external APIs and cloud-native microservices (FastAPI, REST, or gRPC).
Programming & Engineering
Strong proficiency in Python, SQL, PySpark, and Databricks Notebooks.
Experience building modular codebases, deploying APIs, and working with CI/CD pipelines (GitHub Actions, Azure DevOps).
Hands-on experience with Streamlit, Gradio, or other UI frameworks for AI app development.
MLOps & Validation
Hands-on with MLflow tracking, model registry, and experiment management.
Experience in AI validation, faithfulness scoring, drift detection, and integrity match metrics.
Working knowledge of Docker, Kubernetes, and inference scaling techniques.
Soft Skills
Strong communication, stakeholder management, and ability to translate business problems into AI solutions.
Comfort working in agile, multi-disciplinary environments.
Passion for innovation, experimentation, and applied AI problem-solving.
Mandates
• Need GenAI Data Scientist – Databricks certified ML Engineer and work closely with customers. • Use case will involve data extract from pdf-based documents. • Leverage Databricks native solutions.