We are seeking a highly skilled ML / NLP Evaluation Engineer with strong expertise in Python, large-scale data engineering, and NLP model evaluation. The ideal candidate will have hands-on experience building and validating data pipelines, evaluating ranking/search models using offline metrics, and working with modern cloud-based ML ecosystems.
Required Qualifications
5+ years of experience in Software Engineering, Data Engineering, or ML Engineering.
Strong programming expertise in Python.
Hands-on experience with data engineering frameworks such as PySpark or Google Dataflow.
Experience working with large-scale structured and unstructured datasets.
Strong understanding of NLP model evaluation methodologies and ranking systems.
Hands-on experience with offline evaluation metrics such as:
nDCG (Normalized Discounted Cumulative Gain)
MRR (Mean Reciprocal Rank)
Precision@K / Recall@K
Experience designing and executing A/B testing and experimentation frameworks.
Strong SQL skills and experience with distributed data processing systems.
Experience building ETL/data pipelines for ML workflows.
Familiarity with machine learning lifecycle, model validation, and performance monitoring.
Experience with REST APIs, data validation, and automation scripting.
Strong understanding of SDLC, Agile methodologies, and CI/CD practices.
Excellent analytical, debugging, and problem-solving skills.
Technologies & Tools
- Python
- PySpark / Google Dataflow
- SQL
- Google Cloud Platform (preferred)
- BigQuery
- Feast / Tecton
- Airflow
- Git / CI-CD
- NLP & Ranking Evaluation Frameworks
- Experimentation & A/B Testing Tools
Preferred Domain Experience
- Search/Relevance Engineering
- NLP / LLM Systems
- Recommendation Engines
- E-commerce Search
- AI/ML Platforms
- Information Retrieval Systems