Job Title: Sr AI/ML Engineer
Location: Bellevue, WA/ Atlanta, GA/ Overland park, KS
Duration: / Term: C2C
Experience Desired: 10+ Years
Β
Job Responsibilities β Identity Resolution
- Develop and deploy entity resolution models to match and deduplicate customer records across multiple systems β directly impacting the accuracy of CDP as the source of truth
- Implement probabilistic matching techniques (e.g., Fellegi-Sunter) and ML models (gradient boosting, neural classifiers) for record linkage across the US adult population
- Build candidate blocking pipelines using phonetic algorithms (Soundex, Double Metaphone), token similarity, and LSH to handle billions of potential match pairs efficiently
- Apply fuzzy matching techniques (Levenshtein, Jaro-Winkler, Jaccard) for customer attributes such as name, address, phone, and identifiers
- Develop clustering algorithms (DBSCAN, hierarchical clustering) to create unified "golden customer profiles" that serve as the authoritative representation of each individual
- Build embedding-based similarity systems using Sentence-BERT or transformer-based models for semantic matching
- Implement ANN/KNN retrieval systems (FAISS, Annoy) for large-scale entity matching across population-scale datasets
Job Responsibilities β AI/LLM
- Use LLMs (e.g., GPT, Claude) for classification and disambiguation of entity matches, improving resolution accuracy where traditional methods fall short
- Build and support RAG pipelines to enrich customer profiles with contextual data from unstructured sources
- Perform prompt engineering and evaluation for structured data extraction from unstructured inputs feeding into CDP
- Contribute to NLQ-to-SQL systems, enabling business users to query CDP data using natural language β making the authoritative source of truth accessible to non-technical stakeholders
- Support integration with vector databases (e.g., Pinecone, pgvector, Qdrant) for semantic search across customer data
Education and Work Experience
- Bachelor''s or Master''s degree in Computer Science, Data Science, or related field
- 3+ years of experience in ML/AI engineering
- At least 1 year of experience in entity resolution, record linkage, or deduplication β ideally at scale
Technical Skills
- Programming: Python (required)
- Libraries: scikit-learn, HuggingFace Transformers, RapidFuzz, jellyfish
- Experience with LLM APIs (OpenAI, Anthropic) and prompt pipelines
- Strong SQL skills and experience with Spark or Dask for distributed processing
- Familiarity with vector databases and embedding-based retrieval
- Experience with ML lifecycle tools (MLflow or similar)
- Understanding of data quality metrics and how identity resolution impacts downstream trust
Knowledge, Skills, and Abilities
- Strong understanding of ML fundamentals and similarity matching techniques applied to customer identity
- Ability to work with large, messy, real-world datasets spanning hundreds of millions of records
- Understanding of precision/recall tradeoffs in identity resolution and their impact on data trust
- Good problem-solving and analytical skills
- Ability to collaborate with data engineering, platform, and business teams to deliver accurate customer profiles
Key Skills:
Machine Learning, Generative AI, NLP, Fraud Detection, Agentic AI, LangChain, LangGraph.