Data Engineer W /Python & Snowflake
Austin, TX (onsite)
Duration: 6 months
MUST HAVE
• Hands-on experience with writing Complex queries using – Joins, Self Joins, Views, Materialized Views, Cursor also Recursive, use of GROUP BY, PARTITION BY functions / SQL Performance tuning
• Hands-on experience with ETL and Dimensional Data Modelling – Slowly Changing Dimensions (SCD – Type 1, 2, 3)
o Good understanding of concepts like schema types, table types - fact-dimension etc. like how to design a dimension vs fact, design considerations factored etc.
• Proficiency in Python scripting/programming – using Pandas, PyParsing, Airflow.
o Pandas, Tableau server modules, Numpy, Datetime, Apache Airflow related modules, APIs
o Setting up Python scripts on DataLab, scheduling processes, connecting with DataLake (S3 etc )
o Data Pipeline automation
• Good understanding on Snowflake Architecture - experience with designing and building solutions.
o Architecture, design aspects, performance tuning, time travel, warehouse concepts - scaling, clustering, micro-partitioning
o Experience with SnowSQL, Snowpipe
• Must Have - Experience with Snowflake performance optimization techniques
• Own project delivery collaborating with Offshore.
• Actively participating in discussions with business to understand requirements and provide suitable solutions.
• Experience with AI (very beneficial) and advanced AI integration –
o Good experience with Gen AI and LLM Integration which includes:
§ Having good understanding of RAG
§ Prompt and Context Engineering – structure, query and manage data context fed to LLMs
§ Vector Data Management – handling and storing data (also unstructured) in vector databases, indices for semantic search and RAG
§ Experience with LLM Orchestration frameworks – LangChain, LlamaIndex
Role Descriptions:
MUST HAVEHands-on experience with writing Complex queries using Joins| Self Joins| Views| Materialized Views| Cursor also Recursive| use of GROUP BY| PARTITION BY functions SQL Performance tuningHands-on experience with ETL and Dimensional Data Modelling Slowly Changing Dimensions (SCD Type 1| 2| 3)oGood understanding of concepts like schema types| table types - fact-dimension etc. like how to design a dimension vs fact| design considerations factored etc.Proficiency in Python scriptingprogramming using Pandas| PyParsing| Airflow.oPandas| Tableau server modules| Numpy| Datetime| Apache Airflow related modules| APIsoSetting up Python scripts on DataLab| scheduling processes| connecting with DataLake (S3 etc )oData Pipeline automationGood understanding on Snowflake Architecture - experience with designing and building solutions.oArchitecture| design aspects| performance tuning| time travel| warehouse concepts - scaling| clustering| micro-partitioningoExperience with SnowSQL| SnowpipeGood to Have - Experience with Snowflake performance optimization techniquesOwn project delivery collaborating with Offshore.Actively participating in discussions with business to understand requirements and provide suitable solutions.Experience with AI (very beneficial) and advanced AI integration oGood experience with Gen AI and LLM Integration which includesHaving good understanding of RAGPrompt and Context Engineering structure| query and manage data context fed to LLMsVector Data Management handling and storing data (also unstructured) in vector databases| indices for semantic search and RAGExperience with LLM Orchestration frameworks LangChain| LlamaIndex
Essential Skills: MUST HAVEHands-on experience with writing Complex queries using Joins| Self Joins| Views| Materialized Views| Cursor also Recursive| use of GROUP BY| PARTITION BY functions SQL Performance tuningHands-on experience with ETL and Dimensional Data Modelling Slowly Changing Dimensions (SCD Type 1| 2| 3)oGood understanding of concepts like schema types| table types - fact-dimension etc. like how to design a dimension vs fact| design considerations factored etc.Proficiency in Python scriptingprogramming using Pandas| PyParsing| Airflow.oPandas| Tableau server modules| Numpy| Datetime| Apache Airflow related modules| APIsoSetting up Python scripts on DataLab| scheduling processes| connecting with DataLake (S3 etc )oData Pipeline automationGood understanding on Snowflake Architecture - experience with designing and building solutions.oArchitecture| design aspects| performance tuning| time travel| warehouse concepts - scaling| clustering| micro-partitioningoExperience with SnowSQL| SnowpipeGood to Have - Experience with Snowflake performance optimization techniquesOwn project delivery collaborating with Offshore.Actively participating in discussions with business to understand requirements and provide suitable solutions.Experience with AI (very beneficial) and advanced AI integration oGood experience with Gen AI and LLM Integration which includesHaving good understanding of RAGPrompt and Context Engineering structure| query and manage data context fed to LLMsVector Data Management handling and storing data (also unstructured) in vector databases| indices for semantic search and RAGExperience with LLM Orchestration frameworks LangChain| LlamaIndex
Skills: | Category | Name | Required | Importance | Experience | SkillCategoryTest1_MN | Digital : Python | Yes | 1 | 7+ years | | SkillCategoryTest1_MN | Digital : Snowflake | Yes | 1 | 7+ years | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
|
Regards,
Satya
Technical Recruiter
Key Business Solutions, Inc