Overview
On Site
Depends on Experience
Contract - W2
Contract - Independent
Contract - 12 Month(s)
Skills
LANGCHAIN; ML; SQL; PYTHON; PYSPARK
Job Details
Data Engineer
Atlanta, GA
Job Description
Job Summary:
We are looking for a skilled and motivated Data Engineer with strong hands-on experience in Scala/Pyspark and Python, and familiarity with LangChain and Generative AI technologies. The ideal candidate will work on building scalable data pipelines, designing efficient data architectures, and integrating cutting-edge AI tools to drive data-driven solutions across the organization.
Key Responsibilities:
- Design, develop, and maintain scalable and efficient data pipelines using Scala, Python, and Spark.
- Integrate structured and unstructured data from various sources to support downstream analytics and machine learning models.
- Work closely with Data Scientists and Machine Learning Engineers to deploy LLM and GenAI-powered solutions.
- Explore and implement LangChain and other frameworks for LLM orchestration and prompt chaining.
- Build and maintain ETL/ELT pipelines to support data ingestion, transformation, and loading from diverse sources (cloud, APIs, etc.).
- Ensure data quality, observability, and governance best practices across the pipeline.
- Collaborate in Agile teams to deliver well-architected, high-performance data engineering solutions.
Required Skills:
- Strong programming skills in Python.
- Hands-on experience with Big Data tools like Apache Spark, Kafka, Airflow, etc.
- Proficient in SQL and experience with databases (PostgreSQL, MySQL, Redshift, Snowflake, etc.).
- Working knowledge of data modeling, data warehousing, and data lake architectures.
- Experience in building and deploying pipelines on cloud platforms (AWS, Azure, Google Cloud Platform).
Good to Have:
- Familiarity with LangChain, OpenAI APIs, LlamaIndex, or similar frameworks.
- Exposure to Generative AI concepts, LLM-based app development, and prompt engineering.
- Experience with vector databases (Pinecone, FAISS, Weaviate).
- Experience with containerization (Docker/Kubernetes) and CI/CD pipelines
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.