AWS Data Engineer (Travel upto 50%)

Remote • Posted 3 hours ago • Updated 3 hours ago
Contract Independent
Contract W2
12 Months
No Travel Required
Remote
Depends on Experience
Fitment

Dice Job Match Score™

🔢 Crunching numbers...

Job Details

Skills

  • Data Engineering
  • Python
  • Amazon S3
  • Artificial Intelligence
  • Extract
  • Transform
  • Load
  • SQL
  • Unstructured Data

Summary

Job Role: AWS Data Engineer – Unstructured Data & Document Processing
Location: Remote (50% travel required) 
Hire-Type: Contract
 
 
 
 
Position Summary
We are looking for a hands-on AWS Data Engineer to build scalable data ingestion and processing pipelines for large volumes of unstructured documents including PDFs, TIFF images, scanned documents, and other image formats. The primary responsibility is to extract structured information from documents using OCR and AI/ML services, transform the data, and store it in relational databases, JSON document stores, and vector databases to support downstream applications, document search, and AI-powered retrieval.
Responsibilities
  • Design and develop scalable data ingestion pipelines on AWS.
  • Process large volumes of PDFs, TIFF files, images, and other unstructured documents.
  • Build OCR and document extraction workflows using AWS AI services and open-source libraries.
  • Extract metadata, entities, tables, and key-value information from documents.
  • Store extracted data into relational databases, JSON document stores, and vector databases.
  • Develop data models that support fast search and retrieval.
  • Implement document chunking, embedding generation, and indexing for semantic search.
  • Optimize pipelines for performance, scalability, reliability, and cost.
  • Build APIs or integration pipelines for downstream web applications.
  • Ensure data quality, monitoring, logging, and error handling throughout the ingestion process.
  • Work closely with AI engineers and application developers to enable enterprise search and retrieval capabilities.
Required Skills
  • 8+ years of Data Engineering experience.
  • Strong Python development skills.
  • Hands-on experience with AWS services such as S3, Lambda, Step Functions, ECS/EKS, Glue, SQS/SNS, and IAM.
  • Experience processing unstructured documents at scale.
  • Strong knowledge of OCR technologies (AWS Textract preferred; experience with Tesseract or similar is a plus).
  • Experience designing ETL/ELT pipelines.
  • Strong SQL and database design skills.
  • Experience storing and querying JSON data.
  • Experience with vector databases (OpenSearch Vector Engine, Pinecone, Weaviate, pgvector, FAISS, or similar).
  • Understanding of embeddings, semantic search, RAG, and document indexing concepts.
  • Experience building REST APIs is a plus.
  • Familiarity with Docker, Git, and CI/CD pipelines.
  • Strong debugging, communication, and problem-solving skills.
Preferred Qualifications
  • Experience with Amazon Bedrock or other LLM platforms.
  • Experience with LangChain or similar AI orchestration frameworks.
  • Knowledge of Apache Spark or distributed data processing.
  • Experience with document management or enterprise search platforms.
Nice to Have
  • Experience building enterprise document search solutions.
  • Exposure to AI/LLM-based information extraction.
  • Knowledge of Elasticsearch/OpenSearch and search optimization.
  • Experience working with healthcare, legal, financial, or insurance documents.
 
 
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10122703
  • Position Id: 9010049
  • Posted 3 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Remote

2d ago

Easy Apply

Contract, Third Party

Depends on Experience

Remote or Chandler, Arizona

Today

Easy Apply

Contract

$DOE

Remote or Atlanta, Georgia

Today

Full-time

USD 65.00 - 70.00 per hour

Remote

4d ago

Easy Apply

Contract

Depends on Experience

Search all similar jobs