This is Srikanth from Reliable Software. We currently have an opportunity with one of our direct clients and would like to share the details with you. Please review the information below and let me know if you are interested. If you would like to be considered, kindly share your updated resume at
Sr. AWS AI Engineer – Unstructured Data & Document Processing
Location: Remote (Travel uptown 50%)
Position Summary
We are looking for a hands-on AWS Data Engineer to build scalable data ingestion and processing pipelines for large volumes of unstructured documents including PDFs, TIFF images, scanned documents, and other image formats. The primary responsibility is to extract structured information from documents using OCR and AI/ML services, transform the data, and store it in relational databases, JSON document stores, and vector databases to support downstream applications, document search, and AI-powered retrieval.
Responsibilities
• Design and develop scalable data ingestion pipelines on AWS.
• Process large volumes of PDFs, TIFF files, images, and other unstructured documents.
• Build OCR and document extraction workflows using AWS AI services and open-source libraries.
• Extract metadata, entities, tables, and key-value information from documents.
• Store extracted data into relational databases, JSON document stores, and vector databases.
• Develop data models that support fast search and retrieval.
• Implement document chunking, embedding generation, and indexing for semantic search.
• Optimize pipelines for performance, scalability, reliability, and cost.
• Build APIs or integration pipelines for downstream web applications.
• Ensure data quality, monitoring, logging, and error handling throughout the ingestion process.
• Work closely with AI engineers and application developers to enable enterprise search and retrieval capabilities.
Required Skills
• 8+ years of Data Engineering experience.
• Strong Python development skills.
• Hands-on experience with AWS services such as S3, Lambda, Step Functions, ECS/EKS, Glue, SQS/SNS, and IAM.
• Experience processing unstructured documents at scale.
• Strong knowledge of OCR technologies (AWS Textract preferred; experience with Tesseract or similar is a plus).
• Experience designing ETL/ELT pipelines.
• Strong SQL and database design skills.
• Experience storing and querying JSON data.
• Experience with vector databases (OpenSearch Vector Engine, Pinecone, Weaviate, pgvector, FAISS, or similar).
• Understanding of embeddings, semantic search, RAG, and document indexing concepts.
• Experience building REST APIs is a plus.
• Familiarity with Docker, Git, and CI/CD pipelines.
• Strong debugging, communication, and problem-solving skills.
• Preferred Qualifications
• Experience with Amazon Bedrock or other LLM platforms.
• Experience with LangChain or similar AI orchestration frameworks.
• Knowledge of Apache Spark or distributed data processing.
• Experience with document management or enterprise search platforms.
• Nice to Have
• Experience building enterprise document search solutions.
• Exposure to AI/LLM-based information extraction.
• Knowledge of Elasticsearch/OpenSearch and search optimization.
• Experience working with healthcare, legal, financial, or insurance documents.
Educational Qualifications:
- Required - Bachelor’s degree in Computer Science, Information Technology, Computer Engineering or closely related or equivalent.
- Preferred - Master’s degree in Management Information Systems (MIS), Computer Science, Big Data or Analytics or equivalent.
Travel:
· Open to travel based up on the nature of the engagement.
Thanks & Regards
Srikanth Donkani Resource Manager | Reliable Software Direct: |
AI & Analytics Generative AI Machine Learning Cloud DevOps SAP Data Engineering Data Science Databricks Snowflake |
Industries: Government | Healthcare | Banking | Manufacturing | Retail ISO Cert: 9001 | 27001 Equal Employment Opportunity Reliable Software employment does not discriminate on the basis of race, religion, gender, sexual orientation, age or any other basis as covered by federal, state, or local law. Employment decisions are based solely on qualifications, merit and business needs. |