Overview
On Site
Full Time
Skills
Management
Extract
Transform
Load
ELT
Database Administration
Access Control
Backup
Performance Tuning
Normalization
Video
Extraction
Data Quality
Scalability
Training
Documentation
Jupyter
Confluence
Database
SQL
Database Design
PostgreSQL
MySQL
NoSQL
MongoDB
Apache Cassandra
Storage
Media
Computerized System Validation
Python
Scripting
Bash
Ubuntu
Linux
Servers
Pandas
NumPy
OpenCV
FFmpeg
Communication
Technical Writing
PyTorch
TensorFlow
Open Source
GPU
Git
GitHub
Workflow
Continuous Integration
Continuous Delivery
Mentorship
Collaboration
Leadership
Productivity
Data Engineering
Artificial Intelligence
Machine Learning (ML)
Job Details
Job Description
CornerStone TTS is looking for a part time engineer to own the end-to-end design, implementation, and maintenance of mixed-media data pipelines. You will collaborate closely with an AI/ML team and client stakeholders to ingest, process, and manage image, video, and tabular datasets ensuring production-ready systems that support both training and inference workloads.
Key Responsibilities
- Pipeline Architecture & Ownership
- Architect, build, and optimize ETL/ELT workflows to ingest .jpeg, .gif, .csv, and .mov assets.
- Ensure pipelines are modular, version-controlled, and easily maintainable.
- Storage & Database Management
- Design and administer NAS/local storage solutions: setup, access control, backup, and performance tuning.
- Deploy, tune, and maintain SQL (PostgreSQL/MySQL) and NoSQL (MongoDB/Cassandra) databases for high-volume ingestion and querying.
- Media Processing & Transformation
- Leverage Python with Pandas and NumPy for tabular data transformations.
- Use OpenCV for image preprocessing (resizing, normalization, feature extraction).
- Utilize FFmpeg for video ingestion, transcoding, and key-frame extraction.
- Monitoring, Logging & Reliability
- Implement comprehensive logging, monitoring, and alerting for all pipelines.
- Proactively troubleshoot throughput or data-quality issues; optimize for reliability and scalability.
- Collaboration & Documentation
- Partner with ML engineers to integrate data pipelines into model training and serving workflows.
- Produce clear, reproducible documentation (Jupyter notebooks, Markdown guides, Confluence pages) covering architecture, pipeline usage, and troubleshooting.
Required Qualifications
- Experience: 5+ years in data engineering or similar roles, with a track record of owning production pipelines.
- Databases: Expert in SQL database design/tuning (PostgreSQL/MySQL) and at least one NoSQL system (MongoDB, Cassandra).
- Storage: Hands-on with NAS or equivalent local storage architectures and protocols.
- Media Handling: Deep expertise processing .jpeg, .gif, .csv, and .mov files at scale.
- Programming: Advanced Python scripting; proficient with Bash; comfortable on Ubuntu/Linux servers.
- Libraries/Tools: Pandas, NumPy, OpenCV, FFmpeg.
- Communication: Strong written and verbal skills; demonstrated ability to produce clear technical documentation.
Preferred Qualifications
- Familiarity with AI/ML frameworks (PyTorch, TensorFlow, open-source GPT/RAG toolkits).
- Experience in GPU-accelerated environments (NVIDIA stack).
- Proficiency with Git/GitHub workflows and CI/CD pipelines.
- Prior leadership or mentorship experience with junior engineers.
- Background in oil & gas or other regulated industries is a plus, but not required.
What You ll Gain
- Ownership & Impact: Lead the data foundation for cutting-edge AI imaging and chat products.
- Collaboration: Work side-by-side with leadership and AI/ML experts.
- Flexibility: Part-time schedule tailored for high productivity and work-life balance.
- Growth: Hands-on exposure to both data engineering and AI/ML integration in a production setting.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.