We are partnered with a fast-growing data intelligence and marketplace platform operating within a highly data-driven industry. Their platform tracks thousands of businesses and products nationwide, delivering critical market insights powered by large-scale data collection and automation. Data quality is the core of the business, and the engineering team plays a direct role in ensuring the platform remains accurate, reliable, and scalable.
This role is ideal for an engineer who is highly comfortable working with web scraping technologies, data quality investigations, and backend tooling. The team is looking for someone who takes ownership of problems end-to-end, thrives in ambiguous environments, and enjoys solving real-world data challenges at scale. You'll work across scraper infrastructure, data reliability, monitoring systems, and internal tooling while contributing directly to the quality and integrity of the company's core product.
Key Responsibilities
Monitor scraper health dashboards, alerting systems, and pipelines to identify failures, stale data, and coverage gaps (20%)
Diagnose, troubleshoot, and resolve scraper issues while improving reliability and observability across scraper infrastructure (20%)
Build, maintain, and optimize a fleet of 40+ web scrapers using Python and JavaScript (25%)
Investigate and resolve data quality issues across ingestion, transformation, and delivery layers while communicating findings and impact to stakeholders (20%)
Maintain CI/CD pipelines, internal data tools, and reusable data quality testing frameworks written primarily in Python (15%)
Qualifications
Experience building and maintaining web scraping systems and monitoring scraper infrastructure
Strong proficiency in Python
Working proficiency in JavaScript, particularly for scraping and browser automation workflows
Strong SQL skills and experience working with large datasets
Experience troubleshooting messy, real-world data issues and identifying root causes independently
Strong written communication skills with the ability to explain technical issues and business impact clearly
Experience maintaining CI/CD pipelines and automated testing environments is a plus
Comfortable operating in fast-paced, highly ambiguous environments with a strong sense of ownership
The Offer
Competitive base salary + bonequity potential
Comprehensive benefits package including medical, dental, and vision coverage
Opportunity to work on large-scale data infrastructure and automation systems
Highly collaborative engineering culture with direct ownership over business-critical systems
Exposure to complex, high-volume data environments and modern data tooling
Work Authorization: Candidates must be authorized to work in the U.S. without current or future sponsorship.
#LI-SV1
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
- Dice Id: 10105282
- Position Id: 878048
- Posted 1 hour ago