Senior Systems Architect (ETL/Data Warehouse)

UI Software Developer
Contract Independent, Contract W2
Work from home not available Travel not required

Job Description

Position : Senior Systems Architect (ETL/Data Warehouse)

Duration : 12+ Months  (can be extendable up to 5 yrs.)

Location : Rockville, MD

Job Description

We are looking for Senior Systems Architect to support our work at NIH. The position will be based in Bethesda/Rockville, MD. We are looking for a skilled and motivated architect with expertise in open source, distributed systems, automated cloud storage, data engineering functions using variety of persistent, streaming and analytical data sources. The successful applicant will be involved with developing automated solutions, data ingestion, developing and maintaining  pipelines, and solutions to store, process and secure large (TBs-PBs) collections of information. The applicant must have experience working with structured and unstructured data types acquired in collaboration with scientists, biologists, and/or clinicians across the NIH, as well as third-party commercial and open database sources with relevant biological, genomic, or proteomic data.


  • Produce system designs, reference architectures, logical and physical architectural diagrams using modelling tools and documentation.

  • Leads the technical planning, system design, technology evaluation, toolchain selection, research and development, develop prototypes, and coding of a solution for projects

  • Lead and perform assessment and migration of infrastructure resources from on-premise to cloud environments and utilize toolchain, scripts, automation for each stage in the infrastructure deployment

  • Work closely with client counterparts, systems analysts and program management staff to capture and understand business problems, systems requirements and/or visions to translate to technology decisions and system design activities

  • Develop, document and advocate coding standards, design principles, integration designs, deployment models and operational excellence in support of large and complex distributed scientific systems

  • Guide and manage the development and systems engineers on the best practices, implementing system design principles, and best practices

  • Estimate the level of work of development/implementation tasks within a project; identify and alleviate technical challenges, technical debts and enforce standards to improve quality and technical competency

  • Anticipate and monitor and perform activities to minimize or eliminate performance, capacity, scaling of distributed solutions and/or applications

  • Perform coding activities, fix defects, code reviews, monitor build and release pipelines to ensure development activities are progressing as desired

  • Stay abreast of system issue log, bugs or customer complaints to ensure there is plan of action for continuous improvements in the future roadmap (or product backlog)

  • Stay abreast with emerging technologies in your areas but not limited to cloud computing & deployment, site reliability engineering, network design/architecture, infrastructure automation, security best practices, data engineering etc.


  • 10+ experience working with data in research, scientific or academic sectors is desired OR strong background in a research, academic or scientific setting

  • Must have experience in open source development of cloud storage, data engineering (ingress/egress, modelling, ETL, data warehouse) and processing big data characteristics including, structured and unstructured, high volume, streaming, SQL/NoSQL data

  • 6+ experience with high proficiency working in Unix environments and/or RedHat, CentOS, Ubuntu

  • 2+ years professional experience with cloud-based providers (Amazon AWS, Google GCP, Microsoft Azure) developing distributed computation and/or storage/processing systems

  • 2+ experience design, development/configuration of cloud-based storage solutions, networking, and performing data management functions for scientific research data and/or files.

  • 1+ experience either developing or supporting analytical solutions such as R, SAS, Python or similar toolset in the scientific research environment

  • 6+ years of programming experience in at least one of the following technologies node.js, python, Java or similar open source language

  • 2+ years’ experience with database design and maintenance (MongoDB, PostgreSQL, Apache Arrow/Apache Parquet)

  • Proficiency with developing and integrating cloud services such as serverless functions (Lambda), event driven frameworks, message queues and topics, REST API, virtual machines, docker containers, Kubernetes, virtual networks, log analysis solutions (Splunk, EFK or similar), caching, PaaS, SaaS solutions

  • Experience in data engineering technology stack or platforms including but not limited to Apache Spark, Kafka, Hadoop, Distributed Computing or similar big data applications/platforms is desired but not mandatory

  • Familiarity of devops tool set such as Terraform, Ansible, Chef, Puppet, GitHub, SVN, Jenkins or similar is preferred

  • Familiarity of either one of the following scripting languages python, bash and/or perl is preferred.

  • Professional certifications are desired but optional




Adroitts Inc  

12310 Pinecrest Road, Suite 306 | Reston VA 20191

(O) 571 348 4951   | (F) 703 563 6322

(E)  shankaratadroittsdotcom  |

Dice Id : 90751108
Position Id : 264517
Have a Job? Post it