Overview
RemoteHarrisburg, PA
Full TimePart TimeAccepts corp to corp applicationsContract - IndependentContract - W2
Skills
SpecificationSOLIDFunctional requirementsMuleSoftReal-timeArtificial intelligenceWorkflow managementData flowProcess flowApache StormAmazon RDSSQLTestingOracleJavaApplication developmentData collectionC++ReportingObject-Oriented ProgrammingStreamingElectronic Health Record (EHR)Data VisualizationApache KafkaAccessibilityStorageCloud computingInformaticsProcess improvementUnstructured dataData integrationMicrosoft SQL ServerMeta-data managementExtractionTIBCO SoftwarePythonApache CassandraTransformationBusiness requirementsApache SparkPredictive analyticsModelingNoSQLAmazon EC2AutomationRData managementTalendData warehouseCollaborationScriptingBig dataExtracttransformloadInterfacesStatisticsInformaticaDataMicrosoft VisioAmazon Web ServicesMachine Learning (ML)Amazon RedshiftDatabaseRemote Desktop ServicesScalaReference dataMaster data managementInformation systemsApache HadoopData modelingRoot cause analysisData architecturePostgreSQLComputer scienceProject managementERwinDesignRelational databases
Job Details
Job Title: Senior Data Architect
Job Location: Remote
Job Duration: Long-Term Contract
Interview: Virtual
Responsibilities:
- Analyze business requirements and convert them into technical specifications, encompassing data streams, integrations, transformations, databases, data lakes, data warehouses, and data products.
- Develop the framework, standards, and principles for the data architecture, including modeling, metadata, reference data, master data, and security within the PA LDS environment.
- Establish a reference architecture as a guide for creating and enhancing data systems.
- Define end-to-end data process flows, covering data origins, organizational data flow and functions, data management, and data transition.
- Implement procedures to ensure data accuracy, quality, timeliness, availability, and accessibility.
- Create and implement data management processes and procedures.
- Collaborate with internal teams to formulate and execute data strategies, build models, and understand stakeholder needs and objectives.
- Design and develop application programming interfaces (APIs) for data retrieval.
- Deploy complex data environments that meet functional and non-functional requirements of various business areas.
- Identify, design, and implement process improvements, including automation, optimized data delivery, and scalable infrastructure redesign.
- Design the infrastructure necessary for efficient extraction, transformation, and loading of data from diverse sources using SQL and 'big data' technologies.
Qualifications:
- Bachelor's Degree in Computer Science or related field of study, with a minimum of 10+ years of experience in data/database roles, including 5+ years as a Data Architect.
- Strong expertise in cloud-based data services on AWS, such as ECT, Glue, EMR, RDS, and Redshift, with 2-3 years of practical experience.
- Proficiency in real-time data streaming technologies like Storm, Spark-Streaming, Kafka, or similar.
- Solid data management skills for efficient and cost-effective data collection, storage, and utilization.
- Knowledge of system development life cycle, project management methodologies, and requirements, design, and testing techniques.
- Familiarity with established and emerging data management and reporting technologies, including columnar and NoSQL databases, predictive analytics, data visualization, and unstructured data.
- Advanced SQL skills and experience working with relational databases, query authoring, and familiarity with diverse database systems.
- Experience in building and optimizing 'big data' pipelines, architectures, and datasets.
- Proficiency in root cause analysis to address specific business questions and identify areas for improvement.
- Strong project management and organizational abilities.
- Experience collaborating with cross-functional teams in a dynamic environment.
- 5+ years of experience as a Data Architect, with a degree in Computer Science, Statistics, Informatics, Information Systems, or a related quantitative field.
- Hands-on experience with big data tools such as Hadoop, Spark, Kafka, etc.
- Knowledge of artificial intelligence and machine learning (AI/ML) for developing scalable systems to handle large datasets.
- Proficiency in data modeling tools like ERWin or Visio for visualizing metadata and database schemas.
- Familiarity with relational SQL and NoSQL databases, including Oracle, MS SQL Server, Postgres, Cassandra, etc.
- Experience with data pipeline and workflow management tools like Azkaban, Luigi, Airflow, etc.
- Exposure to data integration services solutions from vendors such as Informatica, MuleSoft, Talend, TIBCO, etc.
- Familiarity with cloud-based data services, particularly AWS (EC2, Glue, EMR, RDS, Redshift, etc.).
- Knowledge of stream-processing systems like Storm, Spark-Streaming, Kafka, etc.
- Proficiency in object-oriented/object function scripting languages, such as Python, R, Java, C++, Scala, etc.