Spartan Technologies, Inc.
- Atlanta, GA
- This is a contract to hire role.
- Remote but needs to be in Eastern OR Central Time Zone.
- Will require occasional travel to Manufacturing sites. (10%)
- US Citizens OR GC Holders.
We are looking for a Data Engineer to join the Global Digital Team. Candidate will aid in the optimization of operations by manipulating and aggregating the disparate operational and back office data sources into a format that is easily digestible by both data scientists and statistically adept colleagues. Candidate's core responsibility will be to combine large volumes of disparate complex data, conduct quality checks on the data, manipulate the data and ensure continuous access to a clean format of the operational data for data scientists and other stakeholders. In addition, he/she will also assist in developing the data pipeline to ensure ongoing data collection, consolidation, and management.
Core skills needed include: Heavy Python and Azure / Databricks.
- Design and Develop data ingestion pipelines and processes based on requirements in Python and PySpark.
- Create error handing, exception management and data quality routines to expose the anomalies in the data.
- Profile and analyze data to identify gaps and potential data quality issues.
- Identifies relationships between disparate data sources.
- Uses Python, Databricks and Spark to code the data Engineering routines.
- Perform unit and integration testing.
- Works with the group of data scientists and business SMEs to get the requirements and present the details in data.
- Designs and jointly develops the data architecture with data architect and ensures security and maintenance.
- Explores suitable options, designs, and creates data pipeline (data lake / data warehouses) for specific analytical solutions.
- Identifies gaps and implements solutions for data security, quality and automation of processes.
- Builds data tools and products for effort automation and easy data accessibility.
- Supports maintenance, bug fixing and performance analysis along data pipeline.
- Diagnoses existing architecture and data maturity and identifies gaps.
Monday, November 22, 2021