Data Engineer

The Pegasus Agency - New York, NY

Fullstack Data Engineer

New York, NY

  • Be the technical lead engineer on a team of data engineers responsible for data aggregation, transformation, modeling and delivery for both client usage and internal data science teams
  • Full-stack design, development, and operation of core data capabilities like data lake, data warehouse, data marts and data pipelines
  • Own the team's roadmap and project planning process, partnering with stakeholders to develop business objectives and translate those into action
  • Full accountability for one or more data assets
  • Work with data architects to develop data flows and align to platform integration standards
  • Build data flows for data acquisition, aggregation, and modeling, using both batch and streaming paradigms
  • Consolidate/join datasets to create easily consumable, consistent, holistic information
  • Empower other data teams, data scientists and data analysts to be as self-sufficient as possible by building core capabilities as services and developing reusable library code
  • Ensure efficiency, quality, resiliency of the core data platform



  • Undergraduate or graduate degree in a technical or scientific field, such as Computer Science, Engineering, Mathematics, or similar
  • 5+ years professional experience as a data engineer, software engineer, data analyst, data scientist, or related role
  • Analytically minded and detail-oriented: you actually like working with data, looking for patterns and outliers, establishing data models, and finding the best answers to business & technology problems
  • Expertise in data engineering languages such as Java, Scala, Python, SQL
  • Data modeling and data governance experience; you've designed and implemented data marts, data warehouses or other large-scale data management systems
  • Experience building ETL and data pipelines, both with traditional ETL solutions like Pentaho, SSIS, Talend but also via code-oriented systems like Spark, Airflow or similar
  • Cloud-oriented with strong understanding of SaaS models
  • Experience operating in a secure networking environment, leveraging separate production support and SRE teams is a plus
  • Excellent technical documentation and writing skills
  • You have a bias towards automation, an Agile/Lean mindset and embrace the Devops culture
  • Familiarity with streaming/messaging technologies like Kafka, Kinesis, Spark Streaming,
  • Familiarity with visualizing data with Tableau, Business Objects, Quicksight, PowerBI and similar tools
  • Great customer focus and strong technical troubleshooting skills
  • Proficiency in statistics and data science is a nice-to-have, and interest in learning these is even better
  • Experience with clinical trial data is not required, but interest to learn and understand it is a must
  • Hadoop/Spark and Graph/RDF/Ontologies experience a plus

Posted On: Thursday, May 28, 2020

Apply to this job