Our Firm



Senior Data Engineer


Regulatory DataCorp | Information Technology | King of Prussia, PA

Job Summary

RDC is looking for a Senior Data Engineer to be part of a team dedicated to building best-in-class machine learning solutions that protect the world’s financial systems. As part of the Data Science team, you will work with Software Architecture, Engineering and Product to develop and own stable and scalable software solutions around big data.

Essentials Duties and Responsibilities

  • Architect/Design, implement, monitor, and maintain big data pipelines and ETL/ELT pipelines
  • Discover opportunities for data acquisition and pick the right tools to collect and analyze such datasets in batch and/or real-time
  • Recommend ways to improve data reliability, efficiency and quality
  • Implement best practices around data modeling, data partitioning and data backfilling on new and existing data
  • Help the team ensure compliance with all regulatory requirements related to data privacy
  • Help build and maintain robust alerting and monitoring capabilities
  • Work closely with the Data Scientists and Architecture team to ensure efficient and effective delivery of data solutions
  • Interface with Software Engineers, Product Managers and Business Analysts to understand goals, data needs and implement data-driven features/products

equal employment opportunity (EEO)

It is the policy of Regulatory DataCorp, Inc. and Regulatory DataCorp Limited (herein referred to as RDC) to provide equal employment opportunity to all persons regardless of age, color, national origin, citizenship status, physical or mental disability, race, religion, creed, gender, sex, sexual orientation, gender identity and/or expression, genetic information, marital status, status with regard to public assistance, veteran status, or any other characteristic protected by federal, state or local law. In addition, RDC will provide reasonable accommodations for qualified individuals with disabilities.

Job Description Disclaimer

This job description is not intended as and does not create employment contracts. RDC maintains its status as an at-will employer. All descriptions have been reviewed in an attempt to illustrate the jobs functions and basic duties that illustrate the minimal standards required to successfully perform the positions. The list of duties, responsibilities, and requirements should not be interpreted as all-inclusive.  RDC retains the right to change or assign other duties to this position. 
Skills & Requirements


  • Bachelor’s degree in Computer Science or related field with a GPA of 3.0 or higher
  • At least 3 years of relevant data engineering experience
  • At least one year of professional experience with:
  • Amazon Web Services (EMR, S3, Glue, IAM, ECS)
  • SQL (MSSQL, PostgreSQL)
  • NoSQL databases (MongoDB)
  • Various ETL/ELT approaches and tools to help create Data Warehouses or Data Reservoirs
  • Spark ecosystem (i.e. Dataframes , MLlib, SparkSQL) and Hadoop ecosystem (i.e. Hive, Sqoop, HDFS)
  • Various data serialization formats such as Apache Avro, Apache Parquet, json, csv, yaml, xml
  • Elasticsearch/Kibana or a similar distributed search and analytics engine
  • Kafka, Spark Streaming or a similar real-time stream processing framework
  • ActiveMQ, RabbitMQ or a similar messaging system
  • Databricks/AWS or a similar web-based platform for working with Spark and other Big Data tools
  • Apache Oozie, Apache Airflow, Luigi or a similar workflow management system
  • Excellent programming skills with 1+ years of experience in software development writing production code, in Java, Scala or Python
  • Demonstrated proficiency with:
  • Unix/Linux OS
  • Database Management Systems
  • Distributed Systems
  • Big Data concepts/tools
  • Curious, self-driven, analytical and excited to play with data
  • Demonstrated ability to work with ambiguous requirements, adapt, and learn
  • Excellent verbal and written communication