Collaborate with architects, engineers, and business users on product design and features.
Develop the ETL pipelines to bring data from source systems to the staging AWS S3 buckets.
Develop data ingestion framework into raw tables using advanced PySpark programs.
Understand the data model and organize the raw data into structural Hubs.
Design ETL pipelines using Pyspark, Scala and Snowsql to transform data from Hubs and Satellites to Data Vault2.0 structure.
Implement Data quality and SOX Controls and Schedule batch jobs using Airflow and Autosys.
Design Test Cases and perform various Testing techniques to validate the development.
Education:
Bachelor’s degree in Computer Science, Computer Information Systems or a closely related engineering field of study and some related work experience in the relevant field