â¢ BSc or MSc (preferred) in a STEM field â¢ Relevant work experience of 8 years â¢ Fluency in Python (especially Numpy and Pandas) and familiarity in PySpark â¢ Extensive hands-on experience with AWS Analytical Components like S3, EC2, Lambdas, Glue, SQS, SNS, DynamoDB, Redshift, RDS etc. â¢ Experience with Data Lake Formation and Athena. â¢ Work Experience with industry standard distributed systems (ie. Spark, hive), data pipeline tools (ie. Airflow), NoSQL Databases (DynamoDB) and databases (PostgreSQL) â¢ Experience with Data Analysis, Significant experience optimizing data retrieval processes supporting API output, ideally within a low query volume / high data volume environment. â¢ Demonstrably deep experience with relevant âbig dataâ processing either via Spark or through a modern MPP database like Redshift, ideally with experience in both â¢ Demonstrably deep experience with CI/CD tools and practices in a containerized AWS environment, from deployment pipelines (Jenkins, etc), infrastructure definition (Terraform, CloudFormation, etc. â¢ Understand and design for non-functional concerns such as performance, cost optimization, maintainability, and developer experience.