The Data & Software Engineer works with a small team to build complex data flows for a custom application. Successful candidate will have advanced Python programming skills, familiarity with Java, an understanding of data security, privacy, governance and compliance principles and a demonstrated history of building production data pipelines and ETL workflows at scale. Candidate must have experience building end-to-end data pipelines leveraging Python; using orchestration tools to deploy data pipelines, including configuring and updating Spark Jobs; containerizing and deploying applications in cloud environments like AWS; working with MySQL and PostgreSQL including performance tuning, schema design, and query optimization for complex, analytical workloads; leveraging industry standard tools for code control (Git, IaaC control, etc.); working with data catalogs, tracking data lineage and handling a variety of data formats, including Geospatial; using Bash scripting for automation and data processing tasks; and integrating AI/ML services and models.
Clearance: Active TS/SCI with Full Scope Polygraph (Required).
Experience: Minimum of 5 years of professional experience in data engineering.
Data Processing: Expert proficiency with Apache Spark, PySpark, Pandas, and NumPy.
Cloud & Serverless: Extensive experience with AWS (S3, Lambda, Step Functions).
Orchestration & Modern Data Stack: Hands-on experience with Airflow, Apache Iceberg, and Trino.
DevOps & Infrastructure: Mastery of Docker/Podman and Terraform or CloudFormation.
Database Management: Advanced SQL skills and experience with NoSQL (DynamoDB).
Governance & Lineage: Practical use of Unity Catalog OSS, Apache Polaris, and OpenLineage.
Visualization & Geospatial: Experience with Apache Superset and geospatial tools like H3 and PostGIS.
Language Versatility: Professional familiarity with Java.
AI/ML Integration: Background in integrating AI/ML services into production pipelines.
Modernization: Experience leading large-scale data migration or platform modernization efforts.
Communication: Ability to design technical solutions with minimal oversight and contribute to engineering best practices/documentation.