Staff Data Platform Architect (Databricks)
Hartford, CT, hybrid (on-site 3 days per week)
Department: Data & Analytics Platform
Business Unit: Infrastructure and Cloud Services
Reports To: Senior Director, Data Platform
Role Overview
We are seeking a Staff Data Platform Architect to serve as the primary technical consultant and strategist for our enterprise Databricks ecosystem. This is a high-impact, senior individual contributor role focused on driving technical excellence, automation, and fiscal efficiency.
Unlike a traditional administrator, you will act as an internal consultant to our extensive Databricks team, providing the blueprint for scalable pipelines, advanced automation, and long-term capacity forecasting. You will bridge the gap between complex infrastructure (Unix/Linux) and modern AI/ML workflows, ensuring our platform is both cutting-edge and cost-effective.
Key Responsibilities
Strategic Consultation & Architecture
-
Act as the Technical Authority for Databricks, advising engineering teams on Unity Catalog governance, workspace topology, and complex migration patterns.
-
Consult on the design of high-performance data pipelines, specifically optimizing Delta Live Tables (DLT) and structured streaming for scale.
-
Partner with teams using Ab Initio and Fivetran to ensure seamless integration and architectural alignment across the multi-platform ecosystem.
Platform Optimization & Financial Forecasting
-
Capacity Planning: Own the forecasting of DBU consumption and partner with leadership on multi-year contract utilization and commitment management.
-
Cost Engineering: Design and implement sophisticated cost-attribution models (chargeback/showback) and proactively identify “leaks” in compute spend.
-
Performance Tuning: Define enterprise standards for Z-ordering, partitioning, and compute strategy to maximize performance-per-dollar.
Advanced Automation & AI Operations
-
Architect “self-healing” infrastructure through Python and Bash automation, reducing manual toil for the wider engineering team.
-
Consult on the operationalization of ML models, leveraging MLflow and Model Serving to move experiments into production.
-
Guide the integration of Generative AI and LLM-backed workflows into the standard data engineering lifecycle.
Infrastructure & Linux Engineering
-
Provide deep-tier expertise for the Unix/Linux environments underpinning our compute nodes.
-
Develop advanced automation scripts for cluster lifecycle management, monitoring, and security hardening.
Required Qualifications
-
Experience: 7+ years in Data Engineering/Platform roles, with at least 4 years of deep architectural experience in Databricks.
-
The “Consultant” Mindset: Proven ability to advise multiple teams, influence technical roadmaps, and communicate complex trade-offs to senior leadership.
-
Technical Depth: Mastery of Unity Catalog, Delta Lake, and PySpark.
-
Systems Expertise: Strong proficiency in Unix/Linux systems administration and shell scripting (Bash) for infrastructure automation.
-
Financial Acumen: Experience managing cloud consumption (DBUs), forecasting usage, and implementing cost-governance tools.
-
Tooling: High proficiency with Git-based CI/CD and experience in Oracle environments.
Preferred Qualifications
-
Hands-on experience with Infrastructure-as-Code (Terraform/Ansible) for Databricks provider.
-
Exposure to Ab Initio or Fivetran in a large-scale enterprise environment.
-
Background in highly regulated industries (e.g., Finance or Insurance).
#LI-MG1
#LI-Hybrid