Director Global Technical Services & Operations Management

Ingram Micro

On-site

Regular employment

10 - 15 years of experience

Full Time

Barcelona, Spain

Responsibilities

It's fun to work in a company where people truly BELIEVE in what they're doing!

Job Description:

Job Summary

We are seeking a Director of Global Technical Services and Operations Management to lead and drive process maturity and operational excellence across our IT service management (ITSM) and IT operations management (ITOM) functions including Incident response, event management, and disaster recovery. This position will have primary responsibility for leading and overseeing ITSM and ITOM functions, with additional responsibilities for 24x7 Monitoring Operations and (primarily in the EMEA and APAC Time Zones) coordinating all aspects of Technical Operations Management. Position will also be responsible for all aspects of Ingram Micro’s release management programs and processes.

The ideal candidate would have deep experience with ITIL and tools such as ServiceNow, especially ITSM (including Change, Incident, and Problem management), ITOM, and CMDB/Service Graph and Reporting, ServiceNow integrations with other key tooling such as monitoring and observability tools (e.g. DataDog, SolarWinds, splunk, dynatrace, etc.) and experience working in a globally distributed 24x7 mission-critical environment such as SaaS or eCommerce.

This role will require strong management skills and experience managing IT functions and globally distributed teams comprised of both 3rd party and in-house resources, exceptional communication skills, both written and verbal, and a data-driven approach to managing performance using KPIs and driving oversight and governance to ensure seamless delivery of services and driving performance and accountability across partner teams and vendors, ensuring our platform meets defined availability, quality, compliance and other performance objectives.

The ideal candidate would have thought leadership experience with AIOps and leading an automation-centric (such as auto-healing and automating risk/change assessment) approach to driving continual process and operational excellence maturity and efficiency, driving innovation, improving system resiliency and optimizing Cloud and infrastructure operations.

Key Responsibilities

Strategic Leadership & Vision

Define and execute the long-term platform engineering strategy, aligning it with business objectives.
Integrate DevOps, SRE, and ITSM/ITOM principles to create a unified and efficient operational model.
Drive automation and self-service capabilities to enhance developer productivity and system reliability.
Ensure high availability and reliability for 24x7 global operations, implementing best practices for service continuity.

Infrastructure, DevOps & Automation

Oversee cloud infrastructure (AWS, Azure, or GCP), container orchestration (Kubernetes, Docker), and CI/CD pipelines.
Implement and integrate AIOps solutions for proactive issue detection, incident resolution, and intelligent automation.
Drive Infrastructure as Code (IaC) adoption using tools like Terraform and Ansible.
Develop and execute strategies for cost optimization, security, and governance across cloud environments.

IT Operations, Service Management & Observability

Integrate ITSM/ITOM tools (e.g., ServiceNow) into DevOps and SRE workflows for automated incident management, change management, and service reliability.
Enhance system visibility through observability and monitoring tools like Datadog, Dynatrace, New Relic, and Splunk.
Drive automation-centric service management to improve IT operations efficiency and reduce mean time to resolution (MTTR).

Technology & Architecture

Architect and oversee resilient, scalable, and secure platform solutions, incorporating AIOps, machine learning-driven automation, and event-driven architectures.
Implement API-first and integration-centric approaches for seamless interoperability across IT and engineering ecosystems.
Ensure the alignment of ITSM, DevOps, and cloud-native technologies to create a highly automated and efficient operational model.

Team Leadership & Collaboration

Foster a culture of automation, continuous improvement, and operational excellence.
Collaborate closely with security, software engineering, and product teams to streamline workflows and enhance service reliability.
Ensure 24x7 operational excellence by implementing on-call rotations, automated incident response, and real-time monitoring.

Performance, Reliability & Incident Management

Implement SRE principles, defining and tracking SLAs, SLOs, and error budgets to maintain system reliability.
Develop and refine incident response, root cause analysis, and post-mortem processes using ITSM/ITOM automation.
Optimize service health, incident response, and operational resilience through proactive monitoring and analytics-driven insights.

Qualifications & Experience

Required:

10+ years of experience in software engineering, cloud infrastructure, or platform engineering.
5+ years of leadership experience managing globally distributed platform, SRE, or DevOps teams in a 24x7 operational environment.
Proven expertise integrating DevOps, SRE, and ITSM/ITOM to drive operational efficiency.
Strong knowledge of cloud platforms (AWS, Azure, GCP), Kubernetes, and microservices architecture.
Experience with ServiceNow, AIOps, and IT automation tools to optimize IT operations.
Hands-on expertise in CI/CD pipelines, Infrastructure as Code (Terraform, Ansible), and observability tools (Datadog, Dynatrace, New Relic, Splunk).
Strong background in automation-centric approaches to enhance self-healing infrastructure and intelligent workflows.
Experience implementing AI-driven monitoring, predictive analytics, and auto-remediation solutions.
Excellent verbal and written communication skills, with the ability to present technical concepts to executive leadership and cross-functional teams.

Preferred:

Experience with event-driven architecture and serverless computing.
Knowledge of FinOps, cloud cost optimization, and security best practices.
Prior experience in performance engineering, security automation, or AI/ML infrastructure.

Why Join Us?

Work with cutting-edge technologies, AI-driven automation, and cloud-native solutions.
Competitive salary, equity, and comprehensive benefits.
A culture of innovation, collaboration, and continuous improvement.

Required skills

Automation

AWS

Collaboration

Continuous Improvement

Corporate Governance

Decision Making

Disaster Recovery

E-Commerce

Hardware Engineering

Incident Management

Infrastructure

Integration

IT Operations

IT Service Management

ITIL

Leadership

Machine Learning

Management

Monitoring

Operations Management

Performance Testing

Reliability

Reporting

Root Cause Analysis

Screening Calls

Security

Service Management

Software Engineering

Team Leadership

Docker

Kubernetes

Ansible

CI/CD tools

Devops

Terraform

Azure

ServiceNow

GCP

Cloud architectures

Sun Solaris

Splunk

Event Management

SRE principles

Observability

SLA

SaaS

cloud-native application

API

Retention Management

KPI Metrics

Dynatrace

Containers

Microservices architectures

Serverless architectures

Cloud Infrastructure

incident response

Analytics

optimization

Infrastructure-as-Code

Data Pipelines

Business Communication

Quality Processes

Operational Excellence

Secure Design

IT resilience

Budget reporting

Availability to work occasionally as per US East Coast Time

MariaDB

MLops

English

Job posted 84 days ago