My Shortlist

Your shortlisted jobs will appear here. To view your shortlist: Login Or Register

Date Added: Thu 24/07/2025

Observability Engineer - Trading Company - £100 - £120K Base

London, UK
Apply Now

Company: COMPUTAPPOINT

Job Type: Permanent, FullTime

Salary: £120,000 - £130,000 per annum

Infrastructure Observability Engineer - Leading Trading Company

Location: London, UKContract Type: PermanentSalary: Competitive + Benefits

About Our Client

Our client is a well-established trading company with a strong presence in the global commodities market. They are committed to leveraging cutting-edge technology solutions to drive operational excellence and maintain their competitive edge in the fast-paced trading environment.

The Role

We are seeking an experienced Infrastructure Observability Engineer to lead the design, implementation, and continuous improvement of our client's enterprise observability platform. This role focuses on delivering comprehensive monitoring, event correlation, and impact analysis, demonstrating AIOps capabilities and tools such as BMC Helix Operations Manager.

The ideal candidate will be passionate about improving access to infrastructure performance, automating operational intelligence, and reducing mean time to resolution (MTTR) through intelligent alerting and root cause analysis.

Key Responsibilities
  • Own and evolve the enterprise observability strategy across all infrastructure tracks
  • Design, implement, and support event management and impact analysis workflows using platforms such as BMC Helix Operations Manager
  • Integrate and correlate data from multiple sources (e.g., 20+ monitoring systems) into a unified monitoring and alerting framework
  • Apply AIOps principles to reduce alert noise, detect anomalies, and predict/prevent potential outages
  • Collaborate with infrastructure, application, and service desk teams to define meaningful service-level metrics and dashboards
  • Maintain and extend the configuration of monitoring tools, event enrichment, suppression rules, and correlation logic
  • Develop and support automation for observability platform configuration using Infrastructure as Code
  • Define best practices for monitoring new platforms and services in collaboration with engineering and operations teams
  • Support the integration of observability data with ITSM platforms (e.g., Ivanti Neurons ITSM) to streamline incident and change processes
  • Ensure observability platforms are reliable, secure, well-documented, and continuously aligned with business requirements
Essential Requirements

Specialist Knowledge:

  • Demonstrable experience in observability engineering, infrastructure monitoring, or event management roles
  • Experience with traditional and modern observability stacks such as SCOM, SolarWinds, Prometheus, Grafana and Elastic Stack (ELK)
  • Hands-on experience with BMC Helix Operations Manager, TrueSight, or similar enterprise monitoring platforms
  • Solid understanding of AIOps concepts, including event correlation, noise reduction, anomaly detection, and root cause analysis
  • Strong proficiency with scripting (e.g., Python, PowerShell, Bash) for automation and data handling
  • Solid understanding of networking fundamentals
  • Excellent problem-solving skills with the ability to diagnose complex issues using observability tools and logs
  • Exposure to cloud-native monitoring for platforms such as Azure Monitor, AWS CloudWatch, or Google Cloud Operations
  • Experience with implementing self-healing alerts/systems based on tools such as VMware vCF Operations, Syslog Splunk and VMware LogInsight
  • Proficiency with observability of Kubernetes clusters

Professional Experience:

  • Minimum of 3 years of experience in Infrastructure Observability Engineering
  • Experience working within financial services or trading environments (highly desirable)

Educational Background:

  • Bachelor's degree in Computer Science, Information Technology or a related field
Key Competencies
  • Strong problem-solving abilities
  • Ability to improve business processes
  • Able to use initiative and work independently
  • Strategic planning and implementation skills
  • Excellent communication skills with the ability to engage stakeholders at all levels
  • Ability to work under pressure in a fast-paced, time-sensitive trading environment
Key Relationships

You will work closely with:

  • Outsourced Event & Impact Management Team
  • Outsourced Monitoring Administration Teams
  • Engineering Teams (Platform, Windows, Networks, SQL Server & Oracle)
  • Vendor management teams
  • Change, Incident & Problem Managers
  • Outsourced IT management
Apply Now