We have an immediate full time, permanent position for a dynamic, growing client of ours for an Incident and Problem Manager This position is remote but accessibility to an Atlanta or Houston office is desirable from time to time, but not required. Candidates should reside in either the Eastern or Central time zones.
The IT Incident and Problem Manager will be responsible for overseeing the incident management and resolution process, queue monitoring, overseeing the prioritization and escalation of IT incidents, coordinating the incident response team including with on-going Major Incidents, and, where applicable, conducting trend analyses, root cause analyses, and after action reviews/post-mortems. They will also be responsible for identifying problems and following them through with the responsible and impacted teams to resolution.
Essential duties and responsibilities
- Work closely with the IT Service Desk and all IT support and business teams as needed to ensure proper incident and ticket tracking, transfer, resolution, root cause analysis, and post-mortem activities are conducted properly.
- Track all Major Incidents from inception through resolution.
- Conduct and document After Action Reviews and track follow-up on action items.
- Establish and enforce incident response service level agreements in consultation with end users in the business to establish incident resolution expectations and timeframes.
- Analyze performance of incident management activities and documented resolutions, identify problems, and devise and deliver solutions to enhance quality of service and to prevent future problems.
- Track and analyze trends in incident reports and generate statistical reports to inform proactive problem management.
- Regularly iterate on the incident and problem management processes using data gathered about the frequency and severity of incidents.
- Develop and implement a roadmap for maturing the Major Incident process, overall IT Incident & ticket Management and IT Problem Management.
- Create metrics, reports and dashboards, and review with teams and management to analyze current status, trends, areas of strength, and opportunities for improvement.
- Assess need for any system reconfigurations (minor or significant) based on data and trends and make recommendations.
- Oversee the development, implementation, and administration of incident and problem management training procedures and policies.
- Train, coach, and mentor all support tiers.
- Oversee and improve the problem and incident management processes, seek out and integrate best practices by staying informed regarding developments in the world of ITSM, specifically new products, services, technologies, and standards that relate to the practice of effective incident and problem management.
education and Training
- Bachelor of Science/Arts in Computer Science, Management Information Systems, or a related field, or equivalent work experience required.
- ITIL v3 or v4 Foundation certification required. Higher certifications a plus.
- A minimum two years’ experience working in an Information Technology Infrastructure Library (ITIL) environment required.
- Minimum two years’ experience in IT Service Management (ITSM) environment required.
- Minimum two years’ experience in medium- to large-size enterprise IT environment
- Good knowledge of networking, business continuity, computer hardware, operating systems (desktop & server), security, and the Software Development Lifecycle (SDLC).
- Extensive user support experience.
- Working understanding of diagnostics utilities.
- Experience with conducting After Action Reviews and root cause analysis.
- Experience with developing and providing service level agreements and resolving IT incidents.
- Must be able to work collaboratively and create excellent working relationships across departments and physical locations.
- Exceptional interpersonal skills, with a focus on listening and questioning skills.
- Excellent verbal and written communication skills.
- Strong documentation skills.
- Keen attention to detail.
- Proven analytical and problem-solving abilities.
- Ability to effectively prioritize and execute tasks in a high-pressure environment.
- Motivated self-starter, quick learner, organized, and reliable.
- Ability to work both independently and in a group setting.
- Exceptional customer service orientation.
- Ability to absorb and retain information quickly.
- Ability to present ideas in user-friendly language to non-technical staff and end users.
- Ability to work after-hours as required for incident response.