Critical Incident Management Lead in Jackson, MS at APEX Systems

Date Posted: 8/2/2018

Job Snapshot

Job Description

Job #:  872762

udson’s Bay Company is one of the fastest-growing department store retailers in the world. In North America, HBC’s leading banners include Hudson’s Bay, Lord & Taylor, Saks Fifth Avenue, Gilt, Saks OFF 5TH, Find @ Lord & Taylor, and Home Outfitters. In Europe, HBC’s banners include GALERIA Kaufhof (the largest department store group in Germany), Galeria INNO (Belgium’s only department store group), and Sportarena. At HBC we are a company of adventurers who explore uncharted territory, challenge convention, and work with imagination and fun.


Reporting Relationship: The Lead Analyst is a leadership role within the HBC IT Service Desk team reporting to the Manager, IT Service Desk.
Lead Analyst / Critical Incident Mgmt Lead



The primary purpose of this role is to oversee the mitigation of high priority incidents as quickly as possible with the goal of minimizing impacts to our callers and customers.  The Incident Mgmt Lead will exercise agile project management techniques for facilitation of technology incidents, coordinating/running bridge calls, and engaging the correct resources to achieve this goal.  The best candidates for the role have a strong comprehension of incident response at a technical level, can command a meeting and crisis, work well with other people and have strong verbal and written communication skills, a sense of diplomacy, ability to anticipate obstacles, and decision-making skills to handle the fast-paced world of incidents.



• Assume primary responsibility for monitoring and assessing potential impact of local and global events for users and customers

• Prepare notifications and status of all incidents to high level internal leadership while managing SLA's.

• Identifies major incidents and escalates via the Critical Incident Management (CIM) Process.  Ensuring all Major Incident Process guidelines are followed and all Service Level Guidelines are met during Major Incidents.

• Proactively escalate impacting events, establish and facilitate bridge calls, engage resolvers, and coordinate the resolution.

• Takes a command and control role as Incident Manager during critical incidents focusing on minimizing MTTR.

• During a critical event, be responsible for logging and reporting on the timeline of activities and responses.  Keep accurate record of incident timelines and supporting artifacts.

• Works to develop, maintain, and report on SLAs set by the company, ensuring that services are delivered within the defined thresholds.

• Act as an escalation point, ensuring coordination of resolving parties, and effective communication to stakeholders, along with post incident review. 

• Ensure post-incident activities, such as incident summary reports and reason for outage reports, are managed through to delivery.

• Identifies, evaluates and executes preventive measures to minimize/avoid impact to the consumer experience. 

• Interfaces with vendors to ensure appropriate resolution during network outages or periods of reduced performance.

•  Execute trouble-shooting steps and create incident documentation with proper technical details.  Review of incident data to ensure the completeness and quality of the information collected.  Clearly and concisely communicate incident details both verbally and written in the form of Incident communications.

• Review of the execution of the incident process to identify opportunities for improvement in the process (missed SLAs, gaps in execution or response).  Documentation of and revision to existing processes, with the end goal of improving quality of services offered to customers by technology teams. 

• Create documentation for Service Desk and Command Center Analysts.

• Monitor and oversee incoming call service levels for technical support through the IT Service Desk.



• Provide support during incidents such as natural disasters, local emergencies, power outages, extreme weather.

• As necessary, handle inquiries from employees regarding process and procedure to resolve or escalate technical support requests.

• Spearhead day-to-day issue resolution and escalations, ensure incidents are managed to resolution.

• Responsible for mentoring and helping the team establish best practices.

• Serve in the role of team subject matter expert for company tools in order to provide training for peers and management.

• Assist with Project Management when required and develop a strong understanding of how other IT projects may impact the company’s infrastructure.

• Act as an IT ambassador, working with other departments to communicate IT matters and build strong relationships with business owners.

• Attends meetings regarding IT projects, representing the Service Desk team as an ambassador.

• Reviews proposed changes and represents the Service Desk team during Change Board meetings.

• May perform some third level support of issues in the infrastructure environment leveraging existing knowledge or support from vendors including off hours support when needed.

• Draft communications, assessments, and reports that may be both internal and customer facing, to include leadership and executive management.

• Maintains network security and ensures compliance with security polices and procedures.

• Provides advice and training to end-users.

• Troubleshoots and resolves complex problems.

• Maintains current knowledge of relevant hardware and software applications as assigned.

• Participates in special projects as required.

• Problem-solving and trouble-shooting skills

• Foundational understanding of technology, including infrastructure, and business applications (built on infrastructure).

• Ability to work, adapt, and lead in high pressure and fast-paced environment

• Excellent analytical and interpersonal skills

• Displays a strong sense of time management and accountability

• Conceptual knowledge of troubleshooting methodologies in IT and network infrastructures (preferred)

• Record and classify received Incidents and undertake an immediate effort in order to restore a failed IT Service as quickly as possible

• Assign unresolved Incidents to appropriate Tier 2 Support Group

• Log all Incident/Service Request details, allocating categorization and prioritization codes

• Keep users informed about their Incidents’ status at agreed intervals

• Associate Incidents with other records (i.e. Incidents, Changes, Problems, Knowledge Articles, Known Errors, etc.)

• Ability to operate under stress in a fast paced environment.

• Equally comfortable working independently or collaboratively on a project, often under compressed timelines.

• Strong organizational and multitasking skills.

• Highly responsive and proactive, able to own tasks from start to finish.

• Maintain high level of attention to detail.

• Previous knowledge or strong desire to learn about crisis management issues

• Ability to help troubleshoot a variety of technical problems and own issues through resolution while sometimes needing to engage others for assistance

• Strong understanding of client/server fundamentals and concepts

• Requires advanced understanding of technical concepts, very strong communication skills both verbal and non-verbal as well as the ability to function as an independent worker



• Proficiency with standard office computer and web applications (i.e. Microsoft or Google Office Suites).

• 4-6 years of experience or Bachelor’s degree in computer science or a related field

• Strong understanding of Information Technology

• Knowledge of technology SLA’s and OLA’s, as well as technology governance, risk, and compliance

• Strong interpersonal skills, including collaboration and analytical thinking

• Ability to communicate with both onshore and offshore teams

• Ability to work a 11:00 AM  to 8:00 PM schedule

• Be available to assist team after hours if needed, be on call and work nights and weekends as required.



• 3-5 years’ work experience in Incident Management.

• Infrastructure support and management experience in Mainframe, distributed systems, network, and storage.

• Extensive experience with Microsoft Office Word, Excel, and PowerPoint is preferred.

• ITIL Certification for Incident, Problem, and Change Management is preferred.

• 2+ years of experience working with an IT Infrastructure Library (ITIL) framework

• 3+ years of experience providing first level support for enterprise IT systems in a corporate setting, including diagnosing, troubleshooting, and resolving incidents

• 3+ years of experience in software installation and support

• 2+ years of experience writing IT technical documentation

• 2+ years of work experience in a business role requiring interaction with senior leadership




• Work indoors during all seasons and weather conditions

• Limited amount of local and/or remote location traveling required

• Standard business dress is required


The software platforms used to execute the projects are:

Software: Jira, Confluence, Slack, HipChat, Jabber, Stash / Bitbucket, Aspect, PagerDuty, ScienceLogic EM7, New Relic, Splunk, Microsoft SCOM, Solarwinds, UAD.


If you are interested in this position contact Kedric Bailey at 205-623-1115

or email










EEO Employer

Apex is an Equal Employment Opportunity/Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, age, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic protected by law. Apex will consider qualified applicants with criminal histories in a manner consistent with the requirements of applicable law. If you have visited our website in search of information on employment opportunities or to apply for a position, and you require an accommodation in using our website for a search or application, please contact our Employee Services Department at or 844-463-6178.