Date of the incident: July 18th, 2022 17:32 CET


Duration: 15 calendar hours


Affected services: Batches sent to the device from the UI are not reaching the DFE.


Issue timeline:

  • July 18th - 17h:32 - UI update released intended to optimize the UI load on the system
  • July 18th evening - several reports from customers not being able to submit batches to the presses. Troubleshooting starts validating agent connectivity.
  • July 18th at midnight - The UI release is identified to be around the same time the first issue reports, however the team still cannot reproduce the issue. 
  • July 19th early morning - The issue has been reproduced within HP. Connectivity is discarded.
  • July 19th morning - new reports confirm batch auto send to press works fine, only manual send to device is not able to reach the press. The UI release is rolled back.
  • July 19th afternoon - One customer had to reinstall the Local agent in order to have JMF reach the press.

 

Root cause:

A UI update to the system inadvertently impacted super-agent connectivity, making some files unable to be submitted to devices.

 

The change was introduced with a larger number of changes, making it difficult more identify.

 

Path forward:

 

Aligned with the continuous development model, the release process needs is being tightened by making sure changes are broken into smaller incremental pieces.

 

QA Plan for UI changes is being reviewed.

 

Customer communication needs to be improved, in particular when issues affecting multiple customers arise.

 

 

Terms/Glossary

  • Maintenance Event” means maintenance of the Services that require its interruption;
  • Scheduled Maintenance” means a Maintenance Event in respect of which HP has given the Customer at least twenty-four (24) hours prior written notice;
  • "System degradation" means that the customer is unable to utilize Site Flow as usual so his business is being impacted but the situation is not yet an Outage.
  • Incident” means any set of circumstances resulting in an Outage;
  • Outage” means, that the Customer is unable to access all parts of the Site Flow Subscription service via both API and web-browser log-in, AND all transmitted orders directed to the Customer’s Site Flow account are not being acknowledged (i.e. the entire Site Flow service is “down”).
  • Working Hour” means, the hours between Monday through Friday 09:00-17:00 local time, excluding national and HP designated holidays.
  • "Calendar hours" are regular full-day hours and they cover everything around 24x7. Correspondance with working hours will depend on the actual customer timezone.