Unexpected Service Degradation During Scheduled Maintenance

Incident Report for Appmixer

Resolved

This incident has been resolved.
Posted Nov 25, 2025 - 12:30 CET

Monitoring

We are pleased to confirm that all production tenants have been successfully restored and services are now operational.
We are now addressing any remaining individual tenant-specific concerns through our standard support channels. Our engineering team continues to actively monitor all systems to ensure stability.
We will transition this incident to monitoring status and plan to fully resolve it within the days, provided no new issues arise.
Posted Nov 20, 2025 - 11:25 CET

Update

Recovery work continued as planned yesterday. We deployed an additional tenant and made significant progress on the Authentication Hub migration to AWS infrastructure. Module deployment is ongoing and will continue today, followed by testing before the switchover.
The status remains consistent with our previous update - nearly all tenants are operational, with final deployment work continuing for the remaining tenants.
Posted Nov 19, 2025 - 07:18 CET

Update

Recovery work is resuming today following the weekend. The status remains consistent with our Friday update - nearly all tenants are operational, with final deployment work continuing for the remaining tenants pending DNS and database configuration.
We will continue with Authentication Hub migration and version 6.2.2 deployment this week.
Posted Nov 18, 2025 - 13:22 CET

Update

Recovery progress continues as planned. Nearly all tenants have been successfully restored and are operational.
We are completing deployment of the remaining tenants, pending DNS configuration confirmation for several tenants. We are also migrating Authentication Hub to AWS infrastructure and implementing a Logstash fix.
Posted Nov 14, 2025 - 09:47 CET

Update

Nearly all tenants running on version 6 has been successfully restored and is operational. Our team is actively working on the final remaining tenants.
Today, we will deploy version 6.2.2 to all restored tenants to address a critical logging infrastructure issue. This will be performed with minimal service impact.
Recovery work continues for the remaining tenants running on version 5. We will provide updates as this work progresses.
Posted Nov 13, 2025 - 09:50 CET

Update

Authentication Hub is restored and functional. We will proceed with recovering the remaining tenants.
Posted Nov 12, 2025 - 09:55 CET

Update

We have deployed the first set of tenants and are now monitoring their operation, fine-tuning resources and limits to ensure smooth performance. Once validated, we will proceed with deploying the remaining tenants.
Posted Nov 11, 2025 - 17:47 CET

Update

Tenant restoration has begun with the first tenants currently in progress. Our team is working through the restoration process, including database restoration, DNS configuration, and SSL certificate setup. Once the first tenants are fully restored and verified, we will continue systematically with the remaining tenants.
Posted Nov 11, 2025 - 13:02 CET

Update

New AWS infrastructure has been prepared over the weekend and is now ready for tenant restoration. We are completing final testing today to ensure stability before beginning the restoration process. Tenant restoration will begin today in a phased approach, starting with tenants that have simpler configurations, then proceeding systematically through all remaining tenants. We're rebuilding Docker images in AWS ECR, with versions 6.2.1, 6.1.9, and 6.1.8 already available, and other versions to follow. Work on recovering the Ceph cluster continues in parallel.
Posted Nov 11, 2025 - 09:29 CET

Update

Preparation works to restore the hosted customers continue. In the meantime, the new AWS ECR registry is ready with the latest version (older versions will be restored later). Access credentials will be sent to all self-managed customers shortly.
Posted Nov 10, 2025 - 13:15 CET

Update

We have prepared a new AWS cluster over the weekend to restore hosted customers running on version 6 from backups. Restoration will be happening during the day. Our team is also continuing work on the full recovery of the Docker image registry. Additionally, we’re working toward restoring AuthHub later today, though this depends on how the remaining recovery tasks progress.
Posted Nov 10, 2025 - 10:00 CET

Update

Recovery of the Ceph storage system has proven more complex and slower than anticipated. We are therefore executing a contingency plan to recreate the cluster in an alternative environment and restore all tenants from backups. The new cluster has been deployed, and we are currently validating the tenant recovery process.
Posted Nov 09, 2025 - 17:58 CET

Update

~19% of OSD left to recover. Ceph is slowly recovering. Work on recovering our registry continues.
Posted Nov 08, 2025 - 10:45 CET

Update

76% OSDs up. After that ceph cluster needs to rebalance.
Posted Nov 07, 2025 - 22:03 CET

Update

Over 60% of OSDs restored. Restoration of Docker images in progress.
Posted Nov 07, 2025 - 17:02 CET

Update

We’re still working on the broken OSDs — it’s taking longer than expected to get them all back online. In the meantime, we’ve also started fixing the issues with the Docker registry.
Posted Nov 07, 2025 - 14:30 CET

Update

We’ve fixed the root cause affecting our Ceph cluster, and the cluster is now being restored. The issue turned out to be a bug in ceph cluster, where the Python sub-interpreter used by the mgr modules wasn’t loading plugins correctly.
In the meantime, our team has fixed the manager side and is working on restoring the broken OSDs. We expect this to take the next 2-3 hours.
Posted Nov 07, 2025 - 11:13 CET

Update

We’re currently experiencing a partial service outage affecting some tenants following our scheduled maintenance. Unfortunately, a few unexpected issues have made recovery take longer than planned.

Our engineering team is fully engaged and working hard to restore all services as quickly as possible. We’re making progress, but it may take more time before everything is back to normal.

We’re very sorry for the ongoing disruption and truly appreciate your patience and understanding while we work to resolve this. We’ll continue to share updates as soon as we have more information.
Posted Nov 07, 2025 - 08:56 CET

Identified

During ongoing scheduled maintenance, we identified unexpected issues impacting service availability for some tenants. While the maintenance activities were still in progress, several tenants began experiencing partial outages and degraded performance.

Our engineering team is actively investigating the root cause and working to restore full functionality as quickly as possible. Further updates will be provided as more information becomes available.
Posted Nov 06, 2025 - 13:15 CET
This incident affected: Appmixer Hosted (Studio, Engine, API, Backoffice), Appmixer Image Registry, and Authentication Hub.