flespi noc (eu)
94 subscribers
78 links
flespi eu region NOC
Download Telegram
Dear flespi users!

The last downtime was caused by a magistral network link outage between Groningen and Amsterdam. The failed link was automatically switched to a backup link, but it taked some time to re-route all the traffic.
The situations like this is out of our control.

Nevertheless, sorry for any inconvenience.
#eu: downtime started, error: Failed to perform https://flespi.io/gw/xxx GET request. Usually this indicates either flespi telematics hub REST API overload or when the hub is in the maintenance mode.
#eu: downtime ended, period: 55 second(s)
Dear flespi users!

The last downtime was caused by a DDoS attack on our datacenter's network. The automatic protection system mitigated that attack, but uplink was overflowed for a short period of time.
We're still monitoring the situation.

Sorry for any inconvenience.
Dear flespi users!

For your information, within one hour from now, our SRE team will perform maintenance operation with database cluster by moving it from master node to slave and back with adjusted operation parameters. This may result in some minimal delays or failures in some cases of your PUT/POST/DELETE REST API calls. Our expectations that these delays (during database master transitioning process) should not exceed 10-30 seconds per each transition moment.
Dear flespi users!

Information below only applies to users of flespi analytics and does not affect other flespi subsystems.

Tomorrow we will finish our transition to microseconds granularity and the last remained subsystem in flespi - analytics - will start its calculations based on timing in microseconds. It may affect your consumers if you are still expecting values of "begin", "end" and "duration" counters as integers.

Once the analytics system is upgraded new/updated intervals will be generated with microseconds timing granularity while old intervals will remain as is.

Please check this post for more information: https://forum.flespi.com/d/16-changelog-flespi-analytics/30
#eu: downtime started, error: Failed to perform https://flespi.io GET request. Usually this indicates either flespi eu datacenter network uplink connection problem or when the platform is in the maintenance mode.
#eu: downtime ended, period: 25 second(s)
#eu: downtime started, error: Failed to perform https://flespi.io/gw/xxx GET request. Usually this indicates either flespi telematics hub REST API overload or when the hub is in the maintenance mode.
#eu: downtime ended, period: 14 second(s)
#eu: downtime started, error: Failed to perform https://flespi.io GET request. Usually this indicates either flespi eu datacenter network uplink connection problem or when the platform is in the maintenance mode.
#eu: downtime ended, period: 17 second(s)
#eu: downtime started, error: Failed to perform https://flespi.io GET request. Usually this indicates either flespi eu datacenter network uplink connection problem or when the platform is in the maintenance mode.
#eu: downtime ended, period: 18 second(s)
#eu: downtime started, error: Failed to connect to flespi MQTT Broker. Usually this indicates either the problem with MQTT Broker itself or when the Broker is shutdowned for the maintenance.
#eu: downtime ended, period: 14 second(s)
Dear flespi users!

We're observing an uplink fluctuations at our datacenter right now and investigating the case.
We now have a report and explanation what has happened this night.

To make long story short it was a misconfiguration inside the routing platform of our up-link provider which failed to correctly detect the flapping link between two routers that impacted automatic management of CARP / VRRP. We experienced periodic packet loss within for approximately 40 minutes until engineer on site detect the problem, disabled automatics and switched the routing to the correct link.

The misconfiguration is being fixed now and similar situations in the future should be avoided.

We apologize for the problems occurred and wish you a very good weekend!
Dear flespi users,

Please be informed that we will perform planned database maintenance today in the next 10 minutes.
We do not expect significant impact on the platform operation, except for a short slowdown of the analytics engine and longer items creation or modification.

Sorry for any inconvenience.