Hey all, thanks for checking out the bot. Looks like it has an issue and is not responding to status checks. If you've subscribed to nodes, updates may or may not be working. I'll debug as soon as I can and send another update
Issue appears to be resolved. Please let me know if you see any other problems π
Hey everyone, I've tuned the bot a bit to help reduce the number of offline reports when your node is in fact up. Let me know if you observe more reliable behavior after this change
Bot is now waiting for two successive failed ping attempts before notifying that a node is down. This seems to produce much more useful output. If you still feel like you're getting too many notifications, please let me know and I'll prioritize an update to give you some control over this
An issue with fetching new node details was reported and resolved today. This should not have affected any existing subscriptions
An issue with fetching new node details was reported and resolved today. This should not have affected any existing subscriptions
Please note that the bot is currently offline and not sending alerts. The testnet VM hosting the bot was decomissioned for reasons I am still investigating and unfortunately all subscription data appears to be lost. Bot will come back online once I have a plan to make it more resilient with some kind of backup system
Thanks for your patience and for being an early tester ππ»
Thanks for your patience and for being an early tester ππ»
I am pleased to announce that the bot is back online. Please resubscribe to your nodes to start receiving alerts again. Quick recap of issue and fix:
1. Billing issue on testnet caused bot's VM to get decommissioned and all data was lost
2. Bot now uses an external backup that syncs data on every update
3. I'll experiment with running a second copy as fallback in case of outages
1. Billing issue on testnet caused bot's VM to get decommissioned and all data was lost
2. Bot now uses an external backup that syncs data on every update
3. I'll experiment with running a second copy as fallback in case of outages
New release today removes the bot's dependency on Yggdrasil and uses the Grid Proxy server to check status instead. This should mean that status reports and alerts match what's seen in the explorer. Please report any issues you see with the new version to @scottyeager. Thanks!
Thanks for all reports that subscriptions have not been reliable since the switch to Grid Proxy. Changes deployed today should correct this.
If you prefer the old behavior, the bot also now supports a /ping command to check node status with pings over Yggdrasil. Offering ping based subscriptions again with a separate command is an option I'm considering. Let me know if this is of interest to you π€
If you prefer the old behavior, the bot also now supports a /ping command to check node status with pings over Yggdrasil. Offering ping based subscriptions again with a separate command is an option I'm considering. Let me know if this is of interest to you π€
A few updates over the last months:
* The status and ping commands now report the outcome for all subscribed nodes, if no node id is given
* The bot was printing excessive logs and filled up its (rather limited) available disk space. This interrupted operation briefly but was quickly resolved
* A bug affecting some early users of the bot who had not interacted with the bot for a while has been fixed. The bot will respond to these users again
* As of today, the bot now behaves properly when multiple users subscribe to the same node. Previously, only one user would get the alert (not necessary the same user each time)
* Requesting the status of an invalid node id now gives an error rather than just not responding
Thanks to everyone who reported issues and asked questions about expected operationβI couldn't identify and fix the problems without you!
Roadmap Ideas for development, as feasible:
* Begin querying (and storing) uptime events from GraphQL for subscribed nodes. Evaluate data as a replacement or fallback for grid proxy
* Provide the ability to check cumulative uptime for a given period
* Add alerts for capacity changes, in case of failed hardware (suggestion from a community member, thx)
* ...?
Suggestions welcome -> @scottyeager. Thanks for tuning in π‘
* The status and ping commands now report the outcome for all subscribed nodes, if no node id is given
* The bot was printing excessive logs and filled up its (rather limited) available disk space. This interrupted operation briefly but was quickly resolved
* A bug affecting some early users of the bot who had not interacted with the bot for a while has been fixed. The bot will respond to these users again
* As of today, the bot now behaves properly when multiple users subscribe to the same node. Previously, only one user would get the alert (not necessary the same user each time)
* Requesting the status of an invalid node id now gives an error rather than just not responding
Thanks to everyone who reported issues and asked questions about expected operationβI couldn't identify and fix the problems without you!
* Begin querying (and storing) uptime events from GraphQL for subscribed nodes. Evaluate data as a replacement or fallback for grid proxy
* Provide the ability to check cumulative uptime for a given period
* Add alerts for capacity changes, in case of failed hardware (suggestion from a community member, thx)
* ...?
Suggestions welcome -> @scottyeager. Thanks for tuning in π‘
Update to the bot today, released just a bit ago, corrects a bug that was blocking subscription alerts for some users. This means you may have received some delayed alerts. Please disregard any alerts that are no longer relevant.
Some other changes:
* When using /ping or /status without a node id, to query all subscribed nodes, the lists come back separated by which nodes are up/down and sorted by descending node id
* Bot now refreshes the node's Planetary Network IP before each ping attempt, since they can sometimes change
* Better error handling, so any issues that come up while checking nodes are isolated to a single node
* Bot now sends admin alerts when errors appear in the log file. This should help with catching issues in the future
Thanks to everyone who's reported about their experience with the bot. If you notice any issues, please send a DM to @scottyeager or reply to a message on this channel.
Some other changes:
* When using /ping or /status without a node id, to query all subscribed nodes, the lists come back separated by which nodes are up/down and sorted by descending node id
* Bot now refreshes the node's Planetary Network IP before each ping attempt, since they can sometimes change
* Better error handling, so any issues that come up while checking nodes are isolated to a single node
* Bot now sends admin alerts when errors appear in the log file. This should help with catching issues in the future
Thanks to everyone who's reported about their experience with the bot. If you notice any issues, please send a DM to @scottyeager or reply to a message on this channel.
Hi everyone, with the 3.9 release coming to mainnet today, there are changes that will interfere with the bot's ability to function until it gets an update
In summary:
* The bot will not alert you if a node you are subscribed to goes down (or comes back up)
* Ping function will not work
* You can still fetch the node's status using the /status command
Furthermore:
* The power saving (Wake-on-LAN) feature complicates the notion of a node being "down" or "up"
* I already wanted to overhaul the bot's code and make it nice enough to open source
Therefore, the bot will remain in it's current state until I have the time and energy to bring it up to speed with the changes to the Grid. While the bot has never been officially supported piece of ThreeFold software, I understand that many farmers depend on it and I'd like to continue maintaining it for the time being.
Thanks for your patience until the new version is ready ππ»
In summary:
* The bot will not alert you if a node you are subscribed to goes down (or comes back up)
* Ping function will not work
* You can still fetch the node's status using the /status command
Furthermore:
* The power saving (Wake-on-LAN) feature complicates the notion of a node being "down" or "up"
* I already wanted to overhaul the bot's code and make it nice enough to open source
Therefore, the bot will remain in it's current state until I have the time and energy to bring it up to speed with the changes to the Grid. While the bot has never been officially supported piece of ThreeFold software, I understand that many farmers depend on it and I'd like to continue maintaining it for the time being.
Thanks for your patience until the new version is ready ππ»
Hi again,
Today there's an interim release with two small changes:
* Subscription alerts are now triggered based only on the node's uptime reporting, without any dependence on ping. That means alerts should work again, with the limitation that nodes only report uptime every 40 minutes so it will take at least that long to detect that a node is offline
* The ping command is disabled an displays a message to that effect
Work continues on the new bot, but since it's release date is unknown, these changes should help a bit in the meantime.
Today there's an interim release with two small changes:
* Subscription alerts are now triggered based only on the node's uptime reporting, without any dependence on ping. That means alerts should work again, with the limitation that nodes only report uptime every 40 minutes so it will take at least that long to detect that a node is offline
* The ping command is disabled an displays a message to that effect
Work continues on the new bot, but since it's release date is unknown, these changes should help a bit in the meantime.
Hello,
After receiving reports the the bot was not responding, I identified and resolved an issue related to network performance on the system where the bot was running. If you see any further issues, please don't hesitate to reach out
After receiving reports the the bot was not responding, I identified and resolved an issue related to network performance on the system where the bot was running. If you see any further issues, please don't hesitate to reach out