Hi all!

i have a nice setup with some containers (podman rootless) and bare metal services (anything i can install bare metal, goes bare metal usually).

I used Monit, in the past, to keep an eye on my services and automatically restart something that for any reason goes down. I stopped using Monit because doesnt scale well on mobile browser and it’s frankly clumsy to configure.

I could go back to Monit i guess, but i am wondering if there is anything better out there to try.

A few requirements (not necessarily mandatory, but preferable):

  • Open Source (ideally: true open source, not just commercial sulutions with dumbed down free verisons)
  • Not limited, or focuesd, on containers (no Watchtower and similar)
  • For containers, it can just support “works” or “restart”
  • For containers, if it goes above the minimum “works” and “restart” must support podman
  • Must support bare metal services (status, start, stop)
  • Must send email or other kind of notifications (ok IM notifications, but email preferred)
  • Should additionally monitor external machines (es other servers on the LAN), or generic IP addresses
  • Should detect if a web service is alive but blocked
  • No need for fancy GUIs or a Web GUI (it’s a pro point, but not required)
  • No need for data reporting, graphics and such aminities. They are a plus, but 100% not required.

What do you guys use?

  • asap@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    2 天前

    For Podman you don’t need anything else other than Podman to monitor and restart failed containers:

    podman-compose --podman-run-args='--health-on-failure=restart' up -d
    

    For anything else I use https://healthchecks.io/

    • mosiacmango@lemm.ee
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      2 天前

      If youre using podman quadlets, this config in the systemd service file does the same:

      [Service]

      Restart=always

      • asap@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        2 天前

        It’s just what I use, as I’m specifically looking for something which only notifies when things aren’t able to report due to failure. Free for 20 checks which is more than enough for me.

        If I were hosting it myself I wouldn’t know if my own notification system had failed (since it wouldn’t be able to report due to failure.)