Thursday, November 25, 2010

Zenoss 3.0.3 Disappointments

Lured by the new sleek looking Zenoss interface in 3.x series, and availability of pre-compiled RHEL/CentOS rpm's, I decided to give it a go.  This turned out to be a terrible waste of time and effort.

Installation
The rpm by default created a user zenoss (UID 3117) and group zenoss with bash as shell.  I did not like the fact that the rpm did not add the user as the system user, and decided to change it.  This turned out to be a terrible mistake!  Zenoss developers for some very strange reason have decided to hard code the database name, db user, db password, username, zope password, username, UID, and group in various files (python & shell script).

I should have stopped when I found myself grep'ing through all the files looking for these hard coded values, but I was on a fool's mission.

Once I finished editing all the hard coded values, Zenoss refused to start with a very confusing error message "This account is currently not available."  It turned out that zenoss user must have a valid interactive login shell (bash in this case).

Once I got Zenoss started, it spit out a lot of errors about missing librrd.so.4.  I found the file in /opt/zenoss/lib and added it to my ld.so.conf file, which resolved the issues.  I am not sure why the rpm did not set this up.

As per the zenoss documentation, I proceeded to installing the zenoss-core-zenpacks.  It turned out to be another failure because it too had user and group hard coded.  It refused to install, unless I renamed the zenoss user and group from 'monitor' to 'zenoss'.

Next, all my attempts to access the portal resulted in very strange behavior.  The page kept reloading non-stop.  I was expecting some sort of a user setup wizard, but no such luck.  All attempts to guess the password failed.

Uninstallation
At this point, any normal person would call it a night and revisit the issue after getting some sleep.  Not I!  I decided to remove all my changes and accept the defaults set by the rpm.  I made the mistake of deleting the user and group (monitor:monitor) I had created prior to uninstalling the zenoss and zenoss-core-zenpacks rpm.  Adding the user turned out to be insufficient, to uninstall the rpm you have to add the user and group with the same name and uid/gid.  However, the ordeal was not over yet.  Turns out, to uninstall the zenoss-core-zenpacks, you must have zenoss running and a particular python process listening on TCP/8100.

Finally, I managed to uninstall the whole mess and decided to start fresh with all the rpm defaults.  Accepting all the rpm defaults resulted in success and I was finally greeted with a functioning monitoring portal.

Reflection
I decided to call it a (very unsatisfactory) night.  I woke up this morning and couldn't help wonder if Zenoss is worth using.  If the installation, customization and uninstallation could be so poorly engineered, I can only imagine what issues might creep up in the future.  I have decided to uninstall Zenoss and continue my search for a monitoring solution.

Feedback
This post won't be constructive criticism if I did not offer up some solutions, so here are my 2 cents:

  1. Please don't hard code configuration values in multiple files.  Why not designate one python file which is referenced by all other python scripts (pre_init, post_init, upgrade etc.)?  Similarly, don't hard code configuration values in init.d script, a better option would be to create a /etc/sysconfig/zenoss file.
  2. When adding a user, please add it as a system user since the role of the user is to run system services.  Also, there is no reason to assign this user a home directory in /home.  /opt/zenoss will be a more appropriate choice.
  3. I am not a fan of running services with an interactive login shell such as bash due to security reasons.  I did not dig deep enough to determine if this was a necessity, but I am sure if daemons like Apache can run without an interactive shell, so can a simple monitoring application.  I think /sbin/nologin is an appropriate choice.
  4. Please reference the user added by the rpm consistently.  I noticed some references were using UID, while others were using username 'zenoss'.
  5. Newer PCI compliance standards require us to not accept vendor defaults when it comes to settings such as passwords.  So it is very important to easily allow a system administrator to change settings such as database user, database password etc.  I think web applications such as Drupal have done a fantastic job on installation wizard.
  6. Please update the zenoss-core-pack rpm to check to see if the zenoss service is running, and if not, proceed with the uninstalltion.  If the service is necessary for uninstallation, then either consider starting it or adding such dependency to the '%preun' section in the rpm spec file.
  7. Please add a zenoss library config to /etc/ld.so.conf.d folder.

Cheers & Happy Thanksgiving!,
VVK

1 comment:

  1. VVK,

    Very sorry to hear that the Zenoss 3.0.3 install did not go well for you. On our download page, we try to provide multiple install options (stack installers, virtual appliances, native packages, etc) in oder to make getting up and running as easy as possible - http://community.zenoss.org/docs/DOC-3240?noregister .

    That being said, you brought up some great points on how we can make the process better. I plan on taking the feedback to our engineering team on Monday to see what we can do better in the future.

    I am very sorry to hear this first impression left such a bad taste that you decided not to continue with your test drive. Our goal is to make Zenoss the best monitoring solution out there and we have done a lot of engineering work in the last release to continue in that direction.

    One of the best parts of working with Zenoss is people like you giving us feedback and being part of the Zenoss community (http://community.zenoss.org).

    If there is anything I can do please don't hesitate to email me - josh at zenoss.com

    Thanks again for the feedback.

    Josh Duncan

    ReplyDelete