Unable to assign a tag in vSphere

After long time I’m here again taking away dust from my blog.

Today I want to share an issue that I found trying to assign a tag to a datastore, in order for VMware Cloud Director to use it inside the same storage policy.

This is what I found when I was trying to assign the tag:

And the relative pop-up – blank:

Having a look at the “Tags and Custom Attributes” section, I had all my tags available to be assigned, but I didn’t have them in the previous pop-up window.

Wandering all the links and tabs I found among categories that the field in the category related to the tag I was trying to assign “Tags per object” was checked as: “One tag”. This means that I can assign only 1 tag to an object.

Changing it in “Many tags”:

made the window available and populated.

And, from the previous view, the field “Multiple Cardinality” changed in “true”:

Tagging is a very powerful tool, but it needs attention and accuracy to take full advantage of it.

Advertisement

VCD Cell status: “Unknown”

I’d like to share an experience that I had recently when I decided to upgrade our VCD from 10.1 to 10.3.
According to the upgrade procedures from VMware, one of the prerequisites was to have all the cells in a “Manual” or “Automatic” state in case of failover.

The cell’s status from appliance console

In my case I had that status as “Indeterminate”. Usually this happens whan one of the cells is in a defined status and the others in the opposite status.
I hadd all the cells in manual status, but when I sent a “GET” from API through postman, that condition resulted as “Unknown”.
I tried to set it again with a POST but nothing changed, still unknown.
The environment was a production one, so I didn’t like the idea to redeploy that cell, although not the main one.
After several VMware GSS support sessions, the last one was resolutive.
After confirming that DB was healthy with the commands:

root@vcd1a [ ~ ]# su – postgres
postgres [ ~ ]$ repmgr cluster crosscheck

And that the cell was in unknown state with
/opt/vmware/appliance/bin/api/replicationClusterStatus.py

SSH console

Then I checked the file
/opt/vmware/vpostgres/10/etc/repmgr.conf

for all of the cells: for some reason, some of the expected contents of that file on the affected cell was missing. Actually, the content was much less than the other ones. In this cell I only had these lines:

node_id=24618
node_name='vcd1a'
conninfo='host=192.168.100.101 user=repmgr dbname=repmgr'
data_directory='/var/vmware/vpostgres/current/pgdata'
pg_bindir='/opt/vmware/vpostgres/current/bin'

repmgrd_service_start_command='sudo /usr/bin/systemctl start repmgrd'
repmgrd_service_stop_command='sudo /usr/bin/systemctl stop repmgrd'

service_start_command = 'sudo /usr/bin/systemctl start vpostgres'
service_stop_command = 'sudo /usr/bin/systemctl stop vpostgres'
service_restart_command = 'sudo /usr/bin/systemctl restart vpostgres'
service_reload_command = 'sudo /usr/bin/systemctl reload vpostgres'

I missed a lot of other lines: thanks to support I added the missing ones:

repmgrd_service_start_command='sudo /usr/bin/systemctl start repmgrd'
repmgrd_service_stop_command='sudo /usr/bin/systemctl stop repmgrd'

monitor_interval_secs=2 #The interval (in seconds, default: 2) to check the availability of the upstream node.
connection_check_type=ping #The option connection_check_type is used to select the method repmgrd uses to determine
                                #whether the upstream node is available.
                                # Possible values are:
                                # ping (default) - uses PQping() to determine server availability
                                # connection - determines server availability by attempt ingto make a new connection to the upstream node
                                # query - determines server availability by executing an SQL statement on the node via the existing connection
reconnect_attempts=6 #The number of attempts (default: 6) will be made to reconnect to an unreachable upstream node before
                                #initiating a failover.
                                #There will be an interval of reconnect_interval seconds between each reconnection attempt.
reconnect_interval=1 #Interval (in seconds, default: 10) between attempts to reconnect to an unreachable upstream node
                                #The number of reconnection attempts is defined by the parameter reconnect_attempts
degraded_monitoring_timeout=-1 #Interval (in seconds) after which repmgrd will terminate if either of the servers (locale node and
                                #or upstream node) being monitored is no longer available (degraded monitoring mode).
                                #-1 (default) disables this timeout completely.
failover=manual
promote_command='/opt/vmware/vpostgres/current/bin/repmgr standby promote -f /opt/vmware/vpostgres/current/etc/repmgr.conf --log-to-file'
follow_command='/opt/vmware/appliance/bin/standbyFollow.py %n'
promote_check_timeout=30
primary_visibility_consensus=true
#------------------------------------------------------------------------------
# Logging settings
#------------------------------------------------------------------------------
#
# Note that logging facility settings will only apply to `repmgrd` by default;
# `repmgr` will always write to STDERR unless the switch `--log-to-file` is
# supplied, in which case it will log to the same destination as `repmgrd`.
# This is mainly intended for those cases when `repmgr` is executed directly
# by `repmgrd`.
log_level='INFO' # Log level: possible values are DEBUG, INFO, NOTICE,
                                 # WARNING, ERROR, ALERT, CRIT or EMERG
                                 # WARNING, ERROR, ALERT, CRIT or EMERG

#log_facility='STDERR' # Logging facility: possible values are STDERR, or for
                                 # syslog integration, one of LOCAL0, LOCAL1, ..., LOCAL7, USER
log_file='/var/vmware/vpostgres/current/pgdata/log/repmgr.log' # STDERR can be redirected to an arbitrary file
log_status_interval=300 # interval (in seconds) for repmgrd to log a status message

And, yes – it made the magic. Also this cell now was in Manual mode.

I can’t imagine why that file was cropped, if it was a result of previous upgrades or, simply, some clumsy manual command. The main point is that now I was able to upgrade my VCD.

Pending activities from a VCD tenant to vCenter

Sometime could happen that, for several reasons, your vCD installation is unable to perform and transmit activities from tenants to the underlying vCenter.
When these requests become huge, your VCD could stop talking with vCenter.
In this case, what you need to do is to touch your cloud DB. ATTENTION: we’re talking of a MSSQL-backed VCD installation.


First of all, stop the cells

Second, and I should shout, SECOND: take a snapshot of your DB PLUS take a backup of your SQL


Third, opening SQL Studio, launch the following query:
..

Delete from task;
update jobs set status = 3 where status = 1;
update last_jobs set status = 3 where status = 1;
delete from busy_object;
delete from ccr_drs_host_group_host_inv;
delete from ccr_drs_host_group_inv;
delete from ccr_drs_rule_inv;
delete from ccr_drs_vm_group_inv;
delete from ccr_drs_vm_group_vm_inv;
delete from ccr_drs_vm_host_rule_inv;
delete from compute_resource_inv;
delete from custom_field_manager_inv;
delete from cluster_compute_resource_inv;
delete from datacenter_inv;
delete from datacenter_network_inv;
delete from datastore_inv;
delete from dv_portgroup_inv;
delete from dv_switch_inv;
delete from folder_inv;
delete from managed_server_inv;
delete from managed_server_datastore_inv;
delete from managed_server_network_inv;
delete from network_inv;
delete from resource_pool_inv;
delete from storage_profile_inv;
delete from storage_pod_inv;
delete from task_inv;
delete from task_activity_queue;
delete from activity;
delete from failed_cells;
delete from lock_handle;
delete from vm_inv;
delete from property_map;

After restarting the cells you need to reconnect your vCenter (s) to resync your environment.


Be aware that this operation will not solve your main issue affecting communication between VCD and vCenter: all started because of it, so as soon as you have your VCD back working, start looking at the first problem.

VMware Cloud Director – migration to 9.7 appliances. Troubleshooting.

This night I decided to migrate my linux-based MSSQL-based VCD cells to an appliances installation.
I found a very exhaustive post from

Migrating from vCloud Director 9.5 with an SQL Database to the vCloud Director 9.7 Appliance

with the support of official documentation

https://docs.vmware.com/en/VMware-Cloud-Director/9.7/com.vmware.vcloud.install.doc/GUID-826F9B56-7A0D-4159-89E4-2BB522D9F603.html.

Everything went smooth ’till deployment of stand-by cells.
According to documentation


https://docs.vmware.com/en/VMware-Cloud-Director/9.7/com.vmware.vcloud.install.doc/GUID-26E41AD2-0268-4B12-9505-9F729C5EF63E.html

the only operations to perform were following the online procedure during OVF deploy, specifying that this was a stand-by cell (same size of the primary one).
But I realized that VCD service didn’t start, configuration logs were showing that a password for certificates keystore wasn’t provided and configuration failed.


So, I restarted the configuration and after asking me which NIC I wanted to assigno to http and to consoleproxy (which I didn’t expect since in the appliance deployment bot services are provided by eth0), it asked for the private password for both certificates, http and consoleproxy.

Don’t be scared by the colour…. it’s a customized version 🙂


So, the service started, the cell was shown and active in VCD GUI, but…. on its console at port 5480 couldn’t reach the primary cell (unreachable status). Instead, the primary informed me that this stand-by cell was in a “running” status.
Going deeper, connecting via ssh to the primary cell, the following command:

sudo -i -u postgres /opt/vmware/vpostgres/current/bin/repmgr -f /opt/vmware/vpostgres/current/etc/repmgr.conf cluster matrix

gave me the following output:

INFO: connecting to database
Name | Id | 16711 | 24618 | 26034
—————-+—-+—-+—-+—-
xx-vcd-cell01 | 24618 | * | * | *
xx-vcd-cell02 | 26034 | ? | ? | ?
WARNING: following problems detected:
node 24618 inaccessible via SSH

…seems cell #2 was right.

And again, the output of the following command:
sudo -i -u postgres /opt/vmware/vpostgres/current/bin/repmgr -f /opt/vmware/vpostgres/current/etc/repmgr.conf cluster crosscheck
was:
INFO: connecting to database
Name | Id | 16711 | 24618 | 26034
—————-+—-+—-+—-+—-
xx-vcd-cell01 | 24618 | * | * | *
xx-vcd-cell02 | 26034 | ? | ? | ?
WARNING: following problems detected:
node 26034 inaccessible via SSH

Same result.
From
tail -100 cell-runtime.log
I found that the cell wasn’t listed in the pg_hba.conf – I expected this task was completed by the configuration task. So I proceeded manually, creating /opt/vmware/appliance/etc/pg_hba.d/cell-file.txt” with this cell IP:

“#TYPE DATABASE USER ADDRESS METHOD

host vcloud vcloud 192.168.3.182/24 md5

(VLAN3 is the VLAN assigned to eth1, the DB one)
but the console for the stand-by cell didn’t change, still primary unreachable.
At least, the previous log didn’t show again that error, instead the new cell was added to Broker Network.

In any case, I don’t understand why the system was complaining about the missing entry in pg_hba.conf since it periodically reconnected to the DB: it should be reasonable OR unreachabe OR reachable, ever!

Proceeding with log observation, another strange behaviour: every minute an old Cell UUID was removed (the previous one) and a new one was added. And so on. Well…. this is not normal, of course, or at least I think so…

Then I had a look at a Tom Fojta’s post here,

https://fojta.wordpress.com/tag/cell/,

in order to make all the cells connect each other with no password sharing a key. But it just worked in 1-way, from primary to stand-by and not vice versa. Same result if I executed the same commands on the stand-by cell: able to manually connect to the primary witout password, but the postgres command was still showing no ssh connection. Modifying vcloud user to postgres one in the chown command in Tom’s post didn’t work either.

Last – certificate, again, coming back to the original error I received. I remember when I renewed them last time, I set a password for the 2 single certs (http and consoleproxy), and a different one for the keystore. Having a look at the configuration file I noticed, as written before, an error when verifying the password: actually, it wasn’t wrong, it only was dfferent from keystore.


Solution wasn’t to run the configure command manually (many other parameters are inside the response file) but, instead, I set the 2 private cert passwords same as the keystore password, removed the 2 stand-by cells and redeployed from scratch.


This time, as described by official documentations, I didn’t need to add anything else than the first configuration during OVF deployment (except the user/password section for DB, since already present in the response file).

Now I have a perfect running VCD system.


About Load Balancing: active – stand-by is for DB role, but not for application. So, the cells can run in active active configuration from the application’s point of view.

Thanks to GSS for many of these hints, I learned a hard lesson from this case.

Tenant App for vROPS: Access Denied

When a Tenant App for vROPS was announced, I was really excited: the only thing this product was missing was a multitenant option. Thanks to this feature, not only it became multitenant, but also integrated in VCD!


Setup was more or less smooth, with some FW ports and routing to set (remember, you also need a public IP to NAT the tenant app to, since it will be available through VCD.

Double authorization for every tenant, from VCD side and from Tenant App side.


But the first run wasn’t successful. I kept getting an “Access denied” for the enabled tenants.

This is the VCD part…
…and this, on the tenant app side

Eventually I decided to open a SR to GSS because I didn’t get rid of this situation.
They pointed me immediately to the known issues section https://docs.vmware.com/en/Management-Packs-for-vRealize-Operations-Manager/2.4/rn/Tenant-App-24-Release-Notes.html#knownissues

Since mine was a greenfield installation, the point that solved my issue was the second one: logging as admin to Tenant App, Access Management: disabled organizations that were enabled, and enable again.
After this: now it works as a charm. I only have to remind the process for the nex tenants.
Important: this feature doesn’t work if you login to VCD as provider, but only as tenant administrator.

It opens new offer opportunities: beside the primary role of monitoring and analyzing the tenant environment, it also allow the possibility to bill resources per use.

vROPS Tenant App refusing connection

It happened with any change in the infrastructure, but this VM stopped working. Or, better, the https connection was “refused”. I could ping and ssh it, but no web. And of course, no more VCD plugin working for paying customers… From logs we found nginx’s related errors.

So, once again, GSS came in help. Seems it’s a known issue, so simple, but definitive: adding the vROPS machine FQDN/IP in the vApp Options, as follows:

The value was already set, but not for the default part: I just repeated what was in the “Value” field. Magic: came back working. And customers happy 🙂

Not to Miss at VMworld 2020: Cloud Management…

Not to miss at VMworld 2020: Cloud Management Hands-on Labs: many useful free labs to test new products directly working on them

Not to Miss at VMworld 2020: Cloud Management…

Don’t miss this opportunity to get close and personal with VMware vRealize Cloud Management products and services. If you are hands-on user who wants learn first hand about the newest releases, or further sharpen your existing skills through deep dive into different aspects of cloud management – check out the Hands-on Labs lineup available at The post Not to Miss at VMworld 2020: Cloud Management Hands-on Labs appeared first on VMware Cloud Management.


VMware Social Media Advocacy

The Economics of Cloud Computing

Cloud is one of the least understood technologies in terms of costs and economics

The Economics of Cloud Computing

Cloud spending is up—and so is the buzz around cloud economics. Cloud computing is one of the most sought-after IT investments right now. Cloud is also one of the least understood technologies in terms of costs and economics. Especially now, amidst global economic uncertainty, organizations must be smart about how they consume cloud services to […] The post The Economics of Cloud Computing appeared first on VMware Radius.


VMware Social Media Advocacy

Why VMware Cloud Director?

Why VMware Cloud Director?

Why VMware Cloud Director?

VMware Cloud Director isn’t about IaaS anymore. It’s a pervasive cloud fabric stretching across any VMware endpoint, bringing true cloud capability to a Cloud Provider’s Software-Defined Datacenter. With a brand new UI, deep integration with the VMware SDDC, extensibility, and a plethora of services, vCloud Director is not only the best VMware cloud management platform for Cloud Providers, but also the best VMware-in-the-cloud experience for Enterprises. Learn more about VMware Cloud Director here: https://www.vmware.com/products/cloud-director.html Explore Cloud Solutions with Cloud Director: https://cloudsolutions.vmware.com/ Check out our other Cloud Director videos: http://bit.ly/2N0UKtw Products covered: Cloud Director


VMware Social Media Advocacy

How VMware Marketplace makes simple to deploy vendors and open source solutions

During my last CFD7 presence as a delegate I had a chance to discover the VMware Marketplace from Nee Palaka, Product Marketing & Strategy Manager @ VMware

I already had a look at it and my curiosity pushed me on, so I couldn’t miss this session.

This post is actually a 2-parts post, first one presenting the product as Nee did in an amazing way, with some considerations by my side, and a second  part describing how I implemented it inside Netalia’s production VMware Cloud Director environment, assisted by some VMware brilliant engineers (I apologize, I remember just the name of one of them, Vikrant Singh)

The Marketplace is an opportunity for vendors to reach VMware’s customer base, and also an opportunity for customers to use and install third party solutions plus open source products inside their VMware environments.

MP01

The catalog is composed by images that are tested and validated by VMware to work correctly in VCD, VMConAWS, vSphere, PKS and more to come.

Nee describes 3 main use cases for Marketplace:

  • Moving customers to the cloud simply shifting their workload using the same solutions they had in place on premises directly deployable
  • Maximizing the investment on VMware platform through solutions coming from third party working on different platforms
  • Ensuring developer flexibility, directly connected to the availability of open source solutions in the Marketplace coming mostly by Bitnami

mp02

In our case, the last 2 points are important. We had to create and maintain several templates for our customers, uploading new ones due to new versions or to specific requests for particular needs from them: it was a very high time-consuming activity. More than this, we could offer templates that our clients didn’t think they could have or use and monetizing more efficiently some third-party solutions in a much more effective way to present to our VCD market. Validation from VMware also made feel tenants protected against any kind of threat deploying a template not created directly by them thanks to “VMware security stamp” on them.

Now, about the benefits to customers, Nee points out four interesting topics mentioning the convenience to have all these solutions centralized in a location easy to reach and implement, and growing day by day; trust, due to the strict validation program that solutions have to pass to be inserted in the Marketplace; flexibility to deploy the same solution in different platforms and versions; transparency on notifications and updates coming from a central location.

Today the Marketplace expose more than 300 templates available on several platforms, partly from partners, partly coming from open source communities

The Marketplace is available inside the Cloud Services Portal, you can easily get enabled to adopt it by VMware, but it’s also available for browsing for free without signing in the portal.

mp03

The landing page shows the last uploaded products along with categorized ones by format. You can also filter and sort them by publisher, category, solution type, deployment platform and product pricing.

After choosing an appliance you have the possibility to download it in a OVA format, or to “subscribe”: that means that VMware Marketplace will connect to the platform of your choice, will add to it and will maintain the solution with new versions and patches, with no further effort by your side. In our case, the template was added to a tenant catalog, the tenant that we use to publish catalogs to other tenants, and uploaded on our systems ready to deploy.

Most of the third-party solutions are sold in BYOL mode: you will deploy the appliance, then you’ll be asked to provide a license, sold directly by the vendor.

After this CFD event, back to my (virtual) office, I decided to try this experience, especially because customers were insisting on create new templates on our VCD catalog.

Netalia is an italian cloud service provider focusing its offer on VMware cloud stack, so VCD is our main product.

I spent my last months upgrading the whole platform, so now I’m realizing day by day the benefits of VCD 9.7 (as a CSP, I’m quite conservative – isn’t simple to stop all the customers to upgrade, I’ll wait some time to move to 10.1).

Back to Marketplace: I needed to import the catalog in my installation. Everything worked right, passing through the cloud services and subscribing the most required templates by customers. I had an issue at a certain point, when all the subscriptions failed, and I couldn’t realize why.

I’ve to say that Nee and Vikrant Singh together assisted me in realtime solving the issue (or, better, it wasn’t an issue, but just a procedure): I had to check the box in properties for the tenant used for catalog purposes, from provider interface.

mp04

All these templates are updated by Marketplace, and you can decide how many versions keep in your catalog.

I think that this solution is useful not only for end users, but also for all the companies that are going to present their products in the shape of an OVA. Most of the templates are for free, for the others there’s the BYOL method of licensing.

This is just an example on how many versions are available for a popular appliance:

mp05

And how many of them will Marketplace keep in your catalog before removing them (plus autoupdate in case of new versions):

mp06

This means that your catalog will be not only update regularly, but also cleaned up having old releases removed – as seen, you’ll define the number of versions to keep.

In all of this path I didn’t mention App Launchpad (here a great post from my favourite engineer @ VMware, Daniel Paluszek): I’m sure it will be a success, but I have to upgrade my VCD since it runs on 10.1 version, and I’ll write again about it.

Virtual is ethereal