In-service upgrade fails due to CDB.RUNNING being locked


ConfD Version: 6.6.4

It seems that even when you enter the upgrade mode (start the session and init the upgrade via MAAPI interfaces) you are still able to start and fully lock another session towards CDB.RUNNING with start_session2(). When that happens, next call to perform_upgrade() is doomed to fail with the following:

the configuration database is locked by session none

My question is why, even though that we are in an upgrade mode, you are still able to establish sessions towards cdb outside of the actual upgrade? Also, how to make sure that any start_sssion() call is aborted when confd is in said upgrade mode? Or is there a way to somehow lock the actual upgrade session? I also don’t understand why get_phase() is not indicating that the upgrade has been started (it always returns False even after upgrade is initiated with init_upgrade()). Or is that flag set only when confd initiated upgrade by itself?

A little abstract of how to reproduce the issue:

maapi_s = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 0)
maapi.connect(maapi_s, '', _confd.CONFD_PORT)

 # this emulates some other client|daemon that can connect at any point to cdb...
cdb_s = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 0)

maapi.start_user_session(sock=maapi_s, username='system', context='system', groups=['admin'], src_addr='', prot=_confd.PROTO_TCP)
maapi.init_upgrade(maapi_s, timeoutsecs=10, flags=0)

#... and here's the culprit. Note that this may be called before upgrade session is established, i.e., init_upgrade() will still succeed.
cdb.start_session2(cdb_s, cdb.RUNNING, cdb.LOCK_SESSION | cdb.LOCK_WAIT)

 # this here fails, unless the above cdb session is closed
maapi.perform_upgrade(maapi_s, ...)

Any hints on the above matter, how to workaround it, would be really appreciated!

Well, I guess the steps in the in-service upgrade are primarily intended to protect the upgrade process and NB clients (which are outside your control) from each other. I.e. all current transactions are terminated, no new transactions can be started, thus preventing NB clients from e.g. running transactions based on a data model that may no longer be valid at the point of commit. As a side effect, the probably most common type of CDB clients, i.e. those that only read CDB (config) data based on subscription notifications, will not have anything to read during the upgrade process since no transactions are committed.

Both the CDB clients and the upgrade process are completely under your control, thus you can handle that in whatever way you find most convenient. In many cases a change of the data model in itself also requires restarting or even replacing CDB clients, data providers, etc. You can perhaps make use of the CONFD_NOTIF_UPGRADE_EVENT notifications to orchestrate the upgrade across your different software components. Actually, the cdb_subscriber.c in the in_service_upgrade/simple example in the examples set illustrates this.