Multi-thread: cdb_start_session returns error while doing get & edit-config in parallel

Hi,

I have a multi-threaded confd application with separate threads for

  1. subscription
  2. control
  3. Separate worker thread for Notification, Action and operational data

I am running into cdb_start_session errors while trying to do the following operations in parallel

  1. Get operation on operational data thread
  2. Edit-config operation on subscription thread.

As per the recommendation in the start up guide, i am using the following API for cdb_start_session with mutl-threading

cdb_read_any_session() API:

bool started = false;
int status = CONFD_OK;

while (!started) {

    /* Retry to create a new CDB read session if there are some write
     * transaction occurring in parallel */
    if ((status = cdb_start_session(sock, type)) != CONFD_OK) {
        if (confd_errno != CONFD_ERR_LOCKED) {
            LogMsg("Failed to start cbd session with confd errno %s",
                      confd_strerror(confd_errno));
            return status;
        } else {
            sleep(1);
        }
    } else {
        started = true;
    }
}

Get operation in operational data thread- does set_namespace:

if ((status = cdb_read_any_session(datasock,
                                         CDB_RUNNING)) != CONFD_OK)
{
    return status;
}

cdb_set_namespace(datasock, ona100__ns);

edit-config in Subscription thread:

        if ((status = cdb_read_subscription_socket(subsock,
                                                   sub_points,
                                                   &reslen)) != CONFD_OK)
        {
            LogMsg("terminate sub_read: %d", status);
            goto ctl_thr_cleanup;
        }

        if (reslen > 0)
        {

            if ((status = cdb_start_read_any_session(datasock,
                                                     CDB_RUNNING)) != CONFD_OK)
            {
                LogMsg("Cannot start datasock session");
                goto ctl_thr_cleanup;
            }

            if ((status = cdb_set_namespace(datasock, ona100__ns)) != CONFD_OK)
            {
                LogMsg("Cannot set namespace");
                goto ctl_thr_cleanup;
            }

Error Log
I have seen two different error codes, 18 and 21.

cdb_read_any_session(): Failed to start cbd session with confd errno 21 (Bad protocol usage or unexpected retval), status -1.

cdb_read_any_session(): Failed to start cbd session with confd errno 18 (internal error), status -1.

Can you please help me to understand what is error means and why I would run into this? Is there any problem with the usage of cdb_read_any_session() API.

I see it is suggested to use cdb_start_session2(CDB_LOCK_SESSION | CDB_LOCK_WAIT) as an alternate for cdb_start_session with a loop, but i do not understand if there any implementation difference with that.

Appreciate some help on this.

  • Kapil

Could it be that you reuse your CDB_DATA_SOCKET “sock” in multiple threads? E.g. one reading CDB_RUNNING, and another one writing/reading to/from CDB_OPERATIONAL?

Make sure that each thread has its own socket.

From ConfD 6.3 UG Chapter 6.7. The Protocol and a Library Threads Discussion:

ConfD API functions are thread-safe as such, but multiple threads using them with the same socket will have unpredictable results, just as multiple threads using the read() and write() system calls on the same file descriptor in general will. In the ConfD case, one thread may end up getting the response to a request from another, or even a part of that response, which will result in errors that can be very difficult to debug.

Hi Cohult,

Thanks for the response.

I have created separate sockets for each one of these threads, so we are not hitting the multiple threads trying to access the same socket issue.

  1. subscription
  2. control
  3. Separate worker thread for Notification, Action and operational data

Errors i am seeing are,
errno 21 is CONFD_ERR_PROTOUSAGE. errno 18 is CONFD_ERR_INTERNAL.

It seems to me this always happens after we LOCK the running CDB.

One thread and then do on subscription socket thread. In the mean time the operational data get is running by another thread. The edit-config fails at cdb_start_session() with errno 18 or 21.
And seems retry of cdb_start_session can resolve this error.

Let me know if you need additional info.

Can you (enable and) provide the ConfD developer log with log level set to trace for when the issue occur + the libconfd print out at trace level?

For example in confd.conf:

<developerLog>
  <enabled>true</enabled>
  <file>
    <enabled>true</enabled>
    <name>./devel.log</name>
  </file>
</developerLog>
<developerLogLevel>trace</developerLogLevel>
...

And in your application:
confd_init(argv[0], stderr, CONFD_TRACE);

It still seems suspicious to me that you are using a socket called “datasock” for both operational and configuration data…

Cohult, yes you are right. I didn’t realize that i am accessing the same ‘datasocket’ from two different thread. Though in subscription socket i am working only on subsocket, i am still accessing datasocket to set the namespace.

One thing i can try is to avoid using accessing the data socket in subscription thread, but i am afraid i would not be able to do that, as i need to set the namespace for a certain operation in subscription thread.

I understand the statements made in the section, ConfD 6.3 UG Chapter 6.7, but is there any other way i can work around in this scenario, by adding external lock to ensure data socket is accessed only by one thread at a time ?

Even with external lock, i could land into this issue right ? “one thread may end up getting the response to a request from another, or even a part of that response, which will result in errors that can be very difficult to debug”

I suggest you create two sockets and connect them both as CDB_DATA_SOCKET’s.
One for reading the CDB configuration after you received a configuration change event on the CDB_SUBSCRIPTION_SOCKET.
The other for reading (and writing) CDB operational data.

Thanks Cohult, will give it a try.
One more query, if do data socket read operation in 2 threads and read/write in 1 thread, then i should 3 sockets for CDB_DATA_SOCKET right ?

Right. One socket per thread. No sharing of sockets between threads.