ConfD User Community

Confd read stuck issue

hi
my process is stucked by confd read() several times, not always,sometimes stucked around one time or 2 times within 1hour, sometimes not stuck several hours or several days. the related API will called 10000 times with 1 hour. what is the issue ? how to resolve it?

  1. .stacktrace dump as the following:

    #0 0x00007ff3776a475d in read () from /lib64/libpthread.so.0
    No symbol table info available.
    #1 0x000000000068921d in read_fill ()
    No symbol table info available.
    #2 0x000000000068bb10 in term_read ()
    No symbol table info available.
    #3 0x000000000068bfb4 in op_request_term ()
    No symbol table info available.
    #4 0x0000000000685bb4 in cdb_start_session2 ()
    No symbol table info available.
    #5 0x00000000006520a2 in EmOperWrapper::createCdbSession(int) ()
    No symbol table info available.
    #6 0x0000000000656ab9 in EmOperWrapper::cdbSetValuesToDb(int, confd_tag_value*, int, char*, int) ()
    No symbol table info available.

    #0 0x00007f12958b175d in read () from /lib64/libpthread.so.0
    No symbol table info available.
    #1 0x000000000068921d in read_fill ()
    No symbol table info available.
    #2 0x000000000068bb10 in term_read ()
    No symbol table info available.
    #3 0x000000000068bfb4 in op_request_term ()
    No symbol table info available.
    #4 0x0000000000680997 in cdb_set_timeout ()
    No symbol table info available.
    #5 0x0000000000656b1d in EmOperWrapper::cdbSetValuesToDb(int, confd_tag_value*, int, char*, int) ()
    No symbol table info available.


    #0 0x00007f764913d75d in read () from /lib64/libpthread.so.0
    No symbol table info available.
    #1 0x000000000068921d in read_fill ()
    No symbol table info available.
    #2 0x000000000068bb10 in term_read ()
    No symbol table info available.
    #3 0x000000000068bfb4 in op_request_term ()
    No symbol table info available.
    #4 0x000000000067f4ce in ?? ()
    No symbol table info available.
    #5 0x000000000067f7a0 in cdb_set_values ()
    No symbol table info available.
    #6 0x0000000000656b3e in EmOperWrapper::cdbSetValuesToDb(int, confd_tag_value*, int, char*, int) ()
    No symbol table info available.

  2. find some error log in confd.log

    24-Jun-2020::17:16:52.433 vm47_rms_robo2 confd[3722]: - CDB client (app) timed out, waiting for end_session()
    24-Jun-2020::17:18:22.999 vm47_rms_robo2 confd[3722]: - CDB client (app) timed out, waiting for end_session()
    24-Jun-2020::17:20:07.683 vm47_rms_robo2 confd[3722]: - Daemon app died

    24-Jun-2020::18:12:35.788 vm47_rms_robo2 confd[3722]: - CDB client (app) timed out, waiting for end_session()
    24-Jun-2020::18:14:22.026 vm47_rms_robo2 confd[3722]: - CDB client (app) timed out, waiting for end_session()

the following is related confd API calling scenario:

   mCdbSocket = socket(PF_INET, SOCK_STREAM, 0)

   cdb_connect()

   cdb_load_flags = CDB_LOCK_REQUEST|CDB_LOCK_WAIT|CDB_LOCK_PARTIAL;

   cdb_start_session2(mCdbSocket, CDB_OPERATIONAL, cdb_load_flags) 

   cdb_set_namespace(mCdbSocket, nameSpace) 

   cdb_set_timeout(mCdbSocket,60);

   cdb_set_values(mCdbSocket, tVal, numberOfEntries, tablePath )

   cdb_end_session(mCdbSocket); 

   cdb_close(mCdbSocket);

thx

With CDB_LOCK_WAIT set, your call to cdb_set_values() will wait for a lock on the operational datastore to be released before setting the values. This is the correct thing to do if the intention is to trigger notifications to CDB operational data subscriber applications when writing.

If the lock on CDB oper that your application is waiting for is not released before the 60s CDB client timeout that you set, the call to cdb_set_values() will fail with a timeout.

The reason for the timeout is likely that the previous write to CDB oper that triggered CDB oper subscribers has not been processed yet by all notified CDB oper data subscribers. I.e. all subscribers have not yet called cdb_sync_subscription_socket(..., CDB_DONE_OPERATIONAL)

thanks your answer!.
currently, we are only write opera data to cdb, we have not oper data subscriber app. why my app is stucked sometimes, maybe stuck by confd "read " several times one day, maybe no-stuck several days.

If your call to cdb_set_values() sometimes takes longer than you 60s setting you will get a client timeout, You likely need to increase or remove that timeout for your use-case.

hi
the real issue is my app is blocked long to 150 second by confd read() as above .stacktrace file.
i add “cdb_set_timeout” is target to let confd API do not blocked at read() always, as your above comments, when reach to 60s, cdb_set_values() can return failed but do not blocked in read() always.
do u means that if i remove timeout setting, will it resolve confd read block issue?