We have come across a strange deadlock between data provider callback and validation callback. According to the documentation it should be safe to use same thread for data provider and validation callback because confd will call then in a thread safe manner. However, when validation execute is called, we are using maapi_diff_iterate in worker thread and at the same time confd is calling get_elem on when condition. which is causing the deadlock.
I am wondering if we need create a different thread for validation if that is the case
10-Feb-2021::17:25:29.722: devel-c validate request for path /eypvalid:validation
10-Feb-2021::17:35:31.643 confd[71]: devel-c Worker socket query timed out daemon cpa id 1
10-Feb-2021::17:35:31.645 confd[71]: devel-c get_elem error {external_timeout, “”} for callpoint cpa path /helloe:hello/measurement-capabilities/measurement-jobs/job-prioritization-support
10-Feb-2021::17:35:31.645 devel-c validate error {external_timeout, “”} for path /eypvalid:yang-provider-validation
maapi_diff_iterate() is a blocking call som you will need another thread for at least that API call.
The deadlock that results in a timeout occurs as you are serving callpoint callbacks from the same thread that you do maapi_diff_iterate() from, which in turn invoke those callbacks.
@cohult , just to avoid problem in the future. I would like to check if there any documentation saying which api is blocking ? or all the maapi is blocking call?
Where it makes sense. For example maapi_:diff_iterate() returns CONFD_OK or an error. There is no function to check for the result or callback that is invoked when it is done, hence, it will wait for the iter() function to do its thing and return when the iter() function is done and share the result.
I believe that’s pretty clear.
Had a triy, but seems not really work, or I missed something. If I understand correctly, we still need wait the the thread running maapi_:diff_iterate() return before return confd_ok in validation callback. right? the problem here is validation and get_elem are called at same time, which shouldn’t really happen as documentation said.
So, could it possible to create a new thread only for validator? but there is only function confd_trans_set_fd but no confd_validation_set_fd.
Hi @charlie.zha, The problem with the validation and get_elem getting called at the same time due to a for example a when statement, was adressed in ConfD 7.4. Upgrade to 7.4.x or later, or create a separate daemon context and control socket.