ConfD User Community

Create-subscription rpc gets stuck

We use the confd7.5.5.1 and hit this issue.
There have been some sessions that send rpc success and receive the success response and also receive the changed data notification. All works well.
At some point, a new session sends and gets timeout. and later, it re-sends and gets failure like below:


<?xml version="1.0" encoding="UTF-8"?><rpc xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="206">

<create-subscription

xmlns="urn:ietf:params:xml:ns:netconf:notification:1.0">

<stream>exa-events</stream>

</create-subscription>

</rpc>

<rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="206"><rpc-error>

<error-type>protocol</error-type>

<error-tag>operation-failed</error-tag>

<error-severity>error</error-severity>

<error-message xml:lang="en">A subscription is already active for this session</error-message></rpc-error>

</rpc-reply>

I look into the confd_netconf.log and found every timeout, there is something handling in other session. Don’t know whether this will impact the response.

Please refer to the below snippet of confd_netconf.log:
The confd took around 3m to respond to the rpc while processing the ont-arrival notification**

....
<INFO> 23-Apr-2022::10:25:47.059 eolus confd[9732]: netconf id=12 got rpc: {urn:ietf:params:xml:ns:netconf:notification:1.0}create-subscription attrs: message-id="m-62642771afa4e6201529724f-0"
<INFO> 23-Apr-2022::10:25:47.063 eolus confd[9732]: netconf id=12 create-subscription stream='exa-events' attrs: message-id="m-62642771afa4e6201529724f-0"
<INFO> 23-Apr-2022::10:25:47.067 eolus confd[9732]: netconf id=12 sending rpc-reply, attrs: message-id="m-62642771afa4e6201529724f-0"
...
...
<INFO> 23-Apr-2022::10:29:05.460 eolus confd[9732]: **netconf id=11** got rpc: {urn:ietf:params:xml:ns:netconf:notification:1.0}**create-subscription** attrs: message-id="201"
<INFO> 23-Apr-2022::10:29:05.463 eolus confd[9732]: netconf id=11 create-subscription stream='exa-events' attrs: message-id="201"
<INFO> 23-Apr-2022::10:32:09.556 eolus confd[9732]: netconf id=12 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 23-Apr-2022::10:32:09.563 eolus confd[9732]: netconf id=11 sending rpc-reply, attrs: message-id="201"

<INFO> 23-Apr-2022::10:32:09.565 eolus confd[9732]: netconf id=12 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival

There are 2 questions:

  1. why does create-subscription rpc get stuck when there is something handling in other session?
  2. How to cancel subscription and then re-create subscription? now the NBI re-sends and gets failure and then stop the following process. It’s a critical issue since it stops and blocks all the following process.

Please help to address it. Thanks.

You may have hit a known (mis)feature: when a NETCONF session A subscribes to notifications and another session B sends a RPC that should cause a notification to be sent to A, B does not get a reply until the notification has been sent. If the session A is processing another long-running request, the reply to B is blocked. Is it possible that this is what you see?

There are some attempts to fix this scenario, but the forthcoming release 7.8 does not contain the fix yet, so you might need work around that, e.g. to split the session that causes the delay (in your log that would probably be session 12) to two, one that has these long-running requests, another that subscribes to events.

@mvf, Sounds like what’s your description is similar with what I have seen, but not same. Our case is:
When a NETCONF session A subscribes to notifications and our EDP has some data change like onts’ arrival or departure, so our EDP generates relative external events like “ont-arrival”, “ont-departure” to the confd, and then confd tries to send these notifications to all sessions like B, C ,etc, which has subscribed the notification.
In this case, session A will be blocked for several minutes. But it’s not always blocked in the same case.
Please take a look at the below logs, between receiving create-subscription and replying create-subscription, there are lots of notifications, but it’s very quickly.

<INFO> 25-Apr-2022::19:23:47.394 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.405 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.410 NAPd-E3-2G-023 confd[9897]: netconf id=39 got rpc: {urn:ietf:params:xml:ns:netconf:notification:1.0}create-subscription attrs: message-id="199"
<INFO> 25-Apr-2022::19:23:47.424 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-link
<INFO> 25-Apr-2022::19:23:47.436 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.446 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.456 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-link
<INFO> 25-Apr-2022::19:23:47.465 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.478 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.490 NAPd-E3-2G-023 confd[9897]: netconf id=11 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.497 NAPd-E3-2G-023 confd[9897]: netconf id=12 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.501 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-link
<INFO> 25-Apr-2022::19:23:47.510 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.512 NAPd-E3-2G-023 confd[9897]: netconf id=12 got rpc: {urn:ietf:params:xml:ns:netconf:base:1.0}get attrs: message-id="m-62642771afa4e6201529724f-1663"
<INFO> 25-Apr-2022::19:23:47.515 NAPd-E3-2G-023 confd[9897]: netconf id=12 get attrs: message-id="m-62642771afa4e6201529724f-1663"
<INFO> 25-Apr-2022::19:23:47.525 NAPd-E3-2G-023 confd[9897]: netconf id=12 sending rpc-reply, attrs: message-id="m-62642771afa4e6201529724f-1663"
<INFO> 25-Apr-2022::19:23:47.526 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.540 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-link
<INFO> 25-Apr-2022::19:23:47.548 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.558 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.567 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-link
<INFO> 25-Apr-2022::19:23:47.582 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.602 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.619 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-link
<INFO> 25-Apr-2022::19:23:47.635 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.651 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.663 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-link
<INFO> 25-Apr-2022::19:23:47.675 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.687 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.699 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-link
<INFO> 25-Apr-2022::19:23:47.710 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.724 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.735 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-link
<INFO> 25-Apr-2022::19:23:47.756 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.771 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.788 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-link
<INFO> 25-Apr-2022::19:23:47.800 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.812 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.823 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-link
<INFO> 25-Apr-2022::19:23:47.833 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.844 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.856 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-link
<INFO> 25-Apr-2022::19:23:47.867 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.879 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.889 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-link
<INFO> 25-Apr-2022::19:23:47.899 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.907 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.915 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-link
<INFO> 25-Apr-2022::19:23:47.922 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.930 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.937 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-link
<INFO> 25-Apr-2022::19:23:47.945 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.958 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.965 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-link
<INFO> 25-Apr-2022::19:23:47.972 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.984 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:47.994 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-link
<INFO> 25-Apr-2022::19:23:48.005 NAPd-E3-2G-023 confd[9897]: netconf id=39 create-subscription stream='exa-events' attrs: message-id="199"
<INFO> 25-Apr-2022::19:23:48.008 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:48.020 NAPd-E3-2G-023 confd[9897]: netconf id=39 sending rpc-reply, attrs: message-id="199"
<INFO> 25-Apr-2022::19:23:48.024 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:48.040 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-link
<INFO> 25-Apr-2022::19:23:48.062 NAPd-E3-2G-023 confd[9897]: netconf id=38 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 25-Apr-2022::19:23:48.070 NAPd-E3-2G-023 confd[9897]: netconf id=39 sending notification {http://www.calix.com/ns/exa/base}environment-door
...
...

so there is no long-running requests from the netconf session, right? any idea?
Thanks a lot.

I suggest you take a look at other logs, like devel.log (possibly with developerLogLevel set to trace), to inspect what is going on in case you wait so long for a reply. You may also try NETCONF trace log; but if there is indeed a problem with large messages sent between ConfD and one of the daemons, the trace log will also become quickly very large.

Thanks @mvf , there is no any logs in the confd_developer.log.
Could you help me to understand what’s the meaning of “notification replay”?

...
<INFO> 23-Apr-2022::10:32:21.379 eolus confd[9732]: netconf id=11 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 23-Apr-2022::10:32:21.391 eolus confd[9732]: netconf id=11 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 23-Apr-2022::10:32:21.399 eolus confd[9732]: netconf id=11 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 23-Apr-2022::10:32:21.406 eolus confd[9732]: netconf id=11 sending notification {http://www.calix.com/ns/exa/gpon-interface-base}ont-arrival
<INFO> 23-Apr-2022::10:32:21.409 eolus confd[9732]: netconf id=11 sending notification replayComplete
..

I see if a session sends “create-subscription”, the confd sends rpc-reply and also sends lots of notifications to this session even at that time there is no data changes from our EDP. I don’t know how the confd gets these data changes or events raised earlier by our EDP. Is there any callback to our EDP to retrieve that? I don’t find that.
Thanks.

That might well explain the problem you see. ConfD supports notification replay, it can be either built-in (which is probably your case) or custom. If configured, when a NETCONF client sends a create-subscription message with parameter startTime, the server is supposed to immediately start streaming notifications that has been created since that startTime value; see the RFC section. If your replay buffer is large, this becomes just another instance of the problem with NETCONF sessions blocked due to notifications.

You have a couple of configuration options:

  • Turn of notifications replay altogether (for your stream). In confd.conf, just modify replaySupport to false for your stream - right now, you obviously have this there:

      <replaySupport>true</replaySupport>
    
  • Limit the replay size; there are two parameters affecting that, maxSize and maxFiles. I’m not really familiar with how they work, you might need to experiment to find reasonable values.

  • Implement your own custom replay where you can limit replays as you please.

  • When creating a subscription, either leave out startTime paramenter (without the paramenter the NETCONF server does not do any replay) or limit it to a not-so-distant past so that the number of notifications is small.

The feature is well documented in the ConfD user guide, and the NETCONF notification example shows how to use it.