Crash report after confd phase0 started

Hi,
During confd starts, there is a crash report listed below. After confd started, all edit-config from 830 port failed from client view. Does this crash impact the get and edit-config operation from 830 port? Thank you very much.

2020-12-10T06:53:45.123411770Z =CRASH REPORT==== 10-Dec-2020::06:53:45.113508 ===
2020-12-10T06:53:45.123428100Z crasher:
2020-12-10T06:53:45.123441377Z initial call: notif_server:stream_handler/1
2020-12-10T06:53:45.123444805Z pid: <0.190.0>
2020-12-10T06:53:45.123447916Z registered_name:
2020-12-10T06:53:45.123450792Z exception exit: {noproc,
2020-12-10T06:53:45.123453606Z {gen_server,call,
2020-12-10T06:53:45.123456515Z [replay_server,
2020-12-10T06:53:45.123459418Z {save_notif,‘mycompany-DATA-CHANGE’,
2020-12-10T06:53:45.123462507Z {19,{2020,12,10,6,53,45,110414,0,0}},
2020-12-10T06:53:45.123465498Z {{aaa_server,chk_notification_access,
2020-12-10T06:53:45.123468527Z [‘urn:rdns:com:mycompany:oammodel:mycompany-notifications’,
2020-12-10T06:53:45.123471756Z ‘push-change-updates-not-created’]},
2020-12-10T06:53:45.123474757Z [{[‘urn:rdns:com:mycompany:oammodel:mycompany-notifications’|
2020-12-10T06:53:45.123477817Z ‘push-change-updates-not-created’],
2020-12-10T06:53:45.123480885Z leaf}]}},
2020-12-10T06:53:45.123483749Z infinity]}}
2020-12-10T06:53:45.123486659Z in function gen_server:call/3 (gen_server.erl, line 234)
2020-12-10T06:53:45.123490060Z in call from notif_server:stream_handler_msg/2 (notif_server.erl, line 972)
2020-12-10T06:53:45.123493293Z ancestors: [notif_server,confd_second_sup,confd_sup,<0.48.0>]
2020-12-10T06:53:45.123496448Z message_queue_len: 0
2020-12-10T06:53:45.123499334Z messages:
2020-12-10T06:53:45.123502173Z links: [<0.188.0>]
2020-12-10T06:53:45.123505256Z dictionary:
2020-12-10T06:53:45.123508103Z trap_exit: false
2020-12-10T06:53:45.123510998Z status: running
2020-12-10T06:53:45.123514028Z heap_size: 376
2020-12-10T06:53:45.123517143Z stack_size: 27
2020-12-10T06:53:45.123520213Z reductions: 248
2020-12-10T06:53:45.123523425Z neighbours:
2020-12-10T06:53:45.123526473Z
2020-12-10T06:53:45.147141293Z =ERROR REPORT==== 10-Dec-2020::06:53:45.124741 ===
2020-12-10T06:53:45.147153357Z ** Genmycom server notif_server terminating
2020-12-10T06:53:45.147155886Z ** Last message in was {‘EXIT’,<0.190.0>,
2020-12-10T06:53:45.147158397Z {noproc,
2020-12-10T06:53:45.147160457Z {gen_server,call,
2020-12-10T06:53:45.147162466Z [replay_server,
2020-12-10T06:53:45.147164343Z {save_notif,‘mycompany-DATA-CHANGE’,
2020-12-10T06:53:45.147171326Z {19,{2020,12,10,6,53,45,110414,0,0}},
2020-12-10T06:53:45.147173837Z {{aaa_server,chk_notification_access,
2020-12-10T06:53:45.147175937Z [‘urn:rdns:com:mycompany:oammodel:mycompany-notifications’,
2020-12-10T06:53:45.147178067Z ‘push-change-updates-not-created’]},
2020-12-10T06:53:45.147180047Z [{[‘urn:rdns:com:mycompany:oammodel:mycompany-notifications’|
2020-12-10T06:53:45.147182155Z ‘push-change-updates-not-created’],
2020-12-10T06:53:45.147184153Z leaf}]}},
2020-12-10T06:53:45.147186150Z infinity]}}}
2020-12-10T06:53:45.147188137Z ** When Server state == {state,1,#{},#{},
2020-12-10T06:53:45.147190210Z [{target_info,yang_push_handler,
2020-12-10T06:53:45.147192160Z yang_push_handler,datastore},
2020-12-10T06:53:45.147194160Z {target_info,‘NETCONF’,<0.189.0>,stream},
2020-12-10T06:53:45.147196310Z {target_info,‘mycompany-DATA-CHANGE’,<0.190.0>,
2020-12-10T06:53:45.147198427Z stream}],
2020-12-10T06:53:45.147200355Z }
2020-12-10T06:53:45.147202555Z ** Reason for termination ==
2020-12-10T06:53:45.147205448Z ** {{noproc,
2020-12-10T06:53:45.147208995Z {gen_server,call,
2020-12-10T06:53:45.147212058Z [replay_server,
2020-12-10T06:53:45.147214947Z {save_notif,‘mycompany-DATA-CHANGE’,
2020-12-10T06:53:45.147218044Z {19,{2020,12,10,6,53,45,110414,0,0}},
2020-12-10T06:53:45.147221072Z {{aaa_server,chk_notification_access,
2020-12-10T06:53:45.147224095Z [‘urn:rdns:com:mycompany:oammodel:mycompany-notifications’,
2020-12-10T06:53:45.147226245Z ‘push-change-updates-not-created’]},
2020-12-10T06:53:45.147228324Z [{[‘urn:rdns:com:mycompany:oammodel:mycompany-notifications’|
2020-12-10T06:53:45.147230348Z ‘push-change-updates-not-created’],
2020-12-10T06:53:45.147232327Z leaf}]}},
2020-12-10T06:53:45.147234218Z infinity]}},
2020-12-10T06:53:45.147236109Z [{notif_server,delete_subscription_internal,4,
2020-12-10T06:53:45.147238072Z [{file,“notif_server.erl”},{line,1938}]},
2020-12-10T06:53:45.147240167Z {notif_server,handle_info,2,[{file,“notif_server.erl”},{line,728}]},
2020-12-10T06:53:45.147242793Z {gen_server,try_dispatch,4,[{file,“gen_server.erl”},{line,653}]},
2020-12-10T06:53:45.147244915Z {gen_server,handle_msg,6,[{file,“gen_server.erl”},{line,727}]},
2020-12-10T06:53:45.147248923Z {proc_lib,init_p_do_apply,3,[{file,“proc_lib.erl”},{line,249}]}]}
2020-12-10T06:53:45.147251076Z
2020-12-10T06:53:45.148219271Z =CRASH REPORT==== 10-Dec-2020::06:53:45.147101 ===
2020-12-10T06:53:45.148228527Z crasher:
2020-12-10T06:53:45.148231030Z initial call: notif_server:init/1
2020-12-10T06:53:45.148233273Z pid: <0.188.0>
2020-12-10T06:53:45.148235712Z registered_name: notif_server
2020-12-10T06:53:45.148237771Z exception error: {noproc,
2020-12-10T06:53:45.148239799Z {gen_server,call,
2020-12-10T06:53:45.148241756Z [replay_server,
2020-12-10T06:53:45.148243741Z {save_notif,‘mycompany-DATA-CHANGE’,
2020-12-10T06:53:45.148245798Z {19,{2020,12,10,6,53,45,110414,0,0}},
2020-12-10T06:53:45.148249321Z {{aaa_server,chk_notification_access,
2020-12-10T06:53:45.148251390Z [‘urn:rdns:com:mycompany:oammodel:mycompany-notifications’,
2020-12-10T06:53:45.148253537Z ‘push-change-updates-not-created’]},
2020-12-10T06:53:45.148255532Z [{[‘urn:rdns:com:mycompany:oammodel:mycompany-notifications’|
2020-12-10T06:53:45.148257725Z ‘push-change-updates-not-created’],
2020-12-10T06:53:45.148259794Z leaf}]}},
2020-12-10T06:53:45.148261878Z infinity]}}
2020-12-10T06:53:45.148263870Z in function notif_server:delete_subscription_internal/4 (notif_server.erl, line 1938)
2020-12-10T06:53:45.148266026Z in call from notif_server:handle_info/2 (notif_server.erl, line 728)
2020-12-10T06:53:45.148267982Z in call from gen_server:try_dispatch/4 (gen_server.erl, line 653)
2020-12-10T06:53:45.148270042Z in call from gen_server:handle_msg/6 (gen_server.erl, line 727)
2020-12-10T06:53:45.148272018Z ancestors: [confd_second_sup,confd_sup,<0.48.0>]
2020-12-10T06:53:45.148274136Z message_queue_len: 0
2020-12-10T06:53:45.148276098Z messages:
2020-12-10T06:53:45.148277995Z links: [<0.189.0>,<0.94.0>]
2020-12-10T06:53:45.148280198Z dictionary:
2020-12-10T06:53:45.148282076Z trap_exit: true
2020-12-10T06:53:45.148283920Z status: running
2020-12-10T06:53:45.148285745Z heap_size: 6772
2020-12-10T06:53:45.148287566Z stack_size: 27
2020-12-10T06:53:45.148289398Z reductions: 13271
2020-12-10T06:53:45.148291266Z neighbours:
2020-12-10T06:53:45.148293117Z neighbour:
2020-12-10T06:53:45.148300568Z pid: <0.189.0>
2020-12-10T06:53:45.148302642Z registered_name:
2020-12-10T06:53:45.148304605Z initial call: notif_server:stream_handler/1
2020-12-10T06:53:45.148306597Z current_function: {notif_server,stream_handler,1}
2020-12-10T06:53:45.148308557Z ancestors: [notif_server,confd_second_sup,confd_sup,<0.48.0>]
2020-12-10T06:53:45.148310574Z message_queue_len: 0
2020-12-10T06:53:45.148312450Z links: [<0.188.0>]
2020-12-10T06:53:45.148314478Z trap_exit: false
2020-12-10T06:53:45.148316326Z status: waiting
2020-12-10T06:53:45.148318190Z heap_size: 233
2020-12-10T06:53:45.148320034Z stack_size: 6
2020-12-10T06:53:45.148321870Z reductions: 18
2020-12-10T06:53:45.148323730Z current_stacktrace: [{notif_server,stream_handler,1,
2020-12-10T06:53:45.148325740Z [{file,“notif_server.erl”},{line,755}]},
2020-12-10T06:53:45.148328375Z {proc_lib,init_p_do_apply,3,
2020-12-10T06:53:45.148330374Z [{file,“proc_lib.erl”},{line,249}]}]
2020-12-10T06:53:45.149116998Z =SUPERVISOR REPORT==== 10-Dec-2020::06:53:45.148633 ===
2020-12-10T06:53:45.149121617Z supervisor: {local,confd_second_sup}
2020-12-10T06:53:45.149123722Z errorContext: child_terminated
2020-12-10T06:53:45.149125677Z reason: {{noproc,
2020-12-10T06:53:45.149127584Z {gen_server,call,
2020-12-10T06:53:45.149129514Z [replay_server,
2020-12-10T06:53:45.149131503Z {save_notif,‘mycompany-DATA-CHANGE’,
2020-12-10T06:53:45.149133465Z {19,{2020,12,10,6,53,45,110414,0,0}},
2020-12-10T06:53:45.149135466Z {{aaa_server,chk_notification_access,
2020-12-10T06:53:45.149137680Z [‘urn:rdns:com:mycompany:oammodel:mycompany-notifications’,
2020-12-10T06:53:45.149139754Z ‘push-change-updates-not-created’]},
2020-12-10T06:53:45.149141744Z [{[‘urn:rdns:com:mycompany:oammodel:mycompany-notifications’|
2020-12-10T06:53:45.149143776Z ‘push-change-updates-not-created’],
2020-12-10T06:53:45.149145849Z leaf}]}},
2020-12-10T06:53:45.149147770Z infinity]}},
2020-12-10T06:53:45.149149703Z [{notif_server,delete_subscription_internal,4,
2020-12-10T06:53:45.149151658Z [{file,“notif_server.erl”},{line,1938}]},
2020-12-10T06:53:45.149153810Z {notif_server,handle_info,2,
2020-12-10T06:53:45.149155954Z [{file,“notif_server.erl”},{line,728}]},
2020-12-10T06:53:45.149161695Z {gen_server,try_dispatch,4,[{file,“gen_server.erl”},{line,653}]},
2020-12-10T06:53:45.149163942Z {gen_server,handle_msg,6,[{file,“gen_server.erl”},{line,727}]},
2020-12-10T06:53:45.149166010Z {proc_lib,init_p_do_apply,3,
2020-12-10T06:53:45.149167931Z [{file,“proc_lib.erl”},{line,249}]}]}
2020-12-10T06:53:45.149169979Z offender: [{pid,<0.188.0>},
2020-12-10T06:53:45.149171994Z {id,notif_server},
2020-12-10T06:53:45.149173950Z {mfargs,{notif_server,start_link,}},
2020-12-10T06:53:45.149175886Z {restart_type,permanent},
2020-12-10T06:53:45.149177796Z {shutdown,2000},
2020-12-10T06:53:45.149179694Z {child_type,worker}]
2020-12-10T06:53:45.179563125Z 10-Dec-2020::06:53:45.179 mycom-cm-yang-provider-5f957f9754-2wcg2 confd[49]: - ConfD phase1 started
2020-12-10T06:53:45.296866926Z 10-Dec-2020::06:53:45.296 mycom-cm-yang-provider-5f957f9754-2wcg2 confd[49]: - Starting to listen for NETCONF TCP on 127.0.0.1:2023
2020-12-10T06:53:45.350577155Z 10-Dec-2020::06:53:45.349 mycom-cm-yang-provider-5f957f9754-2wcg2 confd[49]: - ConfD started vsn: 7.4.1

Here is the log of “edit-config”:

2020-12-10T06:54:21.611742463Z 10-Dec-2020::06:54:21.611 mycom-cm-yang-provider-5f957f9754-2wcg2 confd[49]: audit user: user1/19 assigned to groups: users,system-admin,system-security-admin
2020-12-10T06:54:21.612147498Z 10-Dec-2020::06:54:21.611 mycom-cm-yang-provider-5f957f9754-2wcg2 confd[49]: audit user: user1/19 created new session via netconf from 100.109.0.89:33550 with ssh

2020-12-10T06:54:27.846300296Z 10-Dec-2020::06:54:27.846 mycom-cm-yang-provider-5f957f9754-2wcg2 confd[49]: audit user: user1/19 commit thandle 18 begin
2020-12-10T06:54:27.846526022Z 10-Dec-2020::06:54:27.846 mycom-cm-yang-provider-5f957f9754-2wcg2 confd[49]: audit user: user1/19 commit thandle 18 /helloe:hello/message set to “Hello,Current Time:10/12/20 06:54:27”
2020-12-10T06:54:27.846672702Z 10-Dec-2020::06:54:27.846 mycom-cm-yang-provider-5f957f9754-2wcg2 confd[49]: audit user: user1/19 commit thandle 18 end
2020-12-10T06:54:30.985576618Z 10-Dec-2020::06:54:30.985 mycom-cm-yang-provider-5f957f9754-2wcg2 confd[49]: audit user: user1/19 terminated session (reason: normal)

2020-12-10T06:54:25.601839708Z 10-Dec-2020::06:54:25.601 mycom-cm-yang-provider-5f957f9754-2wcg2 confd[49]: audit user: user1/20 assigned to groups: users,system-admin,system-security-admin
2020-12-10T06:54:25.602232085Z 10-Dec-2020::06:54:25.602 mycom-cm-yang-provider-5f957f9754-2wcg2 confd[49]: audit user: user1/20 created new session via netconf from 100.109.0.89:33550 with ssh

2020-12-10T06:54:37.227362746Z 10-Dec-2020::06:54:37.227 mycom-cm-yang-provider-5f957f9754-2wcg2 confd[49]: audit user: user1/20 commit thandle 19 begin
2020-12-10T06:54:37.227567321Z 10-Dec-2020::06:54:37.227 mycom-cm-yang-provider-5f957f9754-2wcg2 confd[49]: audit user: user1/20 commit thandle 19 /helloe:hello/message set to “Hello,Current Time:10/12/20 06:54:34”
2020-12-10T06:54:37.227648290Z 10-Dec-2020::06:54:37.227 mycom-cm-yang-provider-5f957f9754-2wcg2 confd[49]: audit user: user1/20 commit thandle 19 end

BRs
Michael

It looks to me that you are trying to use ConfD notification server before it is fully up and running, is it possible? If it is so, while ConfD should be somewhat more resilient to such situations, you still should wait for ConfD to start properly, for instance using confd --wait-started or something in this vein.

1 Like

For the sake of completeness, there is a section “Starting ConfD” in the chapter “Advanced Topics” in the ConfD User Guide that shows how to start and synchronize ConfD phases and what you can do in each phase.

1 Like

Hi mvf,
Would you please have a look at this error which also impact the netconf operation? Do you think the netconf function will be recovered if there is NO this kind of this error in the next another in-service data model?

2020-12-10T07:03:15.694460968Z <INFO> 10-Dec-2020::07:03:15.693 eric-cm-yang-provider-6d96c8bf4b-fc89t confd[49]: - Upgrade init succeeded
2020-12-10T07:03:15.702503061Z =CRASH REPORT==== 10-Dec-2020::07:03:15.694141 ===
2020-12-10T07:03:15.702522914Z   crasher:
2020-12-10T07:03:15.702526461Z     initial call: aaa_server:'-read_aaa_fun/3-fun-0-'/0
2020-12-10T07:03:15.702529798Z     pid: <0.551.0>
2020-12-10T07:03:15.702533158Z     registered_name: []
2020-12-10T07:03:15.702535962Z     exception error: no match of right hand side value 
2020-12-10T07:03:15.702539008Z                      {error,{cs_error,undefined,misc,upgrade_in_progress,
2020-12-10T07:03:15.702542154Z                                       undefined,
2020-12-10T07:03:15.702545335Z                                       ['http://tail-f.com/ns/confd/1.0'|
2020-12-10T07:03:15.702547677Z                                        'upgrade-in-progress'],
2020-12-10T07:03:15.702560970Z                                       undefined,undefined,undefined,false}}
2020-12-10T07:03:15.702563282Z       in function  aaa_server:read_aaa/2 (aaa_server.erl, line 4201)
2020-12-10T07:03:15.702565299Z       in call from aaa_server:'-read_aaa_fun/3-fun-0-'/3 (aaa_server.erl, line 4078)
2020-12-10T07:03:15.702567395Z     ancestors: [aaa_server,confd_second_sup,confd_sup,<0.48.0>]
2020-12-10T07:03:15.702569507Z     message_queue_len: 0
2020-12-10T07:03:15.702571310Z     messages: []
2020-12-10T07:03:15.702573072Z     links: []
2020-12-10T07:03:15.702574902Z     dictionary: []
2020-12-10T07:03:15.702576668Z     trap_exit: false
2020-12-10T07:03:15.702578612Z     status: running
2020-12-10T07:03:15.702593338Z     heap_size: 610
2020-12-10T07:03:15.702595150Z     stack_size: 27
2020-12-10T07:03:15.702597282Z     reductions: 248
2020-12-10T07:03:15.702599350Z   neighbours:
2020-12-10T07:03:15.702601123Z 

BRs
Michael

Yes, I believe it can impact the NETCONF server, and I find it likely that it does not fully recover. Again, the problem is that ConfD is asked to do stuff (reload AAA in this case, I think) while it is busy doing important bookkeeping.

1 Like

Hi mvf,
For why does the crash show, you think ConfD is busy doing import bookkeeping, does that mean the CPU resources is not enough at that time?
BRs
Michael

Hi Michael,

I can see that you or a colleague of yours you filed an RT ticket on the “reloading AAA” issue back in November.

In this forum, you should create a new topic for each unrelated issue/question. This is a public forum and the intention is to share our ConfD NETCONF RESTCONF YANG etc. knowledge and solutions to with other forums users and visitors. Therefore it will be hard to find information if the headline says one thing while the content is unrelated.

Regarding reloading AAA, as was answered in that RT ticket by the Tail-f support team:

Reloading the aaa (maapi_aaa_reload) does not require a transaction since it is just a re-population of the aaa data in an internal aaa cache, thus nothing is written to CDB hence a transaction is not needed per se.

That said, considering you do want to synchronize an in-service upgrade operation with a reload of the AAA/NACM config into the aaa cache,

I suggest that you do that by making sure you take a lock before you do the reloading of the aaa cache. You could e.g do that with a global lock using maapi_lock(). Note we also provide a maapi_is_lock_set() which you can use to first check if there is already a lock taken.

Thus this way you can make sure you are synchronizing these events.

Best regards

1 Like