A performance problem of using XPATH to validate numerous data in must statement

There is a simple demo which is even more complicated in the actual situation.

container A{
leaf A1;
list A2 {
key index; // 1~8192
leaf name;
leaf xx-counter;
}
}

container B{
leaf B1;
list B2 {
key name;
leaf name {
must “/a:A/a:A2[name=‘abc’ or name =‘efg’]/a:xx-counter < 5” {
error-message “xxxxx”;
}
}
}
}

I checked the xpath.trace.log and found something that was unexpected.
The confd have to iterate list A2(index > 100) saved in the candidate database to find the entity named abc or efg.

I also attempted to add a validationpoint to do the same thing in c++ using the maapi_* api, however it did not work.

Is there any effective method to solve the performance problem. I do prefer the must in Yang rather than the callback function.

Your must statement should look like the following:

      must "/a:A/a:A2[name='abc']/a:xx-counter < 5 or
           /a:A/a:A2[name='efg']/a:xx-counter < 5 " {
           error-message "xxxxx";
      }

You should also be able to write your own validation callpoint for this. What was the problem with your validation point?

Thanks for your reply.

I made a example to explain what i concerned.
The yang file:

  container A {
    leaf A1 {
      type string;
    }
    list A2 {
      key index;
      leaf index {
        type uint16;
      }
      leaf name {
        type string;
      }
      leaf counter {
        type uint8;
      }
    }
  }

  container B {
    list B2 {
      key name;
      leaf name {
        type string;
//        must "count(../../../A/A2[(name='abc' or name='efg') and counter < 5]) > 0" {
        must "count(../../../A/A2[name='abc' and counter < 5]) > 0"+
             " or count(../../../A/A2[name='efg' and counter < 5]) > 0" {
          error-message "invalid value !!!";
        }
      }
      leaf val {
        type uint8;
      }
    }
  }

the audit log of the first must condition which takes 5s.

admin@s100-test-vm% set B B2 ccc val 8
[ok][2016-03-17 13:57:51]

[edit]
admin@s100-test-vm% commit
Aborted: ‘B B2 ccc name’ (value “ccc”): invalid value !!!
[error][2016-03-17 13:57:56]

the audit log of the second must condition which takes 2s.

    admin@s100-test-vm% set B B2 ccc val 8
    [ok][2016-03-17 14:03:56]

    [edit]
    admin@s100-test-vm% commit
    Aborted: 'B B2 ccc name' (value "ccc"): invalid value !!!
    [error][2016-03-17 14:03:58]

the following log is xpath.trace.

17-Mar-2016::13:57:56.323 Evaluating XPath for: /sys:B/B2{ccc}/name:
  count(../../../A/A2[(name='abc' or name='efg') and counter < 5]) > 0
get_next(/sys:A/A2) = {1}
get_elem("/sys:A/A2{1}/name") = a
get_elem("/sys:A/A2{1}/name") = a
get_next(/sys:A/A2{1}) = {2}
get_elem("/sys:A/A2{2}/name") = abc
get_elem("/sys:A/A2{2}/counter") = 5
get_elem("/sys:A/A2{2}/name") = abc
get_next(/sys:A/A2{2}) = {3}
get_elem("/sys:A/A2{3}/name") = a
get_elem("/sys:A/A2{3}/name") = a
get_next(/sys:A/A2{3}) = {4}
get_elem("/sys:A/A2{4}/name") = efg
get_elem("/sys:A/A2{4}/name") = efg
get_elem("/sys:A/A2{4}/counter") = 5
get_next(/sys:A/A2{4}) = {5}
get_elem("/sys:A/A2{5}/name") = a
get_elem("/sys:A/A2{5}/name") = a
get_next(/sys:A/A2{5}) = false
17-Mar-2016::13:57:56.330 XPath for: /sys:B/B2{ccc}/name returns false

17-Mar-2016::14:03:58.514 Evaluating XPath for: /sys:B/B2{ccc}/name:
  count(../../../A/A2[name='abc' and counter < 5]) > 0 or count(../../../A/A2[name='efg' and counter < 5]) > 0
get_next(/sys:A/A2) = {1}
get_elem("/sys:A/A2{1}/name") = a
get_next(/sys:A/A2{1}) = {2}
get_elem("/sys:A/A2{2}/name") = abc
get_elem("/sys:A/A2{2}/counter") = 5
get_next(/sys:A/A2{2}) = {3}
get_elem("/sys:A/A2{3}/name") = a
get_next(/sys:A/A2{3}) = {4}
get_elem("/sys:A/A2{4}/name") = efg
get_next(/sys:A/A2{4}) = {5}
get_elem("/sys:A/A2{5}/name") = a
get_next(/sys:A/A2{5}) = false
get_next(/sys:A/A2) = {1}
get_elem("/sys:A/A2{1}/name") = a
get_next(/sys:A/A2{1}) = {2}
get_elem("/sys:A/A2{2}/name") = abc
get_next(/sys:A/A2{2}) = {3}
get_elem("/sys:A/A2{3}/name") = a
get_next(/sys:A/A2{3}) = {4}
get_elem("/sys:A/A2{4}/name") = efg
get_elem("/sys:A/A2{4}/counter") = 5
get_next(/sys:A/A2{4}) = {5}
get_elem("/sys:A/A2{5}/name") = a
get_next(/sys:A/A2{5}) = false
17-Mar-2016::14:03:58.523 XPath for: /sys:B/B2{ccc}/name returns false

It takes less time by your advice, but the confd also iterates all entities.
if list A2 has more than 10,000 members, it will take a long time to complete validation.
I think that there is a performance problem as long as there is a loop for numerous data.Especially, in a situation that there is no key in the xpath.

It seems that the must statement of YANG could not support big data.so i did a attempt to use callback function.
In my project, the following data model exist.
/…/A{1~128}/B{1~65535}/type (key value is not self-increased)
i have to do something like this:

  1. loop of A to find what key value exists
  2. loop of B to find what key value exists in loop 1
  3. make a string of keypath. [eg: /…/A{34}/B{60000}/type]

this is so stupid, but i have no ideas about how to validate quickly.

The time that you have quoted are for the entire CLI operation and not just the XPath evaluations.

Based on your xpath.trace output above, the actual time taken to perform the XPath evaluations are 7 and 9 milliseconds.

My previous must statement suggestion was given based on your previous incomplete model and no index was being used. It isn’t optimal.

To address your performance concern, you can add the secondary-index tail-f extension to your YANG model to allow the name field of the list to also be searchable. That will avoid the unnecessary list entries to be searched for the XPath comparisons. Following is how your YANG model will look like:

  list A2 {
    key index; // 1~8192
    tailf:secondary-index name {
      tailf:index-leafs "name";
    }
    leaf index {
      type uint16;
    }
    leaf name {
      type string;
      mandatory true;
    }

Given your example configuration from the xpath.trace output, the new xpath.trace looks as follows:

17-Mar-2016::10:03:33.910 Evaluating XPath for: /a:B/B2{ccc}/name:
  /a:A/a:A2[name='abc']/a:counter < 5 or
/a:A/a:A2[name='efg']/a:counter < 5 
find_next(/a:A/A2, [name=abc], name) = {2}
get_elem("/a:A/A2{2}/name") = abc
get_elem("/a:A/A2{2}/counter") = 5
get_next(/a:A/A2{2}, name) = {3}
get_elem("/a:A/A2{3}/name") = b
find_next(/a:A/A2, [name=efg], name) = {4}
get_elem("/a:A/A2{4}/name") = efg
get_elem("/a:A/A2{4}/counter") = 5
get_next(/a:A/A2{4}, name) = false
17-Mar-2016::10:03:33.912 XPath for: /a:B/B2{ccc}/name returns false

The XPath evaluation now takes only 2 milliseconds for this small configuration on my test system. The time savings will be significant for a huge list.

Thanks for your reply. I think this is what I am looking for.
However, another trouble comes. I have changed the demo which looks like the actual data model.

module sys {
  namespace "urn:sys";
  prefix sys;

//  import tailf-common {
//    prefix tailf;
//  }

  grouping fp_g {
    leaf name {
      type string;
      must "(../../aa/name='abc' and ../../aa/counter < 5)"+
        " or (../../aa/name='efg' and ../../aa/counter < 5)" {
        error-message "invalid value !!!";
      }
    }
    leaf val {
      type uint8;
    }
  }

  grouping fp {
    list fp {
      key id;
      leaf id {
        type uint16; // 1~65535
      }
      uses fp_g;
    }
  }

  container system {
    leaf info {
      type string;
    }
    list lif {
      key index;
      leaf index {
        type uint8; // 1~128
      }
      container aa {
        leaf name {
          type string;
        }
        leaf counter {
          type uint8;
        }
      }
      uses fp;
    }
  }
}

I looked up the confd-user-guide to learn how to use the secondary-index, and tried to add it to YANG somewhere correctly.
I am sorry to say that I failed, so i come to trouble you again.
My validation has to check the relationships about /lif/aa/name, /lif/aa/counter and /lif/fp/name.
do you have any suggestions for me ?

As stated in the ConfD User Guide, each leaf listed as the secondary-index must be a direct child to the list. Your container named aa will need to be removed in order for my previous suggested solution to work.

It worked. Thank you very much !