Hi There,
I have a data like this
data a;
a=1; b=5.1; c=5.5; output;
a=1; b=5.2; c=5.5; output;
a=1; b=5.0; c=5.2; output;
a=1; b=4.7; c=5.0; output;
a=1; b=4.6; c=5.2; output;
a=1; b=.; c=4.6; output;
run;
I am trying to create a output in which for a given parent nodie (variable c) it should get all the childs/nodes (variable b) which should not have any child/nodes. for e.g in the above example for parent node with value 5.5 it should output 5.1, 4.7 and 4.6 which do not have any child.
Here is how the output is picked.
5.5 has 5.1 and 5.2 child, then 5.2 has child 5.0, 5.0 has child 4.7 and 5.2 has child 4.6. 4.6, 4.7 and 5.1 do not have any childs.
Please let me know if you need more expalnation.
Thanks in advance for the help.
Anand
.
Are we to understand that a child node has a number < parent node number? Why not code the child node number as missing when it is greater than the parent node?
Is your desired result a data set? Can you show us what your desired result looks like?
I this case the output will be
new variable
5.1
4.7
4.6
All this do not have any child nodes or children for the ancestor or parent 5.5.
Basically when user asks variable c with value 5.5 then output should be look like above.
Steps are
1. The program reads c variable where value is 5.5 and then looks for the children which is variable b and the values are 5.1 and 5.2
2. For each of b value which is 5.1 and 5.2 the program looks for children and their children are for 5.1 no children so it will appear in the report and 5.2 has children 5.0... and it goes on for all the childrens...
data have(keep=_start _end);
a=1; b='5.1'; c='5.5'; output;
a=1; b='5.2'; c='5.5'; output;
a=1; b='5.0'; c='5.2'; output;
a=1; b='4.7'; c='5.0'; output;
a=1; b='4.6'; c='5.2'; output;
a=1; b=' '; c='4.6'; output;
rename c=_start b=_end;
run;
proc sql;
create table _have as
select *
from have where _start not in (select _end from have);
quit;
data want(keep=path want);
if _n_ eq 1 then do;
length path _path $ 800 ;
if 0 then set have;
declare hash ha(hashexp:20,dataset:'have(where=(_start is not missing and _end is not missing))',multidata:'y');
ha.definekey('_start');
ha.definedata('_end');
ha.definedone();
declare hash pa(ordered:'y');
declare hiter hi_path('pa');
pa.definekey('n');
pa.definedata('n','path');
pa.definedone();
end;
set _have;
count=1;n=1;_n=1;
path=catx('|',_start,_end);
pa.add();
do while(hi_path.next()=0);
if n ne 1 then pa.remove(key:_n);_n=n;
_path=path;
_start=scan(path,-1,'|');
rc=ha.find();
if rc ne 0 then do; want=_start;output;end;
do while(rc=0);
if not findw(path,strip(_end),'|') then do;
if length(path)+length(_end)+1 gt lengthc(path) then do;
putlog 'ERROR: The length of path and _path are set too short';
stop;
end;
count+1;n=count;
path=catx('|',path,_end);
pa.add();
path=_path;
end;
rc=ha.find_next();
end;
end;
pa.clear();
run;
Thank you very much. Its kind of working. But I need to output this based on value a and c. In this case the variable value for a is 1 and c is 5.5.
data a;
a=1; b=5.1; c=5.5; output;
a=1; b=5.2; c=5.5; output;
a=1; b=5.0; c=5.2; output;
a=1; b=4.7; c=5.0; output;
a=1; b=4.6; c=5.2; output;
a=1; b=.; c=4.6; output;
a=2; b=5.4; c=6.1; output;
a=2; b=5.3; c=6.1; output;
a=2; b=5.0; c=5.3; output;
a=2; b=4.7; c=5.0; output;
a=2; b=4.5; c=4.7; output;
run;
For the second set of record where a=2 and lets say value c is 6.1 the children output must be as below
5.4
4.5
Only these two do not have children.
I am sorry for not mentioning this case earlier.
run;
You could split this table into two tables and running my code separately .
If you have many a, make a macro to go through all of a . That is not too difficulty .
You could split this table into two tables and running my code separately .
If you have many a, make a macro to go through all of a . That is not too difficulty .
OK. Since I have some time. Check the following code .
data have;
a=1; b=5.1; c=5.5; output;
a=1; b=5.2; c=5.5; output;
a=1; b=5.0; c=5.2; output;
a=1; b=4.7; c=5.0; output;
a=1; b=4.6; c=5.2; output;
a=1; b=.; c=4.6; output;
a=2; b=5.4; c=6.1; output;
a=2; b=5.3; c=6.1; output;
a=2; b=5.0; c=5.3; output;
a=2; b=4.7; c=5.0; output;
a=2; b=4.5; c=4.7; output;
rename c=_start b=_end;
run;
proc sql;
create table _have as
select distinct a,_start
from have where catx(' ',a,_start) not in (select catx(' ',a,_end) from have);
quit;
data want(keep=a _start want);
if _n_ eq 1 then do;
if 0 then set have;
length key n path 8;
declare hash ha(hashexp:20,dataset:'have(where=(_start is not missing and _end is not missing))',multidata:'y');
ha.definekey('a','_start');
ha.definedata('_end');
ha.definedone();
declare hash pa(ordered:'y');
declare hiter hi_path('pa');
pa.definekey('n');
pa.definedata('n','path');
pa.definedone();
declare hash k();
k.definekey('key');
k.definedone();
end;
set _have;
count=1;n=1;_n=1;
path=_start;pa.add();
key=_start;k.add();
do while(hi_path.next()=0);
if n ne 1 then pa.remove(key:_n);_n=n;
rc=ha.find(key:a,key:path);
if rc ne 0 then do; want=path;output;end;
do while(rc=0);
if k.check(key:_end) ne 0 then do;
count+1;n=count;
path=_end;
pa.add();
key=_end;
k.add();
end;
rc=ha.find_next();
end;
end;
pa.clear(); k.clear();
run;
Thank you.. I am able to solve this issue with another approach. Thanks for all your help
Can you help me in solving problem below. I am providing here the appropriate ancestor for each of the record. I need to find this ancestor for each of the record by user. Can you please help.
data a;
user=1; base=4.1; cur=4.2; output; /* 4.1 to 4.2 - Ancestor here is 4.4 */
user=1; base=4.2; cur=4.4; output; /* 4.2 to 4.4 - Ancestor here is 4.4 */
user=1; base=4.6; cur=5.5; output; /* 4.6 to 5.5 - Ancestor here is 6.1 */
user=1; base=5.5; cur=6.1; output; /* 5.5 to 6.1 - Ancestor here is 6.1 */
user=2; base=4.2; cur=4.6; output; /* 4.2 to 4.6 - Ancestor here is 6.1 */
user=2; base=4.6; cur=5.5; output; /* 4.6 to 5.5 - Ancestor here is 6.1 */
user=2; base=5.5; cur=6.1; output; /* 5.0 to 6.1 - Ancestor here is 6.1 */
user=3; base=4.0; cur=4.2; output; /* 4.0 to 4.2 - Ancestor here is 5.5 */
user=3; base=4.2; cur=4.6; output; /* 4.2 to 4.6 - Ancestor here is 5.5 */
user=3; base=4.6; cur=5.5; output; /* 4.6 to 6.1 - Ancestor here is 5.5 */
run;
I am able to achieve this if there is only one user in the dataset using code below. But its not working if there are more than one user.
data _tmp1(rename=(base=base1 cur=cur1));
set a;
run;
**Repeat the training module for each of the delta version between base and current module;
data _tmp2;
retain base cur;
set _tmp1; by u;
cur = cur1;
base = base1;
output;
drop base1 cur1 i;
run;
**Get all the base version for the current modue which has training then need to check the
corresponding base version full access;
data _lineage;
set _tmp2; by u;
array hist(6);
count=1;
hist(count)=base;
do i=1 to last while (count<dim(hist)-1);
set _tmp2(rename=(base=base1 cur=cur1)) nobs=last point=i;
if cur=base1 then do;
count+1;
hist(count)=base1;
base=base1;
cur=cur1;
i=1;
end;
end;
count+1;
hist(count)=cur;
ancest = cur;
run;
Hi ,
I have lot of records in the dataset not just 3. It will take lot of time to process if i write a macro to subset these. Is there a way i can do it in one step.
Thank you,
@anandbillava wrote:
Hi ,
I have lot of records in the dataset not just 3. It will take lot of time to process if i write a macro to subset these. Is there a way i can do it in one step.
Thank you,
Creating subsets won't decrease the time required to create the final result. Do you get any errors while processing the complete dataset.
I will have to create more than 4000 datasets which is am already doing and its taking lot of time to process.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.