BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Felmond
Fluorite | Level 6

Hello!

I'm currently stuck trying to do conditional inheritence:

 

Have:

DATA HAVE;
LENGTH Col1 $ 1 Col2 3 Col3 $ 3 Col4 $ 3;
INFILE DATALINES DELIMITER=',';
A,1,
A,2,foo
A,3,
B,1,bar
B,2,
B,1,
B,2,foo
C,1,
C,2,bar
C,3,
C,4,
C,3,
C,4,
D,1,
D,2,
D,3,
;
RUN;
Col1Col2Col3Col4
A1  
A2foo 
A3  
B1bar 
B2  
B1  
B2foo 
C1  
C2bar 
C3  
C4  
C3  
C4  
D1  
D2  
D3  

 

Want:

Col1Col2Col3Col4
A1  
A2foofoo
A3 foo
B1barbar
B2 bar
B1  
B2foofoo
C1  
C2barbar
C3 bar
C4 bar
C3 bar
C4 bar
D1  
D2  
D3  

 

In the above example Col1 is used to group the data while Col2 defines hierarchies within these groups. If Col3 contains a value I want Col4 to retain this value. If Col3 doesn't contain a value I want Col4 to inherit the value from the last observation but only within the same group and if the level in the hierarchy is descending in relation to the last observation in Col3 that held a value.

 

After googling and fiddling around with the LAG function (doesn't seem to be the right tool here?) and the RETAIN statement (failing at groups C and D in the example) for longer than I'd like to admit I remain clueless on how to archive this.

 

I'd be really grateful for a possible solution, comparable examples or a pointer in the right direction!

 

Regards

1 ACCEPTED SOLUTION

Accepted Solutions
PhilC
Rhodochrosite | Level 12

I wish base SAS had an linked list object.  I've needed something like this before.  And I ache for the days I could find a C library or even make a linked list object to code with.  A hash object could be used but it is not neccessary, and it doesn't help code readability, IMO.  Even though I've done this before with my own data, doing it with this data I feel like I came up with a completely different solution than I remembered before.  This code assumes a well formed input dataset, as I understand the problem. 

 

data have;
length n 8 col1 $ 1 col2 8 col3 $ 3 ;
infile datalines delimiter=',' truncover;
input col1 col2 col3  ;
n=_N_;
datalines;
A,1,
A,2,foo
A,3,
B,1,bar
B,2,
B,1,
B,2,foo
C,1,
C,2,bar
C,3,
C,4,
C,3, 
C,2, 
C,3,
C,4,
D,1,
D,2,
D,3,
;
run;


data want;
  do until (last.col1);
    set have;
      by col1;
    length _col2_x 8 col4 $3;
      retain _col2_x col4;
      /*_col2_x is the value col2 equaled when col3 was not missing; IOW the 
           col3 property set to be applied to the children of that parent.    */
      drop _:;
    if not missing(_col2_x) then do;
        if _col2_x>=col2 
          then call missing(_col2_x,col4);
      end;
    if not missing(col3) then do;
        _col2_x=col2;
        col4=col3;
      end;
    output;
  end;
  call missing(_col2_x, col4);
run;

 

Concepts for anyone looking up these methods, search for DOW loops, aka. Whitlock loops.

View solution in original post

9 REPLIES 9
andreas_lds
Jade | Level 19

The step creating work.have has some bugs: input and datalines statements are missing.

I prefer using a hash-object for such problems:

/* just to make improve readability of the declare statement */
data work.lookup / view=work.lookup;
   set have(drop=col4 rename=(col3=col4));
   where not missing(col4);
run;


data work.want;
   set work.have;
   
   if _n_ = 1 then do;
      declare hash h(dataset: 'work.lookup');
      h.defineKey('col1', 'col2');
      h.defineData('col4');
      h.defineDone();
   end;
   
   if not missing(col3) then do;
      col4 = col3;
   end;
   else do;
      rc = h.find(key: col1, key: col2-1);
      rc = h.ref();
   end;
   
   drop rc;
run;
Felmond
Fluorite | Level 6

Thank you very much for the suggestion! I definitly have to read up on hash objects.

One problem:

data have;
length col1 $ 1 col2 3 col3 $ 3 col4 $ 3;
infile datalines delimiter=',';
A,1,
A,2,foo
A,3,
B,1,bar
B,2,
B,1,
B,2,foo
C,1,
C,2,bar
C,3,
C,4,
C,3, /* new */
C,2, /* new */
C,3,
C,4,
D,1,
D,2,
D,3,
;
run;

data lookup;
   set have(drop=col4 rename=(col3=col4));
   where not missing(col4);
run;

data want;
   set have;
   
   if _n_ = 1 then do;
      declare hash h(dataset: 'lookup');
      h.defineKey('col1', 'col2');
      h.defineData('col4');
      h.defineDone();
   end;
   
   if not missing(col3) then do;
      col4 = col3;
   end;
   else do;
      rc = h.find(key: col1, key: col2-1);
      rc = h.ref();
   end;
   
   drop rc;
run;

I've inserted two new observations.

 

Expected result:

OBScol1col2col3col4
1A1

 

 
2A2foofoo
3A3 foo
4B1barbar
5B2 bar
6B1  
7B2foofoo
8C1  
9C2barbar
10C3 bar
11C4 bar
12C3 bar
13C2  
14C3  
15C4  
16D1  
17D2  
18D3  

 

Actual result:

OBScol1col2col3col4
1A1

 

 
2A2foofoo
3A3 foo
4B1barbar
5B2 bar
6B1  
7B2foofoo
8C1  
9C2barbar
10C3 bar
11C4 bar
12C3 bar
13C2  
14C3 bar
15C4 bar
16D1  
17D2  
18D3  

 

Observations 14 und 15 meet the criteria so they inherit the value. But hierarchically they belong to observation 13 which doesn't have an associated value. Is there a way to consider this?

PhilC
Rhodochrosite | Level 12

Row 12? seems to me it is not connected to the rows 9-11 either,  how could I explain that?

 

A1   B1'   B1   C1 ?? C3   C2   D1
             
A2*   B2'   B2*   C2'     ?? C3*   D2
                 
A3*           C3'     ?? C4*   D3
    nil                  
    foo*       C4'            
    bar'                    
Felmond
Fluorite | Level 6

I'll try to explain it more clearly: col2 represents a hierarchical structure like this (here for group C):

  • C1
    • C2 bar
      • C3 bar
        • C4 bar
      • C3 bar
    • C2
      • C3
        • C4

Along the hierarchy the col3 values are introduced (red) and they then count for all subordinate elements (blue).

 

Edit: Adding to that: As soon as a col3 value is introduced, subordinate elements cannot introduce another col3 value.

PhilC
Rhodochrosite | Level 12

Ok, how does the second C3 have "bar"(?), because the first C2 does, ok 

andreas_lds
Jade | Level 19

@Felmond wrote:

Thank you very much for the suggestion! I definitly have to read up on hash objects.

One problem:

....

Observations 14 und 15 meet the criteria so they inherit the value. But hierarchically they belong to observation 13 which doesn't have an associated value. Is there a way to consider this?


Most likely not by using a hash object the way i used it. I am sure that i didn't understood the role of col3 properly, so i think a lengthy data step will be necessary to create the expected result. Maybe renaming the variables to reflect their role and usage could be useful. And it seems as if i need further text explaining the values of col3.

PhilC
Rhodochrosite | Level 12

I wish base SAS had an linked list object.  I've needed something like this before.  And I ache for the days I could find a C library or even make a linked list object to code with.  A hash object could be used but it is not neccessary, and it doesn't help code readability, IMO.  Even though I've done this before with my own data, doing it with this data I feel like I came up with a completely different solution than I remembered before.  This code assumes a well formed input dataset, as I understand the problem. 

 

data have;
length n 8 col1 $ 1 col2 8 col3 $ 3 ;
infile datalines delimiter=',' truncover;
input col1 col2 col3  ;
n=_N_;
datalines;
A,1,
A,2,foo
A,3,
B,1,bar
B,2,
B,1,
B,2,foo
C,1,
C,2,bar
C,3,
C,4,
C,3, 
C,2, 
C,3,
C,4,
D,1,
D,2,
D,3,
;
run;


data want;
  do until (last.col1);
    set have;
      by col1;
    length _col2_x 8 col4 $3;
      retain _col2_x col4;
      /*_col2_x is the value col2 equaled when col3 was not missing; IOW the 
           col3 property set to be applied to the children of that parent.    */
      drop _:;
    if not missing(_col2_x) then do;
        if _col2_x>=col2 
          then call missing(_col2_x,col4);
      end;
    if not missing(col3) then do;
        _col2_x=col2;
        col4=col3;
      end;
    output;
  end;
  call missing(_col2_x, col4);
run;

 

Concepts for anyone looking up these methods, search for DOW loops, aka. Whitlock loops.

Felmond
Fluorite | Level 6

Wow! Thanks a lot! Your code worked like a charm and did exactly what I was looking for - at least after checking a few hundred observations 😉

 

I've found a nice article on DoW loops (https://sasnrd.com/sas-dow-loop-example/) and am now trying to understand how it works.

 

Thanks again!

Regards

ballardw
Super User

I'm not sure what the rules involved in assigning col4.

Too many cases an example does not provide the rules.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1172 views
  • 5 likes
  • 4 in conversation