BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

Good Evening All,

 

I have a hierarchy which has holes in it as per the HAVE dataset code below.  I am looking to first maintain the right most populated value whilst filling all blanks to the left (the code attached with the Array does this nicely).  Then I need to find the right most populated record and hold it's value until ORG_LVL_10.  I am struggling with the finding a solution that performs the latter task.  Any help would be greatly appreciated.

 

Thank you for all your help.

 

Regards,

Scott 

 

 

data have;
length org_lvl_1 org_lvl_2 org_lvl_3 org_lvl_4 org_lvl_5 org_lvl_6 org_lvl_7 org_lvl_8 org_lvl_9 org_lvl_10 $20.;
infile datalines dlm= ',' dsd missover;
input org_lvl_1 org_lvl_2 org_lvl_3 org_lvl_4 org_lvl_5 org_lvl_6 org_lvl_7 org_lvl_8 org_lvl_9 org_lvl_10;

datalines;
LEVEL1,LEVEL2,LEVEL3,LEVEL4,LEVEL5,LEVEL6,LEVEL7,LEVEL8,LEVEL9,LEVEL10
LEVEL1,LEVEL2,LEVEL3,LEVEL4,LEVEL5,
LEVEL1,LEVEL2,LEVEL3,LEVEL4,,LEVEL6,LEVEL7,LEVEL8,LEVEL9,LEVEL10
LEVEL1,LEVEL2,LEVEL3,LEVEL4,,LEVEL6,LEVEL7,,,LEVEL10
;
run;


data want;
length org_lvl_1 org_lvl_2 org_lvl_3 org_lvl_4 org_lvl_5 org_lvl_6 org_lvl_7 org_lvl_8 org_lvl_9 org_lvl_10 $20.;
infile datalines dlm= ',' dsd missover;
input org_lvl_1 org_lvl_2 org_lvl_3 org_lvl_4 org_lvl_5 org_lvl_6 org_lvl_7 org_lvl_8 org_lvl_9 org_lvl_10;

datalines;
LEVEL1,LEVEL2,LEVEL3,LEVEL4,LEVEL5,LEVEL6,LEVEL7,LEVEL8,LEVEL9,LEVEL10
LEVEL1,LEVEL2,LEVEL3,LEVEL4,LEVEL5,LEVEL5,LEVEL5,LEVEL5,LEVEL5,LEVEL5,
LEVEL1,LEVEL2,LEVEL3,LEVEL4,LEVEL6,LEVEL6,LEVEL7,LEVEL8,LEVEL9,LEVEL10
LEVEL1,LEVEL2,LEVEL3,LEVEL4,LEVEL6,LEVEL6,LEVEL7,LEVEL10,LEVEL10,LEVEL10
;
run;


DATA test; SET have; ARRAY _ORG $ ORG_LVL_1 - ORG_LVL_10; J=1; DO I=1 TO DIM(_ORG); PUT I = J = _ORG(I) = _ORG(J) =; IF TRIM(LEFT(_ORG(I))) NE "" THEN DO; _ORG(J) = COALESCEC(TRIM(LEFT(_ORG(J))),TRIM(LEFT(_ORG(I)))); J+1; END; END; RUN;

 

1 ACCEPTED SOLUTION

Accepted Solutions
s_lassen
Meteorite | Level 14

Here is a relatively simple program which does the trick:

data want;

  set have;

  array org_lvl {*} org_lvl_:;

  /* if last element is missing, set it to last non-missing */

  if missing(org_lvl(dim(org_lvl))) then do;

    do _N_=dim(org_lvl)-1 to 1 by -1 while(missing(org_lvl(_N_)));

      end;

    org_lvl(dim(org_lvl))=org_lvl(_N_);

    end;

  /* fill left when missing */

  do _N_=dim(org_lvl)-1 to 1 by -1;

    if missing(org_lvl(_N_)) then

      org_lvl(_N_)=org_lvl(_N_+1);

    end;

run;

 

Not only is it quite easy to understand, it also performs well (with 400,000 obs, my version took about 0.26 CPU seconds, while your original version took 0.78 seconds - the solution suggested by @Patrick was in between).

View solution in original post

5 REPLIES 5
Patrick
Opal | Level 21

@Scott_Mitchell

Below should do the job:

data have;
 length org_lvl_1 org_lvl_2 org_lvl_3 org_lvl_4 org_lvl_5 org_lvl_6 org_lvl_7 org_lvl_8 org_lvl_9 org_lvl_10 $20.;
 infile datalines dlm= ',' dsd missover;
 input org_lvl_1 org_lvl_2 org_lvl_3 org_lvl_4 org_lvl_5 org_lvl_6 org_lvl_7 org_lvl_8 org_lvl_9 org_lvl_10;

 datalines;
LEVEL1,LEVEL2,LEVEL3,LEVEL4,LEVEL5,LEVEL6,LEVEL7,LEVEL8,LEVEL9,LEVEL10
LEVEL1,LEVEL2,LEVEL3,LEVEL4,LEVEL5,
LEVEL1,LEVEL2,LEVEL3,LEVEL4,,LEVEL6,LEVEL7,LEVEL8,LEVEL9,LEVEL10
LEVEL1,LEVEL2,LEVEL3,LEVEL4,,LEVEL6,LEVEL7,,,LEVEL10
;
run;

data want;
  set have;
  array org_lvl {*} org_lvl_:;
  do _i=1 to dim(org_lvl);
    if missing(org_lvl[_i]) then
      do _j=_i+1 to dim(org_lvl) until(not missing(org_lvl[_i]));
        org_lvl[_i]=org_lvl[_j];
      end;
      if missing(org_lvl[_i]) then org_lvl[_i]=org_lvl[max(_i-1,1)];
  end;
run;
s_lassen
Meteorite | Level 14

Here is a relatively simple program which does the trick:

data want;

  set have;

  array org_lvl {*} org_lvl_:;

  /* if last element is missing, set it to last non-missing */

  if missing(org_lvl(dim(org_lvl))) then do;

    do _N_=dim(org_lvl)-1 to 1 by -1 while(missing(org_lvl(_N_)));

      end;

    org_lvl(dim(org_lvl))=org_lvl(_N_);

    end;

  /* fill left when missing */

  do _N_=dim(org_lvl)-1 to 1 by -1;

    if missing(org_lvl(_N_)) then

      org_lvl(_N_)=org_lvl(_N_+1);

    end;

run;

 

Not only is it quite easy to understand, it also performs well (with 400,000 obs, my version took about 0.26 CPU seconds, while your original version took 0.78 seconds - the solution suggested by @Patrick was in between).

Scott_Mitchell
Quartz | Level 8

Thank you both for your help.

 

This worked perfectly.

s_lassen
Meteorite | Level 14

I have been thinking a bit - if all values are missing, you will get an index-out-of-range error, as _N_=0 after the first loop.

So you may want to put in a safeguard:

data want;

  set have;

  array org_lvl {*} org_lvl_:;

  /* if last element is missing, set it to last non-missing */

  if missing(org_lvl(dim(org_lvl))) then do;

    do _N_=dim(org_lvl)-1 to 1 by -1 while(missing(org_lvl(_N_)));

      end;

    if _N_=0 then /* in case all values are missing */

      return;

    org_lvl(dim(org_lvl))=org_lvl(_N_);

    end;

  /* fill left when missing */

  do _N_=dim(org_lvl)-1 to 1 by -1;

    if missing(org_lvl(_N_)) then

      org_lvl(_N_)=org_lvl(_N_+1);

    end;

run;

I did it with a RETURN statement, as the second loop will not do anything useful in that case.

Scott_Mitchell
Quartz | Level 8

Thanks again @s_lassen.

 

I appreciate the rigor and additional consideration you have given my problem.

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1860 views
  • 1 like
  • 3 in conversation