BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
awardell
Obsidian | Level 7

Hello,
I have a data set that looks like the following: 

ID DATE VAR1 VAR2 VAR3
1 8-22-2022 a    
1 8-22-2022   b  
1 7-23-2021     c
2 11-12-2019 a    
2 10-14-2019     c

 


data have;
input ID DATE :mmddyy10. VAR1 $ VAR2 $ VAR3 $ ;
format DATE mmddyy10.;
datalines;
1 08-22-2022 a . .
1 08-22-2022 . b .
1 07-23-2021 . . c
2 11-12-2019 a . .
2 10-14-2019 . . c
;
run;


I want something that looks like the following:

 

ID DATE VAR1 VAR2 VAR3
1 8-22-2022 a b  
1 8-22-2022 a b  
1 7-23-2021     c
2 11-12-2019 a    
2 10-14-2019     c

 

data want;
input ID DATE :mmddyy10. VAR1 $ VAR2 $ VAR3 $ ;
format DATE mmddyy10.;
datalines;
1 08-22-2022 a b .
1 08-22-2022 a b .
1 07-23-2021 . . c
2 11-12-2019 a . .
2 10-14-2019 . . c
;
run;


Essentially, I need a data set where the values for var1 - var3 are retained if the id AND the date are the same. I tried something like this; however, it did not work (also other similar iterations of the code below (with first.id and first.date).


data test;
set have;
by id date;
retain met_ihc2ln_;
run;


Thank you in advance for any help you can provide!

1 ACCEPTED SOLUTION

Accepted Solutions
SASKiwi
PROC Star

There are probably more elegant ways of doing this but SQL works for the data you've supplied:

proc sql;
  create table want as

  select A.ID
        ,A.DATE
        ,coalesce(A.VAR1, B.Max_VAR1) as VAR1 
        ,coalesce(A.VAR2, B.Max_VAR2) as VAR2
        ,coalesce(A.VAR3, B.Max_VAR3) as VAR3
  from have as A

  left join
  (select ID
          ,DATE
          ,max(VAR1) as Max_VAR1
          ,max(VAR2) as Max_VAR2
          ,max(VAR3) as Max_VAR3
   from have
   group by ID
           ,DATE
  ) as B
  on A.ID = B.ID
  and A.DATE = B.DATE
  order by A.ID
          ,A.DATE 
  ;
quit;

View solution in original post

9 REPLIES 9
ballardw
Super User

Since your date is missing on the first example record how does that meet the requirement "the id AND the date are the same"? So you need to describe what is going on when a date obviously doesn't match that it should be processed this way.

 

Retain won't work directly because it can keep a value from a PREVIOUS record, not look ahead to the NEXT record as you are doing in a couple of places.

awardell
Obsidian | Level 7

I see what you are saying. I had made a mistake with the date being missing in my first chart, no dates should be missing. Do you know of a way that I could look to the NEXT record across date and id? I am still stuck with how to get this. 

SASKiwi
PROC Star

There are probably more elegant ways of doing this but SQL works for the data you've supplied:

proc sql;
  create table want as

  select A.ID
        ,A.DATE
        ,coalesce(A.VAR1, B.Max_VAR1) as VAR1 
        ,coalesce(A.VAR2, B.Max_VAR2) as VAR2
        ,coalesce(A.VAR3, B.Max_VAR3) as VAR3
  from have as A

  left join
  (select ID
          ,DATE
          ,max(VAR1) as Max_VAR1
          ,max(VAR2) as Max_VAR2
          ,max(VAR3) as Max_VAR3
   from have
   group by ID
           ,DATE
  ) as B
  on A.ID = B.ID
  and A.DATE = B.DATE
  order by A.ID
          ,A.DATE 
  ;
quit;
awardell
Obsidian | Level 7

Thank you so much! This worked great! I appreciate all of your help!

andreas_lds
Jade | Level 19

Just another solution:

data want;
   length v1-v3 $ 1;
   
   do _n_ = 1 by 1 until(last.date);
      set have;
      by id date notsorted;
      v1 = coalescec(v1, var1);
      v2 = coalescec(v2, var2);
      v3 = coalescec(v3, var3);
   end;
   
   do _n_ = 1 by 1 until(last.date);
      set have;
      by id date notsorted;
      var1 = coalescec(v1, var1);
      var2 = coalescec(v2, var2);
      var3 = coalescec(v3, var3);
   end;
   
   drop v1-v3;
run;
awardell
Obsidian | Level 7

Thank you so much for your help! I have never seen the coalesce function used like this, and I appreciate it! 

Ksharp
Super User
data have;
input ID DATE :mmddyy10. VAR1 $ VAR2 $ VAR3 $ ;
format DATE mmddyy10.;
datalines;
1 08-22-2022 a . .
1 08-22-2022 . b .
1 07-23-2021 . . c
2 11-12-2019 a . .
2 10-14-2019 . . c
;
run;


proc sql;
create table want(drop=dummy) as
select ID,DATE,var1 as dummy,max(var1) as var1,max(var2) as var2,max(var3) as var3
 from have 
  group by ID,DATE ;
quit;
awardell
Obsidian | Level 7

Thank you so much for this solution! I appreciate your help!

Tom
Super User Tom
Super User

Your example data is not complex enough to actually test all possible scenarios.  In particular, what do you want to do when there is more than one non-missing value for a variable within a group?

 

Here is method using the UPDATE statement.  First collapse to final values per group by using the existing data as transactions against an empty dataset.  Then reapply the transactions to get the detailed observations written out.

data have ;
  input ID DATE :mmddyy. (VAR1-VAR3) (:$8.);
  format date yymmdd10.;
cards;
1 8-22-2022 a . .    
1 8-22-2022 . b . 
1 7-23-2021 . . c
2 11-12-2019 a . .  
2 10-14-2019 . . c
;

data step1;
  update have(obs=0) have ;
  by id descending date;
run;

data want ;
  update step1 have;
  by id descending date;
  output;
run;

Result:

Obs    ID          DATE    VAR1    VAR2    VAR3

 1      1    2022-08-22     a       b
 2      1    2022-08-22     a       b
 3      1    2021-07-23                     c
 4      2    2019-11-12     a
 5      2    2019-10-14                     c

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 2898 views
  • 2 likes
  • 6 in conversation