BookmarkSubscribeRSS Feed
Walternate
Obsidian | Level 7

Hi all,

 

I have a dataset at the person/month/categorical group variable level, like this:

 

ID      First name        Last name       Month       Categorical var

1        abc                    def                  200901              a

1        abc                    def                  200902              a

2        abc                    def                  200903              a

3        abc                    def                  200901              b

4        abc                    def                  200902              b

 

I'm trying to clean the ID variable by setting all values of ID within levels of the categorical variable, as the value of ID as of the most recent month. So basically, as long as a person has the same name, I assume they're the same person and overwrite older values of ID (within categorical var=a) with the newest value, like this:

 

ID      First name        Last name       Month       Categorical var

2        abc                    def                  200901              a

2        abc                    def                  200902              a

2        abc                    def                  200903              a

4        abc                    def                  200901              b

4        abc                    def                  200902              b

 

So basically, I want to retain all the same data, but with ID now consistent within each level of categorical var for each person. 

 

Any help is much appreciated. 

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, few ways to do it, you could sort and retain, or do it in SQL like:

data have;
  input id first_name $ last_name $ month categorical_var $;
datalines;
1 abc def 200901 a
1 abc def 200902 a
2 abc def 200903 a
3 abc def 200901 b
4 abc def 200902 b
;
run;

proc sql;
  create table WANT as
  select  B.ID,
          A.FIRST_NAME,
          A.LAST_NAME,
          A.MONTH,
          A.CATEGORICAL_VAR 
  from    HAVE A
  left join (select CATEGORICAL_VAR,max(ID) as ID from HAVE group by CATEGORICAL_VAR) B
  on      A.CATEGORICAL_VAR=B.CATEGORICAL_VAR;
quit;
Walternate
Obsidian | Level 7

I realize that in the example the IDs got higher the later the month was, but that is not the case in my source data, so I don't think selecting max ID would get me what I want. 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Then change the second group per:

  left join (select distinct CATEGORICAL_VAR,ID as ID from HAVE group by CATEGORICAL_VAR having max(MONTH)) B

However that will cause duplicates if there are more than one cat/id in the last month.  If you can post a datastep with some test data, exactly matching your requirements the code can be adapted, can only go on what I see. 

Ksharp
Super User

Assuming the data has been sorted by month like your sample data did.

 

data have;
  input id first_name $ last_name $ month categorical_var $;
datalines;
1 abc def 200901 a
1 abc def 200902 a
2 abc def 200903 a
3 abc def 200901 b
4 abc def 200902 b
;
run;
data want;
do until(last.categorical_var);
 set have;
 by first_name  last_name  categorical_var ;
end;
_id=id;
do until(last.categorical_var);
 set have;
 by first_name  last_name  categorical_var ;
 id=_id;output;
end;
drop _id;
run;
 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 915 views
  • 0 likes
  • 3 in conversation