BookmarkSubscribeRSS Feed
Walternate
Obsidian | Level 7

Hi,

 

I have data at the person-month level, so each row=1 person in one month. It has ID, month/year, and two categorical variables. What I want is to collapse to the person-year level as follows:

  1. I want the value of categ_var1 as of the earliest month that each person has a value (not missing) for categ_var1. 

  2. If the person ever has categ_var2='abc', I want an indicator saying that they had categ_var2='abc' in that year.

 

The data are like this:

ID   mth_year  categ_var1   categ_var2

1     201301            ' '                abc

1     201302            ' '                ghi

1     201303           def              ghi

2     201301           xyz              jkl    

2     201302           def              jkl

 

So the yearly dataset would be like this:

ID   first_categ_var1  categ_var2_abc_ind

1           def                              1

2           xyz                              0

 

Any help is much appreciated.

 

 

 

3 REPLIES 3
jklaverstijn
Rhodochrosite | Level 12

Hi @Walternate,

 

I tried SQL but reconsidered that old faithfull the datastep is well suited for this. Have a go at this one:

 

data have;
input id mnth_year @10 categ_var1 $3. @14 categ_var2 $3.;
cards;
1 201301     abc
1 201302     ghi
1 201303 def ghi
2 201301 xyz jkl
2 201302 def jkl
;

proc sort data=have;
   by id mnth_year;
run;

data want;
   length first_categ_var1 $3 categ_var2_abc_ind $1;
   keep id first_categ_var1 categ_var2_abc_ind;
   set have;
   by id mnth_year;
   retain first_categ_var1 categ_var2_abc_ind;
   if first.id then do;
      first_categ_var1='';
      categ_var2_abc_ind = '0';
   end;
   if first_categ_var1='' and categ_var1 ne '' then first_categ_var1=categ_var1;
   if categ_var2 = 'abc' then categ_var2_abc_ind = '1';
   if last.id and first_categ_var1 ne '' then output;
   put _all_;
run;

Hope this helps,

- Jan.

sthota
Calcite | Level 5

data a;

input id mth_year categ_var1 $ categ_var2 $;

cards;

1 201301 . abc

1 201302 . ghi

1 201303 def ghi

2 201301 xyz jkl

2 201302 def jkl

;

run;

proc sql;

create table b as select min(mth_year) as yr, * from a where categ_var1 is not null group by id having calculated yr = mth_year;

quit;

proc sql;

create table c as select distinct id, categ_var2 from a where categ_var2 = 'abc';

quit;

proc sql;

create table d as select b.id, b.categ_var1, case when c.id is null then 0 else 1 end as categ_var2_id from b left join c on b.id = c.id ;

quit;

Ksharp
Super User

data have;
input id mnth_year @10 categ_var1 $3. @14 categ_var2 $3.;
cards;
1 201301     abc
1 201302     ghi
1 201303 def ghi
2 201301 xyz jkl
2 201302 def jkl
;
run;
data want;
categ_var2_abc_ind=0;
do until(last.id);
 set have;
 by id;
 if not missing(categ_var1) and not found then do;
  first_categ_var1=categ_var1; 
  found=1;
 end;
 if categ_var2='abc' then categ_var2_abc_ind=1;
end;

drop categ_var1 categ_var2 found;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 824 views
  • 0 likes
  • 4 in conversation