BookmarkSubscribeRSS Feed
Walternate
Obsidian | Level 7

Hi,

 

I have a dataset at the person level, with an ID variable and several month/year variables. 

 

ID    200101 200102  200103............through 201505

1          1            .            1

2           .            .            1

3           .            1           1

 

Values of the month/year variables can either be 1 or missing. 

 

What I need is to create a new variable that tells me the date at which someone had their first value of 1. So for ID 1, that would be 200101, for 2, it would be 200103, etc. 

 

I'm hoping for a solution that doesn't involve typing out everything by hand as I have over 100 month/year variables.

 

Thanks!

4 REPLIES 4
MikeZdeb
Rhodochrosite | Level 12

Hi, made some assumptions as to how your data looks.  Solution also creates the variable as a real date.  If you don't care about that and just want the variable name, use the second data step.

 

data x;
input id d200101-d200103;
datalines;
1 1 . 1
2 . . 1
3 . 1 1
;

 

data y;
format first1 yymon7.;
set x;
array d(*) d: ;
first1 = input(substr(vname(d(whichn( 1,of d: ))),2,6),yymmn6.);
run;

 

Obs       date    id    d200101    d200102    d200103

 1     2001JAN     1       1          .          1
 2     2001MAR     2       .          .          1
 3     2001FEB     3       .          1          1

 

Alternative, FIRST1 as a character variable (variable name with the first occurrence of a 1) ...

 

data y;

set x;
array d(*) d: ;
first1 = vname(d(whichn( 1,of d: )));
run;

stat_sas
Ammonite | Level 13

data have;
input id d200101-d200103;
datalines;
1 1 . 1
2 . . 1
3 . 1 1
;

data want;
set have;
array a{*} d200101-d200103;
do _n_=1 to dim(a);
if a{_n_}=1 then do;
date_first=vname(a{_n_});
leave;
end;
end;
run;

 

proc print;

run;

RW9
Diamond | Level 26 RW9
Diamond | Level 26

As I always comment on these types of things, it is not a good idea to have data in your header row.  The name of a variable is an identifier and not data.  At worst, I would recommend changing the columns headers to have the same prefex and a suffix incrementor:

VAR1, VAR2

This way you can use VAR:, for example.  Far simpler coding.  However as you have data in your column names then I would suggest you normlise your data:

ID      YEAR      RESULT

1        2012       0

1        2013       1

...

 

This will make your life far easier, for instnace your question on how to find the first 1 record:

proc sql;

  select ID,min(YEAR) as MIN_YEAR

  from HAVE 

  group by ID

  having RESULT=1;

quit;

 

You see the data is there to be processed on.

Steelers_In_DC
Barite | Level 11

Here's another solutions:

 

data have;
input ID    '200101'n '200102'n  '200103'n;
cards;
1 1 . 1
2 . . 1
3 . 1 1
;

proc transpose data=have out=tran(rename=_NAME_ = DATE);by id;

proc sort data=tran;by id descending col1 date;

data prep;
set tran;
by id descending col1 date;
if first.id then first = date;
run;

proc sql;
create table want as
select distinct a.*, b.First
from have a left join
     prep b on
a.id = b.id
where not missing(b.first)
order by id;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 617 views
  • 1 like
  • 5 in conversation