Re: Filling variable with names of other variables

Walternate · Posted 10-01-2015 05:38 PM

Hi,

I have a dataset at the person level, with an ID variable and several month/year variables.

ID 200101 200102 200103............through 201505

1 1 . 1

2 . . 1

3 . 1 1

Values of the month/year variables can either be 1 or missing.

What I need is to create a new variable that tells me the date at which someone had their first value of 1. So for ID 1, that would be 200101, for 2, it would be 200103, etc.

I'm hoping for a solution that doesn't involve typing out everything by hand as I have over 100 month/year variables.

Thanks!

MikeZdeb · Posted 10-01-2015 06:19 PM

Hi, made some assumptions as to how your data looks. Solution also creates the variable as a real date. If you don't care about that and just want the variable name, use the second data step.

data x;
input id d200101-d200103;
datalines;
1 1 . 1
2 . . 1
3 . 1 1
;

data y;
format first1 yymon7.;
set x;
array d(*) d: ;
first1 = input(substr(vname(d(whichn( 1,of d: ))),2,6),yymmn6.);
run;

Obs date id d200101 d200102 d200103

1     2001JAN     1       1          .          1
2     2001MAR     2       .          .          1
3     2001FEB     3       .          1          1

Alternative, FIRST1 as a character variable (variable name with the first occurrence of a 1) ...

data y;

set x;
array d(*) d: ;
first1 = vname(d(whichn( 1,of d: )));
run;

stat_sas · Posted 10-01-2015 10:51 PM

data have;
input id d200101-d200103;
datalines;
1 1 . 1
2 . . 1
3 . 1 1
;

data want;
set have;
array a{*} d200101-d200103;
do _n_=1 to dim(a);
if a{_n_}=1 then do;
date_first=vname(a{_n_});
leave;
end;
end;
run;

proc print;

run;

RW9 · Posted 10-02-2015 04:17 AM

As I always comment on these types of things, it is not a good idea to have data in your header row. The name of a variable is an identifier and not data. At worst, I would recommend changing the columns headers to have the same prefex and a suffix incrementor:

VAR1, VAR2

This way you can use VAR:, for example. Far simpler coding. However as you have data in your column names then I would suggest you normlise your data&colon;

ID YEAR RESULT

1 2012 0

1 2013 1

...

This will make your life far easier, for instnace your question on how to find the first 1 record:

proc sql;

select ID,min(YEAR) as MIN_YEAR

from HAVE

group by ID

having RESULT=1;

quit;

You see the data is there to be processed on.

Steelers_In_DC · Posted 10-02-2015 10:19 AM

Here's another solutions:

data have;
input ID '200101'n '200102'n '200103'n;
cards;
1 1 . 1
2 . . 1
3 . 1 1
;

proc transpose data=have out=tran(rename=_NAME_ = DATE);by id;

proc sort data=tran;by id descending col1 date;

data prep;
set tran;
by id descending col1 date;
if first.id then first = date;
run;

proc sql;
create table want as
select distinct a.*, b.First
from have a left join
prep b on
a.id = b.id
where not missing(b.first)
order by id;

Filling variable with names of other variables