02-17-2014 04:10 AM
I have a library with data from four years (2010-2014) each dataset includes two variables: id and source
I now want to append these years and in the append step (or what step the best for doing this...) create a variable that tells you which year the data comes from. Like this:
id source year (new)
1 a 2010
2 a 2010
1 b 2011
2 b 2011
and so on...
I'd rather not create the variable in each dataset because they are huge
Is it possible to create this variable in the append step?
02-17-2014 08:05 AM
See code sample below, it uses some techniques available since SAS9.2 where you can specify data sets lists and the INDSNAME= option on the SET statement.
02-17-2014 09:51 AM
INDSNAME option on the SET statement is what you want.
But you might also want to look into using a BY statement with your SET so that the final dataset will still be sorted by ID. And if you order the datasets in the SET statement properly then the final dataset will also be sorted by YEAR within ID.
data want ;
length indsname $41 ;
set data_2010-data_2013 indsname=indsname ;
by id source ;
year = input(scan(idsname,-1,'_'),4.);
02-17-2014 11:23 AM
How about this one
insert into all
select id,source, 2010 as source_file
select id,source, 2011 as source_file
select id,source, 2012 as source_file
select id,source, 2013 as source_file