Hi there,
I have two data sets: have and have1 (have1 is relatively unimportant so I didn't list it below. First I want to create a series of new variables (var_1 - var_42, however here I only listed var1-var4) to Have and to define var_i = date if count = i (please see the ideal output Want).
The code (listed right below) that I ran gave me a result but it's not the one that I want (listed at the bottom). Also is there an easier way than using %let to define a series of variables in a similar naming convention?
data want; set have;
array var (42) var_1 - var_42;
do i= 1 to 42;
if count = i then var(i) = date;
end;
format var_1 - var date9.;
drop i;
run;
%let var_i = var_1,var_2,var_3,var_4,var_5,var_6,var_7,var_8,var_9,var_10,var_11,var_12,var_13,var_14,var_15,var_16,var_17,var_18,var_19,
var_20,var_21,var_22,var_23,var_24,var_25,var_26,var_27,var_28,var_29,var_30,var_31,var_32,var_33,var_34,var_35,var_36,var_37,var_38,var_39,
var_40,var_41,var_42;
proc sql; create table want_1 as select a.*, b.&var_i from have1 a left join want b on a.id = b.id and a.year = b.year; quit;
Have
ID | Year | Date | Count |
1 | 2006 | 5/31/2006 | 1 |
1 | 2007 | 6/7/2007 | 1 |
1 | 2008 | 7/8/2008 | 1 |
1 | 2009 | 7/13/2009 | 1 |
1 | 2009 | 7/8/2008 | 2 |
2 | 2009 | 12/23/2008 | 1 |
2 | 2009 | 1/30/2009 | 2 |
2 | 2009 | 9/30/2009 | 3 |
2 | 2009 | 12/15/2008 | 4 |
Want
ID | Year | Date | Count | var_1 | var_2 | var_3 | var_4 |
1 | 2006 | 5/31/2006 | 1 | 5/31/2006 | |||
1 | 2007 | 6/7/2007 | 1 | 6/7/2007 | |||
1 | 2008 | 7/8/2008 | 1 | 7/8/2008 | |||
1 | 2009 | 7/13/2009 | 1 | 7/13/2009 | 7/8/2008 | ||
2 | 2009 | 12/23/2008 | 1 | 12/23/2008 | 1/30/2009 | 9/30/2009 | 12/15/2008 |
ID | Year | Date | Count | var_1 | var_2 | var_3 | var_4 |
1 | 2006 | 5/31/2006 | 1 | 5/31/2006 | |||
1 | 2007 | 6/7/2007 | 1 | 6/7/2007 | |||
1 | 2008 | 7/8/2008 | 1 | 7/8/2008 | |||
1 | 2009 | 7/13/2009 | 1 | 7/13/2009 | |||
1 | 2009 | 7/8/2008 | 2 | 7/8/2008 | |||
2 | 2009 | 12/23/2008 | 1 | 12/23/2008 | |||
2 | 2009 | 1/30/2009 | 2 | 1/30/2009 | |||
2 | 2009 | 9/30/2009 | 3 | 9/30/2009 | |||
2 | 2009 | 12/15/2008 | 4 | 12/15/2008 |
Here is some code:
data have;
input ID Year Date :mmddyy.;
format date yymmdd10.;
datalines;
1 2006 5/31/2006
1 2007 6/7/2007
1 2008 7/8/2008
1 2009 7/13/2009
1 2009 7/8/2008
2 2009 12/23/2008
2 2009 1/30/2009
2 2009 9/30/2009
2 2009 12/15/2008
;
data want;
array v var_1-var_4;
format var_: yymmdd10.;
do i = 1 by 1 until (last.year);
set have; by id year;
if i <= dim(v) then v{i} = date;
end;
drop i date;
run;
proc print data=want noobs; var id year var_:; run;
ID Year var_1 var_2 var_3 var_4 1 2006 2006-05-31 . . . 1 2007 2007-06-07 . . . 1 2008 2008-07-08 . . . 1 2009 2009-07-13 2008-07-08 . . 2 2009 2008-12-23 2009-01-30 2009-09-30 2008-12-15
As for the SQL part of your question... I don't understand what it's about.
Here is some code:
data have;
input ID Year Date :mmddyy.;
format date yymmdd10.;
datalines;
1 2006 5/31/2006
1 2007 6/7/2007
1 2008 7/8/2008
1 2009 7/13/2009
1 2009 7/8/2008
2 2009 12/23/2008
2 2009 1/30/2009
2 2009 9/30/2009
2 2009 12/15/2008
;
data want;
array v var_1-var_4;
format var_: yymmdd10.;
do i = 1 by 1 until (last.year);
set have; by id year;
if i <= dim(v) then v{i} = date;
end;
drop i date;
run;
proc print data=want noobs; var id year var_:; run;
ID Year var_1 var_2 var_3 var_4 1 2006 2006-05-31 . . . 1 2007 2007-06-07 . . . 1 2008 2008-07-08 . . . 1 2009 2009-07-13 2008-07-08 . . 2 2009 2008-12-23 2009-01-30 2009-09-30 2008-12-15
As for the SQL part of your question... I don't understand what it's about.
Thank you so much. The code works well but the weird thing is why those var_ are added to the front of the table (not stack to the last column of the table).
I have an additional question: the reason why I set dim(var) to be 4 is I know the max(count) = 4. What if I don't know the max(count), is there a better way to do this?
Regarding SQL, I was trying to add those new var_ to another table. But you've seen when I have many var_, my code will be very tedious. So I'm wondering if there's a more efficient way to do it.
Just as an aside: this code
do i= 1 to 42;
if count = i then var(i) = date;
end;
can be made much more efficient:
if 1 <= count <= 42 then var{count} = date;
Normally, I'd to this using the DATA step and arrays (just as @PGStats has done) or the hash object, as it is simpler for me to construct such a program than to recall how to code proc TRANSPOSE to attain the same goal. In this case, though, the latter is so straightforward that it quickly succumbs to trial and error:
data have ;
input id year date :mmddyy. ;
format date yymmdd10. ;
cards ;
1 2006 05/31/2006
1 2007 06/07/2007
1 2008 07/08/2008
1 2009 07/13/2009
1 2009 07/08/2008
2 2009 12/23/2008
2 2009 01/30/2009
2 2009 09/30/2009
2 2009 12/15/2008
;
run ;
proc transpose
data = have
out = want (drop = _:)
prefix = var_
;
by id year ;
var date ;
run ;
Note that I don't understand the logical coherence between the values of the variable COUNT in the input and output files; you'll have to work it out yourself if it's really needed.
Kind regards
Paul D.
Hi Paul,
Thank you very much. PROC TRANSPOSE works too.
So count is a variable that I constructed, indicating how many events happened within the same year. So logically, COUNT should equal to the i of var_i. Essentially, i indicates the ith event within the year. Does this make sense to you?
Thanks,
Max
@maxjiang6999 : It does now, thanks.
For your want dataset, a proc transpose and merge should do;
proc transpose data=have out=trans prefix=var_;
var date;
by id year;
id count;
data want;
merge
have
trans
;
by id year;
if first.year;
run;
Hi there, thanks! Merge step does save me lots of time!
Regarding your SQL I think I understand what you need
PROC SQL;
CREATE TABLE want_1 (drop=year1) AS SELECT
a.*, b.*
FROM have_1 a
LEFT JOIN want (keep=year1 var_: rename=(year=year1)) b
ON a.year=b.year1;
QUIT;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.