Hi, I'm transposing the data with Proc summary and the code is as below -
data have;
input PT DT RULE_ID RULE_CAT;
cards;
1 10 20 20
1 10 21 20
1 10 22 20
1 10 23 20
1 11 20 20
1 11 21 20
1 11 22 20
1 11 23 20
1 12 20 20
1 12 21 20
1 12 22 20
1 12 23 20
2 10 20 20
2 10 21 20
2 10 22 20
2 10 23 20
2 11 20 20
2 11 21 20
2 11 22 20
2 11 23 20
2 12 20 20
2 12 21 20
2 12 22 20
2 12 23 20
;;;;
run;
proc sql noprint feedback;
select max(ptcount) into :dim
from (select count(PT) as ptcount from have group by PT, DT);
quit;
run;
OPTIONS SYMBOLGEN;
proc summary data=have nway;
class PT DT;
output out=need idgroup(out[&dim](RULE_ID: RULE_CAT: )=);
run;
With the above code my final dataset need looks like below.
So the question now is - columns need to be rearranged like, RULE_ID_1, RULE_CAT_1, RULE_ID_2, RULE_CAT_2,RULE_ID_3, RULE_CAT_3,RULE_ID_4, RULE_CAT_4.
Is this possible? Because the number of columns getting created is random and based on the value of the macro variable dim getting created from Proc Sql.
Please share your views and suggestions.
Thanks TomKari.
Your solution works absolutely well with the small datasets like the one provided.
But creating the lnter2 dataset with multi millions of records will take huge CPU time and memory where a column holds all the variable names needs to be transposed.
Can you please suggest - If I need to apply the same logic on huge datasets, do I need to tweak the logic or do we have any new logic to be built?
You may be interested in this macro from this paper that has a few methods built in to do what you're requiring.
http://www.sascommunity.org/wiki/A_Better_Way_to_Flip_(Transpose)_a_SAS_Dataset
I think this is pretty close...
proc sort data=Have out=Inter1;
by PT DT RULE_ID;
run;
data Inter2;
Length CN $20;
retain Count;
keep PT DT CN DV;
set Inter1;
by PT DT;
if first.DT
then do;
Count = 0;
end;
Count = Count + 1;
CN = "RULE_ID_"||compress(put(Count, best7.));
DV = RULE_ID;
output;
CN = "RULE_CAT_"||compress(put(Count, best7.));
DV = RULE_CAT;
output;
run;
PROC TRANSPOSE DATA=Inter2 OUT=Want;
BY PT DT;
ID CN;
VAR DV;
RUN;
Thanks TomKari.
Your solution works absolutely well with the small datasets like the one provided.
But creating the lnter2 dataset with multi millions of records will take huge CPU time and memory where a column holds all the variable names needs to be transposed.
Can you please suggest - If I need to apply the same logic on huge datasets, do I need to tweak the logic or do we have any new logic to be built?
I've found that trying to predict performance is a mugs game.
I'd suggest giving it a try. I usually start with say 100,000 records, if I'm happy up it to a million, and just keep going up by factors of 10.
Tom
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.