BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Balas
Fluorite | Level 6

Hi, I'm transposing the data with Proc summary and the code is as below -

 

data have;
input PT DT RULE_ID RULE_CAT;
cards;
1 10 20 20
1 10 21 20
1 10 22 20
1 10 23 20
1 11 20 20
1 11 21 20
1 11 22 20
1 11 23 20
1 12 20 20
1 12 21 20
1 12 22 20
1 12 23 20
2 10 20 20
2 10 21 20
2 10 22 20
2 10 23 20
2 11 20 20
2 11 21 20
2 11 22 20
2 11 23 20
2 12 20 20
2 12 21 20
2 12 22 20
2 12 23 20
;;;;
run;

proc sql noprint feedback;
select max(ptcount) into :dim
from (select count(PT) as ptcount from have group by PT, DT);
quit;
run;

OPTIONS SYMBOLGEN;
proc summary data=have nway;
class PT DT;
output out=need idgroup(out[&dim](RULE_ID: RULE_CAT: )=);
run;

 

With the above code my final dataset need looks like below.

 

Capture1.1.JPG

Capture1.2.JPG

 

So the question now is - columns need to be rearranged like, RULE_ID_1, RULE_CAT_1, RULE_ID_2, RULE_CAT_2,RULE_ID_3, RULE_CAT_3,RULE_ID_4, RULE_CAT_4.

 

Is this possible? Because the number of columns getting created is random and based on the value of the macro variable dim getting created from Proc Sql.

 

Please share your views and suggestions.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

For a huge dataset, it is indeed preferable to use another strategy. Here is what I would try:

 

data _null_;
retain count;
length varlist $2000;
do i = 1 by 1 until(last.dt);
    set have end=done; by pt dt notsorted;
    end;
count = max(count, i);
if done then do;
    do i = 1 to count;
        varList = catx(" ", varList, cats("RULE_ID_", i), cats("RULE_CAT_", i));
        end;
    call symputx("varList", varlist);
    end;
run;

%put &varList;

data want;
do i = 1 by 2 until(last.dt);
    set have end=done; by pt dt notsorted;
    array v &varList;
    v{i} = RULE_ID; v{i+1} = RULE_CAT;
    end;
drop i RULE_ID RULE_CAT;
run;
PG

View solution in original post

9 REPLIES 9
PGStats
Opal | Level 21

Do it with a datastep. No macro variable needed.

 

proc sort data=have; by pt dt; run;

data long;
set have; by pt dt;
if first.dt then i = 0;
i + 1;
var = cats("RULE_ID_", i);
value = rule_id;
output;
var = cats("RULE_CAT_", i);
value = rule_cat;
output;
keep pt dt var value;
run;

proc transpose data=long out=want(drop=_name_);
by pt dt;
var value;
id var;
run;

proc print data=want noobs; run;
PG
Balas
Fluorite | Level 6

Thanks PG Stats.

 

Your solution works absolutely well with the small datasets like the one provided.

 

But creating the long dataset with multi millions of records will take huge CPU time and memory where a column holds all the variable names needs to be transposed.

 

Can you please suggest - If I need to apply the same logic on huge datasets, do I need to tweak the logic or do we have any new logic to be built?

 

 

PGStats
Opal | Level 21

For a huge dataset, it is indeed preferable to use another strategy. Here is what I would try:

 

data _null_;
retain count;
length varlist $2000;
do i = 1 by 1 until(last.dt);
    set have end=done; by pt dt notsorted;
    end;
count = max(count, i);
if done then do;
    do i = 1 to count;
        varList = catx(" ", varList, cats("RULE_ID_", i), cats("RULE_CAT_", i));
        end;
    call symputx("varList", varlist);
    end;
run;

%put &varList;

data want;
do i = 1 by 2 until(last.dt);
    set have end=done; by pt dt notsorted;
    array v &varList;
    v{i} = RULE_ID; v{i+1} = RULE_CAT;
    end;
drop i RULE_ID RULE_CAT;
run;
PG
Balas
Fluorite | Level 6

Perfect. Thanks PG Stats.

Ksharp
Super User

PG,

1) What if one of them is CHARACTER variable?

2) Once you get variable list &listVar, why not use RETAIN statment on the OP's original data ?

PGStats
Opal | Level 21

The code is meant to be fast for the problem at hand. It reads the data only twice and it doesn't create any intermediate dataset. Writing the first step in SQL might be a better strategy if it could be run on a server.

PG
Ksharp
Super User

Now it is MERGE time. Check MERGE skill me,Matt, Arthur.T proposed, if your table is big ,check CALL EXECUTE().

 

http://support.sas.com/resources/papers/proceedings15/2785-2015.pdf

 

 

data have;
input PT DT RULE_ID RULE_CAT;
cards;
1 10 20 20
1 10 21 20
1 10 22 20
1 10 23 20
1 11 20 20
1 11 21 20
1 11 22 20
1 11 23 20
1 12 20 20
1 12 21 20
1 12 22 20
1 12 23 20
2 10 20 20
2 10 21 20
2 10 22 20
2 10 23 20
2 11 20 20
2 11 21 20
2 11 22 20
2 11 23 20
2 12 20 20
2 12 21 20
2 12 22 20
2 12 23 20
;;;;
run;
data have;
 set have;
 by PT DT;
 if first.DT then n=0;
 n+1;
run;
proc sql;
select distinct catt('have(where=(n=',n,') 
                     rename=(RULE_ID=RULE_ID_',n,' RULE_CAT=RULE_CAT_',n,'))')
        into : list separated by ' '
 from have;
quit;
data want;
 merge &list ;
 by PT DT;
 drop n;
run;

 

Make it better.

Balas
Fluorite | Level 6

Thanks Ksharp. Solution Worked perfectly well with the small datasets. 

 

But I faced a problem, while creating a macro variable list, the maximum allowable character length is more or less 6000 only.

If the length is more the value (for List) is getting truncated for huge datasets and the code execution is getting stopped with the Error message.

 

Could you please suggest how to rectify this unwanted truncation?

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 3062 views
  • 6 likes
  • 3 in conversation