BookmarkSubscribeRSS Feed
AshleyM
Fluorite | Level 6

I'm trying to write the equivalent SAS code for:

proc sql;

create table dataset2 as select var1, var2, var3, var4, var5, count(*)

from dataset1

group by var1, var2, var3, var4, var5;

10 REPLIES 10
FriedEgg
SAS Employee

proc sort data=dataset1; by var1 var2 var3 var4 var5; run;

data dataset2;

set dataset1;

by var1 var2 var3 var4 var5;

if first.var5 then count=0;

count+1;

if last.var5 then output;

run;

sassharp
Calcite | Level 5

data dataset1;
input var1 var2 var3 var4 var5;
datalines;
1 3 65 63 77
1 2 65 63 66

;
proc sort data=dataset1;
by var1 var2 var3 var4 var5;
run;


data dataset2(keep= var1 var2 var3 var4 var5 count);
set dataset1;
by var1 var2 var3 var4 var5;
count+1;
run;

FriedEgg
SAS Employee

sassharp,

Your code will count across the entire file rather than across by group as the SQL will.  You are basically creating a sequence number for dataset1.

sassharp
Calcite | Level 5

Agreed Fried Egg.

sassharp
Calcite | Level 5

As said friedEgg

data dataset1;
input var1 var2 var3 var4 var5;
datalines;
1 3 65 63 77
1 2 65 63 66
2 2  2  2  2
2 2  2  2  2

;
proc sort data=dataset1;
by var1 var2 var3 var4 var5;
run;

data dataset2(keep= var1 var2 var3 var4 var5 count);
set dataset1;
by var1 var2 var3 var4 var5;
if first.var5 then count=0;
count+1;
if last.var5 then output;


run;

Haikuo
Onyx | Level 15

DOW seems pretty handy for this case:

data dataset1;

input var1 var2 var3 var4 var5;

datalines;

1 3 65 63 77

1 2 65 63 66

2 2  2  2  2

2 2  2  2  2

;

proc sort data=dataset1;

by var1 var2 var3 var4 var5;

run;

data want;

  do sum=1 by 1 until (last.var5);

     set dataset1;

by var1-var5;

  end;

  run;

  proc print;run;

Regards,

Haikuo

sassharp
Calcite | Level 5

data dataset1;
input var1 var2 var3 var4 var5;
datalines;
1 3 65 63 77
1 2 65 63 66
2 2  2  2  2
2 2  2  2  2

;
proc sort data=dataset1;
by var1 var2 var3 var4 var5;
run;

data dataset2(keep= var1 var2 var3 var4 var5 count);
set dataset1;
by var1 var2 var3 var4 var5;
if first.var5 then count=0;
count+1;
if last.var5 then output;


run;

/*or*/


proc sql;
create table dataset2 as
select var1,var2,var3,var4,var5,count(*)
from dataset1
group by var1,var2,var3,var4,var5;
run;

Astounding
PROC Star

Another approach:

proc freq data=dataset1;

   tables var1 * var2 * var3 * var4 * var5 / noprint out=dataset2 (drop=percent);

run;

If there are a lot of combinations of the 5 variables, it is conceivable this would run out of memory.  But the tradeoff is that it works on unsorted data.

FriedEgg
SAS Employee

Another alternative would be proc means, which is a multi-threading procedure which will make the operation faster for larger data:

proc means data=dataset1 n nway noprint;

class var1-var5;

output out=dataset2(drop=_type_ rename=(_freq_=count));

run;

Haikuo
Onyx | Level 15

Another approach, using Hash this time, I know it's silly, just try to show what Hash can do:

data have;

input v1-v6;

datalines;

2 2  2  2  2 4

2 2  2  2  2 77

1 3 65 63 77 2

1 2 65 63 66 3

2 2 2 2 2 76

;

data want (drop=_rc);

length sum 8;

if _n_ = 1 then do;

   declare hash h(suminc: "sum", ordered: "a");

   declare hiter hi("h");

   h.defineKey('v1', 'v2','v3', 'v4', 'v5');

   h.definedata('v1', 'v2','v3', 'v4', 'v5');

   h.defineDone();

   sum = 1;

  end;

do while (not done);

   set have end=done;

   _rc = h.ref();

end;

_rc = hi.first();

do while(_rc = 0);

   _rc = h.sum(sum: sum);

   output;

   _rc = hi.next();

end;

stop;

run;

proc print;run;

Regards,

Haikuo

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 2805 views
  • 0 likes
  • 5 in conversation