07-25-2015 10:07 AM
Dear Support Communities users:
My current datasets have the average of 100 variables and 30 millions obs. I want to roll up my data to keep minimum variables so that data can import to excel or access. (Note: users do not want spend time to learn SAS).
Currently I wrote the code:
Proc SQL;
create table <output_dts> as
select var1,....,var10,sum(amount) as sum_amt, count(*) as counts
from <input_dts>
group by var1,..var10 ;
Quit;
It worked for 10 variables and running time about 1 hour. If I increase up to 40 variables, this code will work or need modify to run.
I need your knowledge and advice to upgrade the code or suggest other solution.
Thanks for valuable support.
William
07-25-2015 11:00 AM
How about using proc summary ?
proc summary data=have;
by var1-var40;
var amount;
output out=want sum=sum_amount n=counts ;
run;
07-25-2015 10:28 AM
Let us do an easy calculation 100 variables apx 8byte/var , round up 1k 30M records. You are having dataset of a size of 30Gb. 1 hour for 3Gb sounds rather slow for me the data must be on some server and transported by a network for that. With a 32-bit office version there is a 2Gb limit.
This sizing doesn't make any sense trying to propagate to desktops.
Avoiding the need for learming SAS could be achieved by using AMO. Addin Microsoft Office. It is still running SAS on a server but it is the office interface.
SAS Add-In for Microsoft Office
07-25-2015 11:00 AM
How about using proc summary ?
proc summary data=have;
by var1-var40;
var amount;
output out=want sum=sum_amount n=counts ;
run;
Need further help from the community? Please ask a new question.