- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi all
I want to know what is wrong with this code, I'm quite sure this code is correct but it may be the issue because of the big sample size, I have 1.6 million observation. So, What is the best idea to solve this problem.
when I submit this code, after few minutes the system showing proc means not responding.
regards
Ibrahim
proc sort data = have;
by year month week;
run; quit;
proc means data = have;
by year month week;
output out = want sum(FATKG)= FAT_KG SUM(PROTKG)= PROT_KG SUM(MILKKG)= MILK_KG;
RUN; QUIT;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I think @Astounding hit the nail on the head there. If you don't specify NOPRINT or VAR it will analyze all numeric variables and put that in the results window - and that will take time. So adding NOPRINT & a VAR statement should speed things up a lot.
proc sort data = have;
by year month week;
run;
proc means data = have NOPRINT;
by year month week;
var FATKG PROTKG MILKKG;
output out = want sum(FATKG)= FAT_KG SUM(PROTKG)= PROT_KG SUM(MILKKG)= MILK_KG;
RUN;
@Astounding wrote:
It's not clear how many numeric variables exist in your data set. Without a VAR statement, the default is to compute statistics for every numeric variable. PROC MEANS might be "intelligent" enough to figure out that you only need statistics computed for 3 variables ... but it might not figure that out. You might fix the problem just by adding to PROC MEANS:
var fatkg protkg milkkg;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
1.6million observations will take a long time to run, there is no workaround unless you can reduce the number of observations. Just leave it until its done.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you have a lot of variables, you can try dropping the variables you don't need in a data set option to make it faster. Not sure if that helps but worth a shot.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It's not clear how many numeric variables exist in your data set. Without a VAR statement, the default is to compute statistics for every numeric variable. PROC MEANS might be "intelligent" enough to figure out that you only need statistics computed for 3 variables ... but it might not figure that out. You might fix the problem just by adding to PROC MEANS:
var fatkg protkg milkkg;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I think @Astounding hit the nail on the head there. If you don't specify NOPRINT or VAR it will analyze all numeric variables and put that in the results window - and that will take time. So adding NOPRINT & a VAR statement should speed things up a lot.
proc sort data = have;
by year month week;
run;
proc means data = have NOPRINT;
by year month week;
var FATKG PROTKG MILKKG;
output out = want sum(FATKG)= FAT_KG SUM(PROTKG)= PROT_KG SUM(MILKKG)= MILK_KG;
RUN;
@Astounding wrote:
It's not clear how many numeric variables exist in your data set. Without a VAR statement, the default is to compute statistics for every numeric variable. PROC MEANS might be "intelligent" enough to figure out that you only need statistics computed for 3 variables ... but it might not figure that out. You might fix the problem just by adding to PROC MEANS:
var fatkg protkg milkkg;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I appreciate your help my SAS friends HAHAHAH, now everything is fine and I got what I'm looking for.
thank you very much again
Ibrahim
regards