15 minutes for this proc sql with 2 tables !
Another procedure?
proc sql;
create table pour_TDB.TDB25 as SELECT TDB_bul.Code_diplome, TDB_bul.AnSuiv,
sum(eff_app_total17) as eff_app_total17, sum(eff_app_bul_16) as eff_app_bul_16, sum(eff_app_bul_15) as eff_app_bul_15, sum(eff_app_bul_14) as eff_app_bul_14,
.............
FROM pour_TDB.TDB_bul LEFT JOIN pour_TDB.Quater_MF ON TDB_bul.AnSuiv = Quater_MF.Annee_suivie AND TDB_bul.Code_diplome = Quater_MF.Code_diplome
GROUP BY TDB_bul.Code_diplome, TDB_bul.AnSuiv;
run;quit;
NOTE: Table POUR_TDB.TDB25 created, with 2421 rows and 67 columns.
135 run;quit;
NOTE: PROC SQL statements are executed immediately; The RUN statement has no effect.
NOTE: PROCEDURE SQL used (Total process time):
real time 15:30.63
cpu time 15:30.43
Thank you for everyone.
I did not think about it, join and then proc summary
And how did your previous post which is marked as complete successfully solve this?
https://communities.sas.com/t5/SAS-Procedures/slow-on-Proc-Sql-summation/m-p/451567#M69759
Appears to be the same question?
I suspect what's in the ... is the issue.
I would be tempted to sum the variables first (assuming they all come from the a single data set) and then join, or if the variables that need summary are mixed in the two sets do the summary first and then join the summary results.
proc summary data=settosum nway; class Code_diplome AnSuiv; var eff: ; output out=summary(drop=_type_ _freq_) sum= ; run;
Since you didn't list all of the variables summed I followed the example actually shown in the code where all of the variables that were summed started with eff.
If Code_diplome AnSuiv are both character you could use: Var _numeric_ to sum all of the numeric variables without explicitly listing them.
The example code creates the sum variable with the same name as the input variable. If you request multiple statistics you need to provide some names for each statistic or use the option /autoname to create names with the statistic as a suffix to source name.
It's not the same question, it's with 2 tables
I can't do that
proc summary data=TDB_bul LEFT JOIN pour_TDB.Quater_MF ON TDB_bul.AnSuiv = Quater_MF.Annee_suivie AND TDB_bul.Code_diplome = Quater_MF.Code_diplome ;
class Code_diplome AnSuiv ;/*2 columns from TDB_bul*/
var eff_app_total17 Masc_2015 ; /*67 columns from TDB_bul and Quater_MF*/
output out=pour_TDB.TDB25 sum= ;
run;
@eric23263 wrote:
It's not the same question, it's with 2 tables
I can't do that
proc summary data=TDB_bul LEFT JOIN pour_TDB.Quater_MF ON TDB_bul.AnSuiv = Quater_MF.Annee_suivie AND TDB_bul.Code_diplome = Quater_MF.Code_diplome ; class Code_diplome AnSuiv ;/*2 columns from TDB_bul*/ var eff_app_total17 Masc_2015 ; /*67 columns from TDB_bul and Quater_MF*/ output out=pour_TDB.TDB25 sum= ; run;
No you can't proc summary works on one data set.
Without example data it is very hard to make suggestions. However for summing things creating the summaries from each of the input data sets separately would reduce the number of records being joined. Also Proc summary (or means ) allows use of variable lists so the code can be much shorter if you don't have to list 300 variables by name with the sum(var) as var structure required by SQL.
Since your sql code did not include sum(table.variable) I had to assume that each of the variables being summed only comes from one data set. So that set should be quite amenable to Proc Summary. If the variables are mixed between the two sets the summarize each. Something like
proc summary data=settosum nway; class Code_diplome AnSuiv; var eff: ; output out=summary(drop=_type_ _freq_) sum= ; run; proc summary data=settosum2 nway; class Code_diplome AnSuiv; var eff: ; output out=summary2(drop=_type_ _freq_) sum= ; run; proc sql; create table want as select <stuff> from summary as a left join summary2 as b on <criteria> ; quit;
The error messages from your Proc Summary should tell that you were not using the tool (proc summary) correctly.
I realize that many people have a background in SQL and try to do everything there because of familiarity but the SAS environment as a LOT of tools and some of them are much more efficient to execute, or simpler to code (the proc summary example here). These other tools have there own rules and it helps to learn them.
I do not try to force data steps to do a Cartesian join on multiple variables because SQL is almost always the best tool for that. Similarly summarizing data with another procedure before merging is a common approach.
Also note that with a very simple change to the code from proc summary I can the sums of those variables for 1)all the records, 2) all the values of Code_diplome, 3) all the values of AnSuiv as well as 4) only the combinations of Code_diplome and AnSuiv.
How simple: remove the NWAY option (though with that you want to keep the _type_ variable.
Add std= mean= max= min= /autoname after the sum= in the output statement of proc summary and I now have those statistics and the variables will be named with the original variable name with _sum _std _mean _max and _min appended. Without having to type out 65 (or 300 or what have you) std(var) as var_std, mean(var) as var_mean etc statements from SQL.
Even if Proc summary ran a little slower the time I saved by using variable lists and autoname might make the solution faster than SQL.
thank you, I'll try Monday
What I did today :
NOTE: Table POUR_TDB.TDB_25_1 created, with 2109 rows and 46 columns.
NOTE: Table POUR_TDB.TDB_25_2 created, with 2109 rows and 20 columns.
NOTE: Table POUR_TDB.TDB_25_3 created, with 2109 rows and 21 columns.
NOTE: Table POUR_TDB.TDB_25_4 created, with 2109 rows and 35 columns.
each proc sql in less than 0.10 secondes
I think the INNER JOIN between the 4 tables is fast
Thank you for everyone.
I did not think about it, join and then proc summary
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.