Hello:
I have some questions regarding proc SQL. My main question is: when I am expecting my output table to have three observations, why does it have 200+ observations?
As background:
The code:
proc sql;
create table estabgroup_table as
select mean_estabgroupamp.variable,
estabgroupamp,
estabgroupsst,
estabgroupnur,
estabgrouphih,
F,
Prob,
Pooledmean
from mean_estabgroupamp,
mean_estabgroupsst,
mean_estabgroupnur,
mean_estabgrouphih,
anova_estabgroup1,
pooledmean1;
quit;
proc print data=estabgroup_table; run;
My input data sets into the SQL table are:
- five proc means output datasets
- proc anova data sets
Specifically:
1. Input data set for pooled means column
Input data set for other mean_estabgroup amp/sst/nur/hih columns are:
The desired output:
Obs | Variable | estabgroupamp | estabgroupsst | estabgroupnur | estabgrouphih | F | PROB | pooledmean |
1 | fns | |||||||
2 | parent_revenue | |||||||
3 | firm_employees_parent1 |
Sample actual output:
Ideas?
Many thanks for your time!
You don't post what you expect your output to look like but I expect you want a merge and you haven't specified any criteria on how to join the data so I'm not even sure how SQL would evaluate it. It looks like a cross join I guess, joining all tables to all records of each table. Specify a join condition
You don't post what you expect your output to look like but I expect you want a merge and you haven't specified any criteria on how to join the data so I'm not even sure how SQL would evaluate it. It looks like a cross join I guess, joining all tables to all records of each table. Specify a join condition
Thanks, Reeza!
Question, though: I tried correcting my code by specifying for SQL to join each table by the variable called "variable" as follows
proc sql;
create table estabgroup_table as
select mean_estabgroupamp.variable,
estabgroupamp,
estabgroupsst,
estabgroupnur,
estabgrouphih,
F,
Prob,
Pooledmean
from mean_estabgroupamp as mea inner join
mean_estabgroupsst as mes inner join
mean_estabgroupnur as men inner join
mean_estabgrouphih as meh inner join
anova_estabgroup1 as an inner join
pooledmean1 as pm
on mea.variable eq mes.variable eq men.variable eq meh.variable eq a.variable eq pm.variable eq ;
quit;
proc print data=estabgroup_table; run;
but did I do this incorrectly because I get the following log result?:
2900 proc sql;
2901 create table estabgroup_table as
2902 select mean_estabgroupamp.variable,
2903 estabgroupamp,
2904 estabgroupsst,
2905 estabgroupnur,
2906 estabgrouphih,
2907 F,
2908 Prob,
2909 Pooledmean
2910 from mean_estabgroupamp as mea inner join
2911 mean_estabgroupsst as mes inner join
-----
73
2912 mean_estabgroupnur as men inner join
-----
73
2913 mean_estabgrouphih as meh inner join
-----
73
2914 anova_estabgroup1 as an inner join
-----
73
ERROR 73-322: Expecting an ON.
2915 pooledmean1 as pm
2916 on mea.variable eq mes.variable eq men.variable eq meh.variable eq an.variable eq pm.variable eq
2916! ;
2916 on mea.variable eq mes.variable eq men.variable eq meh.variable eq an.variable eq pm.variable eq
2916! ;
-
22
ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string,
a numeric constant, a datetime constant, a missing value, BTRIM, INPUT, PUT, SUBSTRING,
USER.
2917 quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
Many of the joins want to know how to match up records, that is the ON requirement
proc sql;
create table new
as select a.*, b.onevariable
from thistable as a left joint thattable as b
on a.idvariable=b.idvariable;
quit;
Your join of multiple sets likely will need parentheses to get the order and each join, unless Cartesian, wants the ON criteria. Also with variables from separate table you may need to specify in the SELECT statement which table as in Meh.Variable to get the correct one.
Or possibly you want to do a datastep merge.
Thanks, ballardw! Cleaning through my code on how I did the joins and specifying which tables to get each selected variable from solved the problem.
I'm wondering: for the following table
Obs | Variable | estabgroupamp | estabgroupsst | estabgroupnur | estabgrouphih | F | PROB | pooledmean |
1 | fep mean(std) | 144.1( 126.1) | 167( 122.6) | 207.5( 132.4) | 132.8( 117.8) | 210.59 | 1.7722E-134 | 147.4 |
2 | fns mean(std) | 40.9( 49.3) | 82.8( 68.2) | 82.8( 64.2) | 44( 51.7) | 2416.83 | 0 | 48.6 |
3 | parent_revenue mean(std) | 4.4( 9.1) | 19.1( 177.1) | 11.1( 75.4) | 19.1( 648.7) | . | . | 11.7 |
given columns 3-6 stand for establishment's (variable name: estabgroup) with valid values of either amp/sst/nur/or hih,
how could I add a row of the frequencies for each valid value of this estabgroup variable?
Thank you so much for your time and help!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.