Hi all,
I have a proc sql step which is building some counts by clinic:
proc sql;
create table clin_counts as
select upper(clin_clean) as clin, count(clin_clean) as num_recs
from clean_clin
group by clin_clean
order by num_recs;
quit;
The step works but I'm confused because clin_clean is all upper case, while the resulting variable after this step (clin) is not.
I added the upper function to the SQL step, but the resulting values are still all lower case.
Any help is much appreciated!
That is surprising. Can you post the log from running:
proc sql;
create table clin_counts as
select upper(clin_clean) as clin, count(clin_clean) as num_recs
from clean_clin
group by clin_clean
order by num_recs;
quit;
data _null_;
set clin_counts(obs=5);
put clin= num_recs=;
run;
Please post your log.
proc sql;
create table clin_counts as
select upper(clin_clean) as clin, count(clin_clean) as num_recs
from clean_clin
group by clin_clean
order by num_recs;
quit;
proc print data=clin_counts (obs=5);
run;
Example data in the form of data step that exhibits this behavior with shown. Since the only variable needed is your Clin_clean I would hope that is relatively easy.(See my example below)
What format is currently applied to your Clin_clean variable?
Note: You want to make sure that the variable has the values correct before using groupby in the syntax you show. If the Clin_clean has different case on input you will get multiple groups because Group By is not using the calculated version of the Clin variable.
datalines; abc abc Abc AbC pdq pdq pdq ; proc sql; create table clin_counts as select upper(clin_clean) as clin, count(clin_clean) as num_recs from clean_clin group by clin_clean order by num_recs; quit; ods listing; proc print data=clin_counts noobs; run; /* result*/ clin num_recs ABC 1 ABC 1 ABC 2 PDQ 3
You appear to be grouping by a different variable than you are including in the output.
Here is an example that makes it much clearer.
proc sql;
select mod(age,3) as mod_age,count(*) as nobs
from sashelp.class
group by age
;
mod_age nobs ------------------ 2 2 0 5 1 3 2 4 0 4 1 1
Also you probably need to include the CALCULATED keyword in the GROUP BY clause to make sure it is not grouping by some variable in CLEAN_CLIN that is named CLIN instead of the new CLIN variable you are calculating the in the SELECT statement.
Also do you really want the count to be zero for the group of records where CLIN_CLEAN is missing? That is what happens when you use a variable inside the COUNT() function, missing values are not counted.
Try this instead.
create table clin_counts as
select upper(clin_clean) as clin
, count(*) as num_recs
from clean_clin
group by calculated clin
order by num_recs;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.