BookmarkSubscribeRSS Feed
Walternate
Obsidian | Level 7

Hi all,

 

I have a proc sql step which is building some counts by clinic:

 

proc sql;
create table clin_counts as
select upper(clin_clean) as clin, count(clin_clean) as num_recs
from clean_clin
group by clin_clean
order by num_recs;
quit;

 

The step works but I'm confused because clin_clean is all upper case, while the resulting variable after this step (clin) is not. 

 

I added the upper function to the SQL step, but the resulting values are still all lower case. 

 

Any help is much appreciated!

4 REPLIES 4
Quentin
Super User

That is surprising. Can you post the log from running:

 

proc sql;
create table clin_counts as
select upper(clin_clean) as clin, count(clin_clean) as num_recs
from clean_clin
group by clin_clean
order by num_recs;
quit;

data _null_;
  set clin_counts(obs=5);
  put clin= num_recs=;
run;
BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
Reeza
Super User

Please post your log. 

proc sql;
create table clin_counts as
select upper(clin_clean) as clin, count(clin_clean) as num_recs
from clean_clin
group by clin_clean
order by num_recs;
quit;

 proc print data=clin_counts (obs=5);
run;
ballardw
Super User

Example data in the form of data step that exhibits this behavior with shown. Since the only variable needed is your Clin_clean I would hope that is relatively easy.(See my example below)

 

What format is currently applied to your Clin_clean variable?

 

Note: You want to make sure that the variable has the values correct before using groupby in the syntax you show. If the Clin_clean has different case on input you will get multiple groups because Group By is not using the calculated version of the Clin variable.

 

 

datalines;
abc
abc
Abc
AbC
pdq 
pdq
pdq
;

proc sql;
create table clin_counts as
select upper(clin_clean) as clin, count(clin_clean) as num_recs
from clean_clin
group by clin_clean
order by num_recs;
quit;

ods listing;
proc print data=clin_counts noobs;
run;

/* result*/
clin    num_recs
ABC         1
ABC         1
ABC         2
PDQ         3


 

 

Tom
Super User Tom
Super User

You appear to be grouping by a different variable than you are including in the output.

 

Here is an example that makes it much clearer.

proc sql;
select mod(age,3) as mod_age,count(*) as nobs
from sashelp.class
group by age
;
 mod_age      nobs
------------------
       2         2
       0         5
       1         3
       2         4
       0         4
       1         1

Also you probably need to include the CALCULATED keyword in the GROUP BY clause to make sure it is not grouping by some variable in CLEAN_CLIN that is named CLIN instead of the new CLIN variable you are calculating the in the SELECT statement.

 

Also do you really want the count to be zero for the group of records where CLIN_CLEAN is missing?  That is what happens when you use a variable inside the COUNT() function, missing values are not counted.

 

Try this instead.

create table clin_counts as
select upper(clin_clean) as clin
     , count(*) as num_recs
from clean_clin
group by calculated clin 
order by num_recs;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 477 views
  • 2 likes
  • 5 in conversation