BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
gsnidow
Obsidian | Level 7

Greetings all.  I'm trying to turn a simple SQL group by into SAS code, but I'm not sure which proc I should use.  The SQL is below.

select

     max_n,

     cr_scor,

     bad_debt_at_conn,

     dep_unpaid,

     chg_off,

     count(*) as count

  from npso

where cr_scor is not null

  and is_max_n = 1

group by max_n, cr_scor, bad_debt_at_conn, dep_unpaid, chg_off

I know I could just run proc sql, but I was thinking maybe there was a proc to do this.  There are around 30 other variables in the table, but these are the only ones I want to keep.  Thank you.

Greg


1 ACCEPTED SOLUTION

Accepted Solutions
Aad
Calcite | Level 5 Aad
Calcite | Level 5

Better alternative is to use PROC SUMMARY as it does NOT calculate statistics at all, unless requested to do so. And of course, omitting calculation saves time (eventually you loose your coffee break Smiley Happy)

Proc Summary NoPrint Nway Missing;

     By Max_N Cr_Scor Bad_Dept_at_Conn Dep_Unpaid Chg_Off;

     Output Out = Whichever_Name_You_Want (Drop = _type_);

     Run;

This writes exactly 1 row for each combination of BY-variables.

BTW: You may need a PROC SORT before you call PROC SUMMARY if the data is not in proper order, or you may replace the BY statement with an identical CLASS statement. It depends on your input wether the SORT+BY or the CLASS version is most efficient.

View solution in original post

15 REPLIES 15
AncaTilea
Pyrite | Level 9

Hi.

Why would you want to change a nicely looking SQL code to several PROCs and Data Steps?

You'd have to use PROC SORT, PROC MEANS/SUMMARY and at least one data step to combine the results from the MEANS/SUMMARY procedure.

Smiley Happy

gsnidow
Obsidian | Level 7

Well, I've been using SQL Server for around 10 years now, and I'm pretty proficient with SQL.  However, there are some ANSI standards in SQL, like 'WITH' statement, row_number(), among others, that are not included in SAS SQL, so in learning SAS, I would like to avoid using SQL as much as possible.  I do use it, but the first couple of programs I wrote are nothing but a series of proc sql statements, and I feel like that is cheating a little bit.

AncaTilea
Pyrite | Level 9

:smileylaugh: fair.

Here is some really poorly written code, but it should give you an idea:

*using sashelp.class as example;

proc sql;

select

     age, weight,height,sex,  count(*) as count

  from sashelp.class

  where age is not null

group by sex,age;

quit;

It gives the following output:

     Age    Weight    Height  Sex     count

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

      11      50.5      51.3  F           1

      12      84.5      59.8  F           2

      12        77      56.3  F           2

      13        98      65.3  F           2

      13        84      56.5  F           2

      14     102.5      62.8  F           2

      14        90      64.3  F           2

      15       112      66.5  F           2

      15     112.5      62.5  F           2

      11        85      57.5  M           1

      12        83      57.3  M           3

      12      99.5        59  M           3

      12       128      64.8  M           3

      13        84      62.5  M           1

      14     112.5        69  M           2

      14     102.5      63.5  M           2

      15       112      66.5  M           2

      15       133        67  M           2

      16       150        72  M           1

Now, let's PROC it:

First sort by what you are "group by" in your SQL:


  proc sort data = sashelp.class out =  have; by sex age;run; *Create a temp file, so we don't sort the original;

Then we use "where" and "keep" statements in the "set" statement:

data want(keep = sex age cnt);

    set temp(where = (age ne 0)

                    keep = age weight sex height);

    by sex age;

    if first.age then cnt = 1;

        else cnt ++ 1;

    if last.age; *we want the "total" count;

run;

Finally get it all together;

data really_want;

    merge have want;

    by sex age;

run;

Hee,hee...Smiley Wink

gsnidow
Obsidian | Level 7

Thank you Anca.  That gives me some stuff to think about.  I guess I thought there was a magic proc that would do the same as the SQL in one fell swoop.

Greg

AncaTilea
Pyrite | Level 9

Maybe there is a a PROC, I just don't know it...

ronf57
Calcite | Level 5

Thanks, i learned alot about the SQL to SAS conversion from your discussion and posts.

Reeza
Super User

I think that's a single proc means using a class statement, but you'll need to control the output using a ways or type statement. The only issue is since you're counting you need to have a variable to count, so I use max_n...though you can use any variable in your dataset.

proc means data=npso noprint;

class max_n cr_scor bad_debt_at_conn dep_unpaid chg_off;

ways max_n*cr_scor*bad_debt_at_conn*dep_unpaid*chg_off;

var max_n;

output out=want n=count;

run;

gsnidow
Obsidian | Level 7

I think this will do it, but I'm not quite sure why there would be missing values for my 'N', which is just _freq_ renamed.

proc means data=npso noprint;

     by max_n cr_scor bad_debt_at_conn dep_unpaid chg_off;

     output out=want (rename=(_freq_=N)

                              keep=max_n cr_scor bad_debt_at_conn dep_unpaid chg_off n);

run;

Now, it seems my big problem is that I am unable to copy and paste anything in this window.  Does anyone else have that happen?

Thank you

Greg

Reeza
Super User

The _freq_ would give you what you want...we both forgot the initial where clause from the SQL query, if you put that in you might get what you wanted without renaming.

If you can't copy/paste that means you're using IE, use chrome or firefox instead.

Linlin
Lapis Lazuli | Level 10

I have the same copy/paste problem. The link shows you how I deal with the problem:

https://communities.sas.com/thread/34723

gsnidow
Obsidian | Level 7

Well, that (what I posted above) can't be right either, since there are 36,925 obs in the want table, and only 27,519 in the source table.  I have no option here to use anything other than explorer, but thank you for the tip Linlin.

Reeza
Super User

The where clause needs to be added in:

proc means data=npso noprint;

where not missing(cr_scor)

  and is_max_n = 1;

     by max_n cr_scor bad_debt_at_conn dep_unpaid chg_off;

     output out=want (rename=(_freq_=N)

                              keep=max_n cr_scor bad_debt_at_conn dep_unpaid chg_off n);

run;

gsnidow
Obsidian | Level 7

Ok, I get it now.  Since I dropped all but the columns I want, I did not see there are multiple rows for each category, one for each statatistic.  It seems the only row I want to keep is where _stat_ = 'N'.  The missing values were the STD row, which makes total sense now. 

Aad
Calcite | Level 5 Aad
Calcite | Level 5

Better alternative is to use PROC SUMMARY as it does NOT calculate statistics at all, unless requested to do so. And of course, omitting calculation saves time (eventually you loose your coffee break Smiley Happy)

Proc Summary NoPrint Nway Missing;

     By Max_N Cr_Scor Bad_Dept_at_Conn Dep_Unpaid Chg_Off;

     Output Out = Whichever_Name_You_Want (Drop = _type_);

     Run;

This writes exactly 1 row for each combination of BY-variables.

BTW: You may need a PROC SORT before you call PROC SUMMARY if the data is not in proper order, or you may replace the BY statement with an identical CLASS statement. It depends on your input wether the SORT+BY or the CLASS version is most efficient.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 15 replies
  • 3926 views
  • 7 likes
  • 6 in conversation