BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Yves_Boonen
Quartz | Level 8

I have been wondering about this for a while, and I haven't been able to find any answers online. The following data is fictional, but let's say you have this table:

 

zonereportsoldest_date1accidentsoldest_date2
ZONE1531/01/2016231/01/2016
ZONE1229/01/2016129/01/2016
ZONE1727/01/2016027/01/2016
ZONE133/02/201613/02/2016
ZONE2231/01/2016331/01/2016
ZONE2929/01/2016429/01/2016
ZONE2827/01/2016527/01/2016
ZONE243/02/201603/02/2016
ZONE3231/01/2016831/01/2016
ZONE3129/01/2016129/01/2016
ZONE3327/01/2016027/01/2016
ZONE333/02/201603/02/2016

 

Let's say you wanted to create a query that counts all the reports and accidents for each zone and remembers the lowest date for each report and accident for each zone. The result table should look like this:

 

zonereportsoldest_date1accidentsoldest_date2
ZONE11727/01/2016427/01/2016
ZONE22327/01/20161227/01/2016
ZONE3927/01/2016927/01/2016

 

In SQL you could easily approach this with something like this:

 

for each zone (sum(reports), min(oldest_date1), sum(accidents), min(oldest_date2))

 

And there you have it, you will receive a table like the one above. SAS seems devious when you want to do something like that and (for as far as I know) requires you to write a complex datastep. Is this true? Or could someone enlighten me?

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi Yves,

 

You need the GROUP BY clause:

proc sql;
create table want as
select zone, sum(reports) as reports, min(oldest_date1) as oldest_date1 format=ddmmyy10.,
             sum(accidents) as accidents, min(oldest_date2) as oldest_date2 format=ddmmyy10.
from have
group by zone;
quit;

View solution in original post

5 REPLIES 5
FreelanceReinh
Jade | Level 19

Hi Yves,

 

You need the GROUP BY clause:

proc sql;
create table want as
select zone, sum(reports) as reports, min(oldest_date1) as oldest_date1 format=ddmmyy10.,
             sum(accidents) as accidents, min(oldest_date2) as oldest_date2 format=ddmmyy10.
from have
group by zone;
quit;
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Not sure that the SQL variant is much shorted code wise than a datastep:

 

data want;  /* Drop or rename as you need */
  set have;
  by zone;
  retain sum_reports min_date1 sum_accidents min_date2;
  if first.zone then do;
    sum_reports=reports;
    min_date1=oldest_date1;
    sum_accidents=accidents;
    min_date2=oldest_date2;
  end;
  else;
    sum_reports=sum(sum_reports,reports);
    min_date1=ifn(oldest_date1 < min_date1,oldest_date1,min_date1);
    sum_accidents=sum(sum_accidents,accidents);
    min_date2=ifn(oldest_date2 < min_date2,oldest_date2,min_date2);
  end;
  if last.zone then output;
run;
Astounding
PROC Star

A complex DATA step is not needed.  PROC SUMMARY can handle this.  Since it appears that your data set is sorted by ZONE, I will use that in the solution.  Workarounds are possible if that's not the case.

 

proc summary data=have;

by zone;

var reports oldest_date1 accidents oldest_date2;

output out=want (drop=_type_ _freq_) sum(reports)=reports sum(accidents)=accidents min(oldest_date1)=oldest_date1

min(oldest_date2)=oldest_date2;

run;

Yves_Boonen
Quartz | Level 8

How could I forget about "group by". That basically means for each, but I find it very unnatural that you do that in one of your last steps.

Reeza
Super User

Proc summary/means is the equivalent. 

 

Think of the case where you want multiple levels, ie group by and total, or subtotals, proc means can incorporate that. Proc SQL cannot, without multiple queries. 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 922 views
  • 5 likes
  • 5 in conversation