BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
fierceanalytics
Obsidian | Level 7

Hello, 

Here is generic code

proc sql; select *, max(datex) FORMAT=DATETIME25.6 as datexx from table1 as a

     group by id1, id2

     having (calculated datexx) = a.datex

    order by id1, id2;

quit;

 

The purpose is to keep only those with max date value. Question is about the scope of max function. This max action is column wise obviously. with 'group by'n effect, it will run max within the groups formed by id1, id2, right? I think so. Does anybody have other opinion? Thanks. 

Jia

    

1 ACCEPTED SOLUTION

Accepted Solutions
RichardDeVen
Barite | Level 11

The SQL Procedure User's Guide contains reference documentation that explains everything, so you don't need an opinion of the interaction of GROUP BY and HAVING.

 

GROUP BY 

Specifies how to group the data for summarizing.

HAVING

The HAVING clause is used with at least one summary function and an optional GROUP BY clause to summarize groups of data in a table. A HAVING clause is any valid SQL expression that is evaluated as either true or false for each group in a query. Alternatively, if the query involves remerged data, then the HAVING expression is evaluated for each row that participates in each group. The query must include one or more summary functions.
 
Typically, the GROUP BY clause is used with the HAVING expression and defines the group or groups to be evaluated. If you omit the GROUP BY clause, then the summary function and the HAVING clause treat the table as one group.

 

View solution in original post

7 REPLIES 7
Reeza
Super User
There shouldn't be 'opinions' here, you should try the different scenarios and see what happens 🙂
There's no cost to testing it.
RichardDeVen
Barite | Level 11

The SQL Procedure User's Guide contains reference documentation that explains everything, so you don't need an opinion of the interaction of GROUP BY and HAVING.

 

GROUP BY 

Specifies how to group the data for summarizing.

HAVING

The HAVING clause is used with at least one summary function and an optional GROUP BY clause to summarize groups of data in a table. A HAVING clause is any valid SQL expression that is evaluated as either true or false for each group in a query. Alternatively, if the query involves remerged data, then the HAVING expression is evaluated for each row that participates in each group. The query must include one or more summary functions.
 
Typically, the GROUP BY clause is used with the HAVING expression and defines the group or groups to be evaluated. If you omit the GROUP BY clause, then the summary function and the HAVING clause treat the table as one group.

 

fierceanalytics
Obsidian | Level 7

I inherited a lot of code loaded with proc SQL. I am not going to spend expensive paid company time to read documentation like my data steps are not good enough. If a quick confirmation or 'no you are wrong' is not available, I will just rewrite it to data step and move on. Whichever way is faster. 

 

Thank you Richard. 

 

Jia

Kurt_Bremser
Super User

not going to spend expensive paid company time to read documentation

I consider any company that discourages learning on the job a hopeless case, not worth wasting time on.

What do you think has made the senior helpers here into experts? Exactly what you disparage in your comment. Thinking like that is a serious mistake, to your personal detriment and to the detriment of your employer.

fierceanalytics
Obsidian | Level 7

When I first saw DS2, I converted several proc sql programs to it. DS2 runs faster. Then I realized DS2 is a life saver because it finishes many SQL jobs that otherwise will die on proc sql. So I spent a lot, buying Jordan's book, enduring all the declare statement... to learn DS2. That brings a lot of joy and value to myself and my organizations. What are you going to label me then? great learner? What are you going to label me if I tell you I am not going to spend time learning other SQL like FEDSQL? "you refuse to learn, you are not good..."

 

After DS2, I spent some time learning programing on proc IMSTAT because it is yet another faster technology, better than SAS foundation that underlies proc sql. Now I have had yet good practice with Viya. Throughout this learning process and years, I took jobs that asked me to rewrite SAS to first R and then Python. I therefore do know first hand that not all packages interact "group by" and 'having'-like the same way SAS does. 

 

At my current job I am helping with modernizing things. The starting point is pile of legacy proc SQL code, among others. Now not only i need to recall what I learned from Howard Schreier's "proc sql by example" book way back, I also have to separate signals from non-SAS package behavior. So I thought I came here to ask people who can just spit out tome quick answers. 

 

So please do not judge companies who expect employees to selectively learn, to learn with value goals in mind, yearly, quarterly if not daily. If you learn SAS, try to learn something that runs on better, modern technologies. In other words, modernize yourself while learning. The first time I read SAS BASE documentation was ~1991 when there was no PDF. From UNIX/VAX, to mainframe, now Windows.  Sometime all platforms together. I don't remember how many times I went through documentations, I still forget. I know who is good programmer when I see one. Actually very good ones often find problems with documentation and wrote books/papers about them. Schreier, Virgile, Dorfman, Whitlock? to recall a few. You are right practice makes it good (expertise not guaranteed) . On the other hand, specialty is specialty. If you ask me how HPFOREST works, or compare HPFOREST with Python, I am not going to send you to  read documentation or go to stackoverflow.com. I can share with you 3-15 bullet points depending on how much detail you like. Just be humble, respect the respectable if you want good ROI of your time. 

 

Jia

 

 

sqlGoddess
SAS Employee

thanks for your question

 

I might simplify the code as follows to still get the same results as your original code.

 

proc sql;
	select * from sashelp.mdv
		group by city
			having shipdate =max(shipdate);
quit;

%put &=sqlobs;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 826 views
  • 3 likes
  • 5 in conversation