I am trying to write a query that will select MIN and MAX timestamp on a single day. I am able to get a vast majority of my data to pull, but there are issues on a few of the rows where I can pull the MIN date but when I change it to MAX I get a page break as the results. There are multiple timestamps per day, but if not it still should select the same timestamp for MIN and MAX.
My code is extremely long or I would post it.
I am using a having statement to select the min and max similar to this:
proc sql; SELECT distinct t2.no, T1._ID, t2.id, t3.First_Name t3.last_name t3.TITLE, t3.02_LOCATION t3.Level_01, t3.Level_02, T1.M_TS T1.O_TS t3.employee T1.O_A FROM TEST.DAY T2 INNER JOIN B.LE T1 ON (T2.ID = T1._ID and (PUT(DATEPART(T2.ADate),date9.) = PUT(DATEPART(T1.M_TS),date9.))) INNER JOIN B.E T3 ON (T1.O_A = T3.EMPLOYEE AND (t3.O_TS > '15Dec9999:0:0:0'dt)) WHERE T2.NO = '11111-11111' GROUP BY T1._ID HAVING M_TS = MAX(T1.M_TS) and O_TS = MIN(T1.O_TS) AND FIRST_NAME = scan(t3.first_name,1,' ') AND LAST_NAME = SCAN(T3.LAST_NAME,1,' ') ; run;
I change nothing in the code but the MAX to MIN and the code works. Does anyone have an idea of what could cause this issue?
MH 2 cents thought:
If you are at an organization that uses SAS at enterprise level on a commercial level, I would rather prefer to deal a lot of this in piecemeal preferably taking advantage with a datastep approach unless something is preferred by egoistic imbecile bosses who think they have achieved a great deal after an explicit sql connect pass through writing tons of variable names with the potential leading to typos and plus the risk of "remerge" that may cause so much duplication and other unwarranted intricacies causing inaccuracy and integrity issues.
Taking advantage of the beautiful PDV block with exquisite block of addresses that holds the operands reading the descriptor portion giving us the breathing space to play with variables of interest for create, insert, update, delete alone is priceless while the other variables can just sleep and still be neatly written to the output buffer.
Of course, I love proc sql and have/will use when it's not unwieldy and that involves cleaner column processing involving less lexicon. Sure Proc Sql also has to have a memory address block and even the great guru Paul Dorfman has mentioned the APP functions can be put to effect in proc sql which might of my personal interest at some point in the future. Nevertheless, long select clauses with many nested in it will certainly cause someone a heartache for sure. HTH
Full code, log and error info plz
Got it.
HAVING M_TS = MAX(T1.M_TS) and O_TS = MIN(T1.O_TS) AND FIRST_NAME = scan(t3.first_name,1,' ') AND LAST_NAME = SCAN(T3.LAST_NAME,1,' ') ;
Nothing wrong with the above syntactically. From a business logic or data point of view can't tell. However, simple conclusion is that no records satisfy all the conditions specified in "having" filter.
MH 2 cents thought:
If you are at an organization that uses SAS at enterprise level on a commercial level, I would rather prefer to deal a lot of this in piecemeal preferably taking advantage with a datastep approach unless something is preferred by egoistic imbecile bosses who think they have achieved a great deal after an explicit sql connect pass through writing tons of variable names with the potential leading to typos and plus the risk of "remerge" that may cause so much duplication and other unwarranted intricacies causing inaccuracy and integrity issues.
Taking advantage of the beautiful PDV block with exquisite block of addresses that holds the operands reading the descriptor portion giving us the breathing space to play with variables of interest for create, insert, update, delete alone is priceless while the other variables can just sleep and still be neatly written to the output buffer.
Of course, I love proc sql and have/will use when it's not unwieldy and that involves cleaner column processing involving less lexicon. Sure Proc Sql also has to have a memory address block and even the great guru Paul Dorfman has mentioned the APP functions can be put to effect in proc sql which might of my personal interest at some point in the future. Nevertheless, long select clauses with many nested in it will certainly cause someone a heartache for sure. HTH
@novinosrin I love your answers and learn from you, i agree that Data step gives more flexibility in terms of Data processing but
1. Explicit sql has significant impact in run times and they are very important when they are lots of records.
2. Other than Proc sql , no other SQL implementation(as far as I know) allow remerge but multiple passes
@kiranv_ --"1. Explicit sql has significant impact in run times and they are very important when they are lots of records."
Not necessarily. I will prove that to you in time. Unfortunately not at 6:10 chicago time while on a hurry to catch red line CTA to get back home.
I agree to 2.
Finally Thank you for the kind note.
sure. I worked on code to process close to 200 millions records with close to 3000 lines of SAS code to Teradata sql, difference was from 20 min to 18 hours. major time issue is because of I/0
@novinosrin wrote:
That depends upon the quality of SAS user and not the product/language plus taking into account he objective/need.and optimum approaches.
Usually it has to do with running on a server versus running on a desktop is my guess.
It's a good workaround when you can't afford a SAS server, but need the power of a server. You are limited by the type of work though, because it's mostly standard data manipulation. Until SQL introduced PIVOT (2008ish) and some of the more statistical techniques it was more of an issue.
Thank you @Reeza. Well noted, The discussion might next slip into performance of proc sql vs datastep(libname) at engine level as the latter indeed has a higher overhead measuring direct extracts/in general column processing as opposed to row, fewer joins/set operators/indexes (more flexible in proc sql than datastep or even proc datasets for obvious reasons. Reference; complete guide to indexes - by author Micheal raithel) utmost.
Datastep is indeed powerful for various tasks that can be accomplished. However, saying that how extensive, variety/complexity it can get pretty much seems unlimited especially to the readers eye on this very forum when demonstrated by some whom we look up to as role models.( don't wanna keep repeating the ones I adore lol)
The matter of typing and maintaining convenience, control and the kind of objective./business requirement beyond the set of functionality described in paragraph one by proc sql and the control of handling variables to the tune of a SAS user be row level processing or column level processing(I must emphasize the required proficiency hash, array(multi) and dow etc) is what makes SAS the classy gem, a kind of its own and yet one can be obliterate to the descriptors(meta portion) i.e not having to worry about the variables that's that not required for processing which doesn't hinder.
On the other hand in proc sql, if fewer select clauses with small number of vars is fine, however long clauses with a huge list of variables, and joins made on multiple variables(with several AND operations) , aliases, ambiguous references and many more unnecessary nits to take care of. And on top of the that, Cartesian is another sweetheart to beware.
So to conclude, besides the server and other things, my view is that different objectives may warrant efficient approaches that would test the individual super user to his/her merits in coming up with the most optimum quality solution.
Best Regards!
PS IMHO, if SAS didn't provide the scope of competitiveness to pursue further far and wide, the charm and joy of coding wouldn't exist. 🙂 Lastly, I use sql when that happens to be the most laziest option to type(code) yet comparable to other options.
This really helped, thank you!
Hey @tugtrog Sorry for the bother and I hope your day is going well.
if you have a moment, take a look at the below new thread posted by a new member 🙂
https://communities.sas.com/t5/New-SAS-User/Merging-Data-in-Proc-SQL/m-p/507812
What a deja vu 🙂 So that means our discussion has actually made sense 🙂
Hey @novinosrin, you are not bothering me at all. I appreciate you sharing this with me, I am actually working on using a data step and PDV right now. Our discussion really did help me, and opened my mind to a whole new idea on how to quickly execute some massive queries on massive data sets that I don't know if I would have been exposed to otherwise.
Thanks again! Have a great weekend .
PS feel free to share anything with me!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.