I'm looking for advice on that one. Some context before.
I have the following table on SAS. There are 711 observations and many more variables. Below is a sample from my table.
date col1 col2 col3
jun14 0 0 0
may14 1 0 2
apr14 1 0 3
The table has no index, no primary key , nothing.
The results I'm aiming for, is to know for a specific date, all the values of that column.
date col1 col2 col3
may14 1 0 2
apr14 1 0 3
Example for May 14, I will have
I'm running the following SQL query on it
proc sql; select * from mytable where date < (input('may14',MONYY5.));
As you can imagine, the query is heavy when you have many variables and many observations. The query started 50 minutes ago and it is still running.
I also thought about using a proc print
proc print data=mytable; var date col1 col2 col3; where date = (input('may14',MONYY5.)); run;
So here is my question.
Is there an other way to have my results rather than through this query or the proc print? Do I need to have a datastep like a transpose , although if I'm doing a transpose, things would be different (see below).
date jun14 may14 apr14
col1 0 1 1
col2 0 0 0
col3 0 2 3
Thanks in advance for your insight.
711 observations and how many columns?
That shouldn't take 50 minutes. Are you on a server or desktop?
Reeza, I have 6000 variables ...
Well, lets think about it for a second. You have 6000 variables. No one in their sane mind is going to scroll through all 6000 variables to read them, I hope. So what are you doing with them.
Second, if you're printing to HTML or LISTING that will take a while to generate the file.
So the real question is what do you want the output to look for whomever is using it next. Then, how do you get it there efficiently.
Did you by any chance miss the quit statement at the end of proc sql query? Proc SQL needs "quit" at the end in order to stop.
If your process is still running then abort the procedure from SAS tool bar and try running it again with "quit"
Hope that helps!
Hi Everyone,
Thanks for the answers.
Let me have a try.
To add, I concur with Reeza, 6000 variables is not plausible, even turning the table round so you have 6000 rows and 711 variables would drastically increase your performance. If you have that many variables then the data should be normalized, e.g.
Date Col Result
JUN14 1 0
JUN14 2 1
JUN14 3 1
MAY14 1 0
MAY14 2 0
...
Whilst working with that many variables may work in SAS, its not a good idea. Also, depending on your data, try summarizing, grouping, etc to shrink the data up.
Hi RW9,
I concur with Reeza and you , actually I'm trying to identify these columns to drop them.
What are you trying to identify about those variables (columns)? And what does it have to do with the DATE variable?
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.