Dear SAS professionals,
I created a table by using the other two but the resulted dataset is completely empty, anyone who might have a idea about the reasons? Many thanks for your help in advance!
212 proc sql;
213 create table mydata as select *
214 from compx2 as a, LinkTable as b
215 where a.char_cusip = b.cusip and
216 b.LINKTYPE in ("LC", "LS", "LU", "LX", "LD", "LN") and
217 (b.LINKDT <= a.endfyr or b.LINKDT = .B) and (a.endfyr <= b.LINKENDDT or b.LINKENDDT = .E)
217! ;
NOTE: Table SIQI_IO.MYDATA created, with 0 rows and 73 columns.
218 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.37 seconds
cpu time 0.40 seconds
Thank you for your post, Reeza!
NOTE: There were 1257843 observations read from the data set SIQI_IO.COMPX2.
NOTE: There were 21 observations read from the data set SIQI_IO.LINKTABLE.
Yes, I tried your suggestion line by line and found actually the error should be starting from WHERE clause:
85 proc sql;
86 create table mydata as select compx2.*, LinkTable.*
87 from compx2 as a, LinkTable as b
88 where a.char_cusip = b.cusip;
NOTE: Table SIQI_IO.MYDATA created, with 0 rows and 73 columns.
88 ! /*and
89 b.LINKTYPE in ("LC", "LS", "LU", "LX", "LD", "LN") and
90 (b.LINKDT <= a.endfyr or b.LINKDT = .B) and (a.endfyr <= b.LINKENDDT or b.LINKENDDT = .E)
90 ! ; */
91 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.92 seconds
cpu time 0.43 seconds
Can you possibly spot the issue? Many thanks for your time with my enquiry!!
Most likely no records met the requirements in the Where clause.
Please see this code for a brief example that does the same thing with fewer variables and easy to see mismatch in the values on the where clause:
data one; input x y; datalines; 1 2 3 4 5 6 ; run; data two; input x z; datalines; 10 20 20 30 40 50 ; run; proc sql; create table work.junk as select a.*, b.z from one as a, two as b where a.x=b.x ; quit;
With only 3 records it is easy to see that x has no matches between the two example sets. But it builds an empty data set with all of the variables referenced.
Since you have a much more complex example you will need to trace things if you expect them to work.
I would start by starting with
where a.char_cusip = b.cusip
to see if you get any results. If you do then add in each of the "and" bits one at a time and see if one of them is failing.
If you are comparing actual date values (possibly LINKDT or LINKENDDT) and your Endfyr is not a complete date but just a year value that may be an issue.
Thank you very much for your example and explanations! It's truly helpful for my understanding!
Yes, I tried and found that starting with WHERE clause, there is no result produced so the issue is there! Do you know why does it like that?
LINKDT or LINKENDDT is actual daily numerical date variables and Endfyr is just a year value. However, this code is used to work but have no idea why it couldn't manage now....
I will double check with the date issue and hopefully it can work eventually. Many thanks!
@Jaaa wrote:
LINKDT or LINKENDDT is actual daily numerical date variables and Endfyr is just a year value. However, this code is used to work but have no idea why it couldn't manage now....
If one value is a year, and one is a SAS date that comparison wouldn't be correct. But if it worked before then I suggest you check your data, it's most likely a data issue.
I think I found the problem is that the cusip length in both datasets is different. One is 8 digits and the other is 9 digits. However, when I used the following code to increase its length, it still didn't make any difference. Do you possibly know how can I achieve that? Many thanks for your help indeed! 🙂
data test;
informat cusip $9.;
format cusip $9.;
set mydata;
run;
@Jaaa wrote:
I think I found the problem is that the cusip length in both datasets is different. One is 8 digits and the other is 9 digits. However, when I used the following code to increase its length, it still didn't make any difference. Do you possibly know how can I achieve that? Many thanks for your help indeed! 🙂
data test; informat cusip $9.; format cusip $9.; set mydata; run;
What should the length be?
If you Endfyr variable happened to be 2017 then likely the only dates less than that value are going to be before 10Jul 1965.
If this code worked before either something such as Year(datevariable)< endfyr or your Endfyr variable was an actual date value such as '31DEC2016'd
@ballardw wrote:If you Endfyr variable happened to be 2017 then likely the only dates less than that value are going to be before 10Jul 1965.
Thank you for your reply! I'm bit confused so what do you mean by that?
@Jaaa wrote:
@ballardw wrote:
If you Endfyr variable happened to be 2017 then likely the only dates less than that value are going to be before 10Jul 1965.
Thank you for your reply! I'm bit confused so what do you mean by that?
If your year value is 2017 then that is its value. The numeric value of a date 01JAN2017, which is what would be used for comparison, is 20820 if the value is a SAS date value since SAS dates are number of days since 01Jan1960. So 2017 as a numeric value corresponds to 10Jul1965. So the only "dates" less than 2017 are prior to 10Jul1965. If your Endfyr value is an actual SAS date corresponding to 31DEC2017 (end of a calendar year) that would be a numeric value of 21184.
It is reasons like this we request example data. We have no idea what actual values you have, what you are comparing and can't tell exactly what fails. I am guessing that you have date values, of some sort, since you have variable names ending in DT or YR but if they are not all actually SAS date values then comparisons can get very sticky.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.