Hi Everyone,
I merge target file with tic | date | clock_time to ALL with tic| date| clock_time| VOL on tic date clock_time. (of course for this illustration purpose)
It is so weird that my simple SQL merge below create missing value for VOL for some records while VOL has value in the ALL file.
I am not sure if it is because of data issue or anything else, so I upload both source files and the result file as well as my code so you can see.
Many thanks,
HHC
proc import datafile='E:\target.xlsx' replace out=target;run;
proc sql;
create table output as select a.*, vol from target as a left join ALL b
on a.tic=b.tic and a.date=b.date and a.clock_time=b.clock_time
order by tic, date, clock_time;
quit;
data missing; set output;
if vol=.;run;
Print out of missing file. These rows have VOL data in the ALL dataset
Obs | tic | clock_time | date | vol |
---|---|---|---|---|
1 | A | 14:33:00 | 14NOV2023 | . |
2 | A | 14:34:00 | 14NOV2023 | . |
3 | A | 14:35:00 | 14NOV2023 | . |
4 | A | 14:36:00 | 14NOV2023 | . |
5 | A | 14:37:00 | 14NOV2023 | . |
However, if you open the target file, you can see it has 14:33:00.
There is no millisecond shown. But when I use the round function to round everything to minute, the SQL works as expected. Still very weird.
There is no milliseconds shown because of the format used to print the values. They can of course still exist as part of the internal value (stored in a numeric variable as count of seconds since midnight).
Also your Excel source uses potentially a format that doesn't surface fractional seconds. If you export your Excel to a .csv then you should see the values there.
Below sample code to illustrate what I'm talking about.
data work.all;
tic='A';
date='14NOV2023'd;
clock_time='14:33:00't;
format date date9. clock_time time.;
vol=123;
run;
data work.target;
tic='A';
date='14NOV2023'd;
clock_time='14:33:00't;
format date date9. clock_time time.;
output;
clock_time='14:33:00.0000001't;
output;
clock_time='14:33:00.999't;
output;
run;
proc print data=work.all;run;
proc sql;
select
tic
,date
,clock_time format=time.
,clock_time format=time16.7
,clock_time format=best32.
from target
;
quit;
proc sql;
select
a.tic
,a.date
,round(a.clock_time) as clock_time format=time8.
,b.vol
from work.target a left join work.all b
on a.tic=b.tic and a.date=b.date and round(a.clock_time)=round(b.clock_time)
;
quit;
It would be easier to help you, if you could post the data used. No excel-files, of course, but data steps using datalines. You have checked the log for suspicious messages?
You left join table ALL that contains your variable Vol to table Target. It appears that for the keys with a missing Vol there is no matching row in your table ALL.
To test the theory run below code. It should create table test with zero rows.
data test;
set all;
if tic='A' and date='14NOV2023'd and clock_time='14:33:00't;
run;
And to ensure that the row exists with this exact key (like no decimals for date and time) also cross-check for table Target. Here you should get a row selected.
data test_2;
set target;
if tic='A' and date='14NOV2023'd and clock_time='14:33:00't;
run;
I see what you meant.
However, if you open the target file, you can see it has 14:33:00.
There is no millisecond shown. But when I use the round function to round everything to minute, the SQL works as expected. Still very weird.
data target; set target;
clock_time = round(clock_time,60);
data all; set all;
clock_time = round(clock_time,60);
data test_2;
set target;
if tic='A' and date='14NOV2023'd and '14:32:00't<clock_time<'14:34:00't;
run;
Obs | tic | clock_time | date |
---|---|---|---|
1 | A | 14:33:00 | 14NOV2023 |
2 | A | 14:34:00 | 14NOV2023 |
The fact that fractions of seconds are not displayed by the assigned format does not mean they're not there.
Repeat until nauseated: The formatted display value is not the entire value of the underlying variable.
Numeric formats as supplied by SAS typically "round" displayed values to fit in a specified or default width.
If the value is 0.0000000001 and the format is Best12. (default for unassigned numeric variables) you will typically see a displayed value of 1E-10.
Date formats typically ignore decimal portions of the value as at best that would represent a fraction of a day and is not in the rules for display.
Time or datetime will display fractions of a second in as many decimal points as you specify with the format. If there is no such in the format then the fractional seconds are not displayed. And may well be rounded to match specified width.
Custom formats can do some even more amazing things to the values but the match for Join or By values in data step still use the value of the variable, not the formatted value unless you go to some extra steps to force that behavior.
However, if you open the target file, you can see it has 14:33:00.
There is no millisecond shown. But when I use the round function to round everything to minute, the SQL works as expected. Still very weird.
There is no milliseconds shown because of the format used to print the values. They can of course still exist as part of the internal value (stored in a numeric variable as count of seconds since midnight).
Also your Excel source uses potentially a format that doesn't surface fractional seconds. If you export your Excel to a .csv then you should see the values there.
Below sample code to illustrate what I'm talking about.
data work.all;
tic='A';
date='14NOV2023'd;
clock_time='14:33:00't;
format date date9. clock_time time.;
vol=123;
run;
data work.target;
tic='A';
date='14NOV2023'd;
clock_time='14:33:00't;
format date date9. clock_time time.;
output;
clock_time='14:33:00.0000001't;
output;
clock_time='14:33:00.999't;
output;
run;
proc print data=work.all;run;
proc sql;
select
tic
,date
,clock_time format=time.
,clock_time format=time16.7
,clock_time format=best32.
from target
;
quit;
proc sql;
select
a.tic
,a.date
,round(a.clock_time) as clock_time format=time8.
,b.vol
from work.target a left join work.all b
on a.tic=b.tic and a.date=b.date and round(a.clock_time)=round(b.clock_time)
;
quit;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.