Hoi, Can someone please help me, how can I merge this two file by matching the ID and the year (not the the whole date)? I don't know how i can take only the year part of the date.
File 1:
ID Date value1
File 2:
ID N Date value2
So I would have something like:
ID Date N Value1 value2
...
029948 12/31/1998 100001 0.070933 0.006000
029948 | 01/29/1999 100001 | 0.048564 | 0.006000 |
....
029948 12/31/1999 100001 0.053090 0.006000
029948 | 01/31/2000 100001 | -0.050745 | 0.005200 |
thank you very much!!!
YEAR is a function that you apply to a variable. So the syntax for your where clause should be
year(a.date) = year(b.date)
The syntax that you tried looks more like a method call for an object oriented programming language like PHP.
Use the year() function
I have try this but it did not work
proc sql;
create table want as
select *
from file1 as a left join
file2 as b
on a.id=b.id and a.year(date)= b.year(date);
quit;
YEAR is a function that you apply to a variable. So the syntax for your where clause should be
year(a.date) = year(b.date)
The syntax that you tried looks more like a method call for an object oriented programming language like PHP.
thanks, it works. may i have one additional question. if i want it also to take the first (year) available value or the latest available value. eg. above, 1998 is not available in file2 but it is available in file1 so i want to match all the observations in 1998 from file1 with the first (year 1999) in file 2 how can i do that?
You could try: (not tested)
on a.id=b.id
group by a.id, a.year(date)
having abs(a.year(date)-b.year(date))=min(abs(a.year(date)-b.year(date)) );
Haikuo
i have something like this
proc sql;
create table want as
select a.*, b.value2
from file1 as a left join
file2 as b
on a.id=b.id
group by a.id, year(a.date)
having abs(year(a.date)-year(b.date))= min(year(b.date)-year(a.date)) ;
quit;
but then all the data from 2000 onwards is exclude.
Hi,
If this is exact what you have run, then you need an additional abs(), like mentioned in my previous post:
having abs(a.year(date)-b.year(date))=min(abs(a.year(date)-b.year(date)) );
Good luck,
Haikuo
yes, thank you.
do you may know why i get more observation than the original file which i do not expect?
I can guess. As in general, if you get more obs than the master table after left join, it usually means you have more than one match per id, year(date). One way to find out how and why is to run something like the following query:
select * from want group by id, year(date) having count(*)>1;
Haikuo
thank you!
In addition to what Tom said, if you want to avoid the notes (and possibly assigning the wrong dates), change your select statement. i.e.:
proc sql;
create table want as
select a.*,b.value2
from file1 as a left join
file2 as b
on a.id=b.id and year(a.date)= year(b.date);
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.