Hi,
I am a relatively new user of SAS.
I need to merge two MS excel files by date.
However, from the second file (file 2), I need to merge only observations that are within 30 days of each date listed in the first excel file (file 1).
Pls do provide some insight on the SAS code I can use for this. My code merges all data in both files.
%LET AllData=
File1
File2
;
%_import(&AllData);
data TR;
set
File1
File2
;
Ari.
Can you provide some sample data that we can work with? Without knowing your data structure we'd be making wild guesses.
If your data is confidential make fake data that reflects your data.
Sure, Reeza.
File 1:
Date | rr | vv | |
1/1/1990 | 366.12 | 366.12 | 0.011158 |
1/3/1990 | 353.94 | 353.94 | 0.018884 |
1/6/1990 | 359.53 | 359.53 | 0.012618 |
1/7/1990 | 354.88 | 354.88 | 0.014233 |
1/9/1990 | 365.24 | 365.24 | 0.012643 |
1/14/1990 | 369.79 | 369.79 | 0.011378 |
File 2:
Date | aa | bb | ff |
1/5/1990 | 20.11 | 20.11 | 0.04426 |
1/8/1990 | 20.26 | 20.26 | 0.0074 |
1/9/1990 | 22.2 | 22.2 | 0.08739 |
1/10/1990 | 22.44 | 22.44 | 0.0107 |
1/11/1990 | 20.05 | 20.05 | -0.1192 |
1/16/1990 | 24.18 | 24.18 | -0.0893 |
1/17/1990 | 24.16 | 24.16 | -0.0008 |
1/18/1990 | 24.34 | 24.34 | 0.0074 |
1/19/1990 | 22.5 | 22.5 | -0.0818 |
1/22/1990 | 26.7 | 26.7 | 0.1573 |
1/20/1990 | 27.25 | 27.25 | 0.02972 |
2/11/1990 | 25.36 | 25.36 | -0.0745 |
2/17/1990 | 24.87 | 24.87 | -0.0197 |
2/22/1990 | 24.32 | 24.32 | -0.0226 |
2/25/1990 | 24.54 | 24.54 | 0.00897 |
2/26/1990 | 24.69 | 24.69 | 0.00608 |
3/27/1990 | 24.29 | 24.29 | -0.0165 |
I want to merge only observations from 'file 2' that are within 30 days of each date listed in 'File 1'.
The merged file should look like:
Date | rr | vv | aa | bb | ff | |
1/1/1990 | 366.12 | 366.12 | 0.011158 | |||
1/3/1990 | 353.94 | 353.94 | 0.018884 | |||
1/5/1990 | 20.110001 | 20.110001 | 0.04425669 | |||
1/6/1990 | 359.53 | 359.53 | 0.012618 | |||
1/7/1990 | 354.88 | 354.88 | 0.014233 | |||
1/8/1990 | 20.26 | 20.26 | 0.0074037 | |||
1/9/1990 | 365.24 | 365.24 | 0.012643 | |||
1/9/1990 | 22.200001 | 22.200001 | 0.08738743 | |||
1/10/1990 | 22.440001 | 22.440001 | 0.01069519 | |||
1/11/1990 | 20.049999 | 20.049999 | -0.1192021 | |||
1/16/1990 | 369.79 | 369.79 | 0.011378 | |||
1/17/1990 | 24.16 | 24.16 | -0.0008278 | |||
1/18/1990 | 24.34 | 24.34 | 0.00739523 | |||
1/19/1990 | 22.5 | 22.5 | -0.0817778 | |||
1/22/1990 | 26.700001 | 26.700001 | 0.1573034 | |||
1/20/1990 | 27.25 | 27.25 | 0.02972473 | |||
2/11/1990 | 25.360001 | 25.360001 | -0.0745268 |
Why does file1 have a date=01/14/1990, but the wanted file does not?
1) show the code you used that doesn't do what you want so we don't recommend that.
2) provide some example data from each set NOT as spread sheets but as SAS data step code: Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
3) Show what the results should be for your given example data (just enough records to show the different cases.
Details are needed because we can't tell if only want one record from file 2 matched to one record from file 1 or potentially multiples of each.
You may try next code:
data all_files;
set file1 file2 indsname=_file_in;
input_file = _file_in;
run;
proc sort data=all_files; by date input_file; run;
data want;
retain lag_file lag_date phase;
lag_file = lag(input_file);
if lag_file = 'FILE1' then do;
pase = 1;
lag_date = date;
end;
else phase =2;
if phase=1 or
_N_ > 1 and phase = 2 and date - lag_date le 30;
run;
Thank you, Shumel.
Unfortunately, this code did not read all my data properly.
I used this:
1) According to input you posted, the date variable is named date.
Where from did you get date1 date2 in your code ?
data file1;
infile cards dlm='09'x truncover;
input date mmddyy10. qq rr vv ;
format date yymmdd10.;
cards;
01/01/1990 366.12 366.12 0.011158
01/03/1990 353.94 353.94 0.018884
01/06/1990 359.53 359.53 0.012618
01/07/1990 354.88 354.88 0.014233
01/09/1990 365.24 365.24 0.012643
01/14/1990 369.79 369.79 0.011378
; run;
data file2;
infile cards dlm='09'x truncover;
input date mmddyy10. aa bb ff;
cards;
01/05/1990 20.11 20.11 0.04426
01/08/1990 20.26 20.26 0.0074
01/09/1990 22.2 22.2 0.08739
01/10/1990 22.44 22.44 0.0107
01/11/1990 20.05 20.05 -0.1192
01/16/1990 24.18 24.18 -0.0893
01/17/1990 24.16 24.16 -0.0008
01/18/1990 24.34 24.34 0.0074
01/19/1990 22.5 22.5 -0.0818
01/22/1990 26.7 26.7 0.1573
01/20/1990 27.25 27.25 0.02972
02/11/1990 25.36 25.36 -0.0745
02/17/1990 24.87 24.87 -0.0197
02/22/1990 24.32 24.32 -0.0226
02/25/1990 24.54 24.54 0.00897
02/26/1990 24.69 24.69 0.00608
03/27/1990 24.29 24.29 -0.0165
; run;
/**** YOUR CODE *****
proc sql noprint;
create table file3 as
select * from file1 as a, file2 as b
where a.date-30 <= b.date <= a.date+30;
quit;
******************/
data all_files;
set file1 file2 indsname=_file_in;
input_file = _file_in;
run;
proc sort data=all_files; by date input_file; run;
data want;
set all_files;
retain lag_file lag_date phase;
lag_file = lag(input_file);
if lag_file = 'WORK.FILE1' then do; /* this line was fixed */
pase = 1;
lag_date = date;
end;
else phase =2;
if phase=1 or _N_ > 1 and phase = 2 and date - lag_date le 30;
run;
Sorry for being late to the tread but, unless I missed something, I don't think you have gotten suggested code that meets your specs.
If I correctly understand what you want, wouldn't something like the following suffice?:
proc sql noprint; select min(date) into :mindate from file1 ; quit; data want; set file1 file2 (where=(&mindate. <= date <= (&mindate.+30))); run;
Art, CEO, AnalystFinder.com
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.