04-22-2017 12:23 PM
Hi I have two data that are collected with same time range. However, after deleting the missing value, the two data gave me different date and left me a question about how to matching. Here is the example of the data after deleting missing value.
My question is how to merge these two csv files into one combined file with the date is matched?
Here is the sas code I have so far:
PROC IMPORT OUT= work.CAD DATAFILE= "G:\one_drive\OneDrive\MSCM5P04\A5\FRB_FEX_dailyrate.csv" DBMS=CSV REPLACE; GETNAMES=YES; DATAROW=2; RUN; * Import canadian dollar data; data work.CAD; * Delete missing value; set work.CAD; if CAD= "ND" then CAD = 0; if CAD = 0 then Delete; Rename Time_Period = Date; run; Proc Export data=work.CAD Outfile= "G:\one_drive\OneDrive\MSCM5P04\A5\CAD_filter.csv" DBMS=csv replace; Putname=yes; run; * Export canadian dollar data; PROC IMPORT OUT= work.Oil DATAFILE= "G:\one_drive\OneDrive\MSCM5P04\A5\Crude_oil_price_investing_csv.csv" DBMS=CSV REPLACE; GETNAMES=YES; DATAROW=2; RUN;
Any suggestion about merging or matching is appreciated, thank you in advance.
04-22-2017 01:10 PM
Not sure why you are both importing and exporting the files, but that isn't relevant to your question.
Also not relevant to your question, a variable in SAS is either character or numeric. Since your CAD variable is character, you can't assign numbers to it. You would have to create a new variable using the input function to convert the dollars into numbers.
However, to merge the two files, you could use either a data step or proc sql.
Here is an example using a data step:
data want; merge cad (in=ina) oil (in=inb); by date; if ina and inb; run;
Art, CEO, AnalystFinder.com
04-22-2017 03:57 PM
Hi art297, thank you for replying, the code I attached is not relevant with the quesiton, it is the code for basic data cleaning.
The question is I don't know if I should match the two data with the date and fill the missing date with 0. Or delete very unmatched date.
Speaking of CAD, it is the exchange rate so not the dollar value. Sorry for not stating the data clearly. What should the step be in terms of dealing with exchange rate (or, in another word, converting exchange rate into number)?
04-22-2017 11:04 PM
"should match the two data with the date and fill the missing date with 0. Or delete very unmatched date"
That's a business logic and not a technical question.
If I understand your problem right then you're having a table with prices in a currency and a table with exchange rates for this currency to another currency and you need to create a new variable which is "price currency 2" = "price currency 1" * "exchange rate"
If above assumption is correct then you need of course an exchange rate for every single date where you have a price in currency 1.
In my mind exchange rates get published so my first advice would be: Get better data! If that's not possible then you need to get clarity what business rules to apply for dates with a missing exchange rate; i.e. should you use the available exchange rate closest to the missing date or always the last previous non-missing date or.... That's something your business users/report consumers need to tell you (if that's not you yourself).
Once you've got the business logic sorted I then would write the code to fill in the blanks in your exchange rate table so that you've got a row for every single day - and then use a data step merge or SQL join as already proposed.
With SQL you could also join using date ranges (=match to last previous available date). There are already multiple threads in this forum demonstrating how that can be done.
04-23-2017 12:26 PM - edited 04-23-2017 12:27 PM
Hi Patrick, thank you for replying.
I was trying to match the exchange rate data with oil return data. In general oil return data is less than exchange rate data in daily basis so I don't know whether to fill 0 in oil data in order to match exchange rate data.
For the merging part, I was stilling trying to figure it out. I tried the solutions provided under this question but seems does not work very well.
04-22-2017 01:12 PM
Not sure what CSV files have to do with the quesiton on merge files.
If you want to merge files by date I suggest merging the files by date.
data want ; merge file1 file2 ; by date; run;
04-22-2017 06:08 PM
@Xusheng: The code I suggested was for matching the two files and, by using the in=ina and in=ib options, eliminating any dates that weren't present in both files.
Your subsequent question regarding how to convert character dates into SAS dates, and conversion rates into numeric fields, can also be used to delete records that don't have valid rates. e.g.:
data cad; informat date $10.; informat cad $6.; input date cad; cards; 2004-01-02 1.29 2004-01-05 1.2803 2004-01-06 1.2824 2004-01-07 1.2824 2004-01-08 1.279 2004-01-09 1.269 ND . 2004-01-12 1.2735 2004-01-13 1.2729 ; data want_cad (drop=_:); format date date9.; set cad (rename=(date=_date cad=_cad)); date=input(_date, ?? anydtdte11.); cad=input(_cad,8.); if not missing(cad); run;
Art, CEO, AnalystFinder.com