Hi there!
I am trying to combine two rows into one, taking only some information from both rows.
I have the following hypothetically financial dataset:
Date, BM, Asset_Class, Weight, Return
24-11-2015, BM1, Stocks, 30%, 2%
24-11-2015, BM2, Stocks, 34%, 5%
I would like to create only one row, where I take the Date, Asset_Class and the weight from BM1 and the return variable from BM2. The new row will still have to be named named BM1.
Is that possible?
Best, Joe.
Again, assuming that variable Date is the link between the obs. with BM='BM1' and the unique corresponding obs. with BM='BM2', the following code should work:
data have;
infile cards dlm=',';
input Date :ddmmyy10. BM $ Asset_Class $ Weight $ Return $;
format Date ddmmyyd10.;
cards;
23-11-2015, BM1, Stocks, 30%, 2.0%
24-11-2015, BM1, Stocks, 30%, 1.5%
25-11-2015, BM1, Stocks, 30%, 1.2%
23-11-2015, BM2, Stocks, 23%, 3.1%
24-11-2015, BM2, Stocks, 26%, 0.2%
25-11-2015, BM2, Stocks, 23%, 4.5%
23-11-2015, BM1, Bonds, 15%, 3.1%
24-11-2015, BM1, Bonds, 15%, 0.2%
25-11-2015, BM1, Bonds, 15%, 4.5%
23-11-2015, BM2, Bonds, 17%, 2.0%
24-11-2015, BM2, Bonds, 18%, 1.5%
25-11-2015, BM2, Bonds, 17%, 1.2%
;
proc sql;
create table want(drop=Return rename=(NewReturn=Return)) as
select *, case when Asset_Class='Stocks'
then (select Return from have where Asset_Class='Stocks' & Date=a.Date & BM='BM2')
else Return
end as NewReturn
from have a
where BM='BM1';
quit;
Here's how you could do this in Base SAS:
data have;
infile cards dlm=',';
input Date :ddmmyy10. BM $ Asset_Class $ Weight $ Return $;
format Date ddmmyyd10.;
cards;
24-11-2015, BM1, Stocks, 30%, 2%
24-11-2015, BM2, Stocks, 34%, 5%
;
proc sql;
create table want as
select *, (select Return from have where Date=a.Date & BM='BM2') as Return_BM2
from have a
where BM='BM1';
quit;
This assumes that variable DATE is the identifier to link associated BM1 and BM2 records. I've read Weight and Return as character variables just for simplification. In practice, of course, these should be numeric variables (cf. PERCENTw.d informat). Furthermore, I assume that Return_BM2 is to be displayed for comparison, not as a replacement of the original value of Return for BM1. If you want to replace the original Return value, you can simply delete the "_BM2" suffix from "Return_BM2" and add a (drop=Return) dataset option after the second "have" in the PROC SQL step.
How reliable is your incoming data? Do you always have raw data that includes the BM1 record followed by the matching BM2 record? If that is a reliable pattern, you could code:
data want;
infile rawdata dlm=',' dsd truncover;
length dummy $ 1;
informat date ddmmyy10.;
input date asset $ class $ weight $ / dummy dummy dummy dummy return $;
format date ddmmyy10.;
drop dummy;
run;
With SAS datasets something like below could work:
data have;
infile cards dlm=',';
input Date :ddmmyy10. BM $ Asset_Class $ Weight $ Return $;
format Date ddmmyyd10.;
cards;
24-11-2015, BM1, Stocks, 30%, 2%
24-11-2015, BM2, Stocks, 34%, 5%
;
proc sql;
create table want as
select
l.bm,l.date, l.asset_class, l.weight, r.return
from
have(where=(bm='BM1')) l
left join
have(where=(bm='BM2')) r
on l.date=r.date and l.asset_class=r.asset_class
;
quit;
Hi,
Here's another method. It assumes that you always want the value from the next record.
data have;
input Date : ddmmyy10.
BM : $3.
Asset_Class : $10.
Weight : percent8.
Return : percent8. ;
format date date9.;
datalines;
24-11-2015 BM1 Stocks 30% 2%
24-11-2015 BM2 Stocks 34% 5% ;
run;
data want;
set have ;
if eof1=0 then
set have(firstobs=2 keep=return rename=(return=leadReturn)) end=eof1;
else leadReturn = .;
run;
Take a look at this paper from Mark Keintz - http://www.lexjansen.com/nesug/nesug13/113_Final_Paper.pdf
- John
Hi!
Thanks a lot for all the suggestions for solutions, it is really helpfull. I will try them out and get back to you asap.
My biggest concern - and maybe I was not specific enough - is that it is only some of the data that needs to be mended.
So my data set looks something like this:
Date, BM, Asset_Class, Weight, Return
23-11-2015, BM1, Stocks, 30%, 2,0%
24-11-2015, BM1, Stocks, 30%, 1,5%
25-11-2015, BM1, Stocks, 30%, 1,2%
23-11-2015, BM2, Stocks, 23%, 3,1%
24-11-2015, BM2, Stocks, 26%, 0,2%
25-11-2015, BM2, Stocks, 23%, 4,5%
23-11-2015, BM1, Bonds, 15%, 3,1%
24-11-2015, BM1, Bonds, 15%, 0,2%
25-11-2015, BM1, Bonds, 15%, 4,5%
23-11-2015, BM2, Bonds, 17%, 2,0%
24-11-2015, BM2, Bonds, 18%, 1,5%
25-11-2015, BM2, Bonds, 17%, 1,2%
So if the asset class equals stocks, then I would like to mend the rows so I use everything but the return from BM1, and only the return coloum from BM2. Afterwards, every row with BM2 should be deleted from the data set. For Bonds (and all other asset classes) I would just like to delete the rows containing BM2 (not mending the return column).
Thanks a lot again!
- Joe.
Again, assuming that variable Date is the link between the obs. with BM='BM1' and the unique corresponding obs. with BM='BM2', the following code should work:
data have;
infile cards dlm=',';
input Date :ddmmyy10. BM $ Asset_Class $ Weight $ Return $;
format Date ddmmyyd10.;
cards;
23-11-2015, BM1, Stocks, 30%, 2.0%
24-11-2015, BM1, Stocks, 30%, 1.5%
25-11-2015, BM1, Stocks, 30%, 1.2%
23-11-2015, BM2, Stocks, 23%, 3.1%
24-11-2015, BM2, Stocks, 26%, 0.2%
25-11-2015, BM2, Stocks, 23%, 4.5%
23-11-2015, BM1, Bonds, 15%, 3.1%
24-11-2015, BM1, Bonds, 15%, 0.2%
25-11-2015, BM1, Bonds, 15%, 4.5%
23-11-2015, BM2, Bonds, 17%, 2.0%
24-11-2015, BM2, Bonds, 18%, 1.5%
25-11-2015, BM2, Bonds, 17%, 1.2%
;
proc sql;
create table want(drop=Return rename=(NewReturn=Return)) as
select *, case when Asset_Class='Stocks'
then (select Return from have where Asset_Class='Stocks' & Date=a.Date & BM='BM2')
else Return
end as NewReturn
from have a
where BM='BM1';
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.