BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JoeBill
Fluorite | Level 6

 Hi there! 

 

I am trying to combine two rows into one, taking only some information from both rows. 

 

I have the following hypothetically financial dataset: 

 

Date, BM, Asset_Class, Weight, Return

24-11-2015, BM1, Stocks, 30%, 2%

24-11-2015, BM2, Stocks, 34%, 5%

 

I would like to create only one row, where I take the Date, Asset_Class and the weight from BM1 and the return variable from BM2. The new row will still have to be named named BM1. 

 

Is that possible? 

 

Best, Joe. 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Again, assuming that variable Date is the link between the obs. with BM='BM1' and the unique corresponding obs. with BM='BM2', the following code should work:

data have;
infile cards dlm=',';
input Date :ddmmyy10. BM $ Asset_Class $ Weight $ Return $;
format Date ddmmyyd10.;
cards;
23-11-2015, BM1, Stocks, 30%, 2.0%
24-11-2015, BM1, Stocks, 30%, 1.5%
25-11-2015, BM1, Stocks, 30%, 1.2%
23-11-2015, BM2, Stocks, 23%, 3.1%
24-11-2015, BM2, Stocks, 26%, 0.2%
25-11-2015, BM2, Stocks, 23%, 4.5%
23-11-2015, BM1, Bonds, 15%, 3.1%
24-11-2015, BM1, Bonds, 15%, 0.2%
25-11-2015, BM1, Bonds, 15%, 4.5%
23-11-2015, BM2, Bonds, 17%, 2.0%
24-11-2015, BM2, Bonds, 18%, 1.5%
25-11-2015, BM2, Bonds, 17%, 1.2%
;

proc sql;
create table want(drop=Return rename=(NewReturn=Return)) as
select *, case when Asset_Class='Stocks'
               then (select Return from have where Asset_Class='Stocks' & Date=a.Date & BM='BM2')
               else Return
          end as NewReturn
from have a
where BM='BM1';
quit;

View solution in original post

6 REPLIES 6
FreelanceReinh
Jade | Level 19

Here's how you could do this in Base SAS:

data have;
infile cards dlm=',';
input Date :ddmmyy10. BM $ Asset_Class $ Weight $ Return $;
format Date ddmmyyd10.;
cards;
24-11-2015, BM1, Stocks, 30%, 2%
24-11-2015, BM2, Stocks, 34%, 5%
;

proc sql;
create table want as
select *, (select Return from have where Date=a.Date & BM='BM2') as Return_BM2
from have a
where BM='BM1';
quit;

This assumes that variable DATE is the identifier to link associated BM1 and BM2 records. I've read Weight and Return as character variables just for simplification. In practice, of course, these should be numeric variables (cf. PERCENTw.d informat). Furthermore, I assume that Return_BM2 is to be displayed for comparison, not as a replacement of the original value of Return for BM1. If you want to replace the original Return value, you can simply delete the "_BM2" suffix from "Return_BM2" and add a (drop=Return) dataset option after the second "have" in the PROC SQL step.

 

Astounding
PROC Star

How reliable is your incoming data?  Do you always have raw data that includes the BM1 record followed by the matching BM2 record?  If that is a reliable pattern, you could code:

 

data want;

infile rawdata dlm=',' dsd truncover;

length dummy $ 1;

informat date ddmmyy10.;

input date asset $ class $ weight $ / dummy dummy dummy dummy return $;

format date ddmmyy10.;

drop dummy;

run;

Patrick
Opal | Level 21

With SAS datasets something like below could work:

data have;
  infile cards dlm=',';
  input Date :ddmmyy10. BM $ Asset_Class $ Weight $ Return $;
  format Date ddmmyyd10.;
  cards;
24-11-2015, BM1, Stocks, 30%, 2%
24-11-2015, BM2, Stocks, 34%, 5%
;

proc sql;
  create table want as
    select 
      l.bm,l.date, l.asset_class, l.weight, r.return
    from  
      have(where=(bm='BM1')) l 
      left join 
      have(where=(bm='BM2')) r
    on l.date=r.date and l.asset_class=r.asset_class
    ;
quit;
jnvickery
Obsidian | Level 7

Hi,

 

Here's another method. It assumes that you always want the value from the next record.

 

data have;

   input Date : ddmmyy10.

   BM : $3.

   Asset_Class : $10.

   Weight : percent8.

   Return : percent8. ;

format date date9.;

datalines;
24-11-2015 BM1 Stocks 30% 2%

24-11-2015 BM2 Stocks 34% 5% ;

run;

data want;
   set have ;
   if eof1=0 then
      set have(firstobs=2 keep=return rename=(return=leadReturn)) end=eof1;
   else leadReturn = .;
run; 

 

Take a look at this paper from Mark Keintz - http://www.lexjansen.com/nesug/nesug13/113_Final_Paper.pdf

 

- John

JoeBill
Fluorite | Level 6

Hi! 

 

Thanks a lot for all the suggestions for solutions, it is really helpfull. I will try them out and get back to you asap. 

 

My biggest concern - and maybe I was not specific enough - is that it is only some of the data that needs to be mended.   

 

 

So my data set looks something like this: 

 

Date,             BM,  Asset_Class, Weight, Return

23-11-2015, BM1, Stocks, 30%, 2,0%

24-11-2015, BM1, Stocks, 30%, 1,5%

25-11-2015, BM1, Stocks, 30%, 1,2%

23-11-2015, BM2, Stocks, 23%, 3,1%

24-11-2015, BM2, Stocks, 26%, 0,2%

25-11-2015, BM2, Stocks, 23%, 4,5%

23-11-2015, BM1, Bonds, 15%, 3,1%

24-11-2015, BM1, Bonds, 15%, 0,2%

 

25-11-2015, BM1, Bonds, 15%, 4,5%

23-11-2015, BM2, Bonds, 17%, 2,0%

24-11-2015, BM2, Bonds, 18%, 1,5%

25-11-2015, BM2, Bonds, 17%, 1,2%

 

So if the asset class equals stocks, then I would like to mend the rows so I use everything but the return from BM1, and only the return coloum from BM2. Afterwards, every row with BM2 should be deleted from the data set. For Bonds (and all other asset classes) I would just like to delete the rows containing BM2 (not mending the return column).

 

Thanks a lot again!

 

- Joe.  

 

FreelanceReinh
Jade | Level 19

Again, assuming that variable Date is the link between the obs. with BM='BM1' and the unique corresponding obs. with BM='BM2', the following code should work:

data have;
infile cards dlm=',';
input Date :ddmmyy10. BM $ Asset_Class $ Weight $ Return $;
format Date ddmmyyd10.;
cards;
23-11-2015, BM1, Stocks, 30%, 2.0%
24-11-2015, BM1, Stocks, 30%, 1.5%
25-11-2015, BM1, Stocks, 30%, 1.2%
23-11-2015, BM2, Stocks, 23%, 3.1%
24-11-2015, BM2, Stocks, 26%, 0.2%
25-11-2015, BM2, Stocks, 23%, 4.5%
23-11-2015, BM1, Bonds, 15%, 3.1%
24-11-2015, BM1, Bonds, 15%, 0.2%
25-11-2015, BM1, Bonds, 15%, 4.5%
23-11-2015, BM2, Bonds, 17%, 2.0%
24-11-2015, BM2, Bonds, 18%, 1.5%
25-11-2015, BM2, Bonds, 17%, 1.2%
;

proc sql;
create table want(drop=Return rename=(NewReturn=Return)) as
select *, case when Asset_Class='Stocks'
               then (select Return from have where Asset_Class='Stocks' & Date=a.Date & BM='BM2')
               else Return
          end as NewReturn
from have a
where BM='BM1';
quit;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

Creating Custom Steps in SAS Studio

Check out this tutorial series to learn how to build your own steps in SAS Studio.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2602 views
  • 5 likes
  • 5 in conversation