BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JoeBill
Fluorite | Level 6

 Hi there! 

 

I am trying to combine two rows into one, taking only some information from both rows. 

 

I have the following hypothetically financial dataset: 

 

Date, BM, Asset_Class, Weight, Return

24-11-2015, BM1, Stocks, 30%, 2%

24-11-2015, BM2, Stocks, 34%, 5%

 

I would like to create only one row, where I take the Date, Asset_Class and the weight from BM1 and the return variable from BM2. The new row will still have to be named named BM1. 

 

Is that possible? 

 

Best, Joe. 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Again, assuming that variable Date is the link between the obs. with BM='BM1' and the unique corresponding obs. with BM='BM2', the following code should work:

data have;
infile cards dlm=',';
input Date :ddmmyy10. BM $ Asset_Class $ Weight $ Return $;
format Date ddmmyyd10.;
cards;
23-11-2015, BM1, Stocks, 30%, 2.0%
24-11-2015, BM1, Stocks, 30%, 1.5%
25-11-2015, BM1, Stocks, 30%, 1.2%
23-11-2015, BM2, Stocks, 23%, 3.1%
24-11-2015, BM2, Stocks, 26%, 0.2%
25-11-2015, BM2, Stocks, 23%, 4.5%
23-11-2015, BM1, Bonds, 15%, 3.1%
24-11-2015, BM1, Bonds, 15%, 0.2%
25-11-2015, BM1, Bonds, 15%, 4.5%
23-11-2015, BM2, Bonds, 17%, 2.0%
24-11-2015, BM2, Bonds, 18%, 1.5%
25-11-2015, BM2, Bonds, 17%, 1.2%
;

proc sql;
create table want(drop=Return rename=(NewReturn=Return)) as
select *, case when Asset_Class='Stocks'
               then (select Return from have where Asset_Class='Stocks' & Date=a.Date & BM='BM2')
               else Return
          end as NewReturn
from have a
where BM='BM1';
quit;

View solution in original post

6 REPLIES 6
FreelanceReinh
Jade | Level 19

Here's how you could do this in Base SAS:

data have;
infile cards dlm=',';
input Date :ddmmyy10. BM $ Asset_Class $ Weight $ Return $;
format Date ddmmyyd10.;
cards;
24-11-2015, BM1, Stocks, 30%, 2%
24-11-2015, BM2, Stocks, 34%, 5%
;

proc sql;
create table want as
select *, (select Return from have where Date=a.Date & BM='BM2') as Return_BM2
from have a
where BM='BM1';
quit;

This assumes that variable DATE is the identifier to link associated BM1 and BM2 records. I've read Weight and Return as character variables just for simplification. In practice, of course, these should be numeric variables (cf. PERCENTw.d informat). Furthermore, I assume that Return_BM2 is to be displayed for comparison, not as a replacement of the original value of Return for BM1. If you want to replace the original Return value, you can simply delete the "_BM2" suffix from "Return_BM2" and add a (drop=Return) dataset option after the second "have" in the PROC SQL step.

 

Astounding
PROC Star

How reliable is your incoming data?  Do you always have raw data that includes the BM1 record followed by the matching BM2 record?  If that is a reliable pattern, you could code:

 

data want;

infile rawdata dlm=',' dsd truncover;

length dummy $ 1;

informat date ddmmyy10.;

input date asset $ class $ weight $ / dummy dummy dummy dummy return $;

format date ddmmyy10.;

drop dummy;

run;

Patrick
Opal | Level 21

With SAS datasets something like below could work:

data have;
  infile cards dlm=',';
  input Date :ddmmyy10. BM $ Asset_Class $ Weight $ Return $;
  format Date ddmmyyd10.;
  cards;
24-11-2015, BM1, Stocks, 30%, 2%
24-11-2015, BM2, Stocks, 34%, 5%
;

proc sql;
  create table want as
    select 
      l.bm,l.date, l.asset_class, l.weight, r.return
    from  
      have(where=(bm='BM1')) l 
      left join 
      have(where=(bm='BM2')) r
    on l.date=r.date and l.asset_class=r.asset_class
    ;
quit;
jnvickery
Obsidian | Level 7

Hi,

 

Here's another method. It assumes that you always want the value from the next record.

 

data have;

   input Date : ddmmyy10.

   BM : $3.

   Asset_Class : $10.

   Weight : percent8.

   Return : percent8. ;

format date date9.;

datalines;
24-11-2015 BM1 Stocks 30% 2%

24-11-2015 BM2 Stocks 34% 5% ;

run;

data want;
   set have ;
   if eof1=0 then
      set have(firstobs=2 keep=return rename=(return=leadReturn)) end=eof1;
   else leadReturn = .;
run; 

 

Take a look at this paper from Mark Keintz - http://www.lexjansen.com/nesug/nesug13/113_Final_Paper.pdf

 

- John

JoeBill
Fluorite | Level 6

Hi! 

 

Thanks a lot for all the suggestions for solutions, it is really helpfull. I will try them out and get back to you asap. 

 

My biggest concern - and maybe I was not specific enough - is that it is only some of the data that needs to be mended.   

 

 

So my data set looks something like this: 

 

Date,             BM,  Asset_Class, Weight, Return

23-11-2015, BM1, Stocks, 30%, 2,0%

24-11-2015, BM1, Stocks, 30%, 1,5%

25-11-2015, BM1, Stocks, 30%, 1,2%

23-11-2015, BM2, Stocks, 23%, 3,1%

24-11-2015, BM2, Stocks, 26%, 0,2%

25-11-2015, BM2, Stocks, 23%, 4,5%

23-11-2015, BM1, Bonds, 15%, 3,1%

24-11-2015, BM1, Bonds, 15%, 0,2%

 

25-11-2015, BM1, Bonds, 15%, 4,5%

23-11-2015, BM2, Bonds, 17%, 2,0%

24-11-2015, BM2, Bonds, 18%, 1,5%

25-11-2015, BM2, Bonds, 17%, 1,2%

 

So if the asset class equals stocks, then I would like to mend the rows so I use everything but the return from BM1, and only the return coloum from BM2. Afterwards, every row with BM2 should be deleted from the data set. For Bonds (and all other asset classes) I would just like to delete the rows containing BM2 (not mending the return column).

 

Thanks a lot again!

 

- Joe.  

 

FreelanceReinh
Jade | Level 19

Again, assuming that variable Date is the link between the obs. with BM='BM1' and the unique corresponding obs. with BM='BM2', the following code should work:

data have;
infile cards dlm=',';
input Date :ddmmyy10. BM $ Asset_Class $ Weight $ Return $;
format Date ddmmyyd10.;
cards;
23-11-2015, BM1, Stocks, 30%, 2.0%
24-11-2015, BM1, Stocks, 30%, 1.5%
25-11-2015, BM1, Stocks, 30%, 1.2%
23-11-2015, BM2, Stocks, 23%, 3.1%
24-11-2015, BM2, Stocks, 26%, 0.2%
25-11-2015, BM2, Stocks, 23%, 4.5%
23-11-2015, BM1, Bonds, 15%, 3.1%
24-11-2015, BM1, Bonds, 15%, 0.2%
25-11-2015, BM1, Bonds, 15%, 4.5%
23-11-2015, BM2, Bonds, 17%, 2.0%
24-11-2015, BM2, Bonds, 18%, 1.5%
25-11-2015, BM2, Bonds, 17%, 1.2%
;

proc sql;
create table want(drop=Return rename=(NewReturn=Return)) as
select *, case when Asset_Class='Stocks'
               then (select Return from have where Asset_Class='Stocks' & Date=a.Date & BM='BM2')
               else Return
          end as NewReturn
from have a
where BM='BM1';
quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1422 views
  • 5 likes
  • 5 in conversation