How do I compare variables in subsets of my data based on date?

Reply
New Contributor
Posts: 3

How do I compare variables in subsets of my data based on date?

Hello!

I have data that has 3 variables-- month, ID, and value-- for stock holdings in a portfolio. There are repeats of IDs as the stock could have been held for multiple months. I am trying to how the portfolio changed for a given month compared to the month after.

 

I want it to compare all the IDs for a given month with all the IDs for the month afterwards (so given month + 1). Then, if there are IDs that are in one month but not the other (buying or selling of a stock), I want to record those ID numbers. Additionally, if the ID number shows up in both of the months, I want to compare the value for the ID and record that. 

 

For example, if this was my data: 

1/30/2016  US1   20

1/30/2016  US2   34

2/28/2016  US1   10

2/28/2016  US3    7

 

I would want to know that US1 changed from 20 to 10, US2 was no longer held by 2/28/2016 and US3 was bought in 2/28/2016. 

 

I don't have a lot of SAS experience but I tried using proc compare. I don't know how to get it compare variables ONLY when the months are 1 month apart. Is there a better way to approach this? I also tried doing a while loop with a when statement but I could not figure out how to get the IDs to compare only when there is a month difference. I am using SAS 9.4. 

 

Thank you so much in advance! 

Super Contributor
Posts: 269

Re: How do I compare variables in subsets of my data based on date?

I have written code to have better control. I have used two macro m1 and m2 to supply year and month value. m1 would be base year and base month and m2 would be compare year and compare month. The final output would be stored in _Result dataset.

 

data have;
input v_date mmddyy10. v_id $ v_qty;
format v_date date9.;
datalines;
1/30/2016  US1   20
1/30/2016  US2   34
2/28/2016  US1   10
2/28/2016  US3    7
;

%let m1=201601;
%let m2=201602;



data _base_mth _cmp_mth;
set have;
start_month=MOD(&m1,100);
end_month=MOD(&m2,100);

start_year=int(&m1 / 100);
end_year=int(&m2 / 100);

t_mth=month(v_date);
t_yr=year(v_date);
if t_mth = start_month and t_yr= start_year then output _base_mth;
if t_mth = end_month and t_yr= end_year then output _cmp_mth;
run;

proc sort data=_base_mth ; by v_id; run;
proc sort data=_cmp_mth ; by v_id; run;

data _Result(keep= v_id src_qty  cmp_qty status) ;
merge _base_mth(in=a rename= v_qty =src_qty) _cmp_mth(in=b rename= v_qty =cmp_qty);
by v_id;
length status $20;
if a and not b then do;
status ="Deleted";
output _Result;
end;
if not a and b then do;
status ="Newly Added";
output _Result;
end;
 if a and b then do; 
 			if src_qty ne cmp_qty then do; 
 			status ="Updated"; 
 			output _Result; 
 			end; 
 			if src_qty eq cmp_qty then  do; 
 			status ="No change"; 
 			output _Result; 
 			end; 
 	end; 
run;
Esteemed Advisor
Posts: 5,129

Re: How do I compare variables in subsets of my data based on date?

Assuming your months are identified by the last day of the month:

 

data have;
input m :mmddyy. id :$8. value;
format m yymm.;
datalines;
1/31/2016  US1   20
1/31/2016  US2   34
2/29/2016  US1   10
2/29/2016  US3    7
;

proc sql;
create table sold as
select 
    id, 
    intnx("MONTH", a.m, 1, "END") as m format=yymm.,
    value
from have as a 
where 
    m < (select max(m) from have) and 
    id not in (select id from have where m = intnx("MONTH", a.m, 1, "END"));

create table bought as
select a.*
from have as a 
where a.id not in (select id from have where m = intnx("MONTH", a.m, -1, "END"));

create table kept as
select 
    a.id, 
    a.m, 
    a.value as fromValue, 
    b.value as toValue
from 
    have as a inner join
    have as b on a.id=b.id and a.m=intnx("MONTH", b.m, 1, "END");

create table want as 
select "Bought" as action, * from bought
outer union corr
select "Sold" as action, * from sold
outer union corr
select "Kept" as action, * from kept
order by m, action;

select * from want;
 
quit;
PG
New Contributor
Posts: 3

Re: How do I compare variables in subsets of my data based on date?

Hi! I am going through your code and have a quick question. What is a.m in the INTNX function? Is that a variable that you made?

 

Thank you so much in advance! 

Trusted Advisor
Posts: 1,683

Re: How do I compare variables in subsets of my data based on date?

What about geting a report by proc tabulate:

 

proc tabulate data=have;
  class ID Month;
  var value;
  table ID, Month='Month' * Value * sum=' ';
  format month yymms7.;
run;

You wil see in the report - eache ID when it starts, when it ends and the value in each month.

Super Contributor
Posts: 269

Re: How do I compare variables in subsets of my data based on date?

Very smart use of tabulate.

Super User
Posts: 10,323

Re: How do I compare variables in subsets of my data based on date?

You did not post the output yet .


data have;
input m :mmddyy. id :$8. value;
dummy=intnx('month',m,0);
format dummy yymmdd10.;
format m yymm.;
datalines;
1/31/2016  US1   20
1/31/2016  US2   34
2/29/2016  US1   10
2/29/2016  US3    7
;

proc sql;
select *,(select value from have where id=a.id and dummy=intnx('month',a.dummy,1)) 
 as next_month_value
 from have as a;
quit;

New Contributor
Posts: 3

Re: How do I compare variables in subsets of my data based on date?

Wow! Thank you all for your help!! I appreciate it so much! 

Ask a Question
Discussion stats
  • 7 replies
  • 242 views
  • 1 like
  • 5 in conversation