Hi Everyone,
I have following dataset:
ISIN | tar_event | hcp | event_month |
FI0009000012 | 11/09/1995 | 0 | -2 |
FI0009000012 | 11/09/1995 | 0 | -2 |
FI0009000012 | 11/09/1995 | 0 | -2 |
FI0009000012 | 11/09/1995 | 0.01924 | -2 |
FI0009000012 | 11/09/1995 | 0.04888 | -2 |
FI0009000012 | 11/09/1995 | -0.00481 | -2 |
FI0009000012 | 11/09/1995 | 0.11815 | -2 |
FI0009000012 | 11/09/1995 | 0.00192 | -2 |
FI0009000012 | 11/09/1995 | 0 | -2 |
FI0009000012 | 11/09/1995 | -0.0077 | -2 |
FI0009000012 | 11/09/1995 | -0.00202 | -2 |
FI0009000012 | 11/09/1995 | 0.00231 | -2 |
FI0009000012 | 11/09/1995 | 0.00385 | -2 |
FI0009000012 | 11/09/1995 | 0.01462 | -2 |
FI0009000012 | 11/09/1995 | 0 | -2 |
FI0009000012 | 11/09/1995 | 0 | -2 |
FI0009000012 | 11/09/1995 | 0 | -2 |
FI0009000012 | 11/09/1995 | 0 | -2 |
FI0009000012 | 11/09/1995 | 0 | -2 |
FI0009000012 | 11/09/1995 | 0 | -2 |
FI0009000012 | 11/09/1995 | 0 | -2 |
FI0009000012 | 11/09/1995 | -0.00077 | -1 |
FI0009000012 | 11/09/1995 | 0 | -1 |
FI0009000012 | 11/09/1995 | -0.01924 | -1 |
FI0009000012 | 11/09/1995 | -0.01943 | -1 |
FI0009000012 | 11/09/1995 | -0.00096 | -1 |
FI0009000012 | 11/09/1995 | -0.01732 | -1 |
FI0009000012 | 11/09/1995 | -0.02165 | -1 |
FI0009000012 | 11/09/1995 | -0.03646 | -1 |
FI0009000012 | 11/09/1995 | 0 | -1 |
FI0009000012 | 11/09/1995 | 0 | -1 |
FI0009000012 | 11/09/1995 | 0.00135 | -1 |
FI0009000012 | 11/09/1995 | 0.00173 | -1 |
FI0009000012 | 11/09/1995 | 0 | -1 |
FI0009000012 | 11/09/1995 | 0 | -1 |
FI0009000012 | 11/09/1995 | 0 | -1 |
FI0009000012 | 11/09/1995 | 0 | -1 |
FI0009000012 | 11/09/1995 | -0.00192 | -1 |
FI0009000012 | 11/09/1995 | 0 | -1 |
FI0009000012 | 11/09/1995 | 0 | -1 |
FI0009000012 | 11/09/1995 | 0 | -1 |
I would like to do the following:
ISIN | tar_event | -1 | -2 |
FI0009000012 | 11/09/1995 | -0.00077 | 0 |
FI0009000012 | 11/09/1995 | 0 | 0 |
FI0009000012 | 11/09/1995 | -0.01924 | 0 |
FI0009000012 | 11/09/1995 | -0.01943 | 0.0192 |
FI0009000012 | 11/09/1995 | -0.00096 | 0.0489 |
FI0009000012 | 11/09/1995 | -0.01732 | -0.005 |
FI0009000012 | 11/09/1995 | -0.02165 | 0.1182 |
FI0009000012 | 11/09/1995 | -0.03646 | 0.0019 |
FI0009000012 | 11/09/1995 | 0 | 0 |
FI0009000012 | 11/09/1995 | 0 | -0.008 |
FI0009000012 | 11/09/1995 | 0.00135 | -0.002 |
FI0009000012 | 11/09/1995 | 0.00173 | 0.0023 |
FI0009000012 | 11/09/1995 | 0 | 0.0039 |
FI0009000012 | 11/09/1995 | 0 | 0.0146 |
FI0009000012 | 11/09/1995 | 0 | 0 |
FI0009000012 | 11/09/1995 | 0 | 0 |
FI0009000012 | 11/09/1995 | -0.00192 | 0 |
FI0009000012 | 11/09/1995 | 0 | 0 |
FI0009000012 | 11/09/1995 | 0 | 0 |
FI0009000012 | 11/09/1995 | 0 | 0 |
FI0009000012 | 11/09/1995 | 0 | 0 |
In my actual dataset I have many ISINs and 12 months data. Here, I have shown one ISIN and two months data only.
-1, -2, .....-12 indicates lag month 1 to lag month 12.
I have attached a sample dataset for your convenience. Thanks in advance for your help!
For each month (identified as an offset to some dated event, probably tar_event), you have a series of HCP values.
Is this right? You apparently want to transpose these values so that the first output observation contains 12 variables (the first value for each of the 12 preceding months), the second output obs has the second value for each of the preceding months, etc.
Question: your months do not all have the same number of observations (in the downloaded data set event the first month (-8) has only 17 obs). So
Why are the months of differing length? Are they counts of trading days? Or some other category of dates? And if so, do you really want to match 1st-record-of-month to 1st-record-of-month, 2nd-record-of-month to 2nd-record-of-month, etc. for the output?
I've edited out the above, since I see you are telling us each month has 21 observations.
If your data are sorted by ISIN tar_event, then:
data want;
merge sample (where=(event_month=-12) rename=hcp=hcp_12)
sample (where=(event_month=-11) rename=hcp=hcp_11)
sample (where=(event_month=-10) rename=hcp=hcp_10)
sample (where=(event_month=-09) rename=hcp=hcp_09)
sample (where=(event_month=-08) rename=hcp=hcp_08)
sample (where=(event_month=-07) rename=hcp=hcp_07)
sample (where=(event_month=-06) rename=hcp=hcp_06)
sample (where=(event_month=-05) rename=hcp=hcp_05)
sample (where=(event_month=-04) rename=hcp=hcp_04)
sample (where=(event_month=-03) rename=hcp=hcp_03)
sample (where=(event_month=-02) rename=hcp=hcp_02)
sample (where=(event_month=-01) rename=hcp=hcp_01);
by isin tar_event;
run;
Note this crucially depends on all months (for any given ISIN/tar_event) having the same number of observations. If not then the last value from a shorter month is propagated through the subsequent records constructed from the longer months. However, you can protect against that via:
data want;
merge sample (where=(event_month=-12) rename=hcp=hcp_12)
sample (where=(event_month=-11) rename=hcp=hcp_11)
sample (where=(event_month=-10) rename=hcp=hcp_10)
sample (where=(event_month=-09) rename=hcp=hcp_09)
sample (where=(event_month=-08) rename=hcp=hcp_08)
sample (where=(event_month=-07) rename=hcp=hcp_07)
sample (where=(event_month=-06) rename=hcp=hcp_06)
sample (where=(event_month=-05) rename=hcp=hcp_05)
sample (where=(event_month=-04) rename=hcp=hcp_04)
sample (where=(event_month=-03) rename=hcp=hcp_03)
sample (where=(event_month=-02) rename=hcp=hcp_02)
sample (where=(event_month=-01) rename=hcp=hcp_01);
by isin tar_event;
output;
call missing(of _all_);
run;
How do we know which values of HCP go on each line of the output data set?? What is the logic used?
-1, -2, .....-12 indicates lag month 1 to lag month 12.
hcp under event_month should go under that relevant month column. For example, in the illustration, all hcp under -1 should go under the column -1, and all hcp under -2 should go under the column -2.
But what is the logic that results in -0.00077 in row 1 along side a zero. ANd how do you determine which -1 number goes in row 2? And how do you determine which -2 number goes in row 2? And how do you determine which -1 number goes in row 3? And so on?
Ok @bd_user_10 , I have one more question, as your original description didn't state what format you want the output in. Do you want some table that can be presented or included in a report, or do you want a SAS data set?
UPDATE: Okay, I have another question, this seems to be a red flag which you need to fix. In your sample data set, there are not 21 records for each event_month, and so the whole algorithm that you describe is meaningless unless there are exactly 21 records for each event_month. So let's go back to the beginning, and could you please give a complete thorough clear description of the logic you want to use here to create this final table or dataset?
For each month (identified as an offset to some dated event, probably tar_event), you have a series of HCP values.
Is this right? You apparently want to transpose these values so that the first output observation contains 12 variables (the first value for each of the 12 preceding months), the second output obs has the second value for each of the preceding months, etc.
Question: your months do not all have the same number of observations (in the downloaded data set event the first month (-8) has only 17 obs). So
Why are the months of differing length? Are they counts of trading days? Or some other category of dates? And if so, do you really want to match 1st-record-of-month to 1st-record-of-month, 2nd-record-of-month to 2nd-record-of-month, etc. for the output?
I've edited out the above, since I see you are telling us each month has 21 observations.
If your data are sorted by ISIN tar_event, then:
data want;
merge sample (where=(event_month=-12) rename=hcp=hcp_12)
sample (where=(event_month=-11) rename=hcp=hcp_11)
sample (where=(event_month=-10) rename=hcp=hcp_10)
sample (where=(event_month=-09) rename=hcp=hcp_09)
sample (where=(event_month=-08) rename=hcp=hcp_08)
sample (where=(event_month=-07) rename=hcp=hcp_07)
sample (where=(event_month=-06) rename=hcp=hcp_06)
sample (where=(event_month=-05) rename=hcp=hcp_05)
sample (where=(event_month=-04) rename=hcp=hcp_04)
sample (where=(event_month=-03) rename=hcp=hcp_03)
sample (where=(event_month=-02) rename=hcp=hcp_02)
sample (where=(event_month=-01) rename=hcp=hcp_01);
by isin tar_event;
run;
Note this crucially depends on all months (for any given ISIN/tar_event) having the same number of observations. If not then the last value from a shorter month is propagated through the subsequent records constructed from the longer months. However, you can protect against that via:
data want;
merge sample (where=(event_month=-12) rename=hcp=hcp_12)
sample (where=(event_month=-11) rename=hcp=hcp_11)
sample (where=(event_month=-10) rename=hcp=hcp_10)
sample (where=(event_month=-09) rename=hcp=hcp_09)
sample (where=(event_month=-08) rename=hcp=hcp_08)
sample (where=(event_month=-07) rename=hcp=hcp_07)
sample (where=(event_month=-06) rename=hcp=hcp_06)
sample (where=(event_month=-05) rename=hcp=hcp_05)
sample (where=(event_month=-04) rename=hcp=hcp_04)
sample (where=(event_month=-03) rename=hcp=hcp_03)
sample (where=(event_month=-02) rename=hcp=hcp_02)
sample (where=(event_month=-01) rename=hcp=hcp_01);
by isin tar_event;
output;
call missing(of _all_);
run;
This is sloppy but I think it accomplishes what you want:
data tmp;
set sample7236 end=eof1;
by isin tar_event event_month;
if first.event_month then counter = 0;
counter+1;
select (event_month);
when (-1) lag1=hcp;
when (-2) lag2=hcp;
when (-3) lag3=hcp;
when (-4) lag4=hcp;
when (-5) lag5=hcp;
when (-6) lag6=hcp;
when (-7) lag7=hcp;
when (-8) lag8=hcp;
when (-9) lag9=hcp;
when (-10) lag10=hcp;
when (-11) lag11=hcp;
when (-12) lag12=hcp;
otherwise;
end;
run;
%macro runme;
%do i=1 %to 12;
data subset&i. (keep=isin tar_event lag&i counter);
set tmp;
if lag&i ne .;
run;
%end;
%mend runme;
%runme;
%macro merge;
data want (drop=counter);
merge
%do i=1 %to 12;
subset&i
%end;
;
by counter;
run;
%mend merge;
%merge;
%macro droptmp;
proc sql;
%do i=1 %to 12;
drop table subset&i;
%end;
drop table tmp;
quit;
%mend droptmp;
%droptmp;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.