BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bd_user_10
Quartz | Level 8

Hi Everyone,

 

I have following dataset:

ISINtar_eventhcpevent_month
FI000900001211/09/19950-2
FI000900001211/09/19950-2
FI000900001211/09/19950-2
FI000900001211/09/19950.01924-2
FI000900001211/09/19950.04888-2
FI000900001211/09/1995-0.00481-2
FI000900001211/09/19950.11815-2
FI000900001211/09/19950.00192-2
FI000900001211/09/19950-2
FI000900001211/09/1995-0.0077-2
FI000900001211/09/1995-0.00202-2
FI000900001211/09/19950.00231-2
FI000900001211/09/19950.00385-2
FI000900001211/09/19950.01462-2
FI000900001211/09/19950-2
FI000900001211/09/19950-2
FI000900001211/09/19950-2
FI000900001211/09/19950-2
FI000900001211/09/19950-2
FI000900001211/09/19950-2
FI000900001211/09/19950-2
FI000900001211/09/1995-0.00077-1
FI000900001211/09/19950-1
FI000900001211/09/1995-0.01924-1
FI000900001211/09/1995-0.01943-1
FI000900001211/09/1995-0.00096-1
FI000900001211/09/1995-0.01732-1
FI000900001211/09/1995-0.02165-1
FI000900001211/09/1995-0.03646-1
FI000900001211/09/19950-1
FI000900001211/09/19950-1
FI000900001211/09/19950.00135-1
FI000900001211/09/19950.00173-1
FI000900001211/09/19950-1
FI000900001211/09/19950-1
FI000900001211/09/19950-1
FI000900001211/09/19950-1
FI000900001211/09/1995-0.00192-1
FI000900001211/09/19950-1
FI000900001211/09/19950-1
FI000900001211/09/19950-1

 

I would like to do the following:

ISINtar_event-1-2
FI000900001211/09/1995-0.000770
FI000900001211/09/199500
FI000900001211/09/1995-0.019240
FI000900001211/09/1995-0.019430.0192
FI000900001211/09/1995-0.000960.0489
FI000900001211/09/1995-0.01732-0.005
FI000900001211/09/1995-0.021650.1182
FI000900001211/09/1995-0.036460.0019
FI000900001211/09/199500
FI000900001211/09/19950-0.008
FI000900001211/09/19950.00135-0.002
FI000900001211/09/19950.001730.0023
FI000900001211/09/199500.0039
FI000900001211/09/199500.0146
FI000900001211/09/199500
FI000900001211/09/199500
FI000900001211/09/1995-0.001920
FI000900001211/09/199500
FI000900001211/09/199500
FI000900001211/09/199500
FI000900001211/09/199500

 

In my actual dataset I have many ISINs and 12 months data. Here, I have shown one ISIN and two months data only.

 

-1, -2, .....-12 indicates lag month 1 to lag month 12.

 

I have attached a sample dataset for your convenience. Thanks in advance for your help!

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

For each month (identified as an offset to some dated event, probably tar_event), you have a series of HCP values.   

 

Is this right?  You apparently want to transpose these values so that the first output observation contains 12 variables (the first value for each of the 12 preceding months), the second output obs has the second value for each of the preceding months, etc.

 

Question: your months do not all have the same number of observations (in the downloaded data set event the first month (-8) has only 17 obs).  So

  1. How many transposed output observations for each ISIN/tar_event combo do you want?
  2. If you want as many output observations as the longest month, do you want missing values for the variables corresponding to the transposed shorter months?

Why are the months of differing length?  Are they counts of trading days?  Or some other category of dates?  And if so, do you really want to match   1st-record-of-month to 1st-record-of-month,  2nd-record-of-month to 2nd-record-of-month, etc. for the output?

 

I've edited out the above, since I see you are telling us each month has 21 observations.

 

If your data are sorted by ISIN tar_event, then:

 

data want;
  merge  sample (where=(event_month=-12) rename=hcp=hcp_12)
         sample (where=(event_month=-11) rename=hcp=hcp_11)
         sample (where=(event_month=-10) rename=hcp=hcp_10)
         sample (where=(event_month=-09) rename=hcp=hcp_09)
         sample (where=(event_month=-08) rename=hcp=hcp_08)
         sample (where=(event_month=-07) rename=hcp=hcp_07)
         sample (where=(event_month=-06) rename=hcp=hcp_06)
         sample (where=(event_month=-05) rename=hcp=hcp_05)
         sample (where=(event_month=-04) rename=hcp=hcp_04)
         sample (where=(event_month=-03) rename=hcp=hcp_03)
         sample (where=(event_month=-02) rename=hcp=hcp_02)
         sample (where=(event_month=-01) rename=hcp=hcp_01);
  by isin tar_event;
run;

Note this crucially depends on all months (for any given ISIN/tar_event) having the same number of observations.  If not then the last value from a shorter month is propagated through the subsequent records constructed from the longer months.  However, you can protect against that via:

data want;
  merge  sample (where=(event_month=-12) rename=hcp=hcp_12)
         sample (where=(event_month=-11) rename=hcp=hcp_11)
         sample (where=(event_month=-10) rename=hcp=hcp_10)
         sample (where=(event_month=-09) rename=hcp=hcp_09)
         sample (where=(event_month=-08) rename=hcp=hcp_08)
         sample (where=(event_month=-07) rename=hcp=hcp_07)
         sample (where=(event_month=-06) rename=hcp=hcp_06)
         sample (where=(event_month=-05) rename=hcp=hcp_05)
         sample (where=(event_month=-04) rename=hcp=hcp_04)
         sample (where=(event_month=-03) rename=hcp=hcp_03)
         sample (where=(event_month=-02) rename=hcp=hcp_02)
         sample (where=(event_month=-01) rename=hcp=hcp_01);
  by isin tar_event;
  output;
  call missing(of _all_);
run;

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

How do we know which values of HCP go on each line of the output data set?? What is the logic used?

--
Paige Miller
bd_user_10
Quartz | Level 8

-1, -2, .....-12 indicates lag month 1 to lag month 12.

hcp under event_month should go under that relevant month column. For example, in the illustration, all hcp under -1 should go under the column -1, and all hcp under -2 should go under the column -2.

PaigeMiller
Diamond | Level 26

But what is the logic that results in -0.00077 in row 1 along side a zero. ANd how do you determine which -1 number goes in row 2? And how do you determine which -2 number goes in row 2? And how do you determine which -1 number goes in row 3? And so on?

--
Paige Miller
bd_user_10
Quartz | Level 8
There supposed to be 21 observations for each month based on 252 calendar (working) days in a year. As you could see tar_event day are exactly the same therefore we don't have to follow any hard rule here!
PaigeMiller
Diamond | Level 26

Ok @bd_user_10 , I have one more question, as your original description didn't state what format you want the output in. Do you want some table that can be presented or included in a report, or do you want a SAS data set?

 

UPDATE: Okay, I have another question, this seems to be a red flag which you need to fix. In your sample data set, there are not 21 records for each event_month, and so the whole algorithm that you describe is meaningless unless there are exactly 21 records for each event_month. So let's go back to the beginning, and could you please give a complete thorough clear description of the logic you want to use here to create this final table or dataset?

--
Paige Miller
mkeintz
PROC Star

For each month (identified as an offset to some dated event, probably tar_event), you have a series of HCP values.   

 

Is this right?  You apparently want to transpose these values so that the first output observation contains 12 variables (the first value for each of the 12 preceding months), the second output obs has the second value for each of the preceding months, etc.

 

Question: your months do not all have the same number of observations (in the downloaded data set event the first month (-8) has only 17 obs).  So

  1. How many transposed output observations for each ISIN/tar_event combo do you want?
  2. If you want as many output observations as the longest month, do you want missing values for the variables corresponding to the transposed shorter months?

Why are the months of differing length?  Are they counts of trading days?  Or some other category of dates?  And if so, do you really want to match   1st-record-of-month to 1st-record-of-month,  2nd-record-of-month to 2nd-record-of-month, etc. for the output?

 

I've edited out the above, since I see you are telling us each month has 21 observations.

 

If your data are sorted by ISIN tar_event, then:

 

data want;
  merge  sample (where=(event_month=-12) rename=hcp=hcp_12)
         sample (where=(event_month=-11) rename=hcp=hcp_11)
         sample (where=(event_month=-10) rename=hcp=hcp_10)
         sample (where=(event_month=-09) rename=hcp=hcp_09)
         sample (where=(event_month=-08) rename=hcp=hcp_08)
         sample (where=(event_month=-07) rename=hcp=hcp_07)
         sample (where=(event_month=-06) rename=hcp=hcp_06)
         sample (where=(event_month=-05) rename=hcp=hcp_05)
         sample (where=(event_month=-04) rename=hcp=hcp_04)
         sample (where=(event_month=-03) rename=hcp=hcp_03)
         sample (where=(event_month=-02) rename=hcp=hcp_02)
         sample (where=(event_month=-01) rename=hcp=hcp_01);
  by isin tar_event;
run;

Note this crucially depends on all months (for any given ISIN/tar_event) having the same number of observations.  If not then the last value from a shorter month is propagated through the subsequent records constructed from the longer months.  However, you can protect against that via:

data want;
  merge  sample (where=(event_month=-12) rename=hcp=hcp_12)
         sample (where=(event_month=-11) rename=hcp=hcp_11)
         sample (where=(event_month=-10) rename=hcp=hcp_10)
         sample (where=(event_month=-09) rename=hcp=hcp_09)
         sample (where=(event_month=-08) rename=hcp=hcp_08)
         sample (where=(event_month=-07) rename=hcp=hcp_07)
         sample (where=(event_month=-06) rename=hcp=hcp_06)
         sample (where=(event_month=-05) rename=hcp=hcp_05)
         sample (where=(event_month=-04) rename=hcp=hcp_04)
         sample (where=(event_month=-03) rename=hcp=hcp_03)
         sample (where=(event_month=-02) rename=hcp=hcp_02)
         sample (where=(event_month=-01) rename=hcp=hcp_01);
  by isin tar_event;
  output;
  call missing(of _all_);
run;

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
JeffMaggio
Obsidian | Level 7

This is sloppy but I think it accomplishes what you want:

 

data tmp;

set sample7236 end=eof1;
by isin tar_event event_month;

if first.event_month then counter = 0;
counter+1;

select (event_month);
    when (-1) lag1=hcp;
    when (-2) lag2=hcp;
    when (-3) lag3=hcp;
    when (-4) lag4=hcp;
    when (-5) lag5=hcp;
    when (-6) lag6=hcp;
    when (-7) lag7=hcp;
    when (-8) lag8=hcp;
    when (-9) lag9=hcp;
    when (-10) lag10=hcp;
    when (-11) lag11=hcp;
    when (-12) lag12=hcp;
    otherwise;
end;

run;

%macro runme;
%do i=1 %to 12;
 data subset&i. (keep=isin tar_event lag&i counter);
 set tmp;
 if lag&i ne .;
 run;
%end;
%mend runme;
%runme;

%macro merge;
data want (drop=counter);
merge  
%do i=1 %to 12;
 subset&i 
%end;
;
by counter;
run;
%mend merge;
%merge;

%macro droptmp;
proc sql;
%do i=1 %to 12;
drop table subset&i; 
%end;
drop table tmp;
quit;
%mend droptmp;
%droptmp;

bd_user_10
Quartz | Level 8
This also works! Thank you for your codes!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 969 views
  • 2 likes
  • 4 in conversation