Hello,
I am working with bond data. A single bond series has a header of descriptive information, then a bunch of cusips to identify individual bonds in the series. It resembles the below:
Type | Maturity | Series | ID |
GO | 20 | 5000 | 280129 |
630693 | |||
RV | 5 | 2400 | 273075 |
GO | 7 | 2500 | 170564 |
GO | 15 | 1000 | 321339 |
488161 | |||
379597 | |||
773084 | |||
RV | 12 | 200 | 36892 |
659199 | |||
468958 |
I want the ID variable to be on the same row as the header information as below:
Type | Maturity | Series | ID | ID2 | ID3 | ID4 |
GO | 20 | 5000 | 280129 | 630693 | ||
RV | 5 | 2400 | 273075 | |||
GO | 7 | 2500 | 170564 | |||
GO | 15 | 1000 | 321339 | 488161 | 379597 | 773084 |
RV | 12 | 200 | 36892 | 659199 | 468958 |
There can be upwards of 100 separate IDs that all pertain to the same header information.
I am struggling with how to do this and would appreciate any advice.
Here is an even easier option to clean the data of blank values and then transpose (that doesn't require the macro looping):
DATA WORK.Have (DROP= Type2 Maturity2 Series2);
SET WORK.Have;
RETAIN Type2 Maturity2 Series2;
IF Type NE ' ' THEN Type2=Type;
ELSE IF Type = ' ' THEN Type=Type2;
IF Maturity NE . THEN Maturity2=Maturity;
ELSE IF Maturity = . THEN Maturity=Maturity2;
IF Series NE ' ' THEN Series2=Series;
ELSE IF Series = ' ' THEN Series=Series2;
RUN;
PROC SORT DATA=WORK.Have; BY Type Maturity Series ID; RUN;
PROC TRANSPOSE DATA=WORK.Have out=WORK.WANT (drop=_name_) prefix=ID_;
var ID;
by Type Maturity Series;
run;
If we are using this logic to create a starter dummy dataset that appears as your example does:
DATA WORK.HAVE;
FORMAT Type $2. Maturity 2. Series $4. ID $6.;
INFORMAT Type $2. Maturity 2. Series $4. ID $6.;
INPUT Type Maturity Series ID;
DATALINES;
GO 20 5000 280129
GO 20 5000 630693
RV 5 2400 273075
GO 7 2500 170564
GO 15 1000 321339
GO 15 1000 488161
GO 15 1000 379597
GO 15 1000 773084
RV 12 200 36892
RV 12 200 659199
RV 12 200 468958
;
You can then use a PROC TRANSPOSE step to create new variable fields for each of the individual IDs aligned to a specific combination of Type/Maturity/Series.
PROC SORT DATA=WORK.Have; BY Type Maturity Series ID; RUN;
PROC TRANSPOSE DATA=WORK.Have out=WORK.WANT (drop=_name_) prefix=ID_;
var ID;
by Type Maturity Series;
run;
Hope this helps
Thank you very much for your reply. I thought of doing something similar. However, only the first observation for each series contains the header information. So, as in the example data I provided, Type, Maturity, and Series are all missing for the subsequent observations after each "header observation".
Are you able to copy the header information to each subsequent blank cell?
I should be able to, but there can be hundreds of IDs under the header. So I could copy using the below
if series = . then series = lag_series;
if series = . & lag_series = . then series = lag2_series;
etc.
I am looking to get around this though.
An easier way to copy the header information would be by using the RETAIN statement for each of the header variables. Once they are all filled in you can use proc transpose.
Here's a good walk through
https://www.mwsug.org/proceedings/2009/stats/MWSUG-2009-D14.pdf
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000214163.htm
Thanks!
You can use this macro statement logic to loop through the dataset filling in the blank values for Type/Maturity/Series based on the last observation containing a non-blank value (only for observations where those values are presently blank).
Then you can use the logic that I previously supplied:
%MACRO FILL_LOOP;
%DO %UNTIL("&BLNKSum."="0");
DATA WORK.Have;
SET WORK.Have;
RETAIN BLNKSum;
LAG_Type=LAG(Type);
LAG_Maturity=LAG(Maturity);
LAG_Series=LAG(Series);
IF Type=' ' THEN Type=LAG_Type;
IF Maturity=. THEN Maturity=LAG_Maturity;
IF Series=' ' THEN Series=LAG_Series;
BLNKSum=SUM(CMISS(Type),CMISS(Maturity),CMISS(Series),BLNKSUM);
CALL SYMPUTX('BLNKSum',BLNKSum);
%PUT &=BLNKSum.;
RUN;
DATA WORK.HAVE(DROP=BLNKSum LAG_Type LAG_Maturity LAG_Series); SET WORK.HAVE; RUN;
%END;
%MEND FILL_LOOP;
%FILL_LOOP;
PROC SORT DATA=WORK.Have; BY Type Maturity Series ID; RUN;
PROC TRANSPOSE DATA=WORK.Have out=WORK.WANT (drop=_name_) prefix=ID_;
var ID;
by Type Maturity Series;
run;
Here is an even easier option to clean the data of blank values and then transpose (that doesn't require the macro looping):
DATA WORK.Have (DROP= Type2 Maturity2 Series2);
SET WORK.Have;
RETAIN Type2 Maturity2 Series2;
IF Type NE ' ' THEN Type2=Type;
ELSE IF Type = ' ' THEN Type=Type2;
IF Maturity NE . THEN Maturity2=Maturity;
ELSE IF Maturity = . THEN Maturity=Maturity2;
IF Series NE ' ' THEN Series2=Series;
ELSE IF Series = ' ' THEN Series=Series2;
RUN;
PROC SORT DATA=WORK.Have; BY Type Maturity Series ID; RUN;
PROC TRANSPOSE DATA=WORK.Have out=WORK.WANT (drop=_name_) prefix=ID_;
var ID;
by Type Maturity Series;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.