Hi,
I am trying to split the value in the same cell to different cells but in the same column.
For table 1
ID | Rating_History |
1 | Rating: CCC+Rating Range: Apr-08-2020 to PresentRating: B-Rating Range: Feb-20-2020 to Apr-08-2020 |
2 | Rating: B+Rating Range: Dec-13-2019 to PresentRating: NRRating Range: Apr-13-2018 to Dec-13-2019Rating: BB-Rating Range: Aug-02-2017 to Apr-13-2018 |
I expect to get the following table
ID | Rating | Rating_Range |
1 | CCC+ | Apr-08-2020 to Present |
1 | B- | Feb-20-2020 to Apr-08-2020 |
2 | B+ | Dec-13-2019 to Present |
2 | NR | Apr-13-2018 to Dec-13-2019 |
2 | BB- | Aug-02-2017 to Apr-13-2018 |
All value in the 'Rating_History' variable is made up of at least one strings like 'Rating: XXXRating Range: XXX-00-0000 to XXX-00-0000(or Present)'. Could you please give me some suggestions about this?
Thanks in advance.
run;data table1; infile cards truncover; input ID $1. Rating_History $2000. ; cards; 1 Rating: CCC+Rating Range: Apr-08-2020 to PresentRating: B-Rating Range: Feb-20-2020 to Apr-08-2020 2 Rating: B+Rating Range: Dec-13-2019 to PresentRating: NRRating Range: Apr-13-2018 to Dec-13-2019Rating: BB-Rating Range: Aug-02-2017 to Apr-13-2018 ;;;;
The following code could give you a start.
data table1;
infile cards truncover;
input
ID $1.
Rating_History $2000.
;
cards;
1 Rating: CCC+Rating Range: Apr-08-2020 to PresentRating: B-Rating Range: Feb-20-2020 to Apr-08-2020
2 Rating: B+Rating Range: Dec-13-2019 to PresentRating: NRRating Range: Apr-13-2018 to Dec-13-2019Rating: BB-Rating Range: Aug-02-2017 to Apr-13-2018
;;;;
data want;
set table1;
n=countc(rating_history,':');
do i=2 to n by 2;
rating=tranwrd(scan(rating_history,i,':'),'Rating Range','');
rating_range=tranwrd(scan(rating_history,i+1,':'),'Rating','');
output;
end;
keep id rating rating_range;
run;
It can be done, but it's a bit of a trick. You'd have to parse the text looking for "Rating Range:" and then take the value from the end of "Rating Range:" to the next "Rating Range:" or the end of a line.
I'll put some code that parses text below. This sample code is parsing JSON data, but you could do something like it to solve this task.
Jim
%LET Cmnt = ;
%LET Width = 75;
**------------------------------------------------------------------------------**;
FILENAME In_Data 'C:\Users\jbarbou3\Documents\SAS\Pgm\Training\JSON_Parse\JSON_Data.txt';
**------------------------------------------------------------------------------**;
DATA WORK.Parsed_Data;
DROP _:;
LENGTH _JSON_Data $32767;
LENGTH _Segment $1026;
LENGTH _Var_Name $32;
LENGTH Row_Nbr 4;
LENGTH Var_Nbr 4;
LENGTH Value $1026;
RETAIN _Max_Nbr 0;
IF _N_ = 1 THEN
DO;
CALL MISSING(Var_Nbr, _Var_Name);
DECLARE HASH Vars(ORDERED: 'A');
Vars.DEFINEKEY('_Var_Name');
Vars.DEFINEDATA('Var_Nbr');
Vars.DEFINEDONE();
END;
IF _End_of_Data THEN
DO;
CALL SYMPUTX('Var_Count', PUT(_Max_Nbr, F3.), 'G');
END;
INFILE In_Data LENGTH = _JSON_Length
RECFM = V
END = _End_of_Data
;
INPUT @1 _JSON_Data $VARYING32767. _JSON_Length;
** The first character is a {, so start in position 2. **;
_Position = 2;
SUBSTR(_JSON_Data, _JSON_Length, 1) = ',';
DO WHILE (_Position < _JSON_Length);
_Segment_Length = INDEXC(SUBSTR(_JSON_Data, _Position), ',');
_Segment = SUBSTR(_JSON_Data, _Position, (_Segment_Length - 1));
_Var_Name = COMPRESS(SUBSTR(_Segment, 1, (INDEXC(_Segment, ':') - 1)), '"');
Value = COMPRESS(SUBSTR(_Segment, (INDEXC(_Segment, ':') + 1)), '"');
_Position = _Position + _Segment_Length;
_RC = Vars.FIND();
IF _RC = 0 THEN
DO;
CALL SYMPUTX(CATS('Var_Name', PUT(Var_Nbr, F3.)), _Var_Name, 'G');
IF Var_Nbr > _Max_Nbr THEN
_Max_Nbr = Var_Nbr;
END;
ELSE
DO;
_Max_Nbr + 1;
CALL SYMPUTX(CATS('Var_Name', PUT(_Max_Nbr, F3.)), _Var_Name, 'G');
Var_Nbr = _Max_Nbr;
_RC = Vars.ADD();
IF _RC > 0 THEN
DO;
PUTLOG "&Err2 ";
PUTLOG "&Err2 %Format_Dashes(&Width)";
PUTLOG "&Err1 | Unable to add variable to Hash table. " _RC=;
PUTLOG "&Err2 %Format_Dashes(&Width)";
END;
END;
Row_Nbr = _N_;
OUTPUT;
END;
RUN;
**------------------------------------------------------------------------------**;
&Cmnt.PROC PRINT DATA=WORK.PARSED_DATA;
&Cmnt.RUN;
**------------------------------------------------------------------------------**;
%MACRO List_Vars(Comma=NO);
%IF %QUPCASE(&Comma) = YES %THEN
%DO;
%LET Comma = ,;
%END;
%ELSE
%DO;
%LET Comma = ;
%END;
%LOCAL i;
%DO i = 1 %TO &Var_Count;
%IF &i = &Var_Count %THEN
%LET Comma = ;
&&Var_Name&i &Comma
%END;
%MEND List_Vars;
&Cmnt%PUT &Nte1 &=Var_Count %List_Vars(Comma=YES);
**------------------------------------------------------------------------------**;
%MACRO Equate_Vars;
%LOCAL i;
%DO i = 1 %TO &Var_Count;
IF Var_Nbr = &i THEN
&&Var_Name&i = Value;
%END;
%MEND Equate_Vars;
&Cmnt%PUT &Nte1 %QUOTE(%Equate_Vars);
**------------------------------------------------------------------------------**;
DATA WORK.Final_Data;
DROP Row_Nbr;
DROP Var_Nbr;
DROP Value;
DROP _:;
RETAIN %List_Vars;
SET WORK.PARSED_DATA;
BY Row_Nbr;
%Equate_Vars;
IF LAST.Row_Nbr THEN
DO;
OUTPUT;
CALL MISSING(%List_Vars(Comma=YES));
END;
RUN;
Hi @Alexxxxxxx Assuming I comprehend your pattern well and your sample is truly representative of your real, please try the below-
data table1;
infile cards truncover;
input
ID $1.
Rating_History $2000.
;
cards;
1 Rating: CCC+Rating Range: Apr-08-2020 to PresentRating: B-Rating Range: Feb-20-2020 to Apr-08-2020
2 Rating: B+Rating Range: Dec-13-2019 to PresentRating: NRRating Range: Apr-13-2018 to Dec-13-2019Rating: BB-Rating Range: Aug-02-2017 to Apr-13-2018
;;;;
data want;
set table1;
do _pos=find(Rating_History,'Rating:') by 0 while(_pos);
_r=find(Rating_History,'Rating Range:',_pos);
_s=_pos+7;
_l=_r-_pos-7;
Rating=substr(Rating_History,_s,_l);
_pos =find(Rating_History,'Rating:', _pos + 1) ;
_s=_r+13;
_l=_pos-_r-13;
if _l<=0 then _l=999;
Rating_Range=substr(Rating_History,_s,_l);
output;
end;
keep ID Rating Rating_Range;
run;
proc print noobs;run;
The SAS System |
ID | Rating | Rating_Range |
---|---|---|
1 | CCC+ | Apr-08-2020 to Present |
1 | B- | Feb-20-2020 to Apr-08-2020 |
2 | B+ | Dec-13-2019 to Present |
2 | NR | Apr-13-2018 to Dec-13-2019 |
2 | BB- | Aug-02-2017 to Apr-13-2018 |
Hi @novinosrin ,
Many thanks for your code, it works well for the most part of my data, except the rows in attachment (it is the original data of the wrong rows),
The result of a wrong row looks like below, could you please give me more suggestions about this?
NOTE: Invalid third argument to function SUBSTR at line 1467 column 16. SP_Entity_ID=309861 SP_ISIN=ARPERE010103 SP_CUSIP=- SP_IQID=4597539 SP_COMPANY_NAME=Petrobras Argentina S.A. SP_ECR_History1=Rating: NR Rating Range: Mar-20-2017 to Present Rating: B- Rating Range: Feb-03-2 016 to Mar-20-2017 Rating: CCC- Rating Range: Jun-18-2014 to Feb-03-2016 Rating: CCC+ Rating R ange: Dec-20-2013 to Jun-18-2014 Rating: B- Rating Range: Sep-13-2013 to Dec-20-2013 Rating: B Rating Range: Apr-05-2013 to Sep-13-2013 Rating: B+ Rating Range: Nov-01-2012 to Apr-05-2013 R ating: BB- Rating Range: Dec-17-2010 to Nov-01-2012 Rating: B+ Rating Range: Feb-12-2009 to Dec-1 7-2010 Rating: BB Rating Range: Apr-23-2007 to Feb-12-2009 Rating: B+ Rating Range: Oct-09-200 6 to Apr-23-2007 Rating: B Rating Range: Apr-01-2005 to Oct-09-2006 Rating: B- Rating Range: J ul-11-2003 to Apr-01-2005 Rating: CCC Rating Range: Aug-08-2002 to Jul-11-2003 Rating: SD Rati ng Range: Aug-02-2002 to Aug-08-2002 Rating: CC Rating Range: Jun-12-2002 to Aug-02-2002 Rating : CCC Rating Range: Apr-18-2002 to Jun-12-2002 Rating: CCC+ Rating Range: Mar-26-2002 to Apr-18-2 002 Rating: CCC- Rating Range: Mar-25-2002 to Mar-26-2002 Rating: SD Rating Range: Mar-25-2002 to Mar-25-2002 Rating: CCC+ Rating Range: Dec-05-2001 to Mar-25-2002 Rating: B Rating Range: Nov-01-2001 to Dec-05-2001 Rating: B+ Rating Range: Oct-10-2001 to Nov-01-2001 Rating: BB- Rat ing Range: Jul-13-2001 to Oct-10-2001 Rating: BB Rating Range: Mar-26-2001 to Jul-13-2001 Ratin g: BB+ Rating Range: Jan-11-2001 to Mar-26-2001 Rating: BBB- Rating Range: Apr-22-1997 to Jan-11- 2001 Rating: BB Rating Range: Apr-02-1997 to Apr-22-1997 Rating: BB- Rating Range: Jan-15-1997 to Apr-02-1997 SP_ECR_History2=CreditWatch: Watch Neg CreditWatch Range: Sep-13-2013 to Dec-20-2013 CreditWatch: Watch Neg CreditWatch Range: Nov-01-2012 to Apr-05-2013 CreditWatch: Watch Pos CreditWatch Range: Sep-14-2010 to Dec-17-2010 CreditWatch: Watch Neg CreditWatch Range: Nov-05-2008 to Feb-12-2009 CreditWatch: Watch Neg CreditWatch Range: Oct-03-2005 to Apr-06-2006 CreditWatch: Watch Pos Cre ditWatch Range: Dec-28-2004 to Apr-01-2005 CreditWatch: Watch Neg CreditWatch Range: Mar-26-2001 t o Jun-07-2001 CreditWatch: Watch Neg CreditWatch Range: Mar-19-2001 to Mar-26-2001 CreditWatch: Watch Neg CreditWatch Range: Nov-01-2000 to Jan-11-2001 SP_ECR_History3=Outlook: Stable Outlook Range: Feb-03-2016 to Mar-20-2017 Outlook: Negative Outlo ok Range: Jun-18-2014 to Feb-03-2016 Outlook: Negative Outlook Range: Dec-20-2013 to Jun-18-2014 Outlook: Negative Outlook Range: Apr-05-2013 to Sep-13-2013 Outlook: Negative Outlook Range: Ap r-24-2012 to Nov-01-2012 Outlook: Stable Outlook Range: Dec-17-2010 to Apr-24-2012 Outlook: Sta ble Outlook Range: Feb-12-2009 to Sep-14-2010 Outlook: Stable Outlook Range: Apr-23-2007 to Nov-0 5-2008 Outlook: Stable Outlook Range: Oct-09-2006 to Apr-23-2007 Outlook: Stable Outlook Range : Apr-06-2006 to Oct-09-2006 Outlook: Positive Outlook Range: Apr-01-2005 to Oct-03-2005 Outloo k: Stable Outlook Range: Jul-11-2003 to Dec-28-2004 Outlook: Negative Outlook Range: Aug-08-2002 to Jul-11-2003 Outlook: Negative Outlook Range: Jun-12-2002 to Aug-02-2002 Outlook: Negative O utlook Range: Apr-18-2002 to Jun-12-2002 Outlook: Negative Outlook Range: Mar-26-2002 to Apr-18-20 02 Outlook: Negative Outlook Range: Mar-25-2002 to Mar-26-2002 Outlook: Negative Outlook Range : Dec-05-2001 to Mar-25-2002 Outlook: Negative Outlook Range: Nov-01-2001 to Dec-05-2001 Outloo k: Negative Outlook Range: Oct-10-2001 to Nov-01-2001 Outlook: Negative Outlook Range: Jul-13-200 1 to Oct-10-2001 Outlook: Negative Outlook Range: Jun-07-2001 to Jul-13-2001 Outlook: Stable O utlook Range: Jan-11-2001 to Mar-19-2001 Outlook: Negative Outlook Range: Jul-22-1999 to Nov-01-20 00 Outlook: Stable Outlook Range: Apr-22-1997 to Jul-22-1999 Outlook: Stable Outlook Range: Ap r-02-1997 to Apr-22-1997 Outlook: Stable Outlook Range: Jan-15-1997 to Apr-02-1997 Company_Name=Petrobras Argentina S.A. Exchange_Ticker=- S_P_Entity_ID=309861 S_P_RatingsDirect__Industry=Corporates Geographic_Locations=Latin America and Caribbean (Primary) S_P_Entity_Credit_Rating_History=Rating: NR Rating Range: Mar-20-2017 to Present Rating: B- Ratin g Range: Feb-03-2016 to Mar-20-2017 Rating: CCC- Rating Range: Jun-18-2014 to Feb-03-2016 Ratin g: CCC+ Rating Range: Dec-20-2013 to Jun-18-2014 Rating: B- Rating Range: Sep-13-2013 to Dec-20-2 013 Rating: B Rating Range: Apr-05-2013 to Sep-13-2013 Rating: B+ Rating Range: Nov-01-2012 to Apr-05-2013 Rating: BB- Rating Range: Dec-17-2010 to Nov-01-2012 Rating: B+ Rating Range: Feb -12-2009 to Dec-17-2010 Rating: BB Rating Range: Apr-23-2007 to Feb-12-2009 Rating: B+ Rating Range: Oct-09-2006 to Apr-23-2007 Rating: B Rating Range: Apr-01-2005 to Oct-09-2006 Rating: B- Rating Range: Jul-11-2003 to Apr-01-2005 Rating: CCC Rating Range: Aug-08-2002 to Jul-11-2003 Rating: SD Rating Range: Aug-02-2002 to Aug-08-2002 Rating: CC Rating Range: Jun-12-2002 to Aug- 02-2002 Rating: CCC Rating Range: Apr-18-2002 to Jun-12-2002 Rating: CCC+ Rating Range: Mar-26 -2002 to Apr-18-2002 Rating: CCC- Rating Range: Mar-25-2002 to Mar-26-2002 Rating: SD Rating R ange: Mar-25-2002 to Mar-25-2002 Rating: CCC+ Rating Range: Dec-05-2001 to Mar-25-2002 Rating: B Rating Range: Nov-01-2001 to Dec-05-2001 Rating: B+ Rating Range: Oct-10-2001 to Nov-01-2001 Rating: BB- Rating Range: Jul-13-2001 to Oct-10-2001 Rating: BB Rating Range: Mar-26-2001 to Jul -13-2001 Rating: BB+ Rating Range: Jan-11-2001 to Mar-26-2001 Rating: BBB- Rating Range: Apr-2 2-1997 to Jan-11-2001 Rating: BB Rating Range: Apr-02-1997 to Apr-22-1997 Rating: BB- Rating R ange: Jan-15-1997 to Apr-02-1997 S_P_Entity_Credit_Rating_Histor1=CreditWatch: Watch Neg CreditWatch Range: Sep-13-2013 to Dec-20-2013 CreditWatch: Watch Neg CreditWatch Range: Nov-01-2012 to Apr-05-2013 CreditWatch: Watch Pos C reditWatch Range: Sep-14-2010 to Dec-17-2010 CreditWatch: Watch Neg CreditWatch Range: Nov-05-2008 to Feb-12-2009 CreditWatch: Watch Neg CreditWatch Range: Oct-03-2005 to Apr-06-2006 CreditWatc h: Watch Pos CreditWatch Range: Dec-28-2004 to Apr-01-2005 CreditWatch: Watch Neg CreditWatch Ran ge: Mar-26-2001 to Jun-07-2001 CreditWatch: Watch Neg CreditWatch Range: Mar-19-2001 to Mar-26-200 1 CreditWatch: Watch Neg CreditWatch Range: Nov-01-2000 to Jan-11-2001 S_P_Entity_Credit_Rating_Histor2=Outlook: Stable Outlook Range: Feb-03-2016 to Mar-20-2017 Outlook : Negative Outlook Range: Jun-18-2014 to Feb-03-2016 Outlook: Negative Outlook Range: Dec-20-2013 to Jun-18-2014 Outlook: Negative Outlook Range: Apr-05-2013 to Sep-13-2013 Outlook: Negative Outlook Range: Apr-24-2012 to Nov-01-2012 Outlook: Stable Outlook Range: Dec-17-2010 to Apr-24-201 2 Outlook: Stable Outlook Range: Feb-12-2009 to Sep-14-2010 Outlook: Stable Outlook Range: Apr -23-2007 to Nov-05-2008 Outlook: Stable Outlook Range: Oct-09-2006 to Apr-23-2007 Outlook: Stab le Outlook Range: Apr-06-2006 to Oct-09-2006 Outlook: Positive Outlook Range: Apr-01-2005 to Oct- 03-2005 Outlook: Stable Outlook Range: Jul-11-2003 to Dec-28-2004 Outlook: Negative Outlook Ra nge: Aug-08-2002 to Jul-11-2003 Outlook: Negative Outlook Range: Jun-12-2002 to Aug-02-2002 Out look: Negative Outlook Range: Apr-18-2002 to Jun-12-2002 Outlook: Negative Outlook Range: Mar-26- 2002 to Apr-18-2002 Outlook: Negative Outlook Range: Mar-25-2002 to Mar-26-2002 Outlook: Negati ve Outlook Range: Dec-05-2001 to Mar-25-2002 Outlook: Negative Outlook Range: Nov-01-2001 to Dec- 05-2001 Outlook: Negative Outlook Range: Oct-10-2001 to Nov-01-2001 Outlook: Negative Outlook Range: Jul-13-2001 to Oct-10-2001 Outlook: Negative Outlook Range: Jun-07-2001 to Jul-13-2001 O utlook: Stable Outlook Range: Jan-11-2001 to Mar-19-2001 Outlook: Negative Outlook Range: Jul-22- 1999 to Nov-01-2000 Outlook: Stable Outlook Range: Apr-22-1997 to Jul-22-1999 Outlook: Stable Outlook Range: Apr-02-1997 to Apr-22-1997 Outlook: Stable Outlook Range: Jan-15-1997 to Apr-02-199 7 Primary_ISIN=ARPERE010103 CIK=0000900442; 0001449877 Company_CUSIP=- Excel_Company_ID=IQ4597539 Security_Tickers=- _pos=0 _r=1592 _s=1605 _l=999 Rating=BB- Rating_Range=Jan-15-1997 to Apr-02-1997 _ERROR_=1 _N_=4614
the attachment is the original data of wrong rows
Hi Your source file is not clear. Can you please paste the wrong rows and all possible rows in a more usable form under datalines like you did this-
data table1; infile cards truncover; input ID $1. Rating_History $2000. ; cards; 1 Rating: CCC+Rating Range: Apr-08-2020 to PresentRating: B-Rating Range: Feb-20-2020 to Apr-08-2020 2 Rating: B+Rating Range: Dec-13-2019 to PresentRating: NRRating Range: Apr-13-2018 to Dec-13-2019Rating: BB-Rating Range: Aug-02-2017 to Apr-13-2018 ;;;;
The following code could give you a start.
data table1;
infile cards truncover;
input
ID $1.
Rating_History $2000.
;
cards;
1 Rating: CCC+Rating Range: Apr-08-2020 to PresentRating: B-Rating Range: Feb-20-2020 to Apr-08-2020
2 Rating: B+Rating Range: Dec-13-2019 to PresentRating: NRRating Range: Apr-13-2018 to Dec-13-2019Rating: BB-Rating Range: Aug-02-2017 to Apr-13-2018
;;;;
data want;
set table1;
n=countc(rating_history,':');
do i=2 to n by 2;
rating=tranwrd(scan(rating_history,i,':'),'Rating Range','');
rating_range=tranwrd(scan(rating_history,i+1,':'),'Rating','');
output;
end;
keep id rating rating_range;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.