Hi,
I am trying to create a new variable (i.e., SP_20scale) based on the old variable (i.e., rating) by following codes,
data SP_table2; Length SP_20scale 8. ; set SP_table1; if rating='AAA' then SP_20scale=20; if rating='AA+' then SP_20scale=19; if rating='AA' then SP_20scale=18; if rating='AA-' then SP_20scale=17; if rating='A+' then SP_20scale=16; if rating='A' then SP_20scale=15; if rating='A-' then SP_20scale=14; if rating='BBB+' then SP_20scale=13; if rating='BBB' then SP_20scale=12; if rating='BBB-' then SP_20scale=11; if rating='BB+' then SP_20scale=10; if rating='BB' then SP_20scale=9 ; if rating='BB-' then SP_20scale=8 ; if rating='B+' then SP_20scale=7 ; if rating='B' then SP_20scale=6 ; if rating='B-' then SP_20scale=5 ; if rating='CCC+' then SP_20scale=4 ; if rating='CCC' then SP_20scale=3 ; if rating='CCC-' then SP_20scale=2 ; if rating='C' then SP_20scale=1 ; if rating='SD' then SP_20scale=1 ; if rating='CC' then SP_20scale=1 ; if rating='D' then SP_20scale=1 ; run;
However, the code cannot identify the value in the 'rating' variable, I put a sample of the dataset in the attachment, the 'rating' variable is created by following codes,
rating=strip(tranwrd(scan(SP_ECR_History1,i,':'),'Rating Range',''));
it should not cause by leading and trailing blanks.
Could you please give me some suggestions about this.
Thanks in advance.
Your strings have carriage return and linefeed characters in them.
proc freq ;
tables rating ;
format rating $hex10.;
run;
The FREQ Procedure Cumulative Cumulative rating Frequency Percent Frequency Percent --------------------------------------------------------------- 420D0A2020 7 14.00 7 14.00 422B0D0A20 9 18.00 16 32.00 422D0D0A20 15 30.00 31 62.00 42420D0A20 3 6.00 34 68.00 42422D0D0A 6 12.00 40 80.00 4242420D0A 1 2.00 41 82.00 4242422B0D 1 2.00 42 84.00 4343430D0A 1 2.00 43 86.00 4343432B0D 3 6.00 46 92.00 4E520D0A20 4 8.00 50 100.00
How did you make them? You can use COMPRESS() function to remove them:
rating=compress(rating,'0D0A'x);
Use a custom informat to map a string value to a corresponding numeric value. The INPUT function is used to apply the custom informat. Per @Tom the original rating value needs to be cleaned up by removing 'unprintable' characters. The COMPRESS function defines a printable character as one with an ASCII code >= 32.
Example:
proc format; invalue ratescale 'AAA' = 20 'AA+' = 19 'AA' = 18 'AA-' = 17 'A+' = 16 'A' = 15 'A-' = 14 'BBB+'= 13 'BBB' = 12 'BBB-'= 11 'BB+' = 10 'BB' = 9 'BB-' = 8 'B+' = 7 'B' = 6 'B-' = 5 'CCC+'= 4 'CCC' = 3 'CCC-'= 2 'C' = 1 'SD' = 1 'CC' = 1 'D' = 1 other = . /* NR or unknown ratings */ ; run; data want; set download.sp_table2; rating = compress(rating,,'KW'); * keep only printable characters; scale = input (rating,ratescale.);
format scale 4.; run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.