Hi, I have a data set like:
data test;
input name $10. rate 2. ;
datalines;
I12345 2.3
0.1 1.2
0.2 1.6
0.3 1.5
0.4 1.4
I23456 3.4
0.1 2.3
0.2 1.8
0.3 1.4
0.4 1.5
I34567 2.6
0.1 1.2
0.2 1.5
0.3 1.6
0.4 1.7
;
run;
I want the data set like:
name rate
I12345 2.3
I12345A 1.2
I12345B 1.6
I12345C 1.5
I12345D 1.4
I23456 3.4
I23456A 2.3
I23456B 1.8
I23456C 1.4
I23456D 1.5
I34567 2.6
I34567A 1.2
I34567B 1.5
I34567C 1.6
I34567D 1.7
I appreciate any idea.
Try using a format and then keep any oldname that doesn't begin with a zero.
proc format;
value $append '0.1' = 'A'
'0.2' = 'B'
'0.3' = 'C'
'0.4' = 'D';
run;
data test2;
set test(rename=(name=oldname));
retain last_name;
if oldname^=:'0' then do;
name=oldname;
last_name=oldname;
end;
else name = strip(last_name)||put(oldname,$append.);
drop last_name oldname;
run;
Every time you encounter a name that begins with uppercase "I", you want to retain that value as a name root. For subsequent observations you want name to have that name root, with consecutive suffixes of "A", "B", "C", etc:
data test;
input name :$10. rate ;
datalines;
I12345 2.3
0.1 1.2
0.2 1.6
0.3 1.5
0.4 1.4
I23456 3.4
0.1 2.3
0.2 1.8
0.3 1.4
0.4 1.5
I34567 2.6
0.1 1.2
0.2 1.5
0.3 1.6
0.4 1.7
;
run;
data want (drop=i name_root);
set test;
length name_root $10;
retain name_root;
array suffix {8} $1 _temporary_ (' ','A','B','C','D','E','F','G');
i+1;
if name=: 'I' then do;
i=1;
name_root=name;
end;
name=cats(name_root,suffix{i});
run;
Notes:
Thansk for the reply. But the case is not always right. Those beginning with "I " are questions and not always begin with "I". Some with "I", some with "N", some with "M".
You already have an answer but, if the value (rather than order) of name determines the letter value assigned, I would suggest something like:
data test; input name :$10. rate ; datalines; I12345 2.3 0.1 1.2 0.2 1.6 0.3 1.5 0.4 1.4 M23456 3.4 0.1 2.3 0.2 1.8 0.3 1.4 0.4 1.5 N34567 2.6 0.1 1.2 0.2 1.5 0.3 1.6 0.4 1.7 ; run; data want (drop=name_root); set test; length name_root $10; retain name_root; if anyalpha(name) then do; name_root=name; name=-3.2; end; name=cats(name_root,byte(name*10+64)); run;
Art, CEO, AnalystFinder.com
Hi Art,
Your solution appends an '8' to the name when anyalpha(name)=TRUE. For these cases, I would suggest
name=-3.2;
so that you're appending a space.
@antonbcristina: thanks for catching that. I've changed it in my suggested code.
Art, CEO, AnalystFinder.com
Try using a format and then keep any oldname that doesn't begin with a zero.
proc format;
value $append '0.1' = 'A'
'0.2' = 'B'
'0.3' = 'C'
'0.4' = 'D';
run;
data test2;
set test(rename=(name=oldname));
retain last_name;
if oldname^=:'0' then do;
name=oldname;
last_name=oldname;
end;
else name = strip(last_name)||put(oldname,$append.);
drop last_name oldname;
run;
Thanks for the idea. However, my data not always begin with "I'. It depends on the exam. Different exam has a different suffix. I want a general code to apply all kinds of exams
I saw your comment and added in the:
if oldname^=:'0' then do;
I test you code and it works. Thank you very much, Merry Christmas!
If you need something even fancier, you can try using Perl Regular Expressions to capture the pattern of your expected names. In this case, I just tried to match a non-digit (\D) followed by one or more digits (\d?). prxmatch returns the position where the pattern is found, and we would like it to start in position one.
if prxmatch('/\D\d?/',oldname)=1 then do;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.