Hi everyone.
May I please ask how to I get from dataset A to B?
Your help is greatly appreciated!
A:
id string
1 8"A",1,2,8"C"
2 9"Q",3
3 4,5,9"P",9"Z",8"3/7"
B:
id string comment8 comment9
1 8"A",1,2,8"C" A. C
2 9"Q",3 Q
3 4,5,9"P",9"Z",8"3/7" 3/7 P. Z
Hi, K.C.
This should get you started.
Tom
data have;
length String $255;
input ID String;
cards;
1 8"A",1,2,8"C"
2 9"Q",3
3 4,5,9"P",9"Z",8"3/7"
run;
data inter1(drop = _:);
length _word CommentText $255;
set have;
do _i = 1 to 255 until(missing(_word));
_word = scan(String, _i, ",");
if find(_word, '"') then
do;
CommentNumber = input(scan(_word, 1, '"'), best15.);
CommentText = scan(_word, 2, '"');
output;
end;
end;
run;
proc sort data=inter1 out=inter2;
by ID CommentNumber;
run;
data inter3;
length ConsolidatedCommentText $255;
retain ConsolidatedCommentText;
set inter2;
by ID CommentNumber;
if first.CommentNumber then
call missing(ConsolidatedCommentText);
if missing(ConsolidatedCommentText) then
ConsolidatedCommentText = CommentText;
else ConsolidatedCommentText = catx(".", ConsolidatedCommentText, CommentText);
if last.CommentNumber then
output;
run;
proc transpose data=inter3 out=want(drop = _NAME_) prefix=comment;
by ID String;
id CommentNumber;
var ConsolidatedCommentText;
run;
Using PRX string matching functions :
data have;
length string $64;
input id string &;
datalines;
1 8"A",1,2,8"C"
2 9"Q",3
3 4,5,9"P",9"Z",8"3/7"
;
data want;
set have;
retain prx8 prx9;
length comment8 comment9 $32;
if _n_=1 then do;
prx8 = prxparse('/\b8"[^"]+"/');
prx9 = prxparse('/\b9"[^"]+"/');
end;
start = 1;
do until (position=0);
call prxnext(prx8, start, -1, string, position, length);
comment8 = catx(". ", comment8, substrn(string, position+2, max(0,length-3)));
end;
start = 1;
do until (position=0);
call prxnext(prx9, start, -1, string, position, length);
comment9 = catx(". ", comment9, substrn(string, position+2, max(0,length-3)));
end;
drop start position length prx8 prx9;
run;
proc print; run;
PG
You can scan the string using comma as a delimiter. You can then scan the individual values (or "words") to find index number and the comment value.
The code below will work when the prefix on the words are single digit numbers. If the values could be larger than 9 then you will need to change the array and also the method used to convert the prefix on the word to an index value. If you are positive that prefix values 0 to 7 could never have any comments then it would probably just be easiest to add a drop statement for those variables.
data want ;
set have ;
array comment (0:9) $40 comment0-comment9 ;
do i=1 to countw(string,',');
word = scan(string,i,',');
index= input(word,1.);
comment(index) = catx('. ',comment(index),scan(word,2,'"'));
end;
drop i word index ;
run;
How about this:
data have; length String $255; input ID String; cards; 1 8"A",1,2,8"C" 2 9"Q",3 3 4,5,9"P",9"Z",8"3/7" ; run; data temp(keep=id a b ); set have; do i=1 to countw(string,','); value=scan(string,i,','); a=scan(value,1,'"'); b=scan(value,2,'"'); if find(value,'"') then output; end; run; proc sort data=temp; by id a; run; data x(keep=id a x); set temp; by id a; length x $ 40; retain x; x=catx('.',x,b); if last.a then do; output; call missing(x); end; run; proc transpose data=x out=want(drop=_name_) prefix=comment; by id; id a; var x; run;
Ksharp
Thanks a lot everyone for your help!
Great to know there are many different ways to achieve this.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.