Help using Base SAS procedures

data manipulation - string concatenation

Reply
Regular Contributor
Posts: 228

data manipulation - string concatenation

Hi everyone.

May I please ask how to I get from dataset A to B?

Your help is greatly appreciated! Smiley Happy


A:

id     string

1     8"A",1,2,8"C"

2     9"Q",3

3     4,5,9"P",9"Z",8"3/7"

B:

id     string                         comment8    comment9

1     8"A",1,2,8"C"              A. C            

2     9"Q",3                                                  Q

3     4,5,9"P",9"Z",8"3/7"      3/7                  P. Z

PROC Star
Posts: 1,167

Re: data manipulation - string concatenation

Hi, K.C.

This should get you started.

Tom

data have;
length String $255;
input ID String;
cards;
1     8"A",1,2,8"C"
2     9"Q",3
3     4,5,9"P",9"Z",8"3/7"
run;

data inter1(drop = _Smiley Happy;
length _word CommentText $255;
set have;

do _i = 1 to 255 until(missing(_word));
  _word = scan(String, _i, ",");

  if find(_word, '"') then
   do;
    CommentNumber = input(scan(_word, 1, '"'), best15.);
    CommentText = scan(_word, 2, '"');
    output;
   end;
end;
run;

proc sort data=inter1 out=inter2;
by ID CommentNumber;
run;

data inter3;
length ConsolidatedCommentText $255;
retain ConsolidatedCommentText;
set inter2;
by ID CommentNumber;

if first.CommentNumber then
  call missing(ConsolidatedCommentText);

if missing(ConsolidatedCommentText) then
  ConsolidatedCommentText = CommentText;
else ConsolidatedCommentText = catx(".", ConsolidatedCommentText, CommentText);

if last.CommentNumber then
  output;
run;

proc transpose data=inter3 out=want(drop = _NAME_) prefix=comment;
by ID String;
id CommentNumber;
var ConsolidatedCommentText;
run;

Respected Advisor
Posts: 4,934

Re: data manipulation - string concatenation

Using PRX string matching functions :

data have;
length string $64;
input id string &;
datalines;
1     8"A",1,2,8"C"
2     9"Q",3
3     4,5,9"P",9"Z",8"3/7"
;

data want;
set have;
retain prx8 prx9;
length comment8 comment9 $32;
if _n_=1 then do;
     prx8 = prxparse('/\b8"[^"]+"/');
     prx9 = prxparse('/\b9"[^"]+"/');
     end;
start = 1;
do until (position=0);
     call prxnext(prx8, start, -1, string, position, length);
     comment8 = catx(". ", comment8, substrn(string, position+2, max(0,length-3)));
     end;
start = 1;
do until (position=0);
     call prxnext(prx9, start, -1, string, position, length);
     comment9 = catx(". ", comment9, substrn(string, position+2, max(0,length-3)));
     end;
drop start position length prx8 prx9;
run;

proc print; run;

PG

PG
Super User
Super User
Posts: 7,077

Re: data manipulation - string concatenation

You can scan the string using comma as a delimiter.  You can then scan the individual values (or "words") to find index number and the comment value.

The code below will work when the prefix on the words are single digit numbers. If the values could be larger than 9 then you will need to change the array and also the method used to convert the prefix on the word to an index value.  If you are positive that prefix values 0 to 7 could never have any comments then it would probably just be easiest to add a drop statement for those variables.

data want ;

  set have ;

  array comment (0:9) $40 comment0-comment9 ;

  do i=1 to countw(string,',');

    word = scan(string,i,',');

    index= input(word,1.);

    comment(index) = catx('. ',comment(index),scan(word,2,'"'));

  end;

  drop i word index ;

run;

Super User
Posts: 10,046

Re: data manipulation - string concatenation

How about this:

data have;
length String $255;
input ID String;
cards;
1     8"A",1,2,8"C"
2     9"Q",3
3     4,5,9"P",9"Z",8"3/7"
;
run;

data temp(keep=id a b );
 set have;
  do i=1 to countw(string,',');
   value=scan(string,i,',');
   a=scan(value,1,'"');
   b=scan(value,2,'"');
   if find(value,'"') then output;
  end;
run;
proc sort data=temp;
 by id a;
run;
data x(keep=id a x);
 set temp;
 by id a;
 length x $ 40;
 retain x;
 x=catx('.',x,b);
 if last.a then do;
                  output;
                      call missing(x);
                    end;
run;
proc transpose data=x out=want(drop=_name_) prefix=comment;
 by id;
 id a;
 var x;
run;



Ksharp

Regular Contributor
Posts: 228

Re: data manipulation - string concatenation

Thanks a lot everyone for your help! Smiley Happy

Great to know there are many different ways to achieve this.

Ask a Question
Discussion stats
  • 5 replies
  • 249 views
  • 9 likes
  • 5 in conversation