BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ss59
Obsidian | Level 7

Hi all,

 

My column names of my data looks like following:

TR_ABC_S1_ABC_MNO_S3TR_ABC_S2_ABC_MNO_S3TR_ABC_S10_ABC_S2TR_ABC_S2_ABC_S10TR_ABC_S2_ABC_PQR_S3TR_XYZ_PQR_S1_XYZ_MNO_S3
110231


What I intend to do is split the column name in two, splitting it after the first number. I also intend to remove the 'TR_' that it starts with. The output I'm looking for is:

 

TR_ABC_S1_ABC_MNO_S3TR_ABC_S2_ABC_MNO_S3TR_ABC_S10_ABC_S2TR_ABC_S2_ABC_S10TR_ABC_S2_ABC_PQR_S3TR_XYZ_PQR_S1_XYZ_MNO_S3
ABC_S1ABC_S2ABC_S10ABC_S2ABC_S2XYZ_PQR_S1
MNO_S3MNO_S3ABC_S2ABC_S10ABC_PQR_S3XYZ_MNO_S3

 

I tried doing it using substr and scan for the first number in the string within it, but can't get it right. Could you please help? Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
input TR_ABC_S1_ABC_MNO_S3	TR_ABC_S2_ABC_MNO_S3	TR_ABC_S10_ABC_S2	TR_ABC_S2_ABC_S10	TR_ABC_S2_ABC_PQR_S3	TR_XYZ_PQR_S1_XYZ_MNO_S3 ;
cards;
1	1	0	2	3	1
;
run;
proc transpose data=have(obs=0) out=temp;
var _all_;
run;
data want;
 set temp;
 _name_=prxchange('s/^tr_//i',1,_name_);
pid=prxparse('/[_a-z]+\d+/i');
s=1;e=length(_name_);
call prxnext(pid,s,e,_name_,p,l);
do while(p>0);
  temp=prxchange('s/^_//',1,substr(_name_,p,l));
  output;
  call prxnext(pid,s,e,_name_,p,l);
end;
drop s e p l pid;
run;

View solution in original post

6 REPLIES 6
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Something like:

data want (orig res j);
  set have;
  length orig res $200;
  array tr{6) tr_:;
  do i=1 to 6;
    orig=vname(tr{i});
    do j=1 to countw(orig,"_");
      res=scan(orig,i,"_");
      output;
    end;
  end;
run;

This will create a nice long list of original variable names, and then each part of the name split by _.  You can then cat() the ones you want back together, i.e. 2 and 3 for first part, then 5 and 6 for second.

Note, if you post test data in the form of a datastep then working code can be provided.

Second note, transposed is always harder to program with than normalised (going down).  Also don't put data in variable/table names, it causes issues like this.  Use a parameter/result approach in normalised structure - will make your life much easier.

ss59
Obsidian | Level 7

Thanks @RW9.

 

I have managed to get it to columns, so I'm looking to get Name1 and Name2 from Name like the following:

 

NameName1Name2
TR_ABC_S1_ABC_MNO_S3ABC_S1MNO_S3
TR_ABC_S2_ABC_MNO_S3ABC_S2MNO_S3
TR_ABC_S10_ABC_S2ABC_S10ABC_S2
TR_ABC_S2_ABC_S10ABC_S2ABC_S10
TR_ABC_S2_ABC_PQR_S3ABC_S2ABC_PQR_S3
TR_XYZ_PQR_S1_XYZ_MNO_S3XYZ_PQR_S1XYZ_MNO_S3
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Then:

data want;
  set have;
  length name1 name2 $200;
  name1=catx("_",scan(name,2,"_"),scan(name,3,"_"));
  name2=catx("_",scan(name,5,"_"),scan(name,6,"_"));
run;
ss59
Obsidian | Level 7

I don't think you can put fixed positions like '2', '3' etc since they are shaped differently. 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Then loop over them:

data want;
  set have;
array name{2} $200;
in=1;
do i=1 to countw(name,"_");
if lengthn(compress(scan(name,i,"-")," ","d")) > 0 then do;
name{in}=catx('_',scan(name,i-1,"_"),scan(name,i,"_"));
in=in+1;
end;
end; run;  

Note assumes maximum of 2 items with a number in,  I.e. this checks each part, if it contains a digit then cat previous and current.   

Ksharp
Super User
data have;
input TR_ABC_S1_ABC_MNO_S3	TR_ABC_S2_ABC_MNO_S3	TR_ABC_S10_ABC_S2	TR_ABC_S2_ABC_S10	TR_ABC_S2_ABC_PQR_S3	TR_XYZ_PQR_S1_XYZ_MNO_S3 ;
cards;
1	1	0	2	3	1
;
run;
proc transpose data=have(obs=0) out=temp;
var _all_;
run;
data want;
 set temp;
 _name_=prxchange('s/^tr_//i',1,_name_);
pid=prxparse('/[_a-z]+\d+/i');
s=1;e=length(_name_);
call prxnext(pid,s,e,_name_,p,l);
do while(p>0);
  temp=prxchange('s/^_//',1,substr(_name_,p,l));
  output;
  call prxnext(pid,s,e,_name_,p,l);
end;
drop s e p l pid;
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 3834 views
  • 4 likes
  • 3 in conversation