BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ss59
Obsidian | Level 7

Hi all,

 

My column names of my data looks like following:

TR_ABC_S1_ABC_MNO_S3TR_ABC_S2_ABC_MNO_S3TR_ABC_S10_ABC_S2TR_ABC_S2_ABC_S10TR_ABC_S2_ABC_PQR_S3TR_XYZ_PQR_S1_XYZ_MNO_S3
110231


What I intend to do is split the column name in two, splitting it after the first number. I also intend to remove the 'TR_' that it starts with. The output I'm looking for is:

 

TR_ABC_S1_ABC_MNO_S3TR_ABC_S2_ABC_MNO_S3TR_ABC_S10_ABC_S2TR_ABC_S2_ABC_S10TR_ABC_S2_ABC_PQR_S3TR_XYZ_PQR_S1_XYZ_MNO_S3
ABC_S1ABC_S2ABC_S10ABC_S2ABC_S2XYZ_PQR_S1
MNO_S3MNO_S3ABC_S2ABC_S10ABC_PQR_S3XYZ_MNO_S3

 

I tried doing it using substr and scan for the first number in the string within it, but can't get it right. Could you please help? Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
input TR_ABC_S1_ABC_MNO_S3	TR_ABC_S2_ABC_MNO_S3	TR_ABC_S10_ABC_S2	TR_ABC_S2_ABC_S10	TR_ABC_S2_ABC_PQR_S3	TR_XYZ_PQR_S1_XYZ_MNO_S3 ;
cards;
1	1	0	2	3	1
;
run;
proc transpose data=have(obs=0) out=temp;
var _all_;
run;
data want;
 set temp;
 _name_=prxchange('s/^tr_//i',1,_name_);
pid=prxparse('/[_a-z]+\d+/i');
s=1;e=length(_name_);
call prxnext(pid,s,e,_name_,p,l);
do while(p>0);
  temp=prxchange('s/^_//',1,substr(_name_,p,l));
  output;
  call prxnext(pid,s,e,_name_,p,l);
end;
drop s e p l pid;
run;

View solution in original post

6 REPLIES 6
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Something like:

data want (orig res j);
  set have;
  length orig res $200;
  array tr{6) tr_:;
  do i=1 to 6;
    orig=vname(tr{i});
    do j=1 to countw(orig,"_");
      res=scan(orig,i,"_");
      output;
    end;
  end;
run;

This will create a nice long list of original variable names, and then each part of the name split by _.  You can then cat() the ones you want back together, i.e. 2 and 3 for first part, then 5 and 6 for second.

Note, if you post test data in the form of a datastep then working code can be provided.

Second note, transposed is always harder to program with than normalised (going down).  Also don't put data in variable/table names, it causes issues like this.  Use a parameter/result approach in normalised structure - will make your life much easier.

ss59
Obsidian | Level 7

Thanks @RW9.

 

I have managed to get it to columns, so I'm looking to get Name1 and Name2 from Name like the following:

 

NameName1Name2
TR_ABC_S1_ABC_MNO_S3ABC_S1MNO_S3
TR_ABC_S2_ABC_MNO_S3ABC_S2MNO_S3
TR_ABC_S10_ABC_S2ABC_S10ABC_S2
TR_ABC_S2_ABC_S10ABC_S2ABC_S10
TR_ABC_S2_ABC_PQR_S3ABC_S2ABC_PQR_S3
TR_XYZ_PQR_S1_XYZ_MNO_S3XYZ_PQR_S1XYZ_MNO_S3
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Then:

data want;
  set have;
  length name1 name2 $200;
  name1=catx("_",scan(name,2,"_"),scan(name,3,"_"));
  name2=catx("_",scan(name,5,"_"),scan(name,6,"_"));
run;
ss59
Obsidian | Level 7

I don't think you can put fixed positions like '2', '3' etc since they are shaped differently. 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Then loop over them:

data want;
  set have;
array name{2} $200;
in=1;
do i=1 to countw(name,"_");
if lengthn(compress(scan(name,i,"-")," ","d")) > 0 then do;
name{in}=catx('_',scan(name,i-1,"_"),scan(name,i,"_"));
in=in+1;
end;
end; run;  

Note assumes maximum of 2 items with a number in,  I.e. this checks each part, if it contains a digit then cat previous and current.   

Ksharp
Super User
data have;
input TR_ABC_S1_ABC_MNO_S3	TR_ABC_S2_ABC_MNO_S3	TR_ABC_S10_ABC_S2	TR_ABC_S2_ABC_S10	TR_ABC_S2_ABC_PQR_S3	TR_XYZ_PQR_S1_XYZ_MNO_S3 ;
cards;
1	1	0	2	3	1
;
run;
proc transpose data=have(obs=0) out=temp;
var _all_;
run;
data want;
 set temp;
 _name_=prxchange('s/^tr_//i',1,_name_);
pid=prxparse('/[_a-z]+\d+/i');
s=1;e=length(_name_);
call prxnext(pid,s,e,_name_,p,l);
do while(p>0);
  temp=prxchange('s/^_//',1,substr(_name_,p,l));
  output;
  call prxnext(pid,s,e,_name_,p,l);
end;
drop s e p l pid;
run;

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 4090 views
  • 4 likes
  • 3 in conversation