BookmarkSubscribeRSS Feed
rykwong
Quartz | Level 8

Dear friends at SAS community

I have a dataset that has 4000 observations.

 

In this dataset, I have a variable var1 which contains text data like this"[123/121/324/345]".  I need to extract the numbers as separate numeric variables.  For any observation, there can be up to 30 numbers in var1 that I need to extract:

 

so I am trying to extract new1=123, new2=121, new3=324 and new4=345.   I tried the following code:

new1=scan(compress(var1, '[]'),1,'/');

new2=scan(compress(var1, '[]'),2,'/');

new3=scan(compress(var1, '[]'),3,'/');

new4=scan(compress(var1, '[]'),4,'/');

 

but the results do not seem correct.  The last number is missing and never extracted.  Also, when an observation only has one number in var1, e.g. var1="[123]", new1 is missing rather than 123 that I want to have.  

 

so for var1="[123/121/324/345]"; I got new1=123, new2=121, new3=324, new4=. when I want new4=345

     for var1="[123]", I got new1=. when I want new1=123

 

any thoughts of a solution?

 

thanks 

Raymond

4 REPLIES 4
novinosrin
Tourmaline | Level 20

data w;
k="[123/121/324/345]";
k1=compress(k,'[]');
do _n_=1 to countw(k1);
num=scan(k1,_n_,'/');
output;
end;
drop k1;
run;

data w;
k="[123/121/324/345]";
k1=compress(k,'[]');
do _n_=1 to countw(k1);
num=scan(k1,_n_,'/');
output;
end;
drop k1;
run;
rykwong
Quartz | Level 8

dear novinosrin,

 

thanks so much for your help.  one of such "var1" variables is cardiac_nuclear_tests_year so I ran the code below

data w;
k1=compress(cardiac_nuclear_tests_year,'[]');
do _n_=1 to countw(k1);
num=scan(k1,_n_,'/');
output;
end;
drop k1;
run;

I am still having problems, because this code puts num in a long format, instead of as separate variables year1, year2....year5.  Also other than cardiac_nuclear_tests_year, I have 10 other similar variables to extract from.   Any other thought?  many thanks 

 

 

novinosrin
Tourmaline | Level 20
data w;
k="[123/121/324/345]";
k1=compress(k,'[]');
grp=0;
do _n_=1 to countw(k1);
num=scan(k1,_n_,'/');
grp+1;
output;
end;
drop k1;
run;

proc transpose data=w out=want prefix=year;
by k;
var num;
id grp;
run;
ballardw
Super User

@rykwong wrote:

dear novinosrin,

 

thanks so much for your help.  one of such "var1" variables is cardiac_nuclear_tests_year so I ran the code below

data w; k1=compress(cardiac_nuclear_tests_year,'[]'); do _n_=1 to countw(k1); num=scan(k1,_n_,'/'); output; end; drop k1; run;

I am still having problems, because this code puts num in a long format, instead of as separate variables year1, year2....year5.  Also other than cardiac_nuclear_tests_year, I have 10 other similar variables to extract from.   Any other thought?  many thanks 

 

 


Perhaps:

data w;
   set have;
   k1=compress(cardiac_nuclear_tests_year,'[]');
   array num{30};
   do i=1 to countw(k1);
      num[i]=scan(k1,i,'/');
   end;
   drop k1 i;
run;

I consider creating a variable _n_ to be suboptimal as SAS has an automatic variable _n_ that indicates the iteration of the data step and could 1) be confusing in code and 2) mess up the count if needed.

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1152 views
  • 0 likes
  • 3 in conversation