BookmarkSubscribeRSS Feed
rykwong
Quartz | Level 8

Dear friends at SAS community

I have a dataset that has 4000 observations.

 

In this dataset, I have a variable var1 which contains text data like this"[123/121/324/345]".  I need to extract the numbers as separate numeric variables.  For any observation, there can be up to 30 numbers in var1 that I need to extract:

 

so I am trying to extract new1=123, new2=121, new3=324 and new4=345.   I tried the following code:

new1=scan(compress(var1, '[]'),1,'/');

new2=scan(compress(var1, '[]'),2,'/');

new3=scan(compress(var1, '[]'),3,'/');

new4=scan(compress(var1, '[]'),4,'/');

 

but the results do not seem correct.  The last number is missing and never extracted.  Also, when an observation only has one number in var1, e.g. var1="[123]", new1 is missing rather than 123 that I want to have.  

 

so for var1="[123/121/324/345]"; I got new1=123, new2=121, new3=324, new4=. when I want new4=345

     for var1="[123]", I got new1=. when I want new1=123

 

any thoughts of a solution?

 

thanks 

Raymond

4 REPLIES 4
novinosrin
Tourmaline | Level 20

data w;
k="[123/121/324/345]";
k1=compress(k,'[]');
do _n_=1 to countw(k1);
num=scan(k1,_n_,'/');
output;
end;
drop k1;
run;

data w;
k="[123/121/324/345]";
k1=compress(k,'[]');
do _n_=1 to countw(k1);
num=scan(k1,_n_,'/');
output;
end;
drop k1;
run;
rykwong
Quartz | Level 8

dear novinosrin,

 

thanks so much for your help.  one of such "var1" variables is cardiac_nuclear_tests_year so I ran the code below

data w;
k1=compress(cardiac_nuclear_tests_year,'[]');
do _n_=1 to countw(k1);
num=scan(k1,_n_,'/');
output;
end;
drop k1;
run;

I am still having problems, because this code puts num in a long format, instead of as separate variables year1, year2....year5.  Also other than cardiac_nuclear_tests_year, I have 10 other similar variables to extract from.   Any other thought?  many thanks 

 

 

novinosrin
Tourmaline | Level 20
data w;
k="[123/121/324/345]";
k1=compress(k,'[]');
grp=0;
do _n_=1 to countw(k1);
num=scan(k1,_n_,'/');
grp+1;
output;
end;
drop k1;
run;

proc transpose data=w out=want prefix=year;
by k;
var num;
id grp;
run;
ballardw
Super User

@rykwong wrote:

dear novinosrin,

 

thanks so much for your help.  one of such "var1" variables is cardiac_nuclear_tests_year so I ran the code below

data w; k1=compress(cardiac_nuclear_tests_year,'[]'); do _n_=1 to countw(k1); num=scan(k1,_n_,'/'); output; end; drop k1; run;

I am still having problems, because this code puts num in a long format, instead of as separate variables year1, year2....year5.  Also other than cardiac_nuclear_tests_year, I have 10 other similar variables to extract from.   Any other thought?  many thanks 

 

 


Perhaps:

data w;
   set have;
   k1=compress(cardiac_nuclear_tests_year,'[]');
   array num{30};
   do i=1 to countw(k1);
      num[i]=scan(k1,i,'/');
   end;
   drop k1 i;
run;

I consider creating a variable _n_ to be suboptimal as SAS has an automatic variable _n_ that indicates the iteration of the data step and could 1) be confusing in code and 2) mess up the count if needed.

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1206 views
  • 0 likes
  • 3 in conversation