DATA Step, Macro, Functions and more

error extracting numeric from a text

Reply
Contributor
Posts: 53

error extracting numeric from a text

Dear friends at SAS community

I have a dataset that has 4000 observations.

 

In this dataset, I have a variable var1 which contains text data like this"[123/121/324/345]".  I need to extract the numbers as separate numeric variables.  For any observation, there can be up to 30 numbers in var1 that I need to extract:

 

so I am trying to extract new1=123, new2=121, new3=324 and new4=345.   I tried the following code:

new1=scan(compress(var1, '[]'),1,'/');

new2=scan(compress(var1, '[]'),2,'/');

new3=scan(compress(var1, '[]'),3,'/');

new4=scan(compress(var1, '[]'),4,'/');

 

but the results do not seem correct.  The last number is missing and never extracted.  Also, when an observation only has one number in var1, e.g. var1="[123]", new1 is missing rather than 123 that I want to have.  

 

so for var1="[123/121/324/345]"; I got new1=123, new2=121, new3=324, new4=. when I want new4=345

     for var1="[123]", I got new1=. when I want new1=123

 

any thoughts of a solution?

 

thanks 

Raymond

PROC Star
Posts: 1,833

Re: error extracting numeric from a text

data w;
k="[123/121/324/345]";
k1=compress(k,'[]');
do _n_=1 to countw(k1);
num=scan(k1,_n_,'/');
output;
end;
drop k1;
run;

data w;
k="[123/121/324/345]";
k1=compress(k,'[]');
do _n_=1 to countw(k1);
num=scan(k1,_n_,'/');
output;
end;
drop k1;
run;
Contributor
Posts: 53

Re: error extracting numeric from a text

Posted in reply to novinosrin

dear novinosrin,

 

thanks so much for your help.  one of such "var1" variables is cardiac_nuclear_tests_year so I ran the code below

data w;
k1=compress(cardiac_nuclear_tests_year,'[]');
do _n_=1 to countw(k1);
num=scan(k1,_n_,'/');
output;
end;
drop k1;
run;

I am still having problems, because this code puts num in a long format, instead of as separate variables year1, year2....year5.  Also other than cardiac_nuclear_tests_year, I have 10 other similar variables to extract from.   Any other thought?  many thanks 

 

 

PROC Star
Posts: 1,833

Re: error extracting numeric from a text

data w;
k="[123/121/324/345]";
k1=compress(k,'[]');
grp=0;
do _n_=1 to countw(k1);
num=scan(k1,_n_,'/');
grp+1;
output;
end;
drop k1;
run;

proc transpose data=w out=want prefix=year;
by k;
var num;
id grp;
run;
Super User
Posts: 13,583

Re: error extracting numeric from a text


@rykwong wrote:

dear novinosrin,

 

thanks so much for your help.  one of such "var1" variables is cardiac_nuclear_tests_year so I ran the code below

data w; k1=compress(cardiac_nuclear_tests_year,'[]'); do _n_=1 to countw(k1); num=scan(k1,_n_,'/'); output; end; drop k1; run;

I am still having problems, because this code puts num in a long format, instead of as separate variables year1, year2....year5.  Also other than cardiac_nuclear_tests_year, I have 10 other similar variables to extract from.   Any other thought?  many thanks 

 

 


Perhaps:

data w;
   set have;
   k1=compress(cardiac_nuclear_tests_year,'[]');
   array num{30};
   do i=1 to countw(k1);
      num[i]=scan(k1,i,'/');
   end;
   drop k1 i;
run;

I consider creating a variable _n_ to be suboptimal as SAS has an automatic variable _n_ that indicates the iteration of the data step and could 1) be confusing in code and 2) mess up the count if needed.

 

Ask a Question
Discussion stats
  • 4 replies
  • 61 views
  • 0 likes
  • 3 in conversation