SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sasseln
Obsidian | Level 7

Hey all,

 

I  am trying for a few days already to cut the last 8 digits from a string into a new variable and delete theses digits from the old one. It needs to be done over 100 variables, so it has to be within an array.

 

The original variables look like that:

 

1234b8:r:20060607

627::20071012

382027:l:20071212

182b::20071615

17293b:r:20081281

 

There are 100 of these kind.

I tried scan and substr, that works for a single variable, but somehow within the array, the values with a character between the :: get lost.

 

The result should look like that:

 

Var                Date

1234b8:r       20060607

382027:l       20071212

182b             20071615

17293b:r       20081281

 

 

Probably there is an easy way to do that, but I can't find it.

 

Thank you in advance

All the best!

 

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

The error shows you the problem:

dat_ops95=  dat_ops96=  dat_ops97=  dat_ops98=  dat_ops99=  dat_ops100=  i=101 _ERROR_=1 _N_=1

 

There is not data in dat_ops95, or 96 etc.  Therefore any substr() evaluation will fail as lengthn(string)=0 then -7 = -7, invalid for substr().

 

Again, post test data exactly as you have it, as a datastep.  Follow this post:

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

It is impossible for me to cover data that you have if I don't know about it (i.e. you post on one column, there is 100 in the log).

 

Put an if around the calculation;

data test1;
  set test;
  array ops_ko ops_ko1-ops_ko100;    
  array dat_ops dat_ops1-dat_ops100;
  do i=1 to 100;
    if lengthn(ops_ko{i}) > 0 then do;
      dat_ops{i}=substr(ops_ko{i},lengthn(ops_ko{i})-7,10);
      ops_ko{i}=tranwrd(ops_ko{i},strip(dat_ops{i}),"");
    end;
  end;

run;

View solution in original post

21 REPLIES 21
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Something like:

data want;
  length thevar var date $50;
  thevar="1234b8:r:20060607";
  date=substr(thevar,lengthn(thevar)-7,10);
  var=tranwrd(thevar,strip(date),"");
run;
Kurt_Bremser
Super User

Try this:

data have;
input string : $20.;
cards;
1234b8:r:20060607
627::20071012
382027:l:20071212
182b::20071615
17293b:r:20081281
;
run;

data want;
set have;
length datestr $8;
format date yymmdd10.;
datestr = scan(string,-1,':','m');
date = input(datestr,yymmdd8.);
string = catx(':',scan(string,1,':','m'),scan(string,2,':','m'));
run;

proc print data=want noobs;
run;

The result:

string      datestr           date

1234b8:r    20060607    2006-06-07
627         20071012    2007-10-12
382027:l    20071212    2007-12-12
182b        20071615             .
17293b:r    20081281             .

Note that you have some invalid dates.

art297
Opal | Level 21

@RW9's code will result in a missing value for var if the first and second strings are the same.

 

The following is a slight modification of his code that corrects for that:

 

data have;
  length thevar $50;
  input;
  thevar=_infile_;
  cards;
1234b8:r:20060607
627::20071012
382027:l:20071212
182b::20071615
17293b:r:20081281
2008128120081281
;

data want;
  set have;
  length var date $50;
  date=substr(thevar,lengthn(thevar)-7,10);
  var=substr(thevar,1,lengthn(thevar)-8);
run;

HTH,

Art, CEO, AnalystFinder.com

 

Astounding
PROC Star

If you can do it for one variable accurately, using arrays shouldn't change the outcome.  One example:

 

data want;

set have;

array original {100} original_1 - original_100;

array dates {100} $ 8 date_1 - date_100;

do _n_=1 to 100;

   dates{_n_} = scan(original{_n_}, -1, ':');

   original{_n_} = substr(original{_n_}, 1, length(original{_n_})-9);

   if substr(original{_n_}, length(original{_n_}), 1) = ':' then substr(original{_n_}, length(original{_n_}), 1) = ' ';

end;

run;

 

It's untested, but I think I got the parentheses in the proper place.

DanielSantos
Barite | Level 11

Hi.

 

Another variation.

 

 

data want;
S1='1234b8:r:20060607';
S2=substr(S1,lengthn(S1)-7);
substr(S1,lengthn(S1)-7)='';
run;

Using the substr function on left of =

 

More on this here: https://v8doc.sas.com/sashtml/lgref/z0212267.htm

 

Daniel Santos @ www.cgd.pt

mansour_ibrahim
Obsidian | Level 7

Good morning,

I propose this solution

 

data test;
input var $50.;
cards;
1234b8:r:20060607
627::20071012
382027:l:20071212
182b::20070615
17293b:r:20081231
;run;


DATA test1 (keep=var: date);
format date yymmdd10.;
set test end=eof;
   length code $10.;
   IF _N_=1 THEN RE = PRXPARSE("/\:\d{8}/");
   retain RE;
   CALL PRXSUBSTR(RE,var,START,LENGTH); 
   IF START GT 0 THEN DO;    
   code = SUBSTRN(var,START + 1,LENGTH - 1); 
   date =input (code,yymmdd10.);
   var2=SUBSTRN(var,1,LENGTH(var) - LENGTH);
   output;
   end; 
run ;
sasseln
Obsidian | Level 7

Thank you all!

But all of the codes make my log file running full when they are used within an array.

Why is that and is there a solution for it?

 

@mansour_ibrahim

Somehow it works fine for the first variable, but when used in an array, in the second variable the date sometimes isn't cut off.

Do you know why?

 

ops_ko{i} is the variable.

 

DATA testila;
set test;

array ops_ko ops_ko1-ops_ko100;    
array dat_ops dat_ops1-dat_ops100;    
do i=1 to 100;    

   if _N_=1 then RE = PRXPARSE("/\:\d{8}/");             
   retain RE;                                            
   call PRXSUBSTR(RE,ops_ko{i},START,LENGTH);              
   if START GT 0 then do;                                
   dat_ops{i} = SUBSTRN(ops_ko{i},START + 1,LENGTH - 1);    
   ops_ko{i} = SUBSTRN(ops_ko{i},1,LENGTH(ops_ko{i}) - 9);   
   output;
   end;

end;
drop i Start Length RE;
run;

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Then your doing something other than the suggestions posted here.  My datastep above should be printed to the log, followed by three other text rows with note, and two rows of time taken to run.  Nothing more.  Without seeing your log/code we can't tell.  Is it your setup, are you running other things, are you printing something to log etc?

sasseln
Obsidian | Level 7

Thanks for the fast reply!

I am only running this code:


data test1;
  set test;

  array ops_ko ops_ko1-ops_ko100;    
  array dat_ops dat_ops1-dat_ops100;
  do i=1 to 100;

  dat_ops{i}=substr(ops_ko{i},lengthn(ops_ko{i})-7,10);
  ops_ko{i}=tranwrd(ops_ko{i},strip(dat_ops{i}),"");
  end;

run;

 

Thank you!

mansour_ibrahim
Obsidian | Level 7
data test;
input  @1 var $17. @18 var1 $20.;
cards;
1234b8:r:20060607 1234b8:r:20060607
152::20071012     627::20091012
382027:l:20071212 382027:l:20051212
182b::20070615    182b::20010615
17293b:r:20081231 17293b:r:20001231
;run;

DATA test1;
format date date1 yymmdd10.; 
set test;
array vr (*) $ var var1; 
array cd (*) $ code code1;
array dt(*) $ date date1;
array pr (*) $ prt1 prt2;  
do i=1 to dim(vr);
   if _N_=1 then RE = PRXPARSE("/\:\d{8}/");             
   retain RE;                                            
   call PRXSUBSTR(RE,vr(i),START,LENGTH);              
   if START GT 0 then do;                                
   cd{i} = SUBSTRN(vr{i},START + 1,LENGTH - 1);
   dt(i) =input (cd(i),yymmdd10.);
   pr(i)=SUBSTRN(vr(i),1,LENGTH(vr(i)) - LENGTH);
      
   end;
end;
drop i Start Length RE;
run;
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Can you post the log?  Are you getting lots of notes or something?

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Can you post the log?  Are you getting lots of notes or something?

sasseln
Obsidian | Level 7

Hi RW9,

 

this is the code:

 

data test2;
  set test;
  array ops_ko ops_ko1-ops_ko100;    
  array dat_ops dat_ops1-dat_ops100;
  do i=1 to 100;
  *length ops_ko date $50;
  dat_ops{i}=substr(ops_ko{i},lengthn(ops_ko{i})-7,10);
  ops_ko{i}=substr(ops_ko{i},1,lengthn(ops_ko{i})-8);
  end;
run;

 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Sorry, you have actually posted the bit of the log right between important bits.  I am specifically looking at notes/errors/warnings.  The reason being is this kind of output occurs when you substr invalid data, or input invalid numeric data or something along those lines.  What SAS does is print out all the data for that observation where that error occurs, hence as all your data seems to have the same problem, i.e. a logic problem, then you are getting large amounts of output in the log.

It is likely down to your array statements:

  array ops_ko ops_ko1-ops_ko100;    

These create numeric arrays, and you are trying to put substring information into them.  Try chaning to something like:

  array ops_ko $200. ops_ko1-ops_ko100; 

 

Also note, this is a very good example of why it is recommended to post test data - in the form of a datastep - so that we can see exactly what you are working with.  I do not see what you do, nor can I run what you are running!  

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 21 replies
  • 2964 views
  • 9 likes
  • 8 in conversation