Solved: Substituting 3rd element in the list

athapa1183 · Posted 12-15-2016 08:56 AM

Hi Everyone,

i have a looping and substitution problem in sas

The Data is

data have;
infile datalines;
input string :$150.;
datalines;
stevdend*kdioror*kdkdkd
SVD*H2459**HC:93000**1~
fakdlkfjajdf*kdkfkldlk*kdkdk
dkfjaokldjfkj*jkdkjfk*kdkkd
woososdlsld*ldlldkdka*lkswkd
SVD*H2459**HC:1090F:8P**0~
qwwkfjajdf*grekfkldlk*grgrrgkdk
xcvxcvaokldjfkj*aaadddrkjfk*qqksodjidf
woososdlsld*ldlldkdka*lkswkd
SVD*H2459**HC:1234K:4P**0~
;

the rows given to us are 2,6,10 (there are 2060 such rows)

so whenever the row numbers are given, I want to populate the 3rd element with *0*

for example in row 2 in the data have, I want this SVD*H2459*0*HC:93000**1~ (The 3rd element is populated with 0 now).

I want to this for all rows in the given list

MY CODE

%let list = 2,6,10;

data want;
set have;
if _N_ in (&list) then do ;
string= tranwrd(string,"**","*0*");
end;
run;

THE problem is it also changes the other blank in the same row to 0, which i don't want.

I also tried the Substr(string,11,1) = '0'; but it didn't help

I also tried to use the scan function but it doesn't recognizes the blank (**) between stars.

Could Someone please help.

Astounding · Posted 12-15-2016 09:22 AM

You're most of the way there. Your program has the right structure to it, and it's only a question of what

goes inside the DO group. Here's one possibility:

found = index(string, '**');

string = substr(string, 1, found) || '0' || substr(string, found+1);

View solution in original post

Reeza · Posted 12-15-2016 09:20 AM

Use SCAN to extract the components and then CATX() to recombine.

athapa1183 · Posted 12-15-2016 10:02 AM

Hi Reeza, As i mentioned the scan does not recognize the ** as a blank and is skipping to the next element with no blanks.

Kurt_Bremser · Posted 12-15-2016 10:06 AM

@athapa1183 wrote:
Hi Reeza, As i mentioned the scan does not recognize the ** as a blank and is skipping to the next element with no blanks.

Note the dsd option in @Reeza's code. That's what causes the input statement to recognize two separators as a missing value.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Astounding · Posted 12-15-2016 09:22 AM

You're most of the way there. Your program has the right structure to it, and it's only a question of what

goes inside the DO group. Here's one possibility:

found = index(string, '**');

string = substr(string, 1, found) || '0' || substr(string, found+1);

athapa1183 · Posted 12-15-2016 10:02 AM

Thank You Astounding..

Kurt_Bremser · Posted 12-15-2016 09:28 AM

First of all, I find it extremely sloppy to run logic per row number. One slight change in the dataset, and you end up with garbage.

I'd rather have the logic controlled by the fact that the first "word" in the string is 'SVD'.

data want;
set have;
length newstring $150;
if scan(string,1,'*') = 'SVD'
then do;
  do i = 1 to countw(string,'*');
    if i = 3
    then newstring = catx('*',trim(newstring),'0');
    else newstring = catx('*',trim(newstring),scan(string,i,'*'));
  end;
  string = newstring;
end;
drop i newstring;
run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

athapa1183 · Posted 12-15-2016 10:07 AM

Thank you KurtBremser. There are many SVD that we don't need to change that is why we need to go by the given rows. But your code logic is really helpful.

Kurt_Bremser · Posted 12-15-2016 10:09 AM

@athapa1183 wrote:
Thank you KurtBremser. There are many SVD that we don't need to change that is why we need to go by the given rows. But your code logic is really helpful.

There has to be some logic which lines need to be changed. Implement that logic right here in that data step, or you will end up footing yourself in the shoot.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

gamotte · Posted 12-15-2016 09:52 AM

Since the "*" seems to be a separator between the different elements of each row, why don't you use it

to read your data in separate columns ? It makes further processing much simpler.

data want;
infile datalines dlm="*" missover dsd;
format var1-var6  $20.;
input var1-var6;

if var3=" " then var3="0";

datalines;
stevdend*kdioror*kdkdkd
SVD*H2459**HC:93000**1~
fakdlkfjajdf*kdkfkldlk*kdkdk
dkfjaokldjfkj*jkdkjfk*kdkkd
woososdlsld*ldlldkdka*lkswkd
SVD*H2459**HC:1090F:8P**0~
qwwkfjajdf*grekfkldlk*grgrrgkdk
xcvxcvaokldjfkj*aaadddrkjfk*qqksodjidf
woososdlsld*ldlldkdka*lkswkd
SVD*H2459**HC:1234K:4P**0~
;
run;

athapa1183 · Posted 12-15-2016 10:04 AM

Thank you it seems to work but I need to concatanate all 6 columns at end.
it is really insightful.

gamotte · Posted 12-15-2016 10:28 AM

You can concatenate the columns with catx but it won't take missing values into account (ie, you

won't have two consecutive asteriks when a variable is missing).

Otherwise, you can use cat and strip but you have to list each variable. Both cases are illustrated below :

options mlogic mprint;

data want;
infile datalines dlm="*" missover dsd;
format var1-var6  $20. var7 $150.;
input var1-var6;

if var3=" " then var3="0";

/* Other operations ... */

var7=catx("*",of var1-var6);
var8=cat(strip(var1),"*",strip(var2),"*", strip(var3),"*", strip(var4),"*",strip(var5),"*",strip(var6));


datalines;
stevdend*kdioror*kdkdkd
SVD*H2459**HC:93000**1~
fakdlkfjajdf*kdkfkldlk*kdkdk
dkfjaokldjfkj*jkdkjfk*kdkkd
woososdlsld*ldlldkdka*lkswkd
SVD*H2459**HC:1090F:8P**0~
qwwkfjajdf*grekfkldlk*grgrrgkdk
xcvxcvaokldjfkj*aaadddrkjfk*qqksodjidf
woososdlsld*ldlldkdka*lkswkd
SVD*H2459**HC:1234K:4P**0~
;
run;

With the second method, you have to remove the trailing "***" on the non SVD rows if you want to keep the original structure of rows.

athapa1183 · Posted 12-15-2016 03:42 PM

it is clear with the cat & strip option.
Thank you

slchen · Posted 12-15-2016 09:55 AM

It is handy with regular expression.

data _null_;
  string="SVD*H2459**HC:1090F:8P**0~";
  insert=prxchange('s/\*\*/*0*/',1,string);
  put _all_;
run;

athapa1183 · Posted 12-15-2016 10:05 AM

Thank you So much. But the expression differ.

Substituting 3rd element in the list

Re: Substituting 3rd element in the list

Re: Substituting 3rd element in the list

Re: Substituting 3rd element in the list

Re: Substituting 3rd element in the list

Re: Substituting 3rd element in the list

Re: Substituting 3rd element in the list

Re: Substituting 3rd element in the list

Re: Substituting 3rd element in the list

Re: Substituting 3rd element in the list

Re: Substituting 3rd element in the list

Re: Substituting 3rd element in the list

Re: Substituting 3rd element in the list

Re: Substituting 3rd element in the list

Re: Substituting 3rd element in the list

Re: Substituting 3rd element in the list

Registration is open

SAS Training: Just a Click Away