BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Yoko
Obsidian | Level 7

I recently posted a question and found a solution (Can I use TRANWRD to replace multiple variable values with space in one line?). 

 

I would like to ask further questions regarding that post. 

In reality, I do have a data set x (please see below) and more countries to remove than to keep. 

 

The program below requires making a list of countries to be deleted. It will be troublesome if there are a lot of countries. 

Since countries to be kept are only a few.  Is it possible to make a list of countries to be kept and use it to get the same result?

 

 

 

##### 

data countries(keep=country rename=(country=del));
input country $ 1-20;
datalines;
USA
Canada
;


data x(drop=del);
input country $1-40;
do i = 1 to n;
set countries point=i nobs=n;
country = tranwrd(country, trim(del), ' ');
end;
datalines;
China USA Japan
Korea China
Canada
;
run;

proc print; run;

  

Obs country
1 China Japan
2 Korea China
3 

 ####

 

Thank you, 

 

Yoko

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

It didn't work correctly because my logic was slightly flawed. It was trying to simultaneously search across a string but, at the same time, change the strings contents.

 

I think that the following actually does work and will be easier to follow:

data keeps(keep=country rename=(country=del));
  input country $ 1-20;
  datalines;
Korea
China
;

data x(drop=del);
  array keepers(3) $ /*_temporary_*/;
  retain keepers;
  if _n_ eq 1 then do i = 1 to n;
    set keeps point=i nobs=n;
    keepers(i)=del;
  end;

  input country $1-40;
  original=country;

  i=1;
  do while(scan(original,i,' ') ne '');
    if scan(original,i,' ') not in keepers then
      country = tranwrd(country, scan(original,i,' '), '');
    i+1;
  end;
  datalines;
China USA Japan
Korea China USA
Canada USA Japan
;

Art, CEO, AnalystFinder.com

 

View solution in original post

10 REPLIES 10
art297
Opal | Level 21

You could always just reverse your current logic. e.g.:

data keeps(keep=country rename=(country=del));
  input country $ 1-20;
  datalines;
Korea
China
;

data x(drop=del);
  array keepers(99) $ _temporary_;
  retain keepers;
  if _n_ eq 1 then do i = 1 to n;
    set keeps point=i nobs=n;
    keepers(i)=del;
  end;

  input country $1-40;

  i=1;
  do while(scan(country,i,' ') ne '');
    if scan(country,i,' ') not in keepers then
      country = tranwrd(country, trim(scan(country,i,' ')), ' ');
    i+1;
  end;
  datalines;
China USA Japan
Korea China
Canada
;

Art, CEO, AnalystFinder.com

 

Yoko
Obsidian | Level 7

Hello, 

 

Thank you for the program. I run it and I got the same result as the one using the 'dropping' method. 

 

But, I realized that the data set called 'keeps' has only Korea and China. 

But, Japan was still kept as well. 

The program is a bit more complicated for me to tell why Japan was still kept.  (This is what I want, though)

Would you mind explaining this? 

 

Thank you, 

 

Yoko

 

art297
Opal | Level 21

It didn't work correctly because my logic was slightly flawed. It was trying to simultaneously search across a string but, at the same time, change the strings contents.

 

I think that the following actually does work and will be easier to follow:

data keeps(keep=country rename=(country=del));
  input country $ 1-20;
  datalines;
Korea
China
;

data x(drop=del);
  array keepers(3) $ /*_temporary_*/;
  retain keepers;
  if _n_ eq 1 then do i = 1 to n;
    set keeps point=i nobs=n;
    keepers(i)=del;
  end;

  input country $1-40;
  original=country;

  i=1;
  do while(scan(original,i,' ') ne '');
    if scan(original,i,' ') not in keepers then
      country = tranwrd(country, scan(original,i,' '), '');
    i+1;
  end;
  datalines;
China USA Japan
Korea China USA
Canada USA Japan
;

Art, CEO, AnalystFinder.com

 

Yoko
Obsidian | Level 7

Thank you for your suggestion.  It worked!  I also tried slightly differently to be similar to what I actually have to do.  

This worked too (please see below).  But, when I tried with my real data, the variable equivalent to 'country' was empty. 

Do you have any ideas regarding what I should be careful or should pay attention when applying this to a real data set? 

 

####

DATA myData;

input ID country $40.;
DATALINES;
1 China USA Japan
2 Korea China
3 Canada
;
run;

 

data keeps(keep=country rename=(country=del));
input country $ 1-20;
datalines;
Korea
China
Japan
;
run ;

 

data newData(drop=del);
set myData;
array keepers(3) $;
retain keepers;
if _n_ eq 1 then do i = 1 to n;
set keeps point=i nobs=n;
keepers(i)=del;
end;

original=country;

i=1;
do while(scan(original,i,' ') ne '');
if scan(original,i,' ') not in keepers then
country = tranwrd(country, scan(original,i,' '), '');
i+1;
end;
run;

proc print; run;

###

 

Thank you!

 

Yoko

art297
Opal | Level 21

You'll have to provide some examples of where it doesn't work. Some reasons why it wouldn't work include: capitalization differences (which you can correct with the upcase function), leading or trailing spaces (which you can correct with trim, left and/or strip functions), country names that include more than two words (which you can correct by separating countries with a delimiter other than space), and variable naming differences.

 

Art, CEO, AnalystFinder.com

 

Yoko
Obsidian | Level 7

My real variable has 4 alphabets (e.g., AAAA) instead of country names. 

When I submit the program, there is o error message. Simply, the variable 'country' is emply like this: 

 

Obs ID country keepers1 keepers2 keepers3 original
1     1                JJJJ         CCCC     UUUU JJJJ
2 2 KKKK CCCC KKKK CCCC JJJJ KKKK CCCC
3 3 KKKK CCCC JJJJ DDDD

Yoko
Obsidian | Level 7

My real variable names are 4-character codes like AAAA instead of country names. 

When I submit the program, I do not get any errors.  Simply the country in the results is empty like this (please see below) 

while country for ID1 should be CCCC JJJJ, country for ID2 should be KKKK JJJJ, and country for ID3  should be 'space'. 

The codes are all capital letters and separated by one space if there are multiple codes. 

In reality there are only 3 codes I want to keep just like my country name example, and there are about 50 codes that I want to replace with space. Lastly, my data is huge.  Not sure if these matter..

 

Obs ID country keepers1 keepers2 keepers3 original
1     1                KKKK      CCCC      JJJJ        CCCC UUUU JJJJ
2     2                KKKK      CCCC      JJJJ        KKKK CCCC
3     3                KKKK      CCCC      JJJJ        DDDD

 

Yoko

art297
Opal | Level 21

Post your keepers data set in the form of a data step, as well as post 10 actual records in the form of a data step.

 

Art, CEO, AnalystFinder.com

art297
Opal | Level 21

I would hope you just made a typo, as shouldn't ID2 result in: KKKK CCCC

 

The following appears to be working perfectly for me (p.s. same code as I posted before):

data keeps(keep=country rename=(country=del));
  input country $ 1-20;
  datalines;
KKKK
CCCC
JJJJ
;

data x(drop=del);
  array keepers(3) $ /*_temporary_*/;
  retain keepers;
  if _n_ eq 1 then do i = 1 to n;
    set keeps point=i nobs=n;
    keepers(i)=del;
  end;

  input country $1-40;
  original=country;

  i=1;
  do while(scan(original,i,' ') ne '');
    if scan(original,i,' ') not in keepers then
      country = tranwrd(country, scan(original,i,' '), '');
    i+1;
  end;
  datalines;
CCCC UUUU JJJJ
KKKK CCCC
DDDD
;

Result:Capture.JPG

 

Art, CEO, AnalystFinder.com

 

 

Yoko
Obsidian | Level 7

Yes, it was my typo!  I got the same results too. 

 

And my real data is like NEXJ SELK YVRC SFOA (not real codes, but something like these)..

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 8225 views
  • 0 likes
  • 2 in conversation