BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
triunk
Obsidian | Level 7

Hello

 

I have a problem that cannot wrap my head around it... :

I have this table:

HAVE WANT
ID1 ID2 ID3 ID
    K02563087 K02563087
0.035501      
.0V0339435 K03394375 K03394375
KO3499516 K03499516 K03499516
K03508960 K03339796 K03339796 K03339796
000030164219NCZUTL02 K03415332 K03415332
1E+09 K02748355 K02748355
EWOLFENWAFEN K03584884 K03584884
EBILHOITE    
ERICHARSON115 K03656588 K03656588
K03516138 K01972934 K01972934
K03516140   K03516140

 

The first three columns are what I have and the fourth column is the ideal scenario.

How would you approach this?

 

Takis

 

P.S. The correct format of ID is letter K and then 8 numbers

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

I have added some validity code:

data want;
 set have;
       array idx $ id1-id3;  /* original data */

     /* assuming ID is the last of valid IDs */
      do i=1 to 3;
           idx(i) = upcase(idx(i));    /* ensure capital K */
           if  substr(idx(i),1,1) = 'K' and
               length(compress(idx(i),'K',kd) = 8  
            then id = idx(i);
     end;
run;     

as space is non valid ID then the last valid is choosed.

View solution in original post

11 REPLIES 11
Shmuel
Garnet | Level 18

Try next logic/code:

 

data want;
 set have;
       array idx $ id1-id3;  /* original data */

     /* assuming ID is the last of valid IDs */
      do i=1 to 3;
           if  substr(idx(i),1,1) = 'K' then id = idx(i);
     end;
run;     

Check:

    - can an IDx start with 'K' but have more alphabetic characters ?

     - can an IDx have less then 8 digits after the 'K' ?

     - Is it always capital 'K' ( never 'k') ?

     - Is there any case you will prefer the first IDx instead the last valid ?

triunk
Obsidian | Level 7

Hi Shmuel,

 

Thank you for these input and great questions:

 

Check:

    - can an IDx start with 'K' but have more alphabetic characters ?

No it has to start with K and then have numbers if you see in ID1 one of the IDs has letter O instead of 0

 

     - can an IDx have less then 8 digits after the 'K' ?

Nope

     - Is it always capital 'K' ( never 'k') ?

yes it should be K although people may type it as k

 

     - Is there any case you will prefer the first IDx instead the last valid ?

yes if it is empty the ID3 I would prefere ID1 or ID2 that has this format

 

I hope now it is more clear. Thank you again very much!

Shmuel
Garnet | Level 18

I have added some validity code:

data want;
 set have;
       array idx $ id1-id3;  /* original data */

     /* assuming ID is the last of valid IDs */
      do i=1 to 3;
           idx(i) = upcase(idx(i));    /* ensure capital K */
           if  substr(idx(i),1,1) = 'K' and
               length(compress(idx(i),'K',kd) = 8  
            then id = idx(i);
     end;
run;     

as space is non valid ID then the last valid is choosed.

cgprog
Fluorite | Level 6

Hi Shmuel I was so excited and accepted your solution too soon.

 

When I tried to run it it gives me this error:

 

12813 data want3;
12814 set have;
12815 array idx $ id1-id3; /* original data */
12816
12817 /* assuming ID is the last of valid IDs */
12818 do i=1 to 3;
12819 idx(i) = upcase(idx(i)); /* ensure capital K */
12820 if substr(idx(i),1,1) = 'K' and length(compress(idx(i),'K',kd) = 8

 


------
72
12821 then id = idx(i);
---- -
388 79
202
ERROR 72-185: The LENGTH function call has too many arguments.

ERROR 388-185: Expecting an arithmetic operator.

ERROR 79-322: Expecting a ).

ERROR 202-322: The option or parameter is not recognized and will be ignored.

12822 end;
12823 run;

 

Now as you see that I have one more comment that I did not make clear...

If I have in ID1: KO3524846 (letter O instead of number 0) but nothing on ID2 and ID3 how I can include in the code a correction of ID1 to K03524846 (all numbers)

If I have ID1:K03508960 (good format) and different ID2:K03339796 (good also format) how I can include in the code to take ID1 as correct? 

 

I attach the table here again with those two revised examples:

Shmuel
Garnet | Level 18

Sorry for my late answer, as I was on vacation.

Just in case you did not overcome yourself the error:

12820 if substr(idx(i),1,1) = 'K' and length(compress(idx(i),'K',kd) = 8

 


------
72
12821 then id = idx(i);

the code line should be:

if substr(idx(i),1,1) = 'K' and length(compress(idx(i)) , 'K' , kd ) = 8
then id = idx(i);

I have added a  ) at the right place.

PeterClemmensen
Tourmaline | Level 20
data have;
input (id1-id3)(:$20.);
infile datalines dsd missover dlm=',';
datalines;
,,K02563087
0.035501,,     
.0V0339435,K03394375, 
KO3499516,K03499516, 
K03508960,K03339796,K03339796
000030164219NCZUTL02,K03415332,
1E+09,K02748355, 
EWOLFENWAFEN,K03584884, 
EBILHOITE,,
ERICHARSON115,K03656588, 
K03516138,K01972934, 
K03516140,,
;

data want(drop=i j);
   set have;
   array id{3} $ id1-id3;
   do i=1 to dim(id);
      if char(id[i], 1) ne 'K' then id[i]=" ";
   end;
   do j=dim(id) to 1 by -1;
      if not missing(id[j]) then do;
         newvar=id[j];
         leave;
      end;
   end;
run;
ghosh
Barite | Level 11
WantID = COALESCEC(ID3, ID2, ID1); 

 

Then test for the validity of WantID (ie starting with K followed by 8 numbers)

Docs here for the function: Coalescec function

triunk
Obsidian | Level 7

a good way to check validity thank you! 

ghosh
Barite | Level 11

The validity test is for you to code.
But COALESCEC will give you the ID value you need based on the non blank values of ID3 to ID1. Please note the order of the variables

Shmuel
Garnet | Level 18

@ghosh , please pay attention that according to documentation - COALESCE:

"Returns the first nonmissing value from a list of character arguments."

and that it not what wanted according to given sample.

ghosh
Barite | Level 11

@Shmuel
It's actually the COALESCEC fn

 

From his sample, I understood he wants to pick the last of the IDn values if valid 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 2489 views
  • 6 likes
  • 5 in conversation