BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Sathish_jammy
Lapis Lazuli | Level 10

Dear Experts,

 

I share the sample dataset which matches my requirement for the study.

I have two datasets contains 2 columns varX varY. 

data have1;
input varX$ varY$;
cards;
abc C1
def C2
ghi C3
jkl C4
mno C5
;

The above code list the alphabets individually refers to C_value.

The below code refers to the combination of alphabets, So according to alphabets I need to match the C_values on varY1.

data have2;
input varX1$;
cards;
abc/def
ghi/mno
jkl/def
;

Required Output like:

abc/defC1/C2
ghi/mnoC3/C5
jkl/defC4/C2

 

Kindly suggest a code to solve the task.

Thanks in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
andreas_lds
Jade | Level 19

Hash-object to the rescue ... maybe ...

Not sure that i understood the c0-problem at all, updated have-datesets would be useful.

 

data want;
   if 0 then set have1;

   set have2;

   length varY1 $ 10;

   if _n_ = 1 then do;
      declare hash h(dataset: 'have1');
      h.defineKey('varx');
      h.defineData('vary');
      h.defineDone();
   end;

   varY1 = '';

   do i = 1 to countw(varx1, '/');
      varX = scan(varX1, i, '/');

      if h.find() = 0 then do;
         varY1 = catx('/', varY1, varY);
      end;
      else do;
         varY1 = catx('/', varY1, 'C0');
      end;
   end;

   keep VarX1 VarY1;
run;

View solution in original post

12 REPLIES 12
ed_sas_member
Meteorite | Level 14

Hi @Sathish_jammy 

 

You can try this code:

data _null_;
	set have1;
	call symputx(varX,varY);
run;

data want;
	set have2;
	varY1 = catx("/",symget(scan(varX1,1,"/")),symget(scan(varX1,2,"/")));
run;

 Capture d’écran 2019-12-16 à 10.20.24.png

 

Sathish_jammy
Lapis Lazuli | Level 10

Thank you so much @ed_sas_member  @gamotte  

Both of your code works well.

 

Additionally, I required one more condition for this task.

Assume that I mentioned any new combination of 3alphabets in (have1(varX1)) which not listed in (varX(have))

How could I get the result as 'C0' in varY1.

For example :

abc/mno/uvx  - C1/C5/C0

ghi/pqr/jkl  - C3/C0/C4

 

It required because Once I found the C0 so I could add that new value in my list manually like

uvx - C6

pqr - C7

 

gamotte
Rhodochrosite | Level 12

You can use the coalescec function :

 

proc sql noprint;
    CREATE TABLE want AS
    SELECT a.varX1, catx('/', coalescec(b.varY,'C0'), coalescec(c.varY,'C0'), coalescec(d.varY,'C0')) AS newVar
    FROM have2 a
    LEFT JOIN have1 b
    ON b.varX=scan(a.varX1,1,'/')
    LEFT JOIN have1 c
    ON c.varX=scan(a.varX1,2,'/')
    LEFT JOIN have1 d
    ON d.varX=scan(a.varX1,3,'/')
    ORDER BY a.varX1
    ;
quit;
Sathish_jammy
Lapis Lazuli | Level 10

I appreciate your response @gamotte 

It results in C0 for the new value but also at the end of all observation were it not requires.

like:

abc/def - C1/C2/C0

ghi/mno - C3/C5/C0

Could you please help me to sort out the C0 at the unrequired Observations.

Much Thanks!

gamotte
Rhodochrosite | Level 12

Here is another approach with a format generated by the dataset have1 :

 

data ctrl;
   length label $ 11;
   set have1(rename=(varX=start varY=label)) end=last;
   end=start;
   retain fmtname 'CN' type 'c';
   output;
   if last then do;
      hlo='O';
      label='C0';
      output;
   end;
run;

proc format library=work cntlin=ctrl;
run;


data want;
    set have2;
    length newvar $20;
    do i=1 to countw(varX1,'/');
        call catx('/', newVar, put(scan(varX1,i,'/'),CN.));
    end;
run;
andreas_lds
Jade | Level 19

Hash-object to the rescue ... maybe ...

Not sure that i understood the c0-problem at all, updated have-datesets would be useful.

 

data want;
   if 0 then set have1;

   set have2;

   length varY1 $ 10;

   if _n_ = 1 then do;
      declare hash h(dataset: 'have1');
      h.defineKey('varx');
      h.defineData('vary');
      h.defineDone();
   end;

   varY1 = '';

   do i = 1 to countw(varx1, '/');
      varX = scan(varX1, i, '/');

      if h.find() = 0 then do;
         varY1 = catx('/', varY1, varY);
      end;
      else do;
         varY1 = catx('/', varY1, 'C0');
      end;
   end;

   keep VarX1 VarY1;
run;
gamotte
Rhodochrosite | Level 12

You can generate codes for the missing combinations as follows :

 

data temp;
    if _N_=0 then set have1; /* retrieve column formats */
    set have2;
    keep varX varY;
    do i=1 to countw(varX1,'/');
        varX=scan(varX1,i,'/');
        output;
    end;
run;

proc sql noprint;
    CREATE TABLE new_vals AS 
    SELECT DISTINCT * FROM temp
    WHERE varX NOT IN SELECT varX FROM have1;

    SELECT max(input(substr(varY,2),best.))+1 INTO :maxindex 
    FROM have1;
quit;

data new_vals (drop=index);
    set new_vals;
    if _N_=1 then index=symgetn('maxindex');
    varY=cats('C', index);
    index+1;
run;

proc append base=have1 data=new_vals;
run;
s_lassen
Meteorite | Level 14

Here is another solution: Use SET with KEY= in a datastep:

data have1;
input varX$ varY$;
cards;
abc C1
def C2
ghi C3
jkl C4
mno C5
;run;

data have2;
lengh varx1 $20;
input varX1;
cards;
abc/def
ghi/mno
jkl/def
abc/mno/uvx
ghi/pqr/jkl
;run;

proc sql;
  create index varx on have1(varx);
quit;

data want;
  set have2;
  do _N_=1 to countw(varX1,'/');
    varx=scan(varx1,_N_,'/');
    set have1 key=varx/unique;
    length varY1 $20;
    if _iorc_ then do;
      _error_=0;
      call catx('/',varY1,'C0');
      end;
    else call catx('/',VarY1,VarY);
    end;
  keep varX1 VarY1;
  run;
ed_sas_member
Meteorite | Level 14

Hi @Sathish_jammy 

 

I have update the code using macro variables. Let me know if it works!

Best,

 

/* Retrieve the list of words that are mentionned in have1 but not in have1 */
/* Put each value in a macrovariable with value = "C0"  */

data have2_tr (keep = VarX);
	if _n_=1 then set have1 (keep=VarX);
	set have2;
	do i=1 to countw(varX1,"/");
		VarX1_tr = scan(varX1,i,"/");
		VarX = VarX1_tr;
		output;
	end;
run;

proc sort data=have2_tr nodupkey;
	by VarX;
run;

data have2_not_in_have1;
	merge have1 (in=x keep=VarX) have2_tr (in=y);
	by VarX;
	if x=0 and y=1;
	call symputx(VarX,"C0");
run;

/* Retrieve the list of words that are mentionned in have1 			*/
/* Put each value in a macrovariable with the value = varY's value  */

data _null_;
	set have1;
	call symputx(varX,varY);
run;

/* Output the results */

data want;
	set have2;
	varY1 = symget(scan(varX1,1));
	do i = 1 to countw(varX1,"/")-1;
		varY1 = catx("/",varY1,symget(scan(varX1,i+1,"/")));
	end;
	drop i;
run;

 

 

NB: The dataset have2_not_in_have1 will give you the list of values not referenced in have 1 and mentioned as C0 in the final dataset 

 

gamotte
Rhodochrosite | Level 12

Hello,

 

proc sql noprint;
    CREATE TABLE want AS
    SELECT a.varX1, cats(b.varY,'/',c.varY)
    FROM have2 a
    LEFT JOIN have1 b
    ON b.varX=scan(a.varX1,1,'/')
    LEFT JOIN have1 c
    ON c.varX=scan(a.varX1,2,'/')
    ORDER BY a.varX1
    ;
quit;

novinosrin
Tourmaline | Level 20

Sorry, Late to the party

 



data have1;
input varX$ varY$;
cards;
abc C1
def C2
ghi C3
jkl C4
mno C5
;

data have2;
input varX1$;
cards;
abc/def
ghi/mno
jkl/def
;

data want ;
 if _n_=1 then do;
 if 0 then set have1;
   dcl hash H (dataset:'have1',ordered:'y') ;
   h.definekey  ("vary") ;
   h.definedata ("varX","varY") ;
   h.definedone () ;
   dcl hiter hi('h');
 end;
set have2;
length want $50;
do while(hi.next()=0);
 if index(varx1, strip(varx))>0 then want=catx('/',want,varY);
end;
keep varx1 want;
run;

 

novinosrin
Tourmaline | Level 20

another Hash direct find

 



data have1;
input varX$ varY$;
cards;
abc C1
def C2
ghi C3
jkl C4
mno C5
;

data have2;
input varX1$;
cards;
abc/def
ghi/mno
jkl/def
;

data want ;
 if _n_=1 then do;
 if 0 then set have1;
   dcl hash H (dataset:'have1') ;
   h.definekey  ("varx") ;
   h.definedata ("varY") ;
   h.definedone();
 end;
set have2;
length want $50;
do _n_=1 to countw(varX1,'/');
 _t=scan(varx1,_n_,'/');
 if h.find(key:_t)=0 then want=catx('/',want,varY);
end;
keep varx1 want;
run;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 2228 views
  • 5 likes
  • 6 in conversation