BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Sathish_jammy
Lapis Lazuli | Level 10

Dear Experts,

 

I share the sample dataset which matches my requirement for the study.

I have two datasets contains 2 columns varX varY. 

data have1;
input varX$ varY$;
cards;
abc C1
def C2
ghi C3
jkl C4
mno C5
;

The above code list the alphabets individually refers to C_value.

The below code refers to the combination of alphabets, So according to alphabets I need to match the C_values on varY1.

data have2;
input varX1$;
cards;
abc/def
ghi/mno
jkl/def
;

Required Output like:

abc/defC1/C2
ghi/mnoC3/C5
jkl/defC4/C2

 

Kindly suggest a code to solve the task.

Thanks in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
andreas_lds
Jade | Level 19

Hash-object to the rescue ... maybe ...

Not sure that i understood the c0-problem at all, updated have-datesets would be useful.

 

data want;
   if 0 then set have1;

   set have2;

   length varY1 $ 10;

   if _n_ = 1 then do;
      declare hash h(dataset: 'have1');
      h.defineKey('varx');
      h.defineData('vary');
      h.defineDone();
   end;

   varY1 = '';

   do i = 1 to countw(varx1, '/');
      varX = scan(varX1, i, '/');

      if h.find() = 0 then do;
         varY1 = catx('/', varY1, varY);
      end;
      else do;
         varY1 = catx('/', varY1, 'C0');
      end;
   end;

   keep VarX1 VarY1;
run;

View solution in original post

12 REPLIES 12
ed_sas_member
Meteorite | Level 14

Hi @Sathish_jammy 

 

You can try this code:

data _null_;
	set have1;
	call symputx(varX,varY);
run;

data want;
	set have2;
	varY1 = catx("/",symget(scan(varX1,1,"/")),symget(scan(varX1,2,"/")));
run;

 Capture d’écran 2019-12-16 à 10.20.24.png

 

Sathish_jammy
Lapis Lazuli | Level 10

Thank you so much @ed_sas_member  @gamotte  

Both of your code works well.

 

Additionally, I required one more condition for this task.

Assume that I mentioned any new combination of 3alphabets in (have1(varX1)) which not listed in (varX(have))

How could I get the result as 'C0' in varY1.

For example :

abc/mno/uvx  - C1/C5/C0

ghi/pqr/jkl  - C3/C0/C4

 

It required because Once I found the C0 so I could add that new value in my list manually like

uvx - C6

pqr - C7

 

gamotte
Rhodochrosite | Level 12

You can use the coalescec function :

 

proc sql noprint;
    CREATE TABLE want AS
    SELECT a.varX1, catx('/', coalescec(b.varY,'C0'), coalescec(c.varY,'C0'), coalescec(d.varY,'C0')) AS newVar
    FROM have2 a
    LEFT JOIN have1 b
    ON b.varX=scan(a.varX1,1,'/')
    LEFT JOIN have1 c
    ON c.varX=scan(a.varX1,2,'/')
    LEFT JOIN have1 d
    ON d.varX=scan(a.varX1,3,'/')
    ORDER BY a.varX1
    ;
quit;
Sathish_jammy
Lapis Lazuli | Level 10

I appreciate your response @gamotte 

It results in C0 for the new value but also at the end of all observation were it not requires.

like:

abc/def - C1/C2/C0

ghi/mno - C3/C5/C0

Could you please help me to sort out the C0 at the unrequired Observations.

Much Thanks!

gamotte
Rhodochrosite | Level 12

Here is another approach with a format generated by the dataset have1 :

 

data ctrl;
   length label $ 11;
   set have1(rename=(varX=start varY=label)) end=last;
   end=start;
   retain fmtname 'CN' type 'c';
   output;
   if last then do;
      hlo='O';
      label='C0';
      output;
   end;
run;

proc format library=work cntlin=ctrl;
run;


data want;
    set have2;
    length newvar $20;
    do i=1 to countw(varX1,'/');
        call catx('/', newVar, put(scan(varX1,i,'/'),CN.));
    end;
run;
andreas_lds
Jade | Level 19

Hash-object to the rescue ... maybe ...

Not sure that i understood the c0-problem at all, updated have-datesets would be useful.

 

data want;
   if 0 then set have1;

   set have2;

   length varY1 $ 10;

   if _n_ = 1 then do;
      declare hash h(dataset: 'have1');
      h.defineKey('varx');
      h.defineData('vary');
      h.defineDone();
   end;

   varY1 = '';

   do i = 1 to countw(varx1, '/');
      varX = scan(varX1, i, '/');

      if h.find() = 0 then do;
         varY1 = catx('/', varY1, varY);
      end;
      else do;
         varY1 = catx('/', varY1, 'C0');
      end;
   end;

   keep VarX1 VarY1;
run;
gamotte
Rhodochrosite | Level 12

You can generate codes for the missing combinations as follows :

 

data temp;
    if _N_=0 then set have1; /* retrieve column formats */
    set have2;
    keep varX varY;
    do i=1 to countw(varX1,'/');
        varX=scan(varX1,i,'/');
        output;
    end;
run;

proc sql noprint;
    CREATE TABLE new_vals AS 
    SELECT DISTINCT * FROM temp
    WHERE varX NOT IN SELECT varX FROM have1;

    SELECT max(input(substr(varY,2),best.))+1 INTO :maxindex 
    FROM have1;
quit;

data new_vals (drop=index);
    set new_vals;
    if _N_=1 then index=symgetn('maxindex');
    varY=cats('C', index);
    index+1;
run;

proc append base=have1 data=new_vals;
run;
s_lassen
Meteorite | Level 14

Here is another solution: Use SET with KEY= in a datastep:

data have1;
input varX$ varY$;
cards;
abc C1
def C2
ghi C3
jkl C4
mno C5
;run;

data have2;
lengh varx1 $20;
input varX1;
cards;
abc/def
ghi/mno
jkl/def
abc/mno/uvx
ghi/pqr/jkl
;run;

proc sql;
  create index varx on have1(varx);
quit;

data want;
  set have2;
  do _N_=1 to countw(varX1,'/');
    varx=scan(varx1,_N_,'/');
    set have1 key=varx/unique;
    length varY1 $20;
    if _iorc_ then do;
      _error_=0;
      call catx('/',varY1,'C0');
      end;
    else call catx('/',VarY1,VarY);
    end;
  keep varX1 VarY1;
  run;
ed_sas_member
Meteorite | Level 14

Hi @Sathish_jammy 

 

I have update the code using macro variables. Let me know if it works!

Best,

 

/* Retrieve the list of words that are mentionned in have1 but not in have1 */
/* Put each value in a macrovariable with value = "C0"  */

data have2_tr (keep = VarX);
	if _n_=1 then set have1 (keep=VarX);
	set have2;
	do i=1 to countw(varX1,"/");
		VarX1_tr = scan(varX1,i,"/");
		VarX = VarX1_tr;
		output;
	end;
run;

proc sort data=have2_tr nodupkey;
	by VarX;
run;

data have2_not_in_have1;
	merge have1 (in=x keep=VarX) have2_tr (in=y);
	by VarX;
	if x=0 and y=1;
	call symputx(VarX,"C0");
run;

/* Retrieve the list of words that are mentionned in have1 			*/
/* Put each value in a macrovariable with the value = varY's value  */

data _null_;
	set have1;
	call symputx(varX,varY);
run;

/* Output the results */

data want;
	set have2;
	varY1 = symget(scan(varX1,1));
	do i = 1 to countw(varX1,"/")-1;
		varY1 = catx("/",varY1,symget(scan(varX1,i+1,"/")));
	end;
	drop i;
run;

 

 

NB: The dataset have2_not_in_have1 will give you the list of values not referenced in have 1 and mentioned as C0 in the final dataset 

 

gamotte
Rhodochrosite | Level 12

Hello,

 

proc sql noprint;
    CREATE TABLE want AS
    SELECT a.varX1, cats(b.varY,'/',c.varY)
    FROM have2 a
    LEFT JOIN have1 b
    ON b.varX=scan(a.varX1,1,'/')
    LEFT JOIN have1 c
    ON c.varX=scan(a.varX1,2,'/')
    ORDER BY a.varX1
    ;
quit;

novinosrin
Tourmaline | Level 20

Sorry, Late to the party

 



data have1;
input varX$ varY$;
cards;
abc C1
def C2
ghi C3
jkl C4
mno C5
;

data have2;
input varX1$;
cards;
abc/def
ghi/mno
jkl/def
;

data want ;
 if _n_=1 then do;
 if 0 then set have1;
   dcl hash H (dataset:'have1',ordered:'y') ;
   h.definekey  ("vary") ;
   h.definedata ("varX","varY") ;
   h.definedone () ;
   dcl hiter hi('h');
 end;
set have2;
length want $50;
do while(hi.next()=0);
 if index(varx1, strip(varx))>0 then want=catx('/',want,varY);
end;
keep varx1 want;
run;

 

novinosrin
Tourmaline | Level 20

another Hash direct find

 



data have1;
input varX$ varY$;
cards;
abc C1
def C2
ghi C3
jkl C4
mno C5
;

data have2;
input varX1$;
cards;
abc/def
ghi/mno
jkl/def
;

data want ;
 if _n_=1 then do;
 if 0 then set have1;
   dcl hash H (dataset:'have1') ;
   h.definekey  ("varx") ;
   h.definedata ("varY") ;
   h.definedone();
 end;
set have2;
length want $50;
do _n_=1 to countw(varX1,'/');
 _t=scan(varx1,_n_,'/');
 if h.find(key:_t)=0 then want=catx('/',want,varY);
end;
keep varx1 want;
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1098 views
  • 5 likes
  • 6 in conversation