How to perform this multiple by variables merge?

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 137
Accepted Solution

How to perform this multiple by variables merge?

Hey SAS groups,

I am trying to figure out the best way to write a SAS program for my need. I would appreciate if any of you can provide me the simplest and best method to arrive at a best possible solution.

I have a table 1 called Brand_details /*Sample below*/ with Variables Band_Names of all automobiles list and Brand_Code and Brand_Ref:

Brand_Names (includes all models)             Brand_Code                 Brand_Ref

Ford                                                                     F1                              Autos.LMV

Ford escort                                                           F1A                            Autos.LMV

ford fiesta                                                               F2A                         /* This is value is either Autos.LMV or Autos.MMV or Autos.HMV where LMV stands for light motor vehicle, MMV for Mid

ford fusion                                                           F3A                              Motor Vehicle and HMV stands for heavy motor vehicle*/

ford ka                                                                   F4A                            Autos.MMV

ford focus                                                             F5A                             Autos.HMV

ford mondeo                                                         F6A

GM                                                                     GM_101

Hyundai                                                              A1_HYU

hyundai sonata                                                    A2_HYU

hundai accent                                                      HYU_Z2b

hyundai getz                                                       HYU_ 9j

Jaguar                                                               Jag_89j

Fiat                                                                   Fiat_2k

Ferrari                                                               Fer_2k

Lamborgini                                                          Lamb_i12

Suzuki                                                                Suki_45

Nissan                                                                 N18

----------------------------and so on----------------------------------------------------------

Table 2 called LMV_Class:

Brand_Ref                    Hyundai                         Mazda         Ford                     Nissan ....................     /* and so on listing all brands as variables, so this dataset is wide*/

Autos.LMV               Hyundai accent                  Mazda2       ford escort             Nissan quest

Autos.LMV               Hyundai sonata                  Mazda3       ford fusion              Nissan Micra

............_______so on__________

Table 3 called MMV_Class:

Brand_Ref                    Hyundai                        Mazda         Ford                     Nissan ....................     /* and so on listing all brands as variables, so this dataset is wide*/

Autos.MMV               Hyundai getz                  Mazda4           ford ikon             Nissan altima

Autos.MMV               Hyundai EON                  Mazda5          ford ka                Nissan Teana

.........._______so on__________

Table 4 called HMV_Class:

Brand_Ref                    Hyundai                        Mazda             Ford                     Nissan ....................     /* and so on listing all brands as variables, so this dataset is wide*/

Autos.HMV                    Hyundai BlueON             Mazda CX          Ford figo               Nissan Titan

Autos.HMV                    Hyundai i20                     Mazda 6          Ford Mondeo          Nissan Armada

I need to merge Brand_details and  LMV_class to get the Brand_code like:

PROC SQL;

SELECT brand_names, brand_code, brand_ref

  FROM Brand_details 

  RIGHT JOIN

  LMV_Class

  ON brand_details.Brand_names = LMV_Class.Hyundai and brand_details.brand_ref=LMV.Class.Brand_ref;

QUIT;

PROC SQL;

SELECT brand_names, brand_code, brand_ref

  FROM Brand_details 

  RIGHT JOIN

  LMV_Class

  ON brand_details.Brand_names = MMV_Class.Mazda and brand_details.brand_ref=MMV.Class.Brand_ref;;

QUIT;

So it seems like for one LMV_Class I have to do that MANY right joins one by one each time with Hyundai, Mazda and all in the variable list. This would mean I will have to write so many SQL joins just to complete one part that is LMV_Class. And the same process will have to repeat for other classes MMV_CLASS and HMV_Class. Any better solutions please that is more powerful and shorter?

Many thanks,

Charlotte from England


Accepted Solutions
Solution
‎06-26-2013 09:11 AM
Super User
Posts: 10,041

Re: How to perform this multiple by variables merge?

Posted in reply to CharlotteCain

OK. Charlotte,

It is too late in China. I have to go to sleep. If you have other questions , We may have a meet tomorrow.

data Brand_details;
input Brand_Names  & $20.       Brand_Code   & $20.               Brand_Ref      & $20.  ;
cards;
Ford                                                                     F1                              Autos.LMV
Ford escort                                                           F1A                            Autos.LMV
ford fiesta                                                               F2A                    Autos.LMV 
ford fusion                                                           F3A                        Autos.LMV
ford ka                                                                   F4A                            Autos.MMV
ford focus                                                             F5A                             Autos.HMV
ford mondeo                                                         F6A  Autos.LMV
GM                                                                     GM_101  Autos.LMV
Hyundai                                                              A1_HYU  Autos.LMV
Hyundai sonata                                                    A2_HYU  Autos.LMV
Hyundai accent                                                      HYU_Z2b  Autos.LMV
Hyundai getz                                                       HYU_ 9j  Autos.LMV
Jaguar                                                               Jag_89j  Autos.LMV
Fiat                                                                   Fiat_2k  Autos.MMV
Ferrari                                                               Fer_2k   Autos.MMV
Lamborgini                                                          Lamb_i12   Autos.MMV
Suzuki                                                                Suki_45  Autos.HMV
Nissan                                                                 N18  Autos.HMV
;
run;
data LMV_Class;
input Brand_Ref       & $20.             Hyundai        & $20.                 Mazda      & $20.   Ford      & $20.             Nissan & $20.;
cards;
Autos.LMV               Hyundai accent                  Mazda2       ford escort             Nissan quest
Autos.LMV               Hyundai sonata                  Mazda3       ford fusion              Nissan Micra
;
run;
data  MMV_Class;
input Brand_Ref       & $20.             Hyundai        & $20.                 Mazda      & $20.   Ford      & $20.             Nissan & $20.;
cards;
Autos.MMV               Hyundai getz                  Mazda4           ford ikon             Nissan altima
Autos.MMV               Hyundai EON                  Mazda5          ford ka                Nissan Teana
;
run;
data  HMV_Class;
input Brand_Ref       & $20.             Hyundai        & $20.                 Mazda      & $20.   Ford      & $20.             Nissan & $20.;
cards;
Autos.HMV                    Hyundai BlueON             Mazda CX          Ford figo               Nissan Titan
Autos.HMV                    Hyundai i20                     Mazda 6          Ford Mondeo          Nissan Armada
;
run;
data temp;
 set LMV_Class MMV_Class HMV_Class;
run;

data want(keep=Brand_Names Brand_Code Brand_Ref Brand); 
 if _n_ eq 1 then do;
  if 0 then set Brand_details ;
  declare hash ha(dataset:'Brand_details');
   ha.definekey('Brand_Names','Brand_Ref');
   ha.definedata('Brand_Code');
   ha.definedone();
 end;
set temp;
 array x{*} Hyundai -- Nissan;
 do i=1 to dim(x);
  Brand_Names=x{i};     Brand=vname(x{i});
  rc=ha.find();
  output;
  call missing(Brand_Code);
 end;
run;
proc sort data=want;by Brand Brand_Ref;run;
data want;
 set want;
 by Brand Brand_Ref;
 if first.Brand_Ref then n=0;
 n+1;
 Brand_Ref=cats(Brand_Ref,put(n,z2.));
run;

Good Luck.

Ksharp

View solution in original post


All Replies
Super User
Posts: 10,041

Re: How to perform this multiple by variables merge?

Posted in reply to CharlotteCain

What output are you looking for .posting it will better explain your question.

Ksharp

Frequent Contributor
Posts: 137

Re: How to perform this multiple by variables merge?

Hi Ksharp,

Thanks for the response.The output i am looking for is a report that has the following columns:

Brand_ref        Brand_code        Brand_Names

  Autos.LMV       HYU_Z2b           Hyundai accent

1. I want the list to start from LMV that is Autos.LMV, one the LMV list is done, followed in Autos.MMV and the 3rd to follow is Autos.HMV.

2. Please notice the Brand_Names in the Brand_details has entire names of all brands listed for all the 3 classes LMV, MMV and HMV making it a complete descriptor table, however only CERTAIN BRAND NAMES ARE INCLUDED IN THE LMV, MMV AND HMV datasets. For example, if hyundai has 20 models(brand names) that are listed in Brand_details, the idea here is to only include only those Hyundai models(brand_names) in LMV,MMV and HMV datasets that may have only 5,8, or 9 names  corresponds to it in the Brand_details dataset. Therefore the right join i guess, but no point as it seems writing too many joins or merges.

Basically the report is consolidated form of LMV, MMV and HMV. The issue here is Brand_names in Brand details dataset is actually equal to variable values of Hyundai and all other brands in the LMV, MMV and HMV datasets. Hence, these 3 happen to very wide datasets. II hope i am not confusing you, if you wish let me know should i write in a better way you can comprehend better.

Thanks,

Charlotte

Super User
Posts: 10,041

Re: How to perform this multiple by variables merge?

Posted in reply to CharlotteCain

Hi. Charlotte ,

It looks like not too difficult .

data Brand_details;
input Brand_Names  & $20.       Brand_Code   & $20.               Brand_Ref      & $20.  ;
cards;
Ford                                                                     F1                              Autos.LMV
Ford escort                                                           F1A                            Autos.LMV
ford fiesta                                                               F2A                    Autos.LMV 
ford fusion                                                           F3A                        Autos.LMV
ford ka                                                                   F4A                            Autos.MMV
ford focus                                                             F5A                             Autos.HMV
ford mondeo                                                         F6A  Autos.LMV
GM                                                                     GM_101  Autos.LMV
Hyundai                                                              A1_HYU  Autos.LMV
Hyundai sonata                                                    A2_HYU  Autos.LMV
Hyundai accent                                                      HYU_Z2b  Autos.LMV
Hyundai getz                                                       HYU_ 9j  Autos.LMV
Jaguar                                                               Jag_89j  Autos.LMV
Fiat                                                                   Fiat_2k  Autos.MMV
Ferrari                                                               Fer_2k   Autos.MMV
Lamborgini                                                          Lamb_i12   Autos.MMV
Suzuki                                                                Suki_45  Autos.HMV
Nissan                                                                 N18  Autos.HMV
;
run;
data LMV_Class;
input Brand_Ref       & $20.             Hyundai        & $20.                 Mazda      & $20.   Ford      & $20.             Nissan & $20.;
cards;
Autos.LMV               Hyundai accent                  Mazda2       ford escort             Nissan quest
Autos.LMV               Hyundai sonata                  Mazda3       ford fusion              Nissan Micra
;
run;
data  MMV_Class;
input Brand_Ref       & $20.             Hyundai        & $20.                 Mazda      & $20.   Ford      & $20.             Nissan & $20.;
cards;
Autos.MMV               Hyundai getz                  Mazda4           ford ikon             Nissan altima
Autos.MMV               Hyundai EON                  Mazda5          ford ka                Nissan Teana
;
run;
data  HMV_Class;
input Brand_Ref       & $20.             Hyundai        & $20.                 Mazda      & $20.   Ford      & $20.             Nissan & $20.;
cards;
Autos.HMV                    Hyundai BlueON             Mazda CX          Ford figo               Nissan Titan
Autos.HMV                    Hyundai i20                     Mazda 6          Ford Mondeo          Nissan Armada
;
run;
data temp;
 set LMV_Class MMV_Class HMV_Class;
run;

data want(keep=Brand_Names Brand_Code Brand_Ref ); 
 if _n_ eq 1 then do;
  if 0 then set Brand_details ;
  declare hash ha(dataset:'Brand_details');
   ha.definekey('Brand_Names','Brand_Ref');
   ha.definedata('Brand_Code');
   ha.definedone();
 end;
set temp;
 array x{*} Hyundai -- Nissan;
 do i=1 to dim(x);
  Brand_Names=x{i};
  rc=ha.find();
  output;
  call missing(Brand_Code);
 end;
run;

Ksharp

Super Contributor
Posts: 334

Re: How to perform this multiple by variables merge?

gave you a great way to do it with hash objects. Not knowing if the posted data accurately reflects your data, I would also suggest upcase your variables to maximize the match rate. This might not be necessary if all the data is all from the same original source.

EJ

Frequent Contributor
Posts: 137

Re: How to perform this multiple by variables merge?

Hi Ksharp,

Thanks for that awesome approach. I missed out on one more variable to include in the requirement,So  if i am allowed to bother you again, i would add it here. Please accept my apologies. On the output report,I need to assign an ID index variable. For example, on each brand names that is comes from LMV,MMV and HMV datsets  I would need  to assign like:

lets say for Hyundai, if there were 5 brand names in LMV-

ID_INDEX

Autos.LMV.01

Autos.LMV.02

Autos.LMV.03

Autos.LMV.04

Autos.LMV.05

and the same follows for other brands within LMV, MMV and HMV.

I thought of  Id_index= Brand_ref || (_N_,Z2.);

/*unfortunately the above counts though the entire obs, and the need is the count should start and end for each brand within the categories*/

Is it possible if you can modify in that pass or in a new datastep. Your help would really be most appreciated.

Thanks,

Charlotte

Solution
‎06-26-2013 09:11 AM
Super User
Posts: 10,041

Re: How to perform this multiple by variables merge?

Posted in reply to CharlotteCain

OK. Charlotte,

It is too late in China. I have to go to sleep. If you have other questions , We may have a meet tomorrow.

data Brand_details;
input Brand_Names  & $20.       Brand_Code   & $20.               Brand_Ref      & $20.  ;
cards;
Ford                                                                     F1                              Autos.LMV
Ford escort                                                           F1A                            Autos.LMV
ford fiesta                                                               F2A                    Autos.LMV 
ford fusion                                                           F3A                        Autos.LMV
ford ka                                                                   F4A                            Autos.MMV
ford focus                                                             F5A                             Autos.HMV
ford mondeo                                                         F6A  Autos.LMV
GM                                                                     GM_101  Autos.LMV
Hyundai                                                              A1_HYU  Autos.LMV
Hyundai sonata                                                    A2_HYU  Autos.LMV
Hyundai accent                                                      HYU_Z2b  Autos.LMV
Hyundai getz                                                       HYU_ 9j  Autos.LMV
Jaguar                                                               Jag_89j  Autos.LMV
Fiat                                                                   Fiat_2k  Autos.MMV
Ferrari                                                               Fer_2k   Autos.MMV
Lamborgini                                                          Lamb_i12   Autos.MMV
Suzuki                                                                Suki_45  Autos.HMV
Nissan                                                                 N18  Autos.HMV
;
run;
data LMV_Class;
input Brand_Ref       & $20.             Hyundai        & $20.                 Mazda      & $20.   Ford      & $20.             Nissan & $20.;
cards;
Autos.LMV               Hyundai accent                  Mazda2       ford escort             Nissan quest
Autos.LMV               Hyundai sonata                  Mazda3       ford fusion              Nissan Micra
;
run;
data  MMV_Class;
input Brand_Ref       & $20.             Hyundai        & $20.                 Mazda      & $20.   Ford      & $20.             Nissan & $20.;
cards;
Autos.MMV               Hyundai getz                  Mazda4           ford ikon             Nissan altima
Autos.MMV               Hyundai EON                  Mazda5          ford ka                Nissan Teana
;
run;
data  HMV_Class;
input Brand_Ref       & $20.             Hyundai        & $20.                 Mazda      & $20.   Ford      & $20.             Nissan & $20.;
cards;
Autos.HMV                    Hyundai BlueON             Mazda CX          Ford figo               Nissan Titan
Autos.HMV                    Hyundai i20                     Mazda 6          Ford Mondeo          Nissan Armada
;
run;
data temp;
 set LMV_Class MMV_Class HMV_Class;
run;

data want(keep=Brand_Names Brand_Code Brand_Ref Brand); 
 if _n_ eq 1 then do;
  if 0 then set Brand_details ;
  declare hash ha(dataset:'Brand_details');
   ha.definekey('Brand_Names','Brand_Ref');
   ha.definedata('Brand_Code');
   ha.definedone();
 end;
set temp;
 array x{*} Hyundai -- Nissan;
 do i=1 to dim(x);
  Brand_Names=x{i};     Brand=vname(x{i});
  rc=ha.find();
  output;
  call missing(Brand_Code);
 end;
run;
proc sort data=want;by Brand Brand_Ref;run;
data want;
 set want;
 by Brand Brand_Ref;
 if first.Brand_Ref then n=0;
 n+1;
 Brand_Ref=cats(Brand_Ref,put(n,z2.));
run;

Good Luck.

Ksharp

Frequent Contributor
Posts: 137

Re: How to perform this multiple by variables merge?

Hi Ksharp,

Good Morning from England, It is just 6 AM here. Thank you so much indeed for going the extra mile in helping out with the solution whilst I am extremely sorry that I didn't realise it was pretty late there in China last night considering the time difference. Nonetheless, I sincerely appreciate the much needed amazing help. I am about to get ready for the day's work and I will test your code on my application and will sincerely update you the soonest.

I am basically writing this as a matter of courtesy to thank you more as you worked to give me a  solution compromising your sleep. Awesome Sir!!.

Many Thanks Indeed,

Charlotte from England

Super User
Posts: 10,041

Re: How to perform this multiple by variables merge?

Posted in reply to CharlotteCain

Hi. Charlotte,

You're welcome . Now at China it is 01:00 pm .Hope you can get your job done.

If you need make a new index variable, try this:

data want;
 set want;
 by Brand Brand_Ref;
 if first.Brand_Ref then n=0;
 n+1;
 Index_Brand_Ref=cats(Brand_Ref,put(n,z2.));      /*********/

Ksharp

Message was edited by: xia keshan

Frequent Contributor
Posts: 137

Re: How to perform this multiple by variables merge?

Hi Ksharp,

Good Morning from England,

I ran the solution provided by you perfectly well on my application with some minor adjustment to great success. I can't thank you enough for the rather proactive super fast response that helped at a time when i needed the most. I really appreciate so much for the invaluable help.

Well before we go off this thread, I would like you to leave me and other new aspiring SAS users a point of advice in ways of learning SAS fast to be able to reach a standard that would let me accomplish quality programming competing with my self day by day. That would help me and others who are keen a great deal. To be honest, I have been using SAS for just over a month

I have noticed your name being pretty distinguished earlier when Pierre Gagnon suggested your name among likes of few others who are great solution providers in this forum.

Thanks very much indeed,

Charlotte

Super User
Posts: 10,041

Re: How to perform this multiple by variables merge?

Posted in reply to CharlotteCain

Ha. Charlotte,

Glad to hear it is working.

" point of advice in ways of learning SAS fast to be able to reach a standard that would let me accomplish quality programming competing with my self day by day."

I have no some advice. Just keep learning SAS by reading its documentation which contains everything about SAS. And most important thing is to continue to visit this forum and see & learn other SAS user's question and solution.  I growed up from this excellent forum when I was a rookie, Patrick , data _null_; , Cynthia , Arthur.Carpenter,Peter.C  gave me lots of help ( Actually they all have emerged up when I firstly step into this forum ).

I might be disappeared  again for a while after two weeks, But don't worried I will come back. Here have lots of my fun.

"I have been using SAS for just over a month "

That is not long enough, at least one or two year learning . you will find your sas skill is good enough.

"I have noticed your name being pretty distinguished earlier when Pierre Gagnon suggested your name"

Thanks PG mentioned me. He is a sas expert .

Xia Keshan

From China

Message was edited by: xia keshan

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 762 views
  • 1 like
  • 3 in conversation