Hey SAS groups,
I am trying to figure out the best way to write a SAS program for my need. I would appreciate if any of you can provide me the simplest and best method to arrive at a best possible solution.
I have a table 1 called Brand_details /*Sample below*/ with Variables Band_Names of all automobiles list and Brand_Code and Brand_Ref:
Brand_Names (includes all models) Brand_Code Brand_Ref
Ford F1 Autos.LMV
Ford escort F1A Autos.LMV
ford fiesta F2A /* This is value is either Autos.LMV or Autos.MMV or Autos.HMV where LMV stands for light motor vehicle, MMV for Mid
ford fusion F3A Motor Vehicle and HMV stands for heavy motor vehicle*/
ford ka F4A Autos.MMV
ford focus F5A Autos.HMV
ford mondeo F6A
GM GM_101
Hyundai A1_HYU
hyundai sonata A2_HYU
hundai accent HYU_Z2b
hyundai getz HYU_ 9j
Jaguar Jag_89j
Fiat Fiat_2k
Ferrari Fer_2k
Lamborgini Lamb_i12
Suzuki Suki_45
Nissan N18
----------------------------and so on----------------------------------------------------------
Table 2 called LMV_Class:
Brand_Ref Hyundai Mazda Ford Nissan .................... /* and so on listing all brands as variables, so this dataset is wide*/
Autos.LMV Hyundai accent Mazda2 ford escort Nissan quest
Autos.LMV Hyundai sonata Mazda3 ford fusion Nissan Micra
............_______so on__________
Table 3 called MMV_Class:
Brand_Ref Hyundai Mazda Ford Nissan .................... /* and so on listing all brands as variables, so this dataset is wide*/
Autos.MMV Hyundai getz Mazda4 ford ikon Nissan altima
Autos.MMV Hyundai EON Mazda5 ford ka Nissan Teana
.........._______so on__________
Table 4 called HMV_Class:
Brand_Ref Hyundai Mazda Ford Nissan .................... /* and so on listing all brands as variables, so this dataset is wide*/
Autos.HMV Hyundai BlueON Mazda CX Ford figo Nissan Titan
Autos.HMV Hyundai i20 Mazda 6 Ford Mondeo Nissan Armada
I need to merge Brand_details and LMV_class to get the Brand_code like:
PROC SQL;
SELECT brand_names, brand_code, brand_ref
FROM Brand_details
RIGHT JOIN
LMV_Class
ON brand_details.Brand_names = LMV_Class.Hyundai and brand_details.brand_ref=LMV.Class.Brand_ref;
QUIT;
PROC SQL;
SELECT brand_names, brand_code, brand_ref
FROM Brand_details
RIGHT JOIN
LMV_Class
ON brand_details.Brand_names = MMV_Class.Mazda and brand_details.brand_ref=MMV.Class.Brand_ref;;
QUIT;
So it seems like for one LMV_Class I have to do that MANY right joins one by one each time with Hyundai, Mazda and all in the variable list. This would mean I will have to write so many SQL joins just to complete one part that is LMV_Class. And the same process will have to repeat for other classes MMV_CLASS and HMV_Class. Any better solutions please that is more powerful and shorter?
Many thanks,
Charlotte from England
OK. Charlotte,
It is too late in China. I have to go to sleep. If you have other questions , We may have a meet tomorrow.
data Brand_details; input Brand_Names & $20. Brand_Code & $20. Brand_Ref & $20. ; cards; Ford F1 Autos.LMV Ford escort F1A Autos.LMV ford fiesta F2A Autos.LMV ford fusion F3A Autos.LMV ford ka F4A Autos.MMV ford focus F5A Autos.HMV ford mondeo F6A Autos.LMV GM GM_101 Autos.LMV Hyundai A1_HYU Autos.LMV Hyundai sonata A2_HYU Autos.LMV Hyundai accent HYU_Z2b Autos.LMV Hyundai getz HYU_ 9j Autos.LMV Jaguar Jag_89j Autos.LMV Fiat Fiat_2k Autos.MMV Ferrari Fer_2k Autos.MMV Lamborgini Lamb_i12 Autos.MMV Suzuki Suki_45 Autos.HMV Nissan N18 Autos.HMV ; run; data LMV_Class; input Brand_Ref & $20. Hyundai & $20. Mazda & $20. Ford & $20. Nissan & $20.; cards; Autos.LMV Hyundai accent Mazda2 ford escort Nissan quest Autos.LMV Hyundai sonata Mazda3 ford fusion Nissan Micra ; run; data MMV_Class; input Brand_Ref & $20. Hyundai & $20. Mazda & $20. Ford & $20. Nissan & $20.; cards; Autos.MMV Hyundai getz Mazda4 ford ikon Nissan altima Autos.MMV Hyundai EON Mazda5 ford ka Nissan Teana ; run; data HMV_Class; input Brand_Ref & $20. Hyundai & $20. Mazda & $20. Ford & $20. Nissan & $20.; cards; Autos.HMV Hyundai BlueON Mazda CX Ford figo Nissan Titan Autos.HMV Hyundai i20 Mazda 6 Ford Mondeo Nissan Armada ; run; data temp; set LMV_Class MMV_Class HMV_Class; run; data want(keep=Brand_Names Brand_Code Brand_Ref Brand); if _n_ eq 1 then do; if 0 then set Brand_details ; declare hash ha(dataset:'Brand_details'); ha.definekey('Brand_Names','Brand_Ref'); ha.definedata('Brand_Code'); ha.definedone(); end; set temp; array x{*} Hyundai -- Nissan; do i=1 to dim(x); Brand_Names=x{i}; Brand=vname(x{i}); rc=ha.find(); output; call missing(Brand_Code); end; run; proc sort data=want;by Brand Brand_Ref;run; data want; set want; by Brand Brand_Ref; if first.Brand_Ref then n=0; n+1; Brand_Ref=cats(Brand_Ref,put(n,z2.)); run;
Good Luck.
Ksharp
What output are you looking for .posting it will better explain your question.
Ksharp
Hi Ksharp,
Thanks for the response.The output i am looking for is a report that has the following columns:
Brand_ref Brand_code Brand_Names
Autos.LMV HYU_Z2b Hyundai accent
1. I want the list to start from LMV that is Autos.LMV, one the LMV list is done, followed in Autos.MMV and the 3rd to follow is Autos.HMV.
2. Please notice the Brand_Names in the Brand_details has entire names of all brands listed for all the 3 classes LMV, MMV and HMV making it a complete descriptor table, however only CERTAIN BRAND NAMES ARE INCLUDED IN THE LMV, MMV AND HMV datasets. For example, if hyundai has 20 models(brand names) that are listed in Brand_details, the idea here is to only include only those Hyundai models(brand_names) in LMV,MMV and HMV datasets that may have only 5,8, or 9 names corresponds to it in the Brand_details dataset. Therefore the right join i guess, but no point as it seems writing too many joins or merges.
Basically the report is consolidated form of LMV, MMV and HMV. The issue here is Brand_names in Brand details dataset is actually equal to variable values of Hyundai and all other brands in the LMV, MMV and HMV datasets. Hence, these 3 happen to very wide datasets. II hope i am not confusing you, if you wish let me know should i write in a better way you can comprehend better.
Thanks,
Charlotte
Hi. Charlotte ,
It looks like not too difficult .
data Brand_details; input Brand_Names & $20. Brand_Code & $20. Brand_Ref & $20. ; cards; Ford F1 Autos.LMV Ford escort F1A Autos.LMV ford fiesta F2A Autos.LMV ford fusion F3A Autos.LMV ford ka F4A Autos.MMV ford focus F5A Autos.HMV ford mondeo F6A Autos.LMV GM GM_101 Autos.LMV Hyundai A1_HYU Autos.LMV Hyundai sonata A2_HYU Autos.LMV Hyundai accent HYU_Z2b Autos.LMV Hyundai getz HYU_ 9j Autos.LMV Jaguar Jag_89j Autos.LMV Fiat Fiat_2k Autos.MMV Ferrari Fer_2k Autos.MMV Lamborgini Lamb_i12 Autos.MMV Suzuki Suki_45 Autos.HMV Nissan N18 Autos.HMV ; run; data LMV_Class; input Brand_Ref & $20. Hyundai & $20. Mazda & $20. Ford & $20. Nissan & $20.; cards; Autos.LMV Hyundai accent Mazda2 ford escort Nissan quest Autos.LMV Hyundai sonata Mazda3 ford fusion Nissan Micra ; run; data MMV_Class; input Brand_Ref & $20. Hyundai & $20. Mazda & $20. Ford & $20. Nissan & $20.; cards; Autos.MMV Hyundai getz Mazda4 ford ikon Nissan altima Autos.MMV Hyundai EON Mazda5 ford ka Nissan Teana ; run; data HMV_Class; input Brand_Ref & $20. Hyundai & $20. Mazda & $20. Ford & $20. Nissan & $20.; cards; Autos.HMV Hyundai BlueON Mazda CX Ford figo Nissan Titan Autos.HMV Hyundai i20 Mazda 6 Ford Mondeo Nissan Armada ; run; data temp; set LMV_Class MMV_Class HMV_Class; run; data want(keep=Brand_Names Brand_Code Brand_Ref ); if _n_ eq 1 then do; if 0 then set Brand_details ; declare hash ha(dataset:'Brand_details'); ha.definekey('Brand_Names','Brand_Ref'); ha.definedata('Brand_Code'); ha.definedone(); end; set temp; array x{*} Hyundai -- Nissan; do i=1 to dim(x); Brand_Names=x{i}; rc=ha.find(); output; call missing(Brand_Code); end; run;
Ksharp
Hi Ksharp,
Thanks for that awesome approach. I missed out on one more variable to include in the requirement,So if i am allowed to bother you again, i would add it here. Please accept my apologies. On the output report,I need to assign an ID index variable. For example, on each brand names that is comes from LMV,MMV and HMV datsets I would need to assign like:
lets say for Hyundai, if there were 5 brand names in LMV-
ID_INDEX
Autos.LMV.01
Autos.LMV.02
Autos.LMV.03
Autos.LMV.04
Autos.LMV.05
and the same follows for other brands within LMV, MMV and HMV.
I thought of Id_index= Brand_ref || (_N_,Z2.);
/*unfortunately the above counts though the entire obs, and the need is the count should start and end for each brand within the categories*/
Is it possible if you can modify in that pass or in a new datastep. Your help would really be most appreciated.
Thanks,
Charlotte
OK. Charlotte,
It is too late in China. I have to go to sleep. If you have other questions , We may have a meet tomorrow.
data Brand_details; input Brand_Names & $20. Brand_Code & $20. Brand_Ref & $20. ; cards; Ford F1 Autos.LMV Ford escort F1A Autos.LMV ford fiesta F2A Autos.LMV ford fusion F3A Autos.LMV ford ka F4A Autos.MMV ford focus F5A Autos.HMV ford mondeo F6A Autos.LMV GM GM_101 Autos.LMV Hyundai A1_HYU Autos.LMV Hyundai sonata A2_HYU Autos.LMV Hyundai accent HYU_Z2b Autos.LMV Hyundai getz HYU_ 9j Autos.LMV Jaguar Jag_89j Autos.LMV Fiat Fiat_2k Autos.MMV Ferrari Fer_2k Autos.MMV Lamborgini Lamb_i12 Autos.MMV Suzuki Suki_45 Autos.HMV Nissan N18 Autos.HMV ; run; data LMV_Class; input Brand_Ref & $20. Hyundai & $20. Mazda & $20. Ford & $20. Nissan & $20.; cards; Autos.LMV Hyundai accent Mazda2 ford escort Nissan quest Autos.LMV Hyundai sonata Mazda3 ford fusion Nissan Micra ; run; data MMV_Class; input Brand_Ref & $20. Hyundai & $20. Mazda & $20. Ford & $20. Nissan & $20.; cards; Autos.MMV Hyundai getz Mazda4 ford ikon Nissan altima Autos.MMV Hyundai EON Mazda5 ford ka Nissan Teana ; run; data HMV_Class; input Brand_Ref & $20. Hyundai & $20. Mazda & $20. Ford & $20. Nissan & $20.; cards; Autos.HMV Hyundai BlueON Mazda CX Ford figo Nissan Titan Autos.HMV Hyundai i20 Mazda 6 Ford Mondeo Nissan Armada ; run; data temp; set LMV_Class MMV_Class HMV_Class; run; data want(keep=Brand_Names Brand_Code Brand_Ref Brand); if _n_ eq 1 then do; if 0 then set Brand_details ; declare hash ha(dataset:'Brand_details'); ha.definekey('Brand_Names','Brand_Ref'); ha.definedata('Brand_Code'); ha.definedone(); end; set temp; array x{*} Hyundai -- Nissan; do i=1 to dim(x); Brand_Names=x{i}; Brand=vname(x{i}); rc=ha.find(); output; call missing(Brand_Code); end; run; proc sort data=want;by Brand Brand_Ref;run; data want; set want; by Brand Brand_Ref; if first.Brand_Ref then n=0; n+1; Brand_Ref=cats(Brand_Ref,put(n,z2.)); run;
Good Luck.
Ksharp
Hi Ksharp,
Good Morning from England, It is just 6 AM here. Thank you so much indeed for going the extra mile in helping out with the solution whilst I am extremely sorry that I didn't realise it was pretty late there in China last night considering the time difference. Nonetheless, I sincerely appreciate the much needed amazing help. I am about to get ready for the day's work and I will test your code on my application and will sincerely update you the soonest.
I am basically writing this as a matter of courtesy to thank you more as you worked to give me a solution compromising your sleep. Awesome Sir!!.
Many Thanks Indeed,
Charlotte from England
Hi. Charlotte,
You're welcome . Now at China it is 01:00 pm .Hope you can get your job done.
If you need make a new index variable, try this:
data want; set want; by Brand Brand_Ref; if first.Brand_Ref then n=0; n+1; Index_Brand_Ref=cats(Brand_Ref,put(n,z2.)); /*********/Ksharp
Message was edited by: xia keshan
Hi Ksharp,
Good Morning from England,
I ran the solution provided by you perfectly well on my application with some minor adjustment to great success. I can't thank you enough for the rather proactive super fast response that helped at a time when i needed the most. I really appreciate so much for the invaluable help.
Well before we go off this thread, I would like you to leave me and other new aspiring SAS users a point of advice in ways of learning SAS fast to be able to reach a standard that would let me accomplish quality programming competing with my self day by day. That would help me and others who are keen a great deal. To be honest, I have been using SAS for just over a month
I have noticed your name being pretty distinguished earlier when Pierre Gagnon suggested your name among likes of few others who are great solution providers in this forum.
Thanks very much indeed,
Charlotte
Ha. Charlotte,
Glad to hear it is working.
" point of advice in ways of learning SAS fast to be able to reach a standard that would let me accomplish quality programming competing with my self day by day."
I have no some advice. Just keep learning SAS by reading its documentation which contains everything about SAS. And most important thing is to continue to visit this forum and see & learn other SAS user's question and solution. I growed up from this excellent forum when I was a rookie, Patrick , data _null_; , Cynthia , Arthur.Carpenter,Peter.C gave me lots of help ( Actually they all have emerged up when I firstly step into this forum ).
I might be disappeared again for a while after two weeks, But don't worried I will come back. Here have lots of my fun.
"I have been using SAS for just over a month "
That is not long enough, at least one or two year learning . you will find your sas skill is good enough.
"I have noticed your name being pretty distinguished earlier when Pierre Gagnon suggested your name"
Thanks PG mentioned me. He is a sas expert .
Xia Keshan
From China
Message was edited by: xia keshan
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.