Hi guys,
I hope you can help me with this.
I AM interested to merge HAVE dataset with HAVE2 dataset, to obtain WANT using proc sql (many many merge) maybe?
data HAVE;
length level $10.;
input level $;
datalines;
COLOR
blue
green
red
METAL
bronze
silver
gold
;
run;
data HAVE2;
length main submain intensity $10.;
length nmain nsubmain 8.;
input main $ submain $ intensity nsubmain nmain;
datalines;
COLOR blue moderate 7 3
COLOR blue mild 2 2
COLOR green moderate 4 3
COLOR green severe 5 5
COLOR green moderate 4 3
COLOR red moderate 5 3
HOUSE bronze severe 3 6
HOUSE bronze mild 2 2
HOUSE gold mild 3 2
HOUSE silver severe 4 6
HOUSE silver mild 5 2
HOUSE silver moderate 5 3
;
run;
***dataset WANT;
COLOR severe 5
COLOR moderate 3
COLOR mild 2
blue moderate 7
blue mild 2
green severe 5
green moderate 4
red moderate 5
HOUSE severe 6
HOUSE moderate 3
HOUSE mild 2
bronze severe 3
bronze mild 2
gold mild 4
silver severe 4
silver moderate 5
silver mild 5
Thanks,
V.
You need to provide variable names for your WANT dataset. Perhaps they should be LEVEL, INTENSITY and COUNT.
Your HAVE dataset was row with "METAL" and your HAVE2 and WANT instead have "HOUSE".
Your HAVE dataset has gold after silver, but in your wanted output it is listed before silver. Why? Is there logic that dictates the order?
If you add a sorting variable to your HAVE dataset so that it can be used to order the results in the way you want. You should also do the same for your INTENSITY variable, either by adding that variable to the HAVE dataset or creating a separate datasets with the relative order for the intensity values.
The logic would be clearer if you started with two separate datasets with summaries of MAIN*INTENSITY and SUBMAIN*INTENSITY rather than the HAVE2 dataset.
data have;
input level :$10. @@;
sortord+1;
cards;
COLOR blue green red HOUSE bronze silver gold
run;
data have1 ;
input intensity :$10. @@;
sortord+1;
cards;
severe moderate mild
run;
proc sql ;
create table want as
select a.level,a.intensity,a.count
from
( select distinct main as level,intensity,nmain as count from have2
UNION
select distinct submain as level,intensity,nsubmain as count from have2
) a
, have b
, have1 c
where a.level = b.level
and a.intensity = c.intensity
order by b.sortord, c.sortord
;
quit;
Obs level intensity count
1 COLOR severe 5
2 COLOR moderate 3
3 COLOR mild 2
4 blue moderate 7
5 blue mild 2
6 green severe 5
7 green moderate 4
8 red moderate 5
9 HOUSE severe 6
10 HOUSE moderate 3
11 HOUSE mild 2
12 bronze severe 3
13 bronze mild 2
14 silver severe 4
15 silver moderate 5
16 silver mild 5
17 gold mild 3
Hi,
Here you go...Hope it helps...
proc sql;
create table want as
select distinct a.level,b.intensity,
case
when level = "COLOR" then b.nmain
else b.nsubmain
end as temp
from have as a right join
have1 as b
on a.level = b.main or
a.level = b.submain;
quit;
proc sort nodupkey data = have2 out = want2(where = (main = "HOUSE"));
by main intensity;
run;
data final(keep = level intensity temp);
set want want2(rename = (main = level nmain = temp));
run;
Dear Urvish Shah...still is not right...
there appeat a have1 in the code, and the family HOUSE is in the end of the dataset, and need to be after the first subfamily, after the row (red).....could you fix this?
Use proc means with a ways statement and then format your output instead. You may also want to introduce some numeric variables to help control the order instead of using character variables.
proc means data=have;
class main submain intensity;
ways;
var nsubmain nmain;
output out=summary1 sum=/autoname;
run;
Dear Reeza, mi idea is to join the two datasets (have) and (have2) to get
the dataset WANT, could you write the all code to see it.
Thanks.
You need to provide variable names for your WANT dataset. Perhaps they should be LEVEL, INTENSITY and COUNT.
Your HAVE dataset was row with "METAL" and your HAVE2 and WANT instead have "HOUSE".
Your HAVE dataset has gold after silver, but in your wanted output it is listed before silver. Why? Is there logic that dictates the order?
If you add a sorting variable to your HAVE dataset so that it can be used to order the results in the way you want. You should also do the same for your INTENSITY variable, either by adding that variable to the HAVE dataset or creating a separate datasets with the relative order for the intensity values.
The logic would be clearer if you started with two separate datasets with summaries of MAIN*INTENSITY and SUBMAIN*INTENSITY rather than the HAVE2 dataset.
data have;
input level :$10. @@;
sortord+1;
cards;
COLOR blue green red HOUSE bronze silver gold
run;
data have1 ;
input intensity :$10. @@;
sortord+1;
cards;
severe moderate mild
run;
proc sql ;
create table want as
select a.level,a.intensity,a.count
from
( select distinct main as level,intensity,nmain as count from have2
UNION
select distinct submain as level,intensity,nsubmain as count from have2
) a
, have b
, have1 c
where a.level = b.level
and a.intensity = c.intensity
order by b.sortord, c.sortord
;
quit;
Obs level intensity count
1 COLOR severe 5
2 COLOR moderate 3
3 COLOR mild 2
4 blue moderate 7
5 blue mild 2
6 green severe 5
7 green moderate 4
8 red moderate 5
9 HOUSE severe 6
10 HOUSE moderate 3
11 HOUSE mild 2
12 bronze severe 3
13 bronze mild 2
14 silver severe 4
15 silver moderate 5
16 silver mild 5
17 gold mild 3
Sorry Tom, it was a mistake..now i corrected..it follow the alphabetic order. thanks.
data HAVE;
length level $10.;
input level $;
datalines;
COLOR
blue
green
red
METAL
bronze
gold
silver
;
run;
If you can sort by the variable content then your do not need the HAVE dataset.
proc sql ;
create table want as
select a.level,a.intensity,a.count
from
( select distinct main, main as level,intensity,nmain as count from have2
UNION
select distinct main, submain as level,intensity,nsubmain as count from have2
) a
order by a.main,a.level,a.intensity descending
;
quit;
proc print data=want ; run;
Hi Tom, , thank you for this, but could you fix the code...I got this mistake...Thank you.
( select main,distinct main as level,intensity,nmain as count from have2
----
22
262 ( select main,distinct main as level,intensity,nmain as count from have2
----
78
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, *, **, +, ',', -, '.', /,
<, <=, <>, =, >, >=, ?, AND, AS, CONTAINS, EQ, EQT, FROM, GE, GET, GT, GTT, LE,
Move the distinct keyword to right after the select. I fixed it in the post above.
Thank you very much Tom, it works.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.