BookmarkSubscribeRSS Feed
buddha_d
Pyrite | Level 9

SAS gurus,

             I want to create a report using these 4 datasets. For this I have to merge all datasets to make a single data set. data two and three has height age gender as common (sorting variables) and data one has height and age (sorting) variables and data four has no common variable. I don't want to loose any common variables from one dataset to another (since merging sometimes eliminates sample variable columns). I want to create a new variable for data three weight_sum for all male weights and female weights. All of these need to be merged to make a single dataset. 

            The tricky part is I need to create a .csv report using data _null_ statement (using file statement to store it in a location). Please let me know if you need any further clarification. I would greatly appreciate if anyone  helps me in this regard. 

Thanks a lot. 

Please look at the attachments for have to want. 

14 REPLIES 14
Reeza
Super User

@buddha_d wrote:

SAS gurus,

             I want to create a report using these 4 datasets. For this I have to merge all datasets to make a single data set. data two and three has height age gender as common (sorting variables) and data one has height and age (sorting) variables and data four has no common variable. I don't want to loose any common variables from one dataset to another (since merging sometimes eliminates sample variable columns). I want to create a new variable for data three weight_sum for all male weights and female weights. All of these need to be merged to make a single dataset. 

            The tricky part is I need to create a .csv report using data _null_ statement (using file statement to store it in a location). Please let me know if you need any further clarification. I would greatly appreciate if anyone  helps me in this regard. 

Thanks a lot. 

Please look at the attachments for have to want. 


What exactly is your question? What part do you need help with?

buddha_d
Pyrite | Level 9

Reeza,

          I need help with the dataset creation in sas then writing code to export the dataset using data _null_ statement (using file statement). 

thanks for asking

Reeza
Super User

Alright, well, this is clearly homework, so my personal policy is to assist, not provide answers. If you post what you've tried and isn't working I'm happy to help debug or identify where you've gone wrong or a direction to go. I'm not going to do your work.

Good Luck. 

 

buddha_d
Pyrite | Level 9

Reeza, How can I merge two datasets when there is no common variables? I have merged 3 datasets that have common variables, but stuck with the last part? 

Patrick
Opal | Level 21

@buddha_d

Please post the code you've already written into:

 Capture.JPG

 

As for merging your data set four:

To which record(s) do you believe should it get merged. Show us the result of your 3 merged data sets and then tell us where the ds four record should get merged to. 

You need first to solve the logical question (what) before we can jump to the how.

Reeza
Super User

So what is the rule for merging?

Shmuel
Garnet | Level 18

What kind of report should it be? Can you display the desired format?

What variables should be on the report?

Why do you think you need to merge the datasets?

Why not just make a concatenation of the four datsets? or only part of them?

 

 

Reeza
Super User

Here's a list of the ways you can combine a SAS dataset and how to accomplish each of them:

http://support.sas.com/documentation/cdl/en/lrcon/69852/HTML/default/viewer.htm#n1tgk0uanvisvon1r26l...

buddha_d
Pyrite | Level 9

my sincerely apologies to all of you who responded. I was out and just checked the messages. 

proc sort data=one out=one1;
by height age gender;
run;
proc sort data=two out=two2;
by height age gender;
run;
data two_concat;
merge two2 one1;
by height age gender;
run;
proc sort data=two_concat out=two_concat1;
by height age;
run;
data concat;
merge three two_concat1;
by height age;
run;

 

 

the last dataset is just a dataset I have to add as information sake. But when i tried to merge with other three datasets, I was not able to do it. Then the question came, how to merge it. I heard we can make indexes and them together. But never worked on it.

               The criteria is to make a single dataset that contains all the observations concatenated. When I run my code I am getting repeated values (the values that are less than the most observations dataset). How can I avoid that to begin with and as i progress along with this I would ask more questions. 

Thanks all for your help

 

      

buddha_d
Pyrite | Level 9

sorry I messed up with dataset names

Shmuel
Garnet | Level 18

Pay attention:

1) In most of your input records height=17, age=45 - are the same.
    By using MERGE - those rows will output in one MALE record and one FEMALE record.

    Is this your intention?

2) Data THREE has same variables names like data ONE: alias, city, zipcode
    that means that by MERGE data will overide and you will get last observation data.

    Is that what you want?

 

Does each observation represent one person? Do height, age and gender make it unique equivalent to an ID ?

 

Maybe next code will fit your needs:

data want;
   set four (rename=(name=manager date=mng_date))
         one 
         two 
         three
    ;
         retain manager title earning mng_date;
rum;
proc sort data=want; 
   by height age gender;
run;
        

 

 

 

buddha_d
Pyrite | Level 9

Shmuel,

1) In most of your input records height=17, age=45 - are the same.
    By using MERGE - those rows will output in one MALE record and one FEMALE record.

    Is this your intention? Yes

2) Data THREE has same variables names like data ONE: alias, city, zipcode
    that means that by MERGE data will overide and you will get last observation data.

    Is that what you want? No

 

Does each observation represent one person?  yes Sir. Do height, age and gender make it unique equivalent to an ID ? No, but that's all I have to mess with. 

 

                Thank you for your response. Yes, you are right, I am getting the repeated observations from last observation.I used this code below. How can I have just same number of observations and missing values for other repeating values of the last observation (which I don't want). Is there any other like proc sql gets me what I want. Please advise. Thanks for responding. 

 

data one;
input height age weight2 gender2$ alias2$ city2$ zipcode2$ store$;
cards;
17 45 25 ganges fakenote wilming 584157 walmart
;
run;

data two;
input height age weight1 gender$ date mmddyy10. alias1$ city1$ zipcode1$;
cards;
17 45 247.96 male 07/21/2017 john1 richmond 12546
17 45 283.13 female 06/15/2017 cindy1 siouxcit 45104
;
run;


data three;
input height age weight gender$ alias$ city$ zipcode$;
cards;
17 45 45.5 male john richmond 12546
17 45 56.48 male shawn Newyork 40534
17 45 57.78 female cindy siouxcit 45104
17 45 45.21 female crawford chicago 15483
17 45 47.14 male jason brooking 57006
17 45 94.18 female gita villa 58461
17 45 98.84 male johnson volga 15581
17 45 85.96 female smita raymond 48634
;
run;

data four;
input name$ title$ earning dollar7. date mmddyy10.;
cards;
Kum&go Manager $65,000 07/29/2017
;
run;

proc sort data=three out=three1(rename=(gender=gender1)) ;
by height age gender;
run;
proc sort data=two out=two2;
by height age gender;
run;
data two_concat;
merge two2 three1;
by height age ;
run;
proc sort data=two_concat out=two_concat1;
by height age;
run;
data concat;
merge one two_concat1;
by height age;
run;

 

Shmuel
Garnet | Level 18

What do you mean by "How can I have just same number of observations " ?

Is it sum of all datasets' number of observation? While merging you get less because of fittings.

 

You don't want to overide data from THREE on ONE - then you need to rename variables to have them uniqu

like you have in TWO.

 

As to FOUR - should data be on every observation the same ? Thats why I set FOUR first with renaming and retaining of its variables.

 

Can you post desired output?

Tom
Super User Tom
Super User

Not sure why you included the code as an attachment instead of using the Insert SAS code button to paste it in.

data one;
input height age weight2 gender2$ alias2$ city2$ zipcode2$ store$;
cards; 
17 45 25 ganges fakenote wilming 584157 walmart
;
run;

data two;
input height age weight1 gender$ date mmddyy10. alias1$ city1$ zipcode1$;
cards;
17 45 247.96 male 07/21/2017 john1 richmond 12546
17 45 283.13 female 06/15/2017 cindy1 siouxcit 45104
;
run;

data three;
input height age weight gender$ alias$ city$ zipcode$;
cards;
17 45 45.5 male john richmond 12546
17 45 56.48 male shawn Newyork 40534
17 45 57.78 female cindy siouxcit 45104
17 45 45.21 female crawford chicago 15483
17 45 47.14 male jason brooking 57006
17 45 94.18 female gita villa 58461
17 45 98.84 male johnson volga 15581
17 45 85.96 female smita raymond 48634
;
run;

data four;
input name$ title$ earning date mmddyy.;
cards;
Kum&Go Manager $65,000 07/29/2017;
run;

Your data files don't really look related to each other.

Perhaps you just want to stack the first three together and then add in the single observation from FOUR so that the MANAGER name is on every observation?

data want ;
  if _n_=1 then set four;
  set one two three;
run;

 Watch out for variables with similar names, but that actually have different content. 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 2798 views
  • 1 like
  • 5 in conversation