I am learning SAS and I am supposed to get this output by sorting and merging my 2 raw datasets.
The problem I keep running into is that I get close, but only the first 3 obs are correct.
This is my code:
filename f1 "C:\Users\Colle\Documents\Grad School\A BINF 5210 Health Data Analytics with SAS\File1-1-1.txt";
filename f2 "C:\Users\Colle\Documents\Grad School\A BINF 5210 Health Data Analytics with SAS\File2-1-1.txt";
run;
data temp;
infile f1 DSD dlm='09'x;
input Salary Location $ ID $ Desc $;
run;
proc sort data=temp;
by ID;
run;
proc print data=temp;
run;
data temp1;
infile f2 DSD dlm='09'x;
input ID $ Gender $ Age Mar_Status $ Weight;
run;
proc sort data=temp1;
by ID;
run;
proc print data=temp1;
run;
data merged_data;
merge temp(in=a) temp1(in=b);
by ID;
if a=1 or b=0;
run;
proc print data=merged_data;
run;
This is my output which obviously doesn't match the one I'm supposed to get. I tried all left and right merge combinations.
What am I doing wrong? Obs 1-3 are what I need but then obs 4-5 are wrong.
I attached my raw data below.
Again I am a beginner so I am not supposed to use anything "fancy" or more advanced than this level of sorting/merging/aliases 🙂
Thank you!!
If you look into your SAS log you will find:
Your source file has 6 lines but you "lose" the last one. This is caused by your source file not having strings for all positions.
Add infile option TRUNCOVER to read the data correctly. As a rule of thumb: Always use truncover unless there is a good reason not to.
data temp;
infile f1 DSD dlm='09'x truncover;
input Salary Location $ ID $ Desc $;
run;
Once you've got correct source data you also need to amend your merge step to only keep rows where the ID's match.
data merged_data;
merge temp(in=a) temp1(in=b);
by ID;
if a=1 and b=1;
run;
If you look into your SAS log you will find:
Your source file has 6 lines but you "lose" the last one. This is caused by your source file not having strings for all positions.
Add infile option TRUNCOVER to read the data correctly. As a rule of thumb: Always use truncover unless there is a good reason not to.
data temp;
infile f1 DSD dlm='09'x truncover;
input Salary Location $ ID $ Desc $;
run;
Once you've got correct source data you also need to amend your merge step to only keep rows where the ID's match.
data merged_data;
merge temp(in=a) temp1(in=b);
by ID;
if a=1 and b=1;
run;
Ah thank you so much!
I didn't use truncover because my professor didn't teach us that, so I wonder what she was expecting us to use?
This works though so I am using your solution 🙂
@Colleen9400 wrote:
Ah thank you so much!
I didn't use truncover because my professor didn't teach us that, so I wonder what she was expecting us to use?
This works though so I am using your solution 🙂
If you are using LIST MODE input and some values are missing then place a period in the source text file to indicate where the missing value. You can use period for either numeric or character variables as the normal character informat will convert a string consisting of a single period to an empty (really all blank) string.
data want;
input name $ age sex $ ;
cards;
Alice 12 F
Joe . M
Sam 13 .
;
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.