I am trying to create a dataset that has two variables, one of which is a date field. The code I have is below, but the output only puts a "." in the date column. Can someone help me with this formatting issue?
data Advanced;
input name $ date;
format date mmddyy10;
datalines;
CAMPBELL 12/01/2012
COOK 03/01/2011
EDWARDS 02/01/2013
MORRIS 02/04/2013
PEREZ 06/01/2011
ROGERS 07/01/2012
SANCHEZ 09/01/2012
STEWART 1/19/2013
TURNER 05/01/2011
;
proc sort data=Advanced;
by name;
run;
You can do it like this
data Bonus;
set Both;
Bonus=.01*salary;
if intck('year', advanceddate, today()) le 7 and intck('year', basedate, today()) le 7;
format Bonus dollar10.;
run;
proc print data=Bonus;
run;
Use an appropriate informat and remember a period after the format name like this
data Advanced;
input name $ date:mmddyy10.;
format date mmddyy10.;
datalines;
CAMPBELL 12/01/2012
COOK 03/01/2011
EDWARDS 02/01/2013
MORRIS 02/04/2013
PEREZ 06/01/2011
ROGERS 07/01/2012
SANCHEZ 09/01/2012
STEWART 1/19/2013
TURNER 05/01/2011
;
Thank you! So if I wanted to merge the two datasets, how could I do so since they both have the same "date" variable? I just merged them but some of the dates have a "." in the cell. I've heard of the rename function but haven't been able to quite figure out how to incorporate it into my SAS code.
First off: What two datasets? I see only one?
You can rename the date variable in another data set like this
data Advanced;
input name $ date:mmddyy10.;
format date mmddyy10.;
datalines;
CAMPBELL 12/01/2012
COOK 03/01/2011
EDWARDS 02/01/2013
MORRIS 02/04/2013
PEREZ 06/01/2011
ROGERS 07/01/2012
SANCHEZ 09/01/2012
STEWART 1/19/2013
TURNER 05/01/2011
;
data OtherData;
set Advanced(rename=(date=date2));
run;
The second data set is as follows:
data Base;
input name $ date:mmddyy10.;
format date mmddyy10.;
datalines;
CAMPBELL 01/01/2008
CARTER 11/01/2009
COLLINS 02/01/2005
COOK 06/01/2010
EDWARDS 06/01/2010
EVANS 05/01/2009
MITCHELL 12/01/2010
MORRIS 11/01/2012
NELSON 08/01/2010
PEREZ 02/01/2008
PHILLIPS 01/01/2011
ROGERS 10/01/2008
SANCHEZ 11/01/2009
STEWART 03/01/2012
TURNER 02/01/2006
;
proc sort data=Base;
by name;
run;
I was able to get that date formatted as well using the code you provided (thanks again so much for that!).
I'm ultimately trying to merge three datasets to create a list of employees that have passed both the base and advanced certifications (the base/advanced datasets are the ones already provided). For the life of me I can't seem to figure this one out since the output I have still shows all total employees, not just the ones that have both certifications.
The code I have to merge the three of them is as follows:
data Both;
merge Employee Advanced Base;
by name;
keep name date salary;
run;
If you could provide any assistance on this I would be super grateful!
So you have three data sets you want to put together? 🙂 Then post all three and let's have a look?
Below is all of the code that created the three datasets. What I need to adjust is the last portion that merges the three to show only the people that have both certifications. So a final dataset that shows the employees’ names, both certification dates, and their salaries.
/***Problem #1***/
data Dept1Names;
input name $ empid $;
datalines;
NELSON 254
MITCHELL 362
PEREZ 910
CARTER 935
;
data Dept1Salaries;
input salary id $;
datalines;
103100 254
100200 362
73500 910
88400 935
;
data Dept1;
merge Dept1Names Dept1Salaries;
keep name salary;
run;
/***Problem #2***/
data Dept2;
input name $ salary;
datalines;
ROBERTS 74000
TURNER 89200
PHILLIPS 87000
CAMPBELL 104600
PARKER 87800
EVANS 73100
EDWARDS 72300
COLLINS 100500
;
data Dept3;
input name $ salary;
datalines;
STEWART 102600
SANCHEZ 103300
MORRIS 87200
ROGERS 70700
REED 73000
COOK 88900
;
data Employee;
set Dept1 Dept2 Dept3;
proc sort data = Employee;
by name;
run;
/***Problem #3***/
data Advanced;
input name $ date:mmddyy10.;
format date mmddyy10.;
datalines;
CAMPBELL 12/01/2012
COOK 03/01/2011
EDWARDS 02/01/2013
MORRIS 02/04/2013
PEREZ 06/01/2011
ROGERS 07/01/2012
SANCHEZ 09/01/2012
STEWART 1/19/2013
TURNER 05/01/2011
;
proc sort data=Advanced;
by name;
run;
data Base;
input name $ date:mmddyy10.;
format date mmddyy10.;
datalines;
CAMPBELL 01/01/2008
CARTER 11/01/2009
COLLINS 02/01/2005
COOK 06/01/2010
EDWARDS 06/01/2010
EVANS 05/01/2009
MITCHELL 12/01/2010
MORRIS 11/01/2012
NELSON 08/01/2010
PEREZ 02/01/2008
PHILLIPS 01/01/2011
ROGERS 10/01/2008
SANCHEZ 11/01/2009
STEWART 03/01/2012
TURNER 02/01/2006
;
proc sort data=Base;
by name;
run;
data Both;
merge Employee Advanced Base;
by name;
keep name date salary;
run;
If I understand you correctly, this gives you what you want 🙂
data Both;
merge Employee Advanced(in=ad) Base(in=ba);
by name;
keep name date salary;
if ad & ba;
run;
How would I be able to include both dates (base and advanced certification date)? I tried using the rename function in the Advanced table but had no luck. Including both dates would give me exactly what I'm looking for.
@sasnewbie5 wrote:
How would I be able to include both dates (base and advanced certification date)? I tried using the rename function in the Advanced table but had no luck. Including both dates would give me exactly what I'm looking for.
Basically to keep different values for like named variable RENAME one of them. Either prior to the merge or as a data set option.
data want;
merge set1
set2 (rename=(date=certificationdate))
;
run;
or similar. If you have the same named variable on multiple sets then rename each one that you don't want to overwrite.
Do like this
data Both;
merge Employee Advanced(in=ad rename=(date=advanceddate)) Base(in=ba rename=(date=basedate));
by name;
keep name basedate advanceddate salary;
if ad & ba;
run;
Perfect! Perhaps you can help me with this last part that pertains to this dataset. Based on the newly created dataset "Both", I need to determine which employees are eligible for a 1% bonus.
SAS certifications are only valid for seven years. Any certification tests passed over seven years ago from today’s date are expired. I need to eliminate employees that have expired certifications in either the Base or Advanced tests. Then I need to output the remaining employees’ names, certification dates, and their bonuses to the output window using appropriate date and dollar formats.
Could you assist with this?
I know that only 2 of the employees (Morris and Steward) should get the bonus, but I'm not getting how to write a code to execute this.
You can do it like this
data Bonus;
set Both;
Bonus=.01*salary;
if intck('year', advanceddate, today()) le 7 and intck('year', basedate, today()) le 7;
format Bonus dollar10.;
run;
proc print data=Bonus;
run;
Also, let me point out that the creation of Both is a nice place to take advantage of the hash object. Like this, you surpass both of the PROC SORT steps previously.
data Both_Hash(drop=rc:);
if 0 then set Advanced(rename=(date=advanceddate)) Base(rename=(date=basedate));
if _N_ = 1 then do;
declare hash adv(dataset:'Advanced(rename=(date=advanceddate))');
adv.defineKey('name');
adv.defineData('advanceddate');
adv.defineDone();
declare hash bas(dataset:'Base(rename=(date=basedate))');
bas.defineKey('name');
bas.defineData('basedate');
bas.defineDone();
end;
set Employee;
rc1=adv.find();
rc2=bas.find();
if rc1=0 & rc2=0;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.