08-24-2016 12:52 PM - edited 08-24-2016 12:53 PM
I'm a brand new SAS user. I'm trying to creat a new data set based on an exisitng data set which will combine Observations with the same name and show a sum variable for each of those names for the entire month as a whole. I'm hoping someone can point me in the right direction.
Existing Data Set "A"
New Data Set "B" I want to create
08-24-2016 02:27 PM
This is a standard summation so proc means/univariate/summary are good ways.
proc means data=have sum stackods nway; class name work_date; format work_date monyy7.; var hours_worked; ods output summary=want; run; proc print data=want; run;
08-24-2016 04:04 PM
To provide a little expansion on @Reeza's post: The use of a format with a class variable creates analysis groups based on the formatted value. The underlying variable will still be a date just have a different default appearance. The actual value in the result set is usually the smallest in the values grouped together. The formats is one thing that makes SAS very flexible as a change in format can allow creating different summaries without creating multiple variables. Most procedures will honor this behavior.
For an example change the format to be YEAR4. and you will get the summary by calendary year.
08-24-2016 05:24 PM
And in addition to Reeza's suggestion to use PROC MEANS (Method #1), you also have the option to use PROC TABULATE or PROC REPORT or even PROC SQL.
Here are examples of the TABULATE (#2) and REPORT (#3) methods.
data work_hrs; infile datalines; input Name $ work_date : mmddyy10. hours_worked; datalines; Andy 5/4/2016 2 Andy 5/15/2016 3 Andy 5/22/2016 4 Dennis 7/3/2016 1 Dennis 7/22/2016 2 ; run; proc tabulate data=work_hrs out=dsout_tab(drop=_type_ _table_ _page_); title '2a PROC TABULATE Report'; var hours_worked; class name work_date; format work_date monyy5.; table name*work_date, hours_worked*(n sum); run; proc print data=dsout_tab; title '2b dataset from PROC TABULATE'; format work_date monyy5.; run; proc report data=work_hrs out=dsout_rep(drop=_break_); title '3a PROC REPORT output report'; column name work_date n hours_worked; define name / group; define work_date / group f=monyy5.; define n / 'Number of Days'; define hours_worked / 'Total Hours Worked'; run; proc print data=dsout_rep; title '3b dataset from PROC REPORT'; format work_date monyy5.; run;