- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Compute Descriptive Statistics of score by review_weekday?
This is my question any help into what to do in code?
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Assuming the variable is numeric you would use PROC MEANS, here's a fully worked example, copy and paste into SAS and run to see the example. If you have categorical data, it's a different process.
*Create summary statistics for a dataset by a 'grouping' variable and store it in a dataset;
*Generate sample fake data;
data have;
input ID feature1 feature2 feature3;
cards;
1 7.72 5.43 4.35
1 5.54 2.25 8.22
1 4.43 6.75 2.22
1 3.22 3.21 7.31
2 6.72 2.86 6.11
2 5.89 4.25 5.25
2 3.43 7.30 8.21
2 1.22 3.55 6.55
;
run;
*ensure sort before means;
proc sort data=have;
by id;
run;
*Create summary data;
proc means data=have noprint;
by id;
var feature1-feature3;
output out=want median= var= mean= /autoname;
run;
*Show for display;
proc print data=want;
run;
*First done here:https://communities.sas.com/t5/General-SAS-Programming/Getting-creating-new-summary-variables-longitudinal-data/m-p/347940/highlight/false#M44842;
*Another way to present data is as follows;
proc means data=have stackods nway n min max mean median std p5 p95;
by id;
var feature1-feature3;
ods output summary=want2;
run;
*Show for display;
proc print data=want2;
run;
https://github.com/statgeek/SAS-Tutorials/blob/master/proc_means_basic.sas
@Goffy123 wrote:
Compute Descriptive Statistics of score by review_weekday?
This is my question any help into what to do in code?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Assuming the variable is numeric you would use PROC MEANS, here's a fully worked example, copy and paste into SAS and run to see the example. If you have categorical data, it's a different process.
*Create summary statistics for a dataset by a 'grouping' variable and store it in a dataset;
*Generate sample fake data;
data have;
input ID feature1 feature2 feature3;
cards;
1 7.72 5.43 4.35
1 5.54 2.25 8.22
1 4.43 6.75 2.22
1 3.22 3.21 7.31
2 6.72 2.86 6.11
2 5.89 4.25 5.25
2 3.43 7.30 8.21
2 1.22 3.55 6.55
;
run;
*ensure sort before means;
proc sort data=have;
by id;
run;
*Create summary data;
proc means data=have noprint;
by id;
var feature1-feature3;
output out=want median= var= mean= /autoname;
run;
*Show for display;
proc print data=want;
run;
*First done here:https://communities.sas.com/t5/General-SAS-Programming/Getting-creating-new-summary-variables-longitudinal-data/m-p/347940/highlight/false#M44842;
*Another way to present data is as follows;
proc means data=have stackods nway n min max mean median std p5 p95;
by id;
var feature1-feature3;
ods output summary=want2;
run;
*Show for display;
proc print data=want2;
run;
https://github.com/statgeek/SAS-Tutorials/blob/master/proc_means_basic.sas
@Goffy123 wrote:
Compute Descriptive Statistics of score by review_weekday?
This is my question any help into what to do in code?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Goffy123 wrote:
I have categoric data does this mean it is a different way?
What descriptive statistics do you want?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
proc sql;
create table want as
select weekday, avg(score)
from have
group by weekday;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Provide more details and I can answer it then. For example, you can calculate the mean of a categorical variable so not sure how that would work, given your other response.
EDIT - added last sentence.
@Goffy123 wrote:
I have categoric data does this mean it is a different way?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If your only categorical variable is the grouping variable, ie weekday, that's fine since you're not summarizing the categorical data.
Did you run the example above? Did it work?
You can align your variables the same way and it should work fine.
If you're having issues post the code you're using and detail any issues.
@Goffy123 wrote:
My dataset is named Lasvegas and from that dataset I am trying to find the average of score by Review_weekday. The Score variable is 1-5 and the Review_ weekday is Monday, Tuesday.... Sunday.