I have a dataset which looks like below:
data have;
infile datalines truncover;
input type $ quarter phase;
datalines;
SS-X01 202001 1
SS-X01 202002 1
SS-X02 202001 1
SS-X02 202002 2
SS-Y02 202102 1
SS-Y02 202103 3
SS-Y01 202102 1
SS-Y01 202103 3
;
I'm trying to do the following things:
I want my dataset to look like this:
type quarter phase total
SS-X 202001 1 2
SS-X 202002 1 1
SS-X 202001 2 0
SS-X 202002 2 1
SS-Y 202102 1 2
SS-Y 202103 1 0
SS-Y 202102 3 0
SS-Y 202103 3 2
Do you need to do this by type? I don't completely follow your logic, please be more specific.
I don't see how you get this result in the output
SS-Y 202102 1 2
as I think the result ought to be a 1 in the final column for this row. Could you please explain?
That means for type Y quarter 202102 I have two 1's. Get it? Im counting how much PHASE occurs.
data intermediate;
set have;
type=substr(type,1,4);
run;
proc freq data=intermediate;
tables type*quarter*phase/list nopercent nocum;
run;
substr(type,1,4)
indicates we want to take part of the value of variable TYPE ... which parts? we start at the first character and continue for until a total of four characters has been selected
So if the value of TYPE is SS-X01, then we take the first four characters, resulting in SS-X
If it said substr(type,2,10), we would select the characters starting with the second character and continue until 10 characters had been selected.
TOTAL is found in the output from PROC FREQ. I believe the default column name used in PROC FREQ is Frequency.
@PaigeMiller Oke but this does not look the same as my desired outcome. Its just a frequency ...I also want it to be a dataset and not a report .
This is not the output I get. Please show the code you are using.
@PaigeMiller This is my code:
data have;
infile datalines truncover;
input type $ quarter phase;
datalines;
SS-X01 202001 1
SS-X01 202002 1
SS-X02 202001 1
SS-X02 202002 2
SS-Y02 202102 1
SS-Y02 202103 3
SS-Y01 202102 1
SS-Y01 202103 3
;
data intermediate;
set have;
type=substr(type,1,4);
run;
proc freq data=intermediate;
tables type*quarter*phase/list nopercent nocum;
run;
This is intermediate:
This is FREQ Procedure:
Two things:
* I want the zero frequencies also.
* I want it to be a dataset not a report because I need to do some other things with it after this.
To create a data set, use
tables type*quarter*phase/list noprint out=outputdatasetname;
Please explain how you would know the entire set of all possible places where a zero would appear. For example, your original output has SS-X and phase 2 with Total = 0, but when you go to SS-Y there is no phase 2 with Total = 0 in the output data set. Why is that?
@PaigeMiller Oké. Can you explain your question regarding the zeros? I didn't understand it
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.