Thank you for reviewing my questions, I am really stuck in here.
I have a set of data that I have 64 conditions and 500 replications for each condition. However, some conditions have less than 500 replications, and it is hard for me to identify the conditions with less than 500 replications and how many are missing because there is no missing values (such as "." or blank) in replications. For example, the rep is like this: 1, 2, 4, 5, 8, 9, 10, ......., 500. (After 2, the replications directly jump to 4, and after 5, the replications directly jump to 😎
Could anyone tell me how to identify the conditions with less than 500 replications and how many of replications are missing for each condition? Thank you!! The data format is like below:
var1 var2 var3 var4 rep
a b c d 1
a b c d 2
a b c d 4
a b c d 5
a b c d 6
......
a b c d 500
a b c e 1
a b c e 2
.....
a b c e 500
a b f d 1
a b f d 3
......
a b f d 500
;
The var1, var2, var3, and var4 are manipulated design factors, the different combination of design factors is one condition, and each condition has 500 reps. (like the combination of 'a,b,c,d' is condition 1, 'a,b,c,e' is condition 2, etc., ) Hopefully I made my questions clear. Thank you again!!
proc freq data=have;
table var1*var2*var3*var4 / list out=counts;
run;
data not500;
set counts;
where COUNT ne 500;
MissingObs = 500 - COUNT;
run;
Assuming VAR1-VAR4 are constant for each set of REPS, then the above will work.
I'm assuming the variable name outputted from proc freq is COUNT but I can't recall the exact name right now. You may need to fix that part.
@SAS-questioner wrote:
Thank you for reviewing my questions, I am really stuck in here.
I have a set of data that I have 64 conditions and 500 replications for each condition. However, some conditions have less than 500 replications, and it is hard for me to identify the conditions with less than 500 replications and how many are missing because there is no missing values (such as "." or blank) in replications. For example, the rep is like this: 1, 2, 4, 5, 8, 9, 10, ......., 500. (After 2, the replications directly jump to 4, and after 5, the replications directly jump to 😎
Could anyone tell me how to identify the conditions with less than 500 replications and how many of replications are missing for each condition? Thank you!! The data format is like below:
var1 var2 var3 var4 rep
a b c d 1
a b c d 2
a b c d 4
a b c d 5
a b c d 6
......
a b c d 500
a b c e 1
a b c e 2
.....
a b c e 500
a b f d 1
a b f d 3
......
a b f d 500
;
The var1, var2, var3, and var4 are manipulated design factors, the different combination of design factors is one condition, and each condition has 500 reps. (like the combination of 'a,b,c,d' is condition 1, 'a,b,c,e' is condition 2, etc., ) Hopefully I made my questions clear. Thank you again!!
proc freq data=have;
table var1*var2*var3*var4 / list out=counts;
run;
data not500;
set counts;
where COUNT ne 500;
MissingObs = 500 - COUNT;
run;
Assuming VAR1-VAR4 are constant for each set of REPS, then the above will work.
I'm assuming the variable name outputted from proc freq is COUNT but I can't recall the exact name right now. You may need to fix that part.
@SAS-questioner wrote:
Thank you for reviewing my questions, I am really stuck in here.
I have a set of data that I have 64 conditions and 500 replications for each condition. However, some conditions have less than 500 replications, and it is hard for me to identify the conditions with less than 500 replications and how many are missing because there is no missing values (such as "." or blank) in replications. For example, the rep is like this: 1, 2, 4, 5, 8, 9, 10, ......., 500. (After 2, the replications directly jump to 4, and after 5, the replications directly jump to 😎
Could anyone tell me how to identify the conditions with less than 500 replications and how many of replications are missing for each condition? Thank you!! The data format is like below:
var1 var2 var3 var4 rep
a b c d 1
a b c d 2
a b c d 4
a b c d 5
a b c d 6
......
a b c d 500
a b c e 1
a b c e 2
.....
a b c e 500
a b f d 1
a b f d 3
......
a b f d 500
;
The var1, var2, var3, and var4 are manipulated design factors, the different combination of design factors is one condition, and each condition has 500 reps. (like the combination of 'a,b,c,d' is condition 1, 'a,b,c,e' is condition 2, etc., ) Hopefully I made my questions clear. Thank you again!!
That works like charm! Thank you!!
Can you show code for how that replication variable was created?
Reeza just provided correct solution, but thank you all the same.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.