Hi,
I want to know if it's possible to conditionally delete a row in SAS for a table when it has 0 counts.
The example data is here:
data have;
length name $25;
format subjectn sbj. statusn stus.;
input name $ subjectn statusn ;
datalines;
joe 1 1
jack 2 2
lisa 3 3
mick 2 2
laura 1 1
larry 4 3
;
run;
Explanation of the the goal i'm trying to reach:
There are 6 subjects, but only 4 are in this dataset. The other two were reading and gym.
Let's say there was a student, kevin, who was enrolled but dropped out of the subject gym.
The goal is to have a final table that has only 5 subjects (despite if there is even 0 counts in the cell) in the final table.
The 5 final subjects we want are: math, english, physics, history, and reading.
We want to delete the gym row only, on the condition that it has 0 counts. If there is someone taking gym, then the row stays.
Is there a way to do this?
The following is the code to produce the table above.
proc format;
value sbj
1=math
2=english
3=physics
4=history
5=reading
6=gym
;
value stus
1=finished
2=stopped
3=continuing
;
run;
proc summary data=have nway completetypes;
class subjectn statusn/ preloadfmt order=data missing;
format subjectn sbj. statusn stus.;
output out=counts;
proc freq data=counts order=data;
table subjectn*statusn/ missing norow nocol nocum nopercent out=cnts2 (drop=percent);
weight _freq_/ zeros;
format subjectn sbj. statusn stus.;
run;
proc sort data=cnts2;
by subjectn;
run;
proc transpose data=cnts2 out=cnts2_t (drop=_NAME_ _LABEL_);
by subjectn;
id statusn;
var count;
run;
Does below return what you're after?
proc transpose
data=cnts2
out=cnts2_t
( drop=_NAME_ _LABEL_
where=(not (put(subjectn,sbj.) = 'gym' and finished=0 and stopped=0 and continuing=0))
);
by subjectn;
id statusn;
var count;
run;
@Hello_there wrote:
Hi,
I want to know if it's possible to conditionally delete a row in SAS for a table when it has 0 counts.
The example data is here:
data have; length name $25; format subjectn sbj. statusn stus.; input name $ subjectn statusn ; datalines; joe 1 1 jack 2 2 lisa 3 3 mick 2 2 laura 1 1 larry 4 3 ; run;
Explanation of the the goal i'm trying to reach:
There are 6 subjects, but only 4 are in this dataset. The other two were reading and gym.Let's say there was a student, kevin, who was enrolled but dropped out of the subject gym.
The goal is to have a final table that has only 5 subjects (despite if there is even 0 counts in the cell) in the final table.
The 5 final subjects we want are: math, english, physics, history, and reading.
We want to delete the gym row only, on the condition that it has 0 counts. If there is someone taking gym, then the row stays.
Is there a way to do this?
The following is the code to produce the table above.
proc format; value sbj 1=math 2=english 3=physics 4=history 5=reading 6=gym ; value stus 1=finished 2=stopped 3=continuing ; run; proc summary data=have nway completetypes; class subjectn statusn/ preloadfmt order=data missing; format subjectn sbj. statusn stus.; output out=counts; proc freq data=counts order=data; table subjectn*statusn/ missing norow nocol nocum nopercent out=cnts2 (drop=percent); weight _freq_/ zeros; format subjectn sbj. statusn stus.; run; proc sort data=cnts2; by subjectn; run; proc transpose data=cnts2 out=cnts2_t (drop=_NAME_ _LABEL_); by subjectn; id statusn; var count; run;
What DATA provides the information that "Kevin" and "gym" are supposed to be removed from the final output?
You went to a lot of work to make sure zero counts are in the data, now you are likely to have even more to identify things that need to be removed. But there has to be DATA somewhere to tell which to remove. That will mean identify the values that "should" have been displayed from that extra source and then combine that information with your data containing zero counts to either keep or remove records. Not that it can't be done, but this needs additional data set(s) to accomplish what you want.
And your "reader", who may know there should be a "gym" class would need some explanation as to why it doesn't appear. This may be fun when you start doing, taking the "class and student" example, when one grade/school building has one set of resulting values (i.e. gym appears because there was a student in gym) and this specific group doesn't.
Here is one way provided you can make a data set similar to the Todelete that I make:
data todelete; input subjectn; datalines; 6 ; data excluded; merge cnts2_t todelete (in=indelete) ; by subjectn; if indelete then delete; run;
The Merge combines data sets. The data set option in= creates a temporary variable that indicates whether the data set contributes to the current merged data observation. The values are 1/0 for contributes/doesn't contribute. In this case we only care if the Todelete contributes so only that data set has the option. The BY is to match on the value so only the "gym" record would have the indelete=1. Then we test the value of the indelete variable and if true (SAS treats 1 as true, 0 as false) then the record is deleted.
Hi, thanks for your reply.
The context doesn't really matter so much as i was trying to find a parallel example for my own coding project. The example i gave isn't very good i admit.
The data is a dynamic data set, and it's constantly being updated, and people are being switched into different statuses. The gym row is to represent something temporary unless there was an actual count; in which case it stays.
To accomplish what i was going for i think i figured it out. It involves an additional step, that's based off the transposed one, and an if-then delete statement.
Does below return what you're after?
proc transpose
data=cnts2
out=cnts2_t
( drop=_NAME_ _LABEL_
where=(not (put(subjectn,sbj.) = 'gym' and finished=0 and stopped=0 and continuing=0))
);
by subjectn;
id statusn;
var count;
run;
@Hello_there wrote:
Thanks, i tested this by updating the data set with a gym value and it does what i need. It keeps it if there is a count, and doesn't when there isn't one.
Additionally i experimented and tried an additional data step with an if-then delete statement and that worked also.
Note that if you follow my merge and delete example then the requirement is to build a data set with the value(s) to remove. I suspect that can be made moderately simple with the implied additional data you have. So the whole thing becomes more dynamic and you don't have to modify code to write if/then with literal values.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.