Hi guys,
suppose to have the following:
data DB;
input ID :$20. Admission :date09. Discharge :date09. Index Variable1 Variable2 Variable3;
format Admission date9. Discharge date9.;
cards;
166 16FEB2019 26FEB2019 1 1 0 0
166 18MAR2019 25MAR2019 0 0 0 0
166 12APR2020 02JUN2020 0 0 0 0
170 22FEB2017 07MAR2017 1 0 0 0
170 22FEB2017 07MAR2017 0 0 0 0
170 30JAN2019 04MAR2019 0 0 0 0
313 03MAR2016 10MAR2016 1 0 1 0
313 03MAR2016 10MAR2016 0 0 0 0
313 12DEC2019 15DEC2019 0 0 0 0
215 22DEC2014 25DEC2014 1 1 0 0
;
Is there a way to get the following?
data DB1;
input ID :$20. Admission :date09. Discharge :date09. Index Variable1 Variable2 Variable3 Count;
format Admission date9. Discharge date9.;
cards;
166 16FEB2019 26FEB2019 1 1 0 0 1
166 18MAR2019 25MAR2019 0 0 0 0 0
166 12APR2020 02JUN2020 0 0 0 0 0
170 22FEB2017 07MAR2017 1 0 0 0 0
170 22FEB2017 07MAR2017 0 0 0 0 0
170 30JAN2019 04MAR2019 0 0 0 0 0
313 03MAR2016 10MAR2016 1 0 1 0 1
313 03MAR2016 10MAR2016 0 0 0 0 0
313 12DEC2019 15DEC2019 0 0 0 0 0
215 22DEC2014 25DEC2014 1 1 0 0 1
;
In other words where Index = 1 if there is at least one record = 1 in Variable1-3 then add a new variable Count = 1 otherwise 0.
Thank you in advance
Are these binary variables? Do you want the COUNT of how many were true (as the variable name implies) or another binary variable?
This seems to express what you want.
First let's make a HAVE dataset that included the expected result.
data have;
input ID :$20. Admission :date. Discharge :date. Index Variable1-Variable3 expect;
format Admission Discharge date9.;
cards;
166 16FEB2019 26FEB2019 1 1 0 0 1
166 18MAR2019 25MAR2019 0 0 0 0 0
166 12APR2020 02JUN2020 0 0 0 0 0
170 22FEB2017 07MAR2017 1 0 0 0 0
170 22FEB2017 07MAR2017 0 0 0 0 0
170 30JAN2019 04MAR2019 0 0 0 0 0
313 03MAR2016 10MAR2016 1 0 1 0 1
313 03MAR2016 10MAR2016 0 0 0 0 0
313 12DEC2019 15DEC2019 0 0 0 0 0
215 22DEC2014 25DEC2014 1 1 0 0 1
;
Now we can take advantage of the 0/1 coding to use a simple expression to calculate the COUNT of how many are TRUE when INDEX is TRUE like this:
data want;
set have;
count=index*sum(of variable1-variable3);
run;
If instead of a COUNT you actual want another boolean flag then use the MAX() function instead of the SUM() function. Or replace the * operator with the AND operator instead.
Result:
3162 proc print;
3163 where count ne expect;
3164 run;
NOTE: No observations were selected from data set WORK.WANT.
NOTE: There were 0 observations read from the data set WORK.WANT.
WHERE count not = expect;
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
Seems as if when INDEX=1, you want to test to see if there is a 1 in variable1-variable3. Testing these three variables is a simple series of IF statements, or better yet you can use the SUM function
if sum(of variable1-variable3)>0 then ... ;
else ... ;
Are these binary variables? Do you want the COUNT of how many were true (as the variable name implies) or another binary variable?
This seems to express what you want.
First let's make a HAVE dataset that included the expected result.
data have;
input ID :$20. Admission :date. Discharge :date. Index Variable1-Variable3 expect;
format Admission Discharge date9.;
cards;
166 16FEB2019 26FEB2019 1 1 0 0 1
166 18MAR2019 25MAR2019 0 0 0 0 0
166 12APR2020 02JUN2020 0 0 0 0 0
170 22FEB2017 07MAR2017 1 0 0 0 0
170 22FEB2017 07MAR2017 0 0 0 0 0
170 30JAN2019 04MAR2019 0 0 0 0 0
313 03MAR2016 10MAR2016 1 0 1 0 1
313 03MAR2016 10MAR2016 0 0 0 0 0
313 12DEC2019 15DEC2019 0 0 0 0 0
215 22DEC2014 25DEC2014 1 1 0 0 1
;
Now we can take advantage of the 0/1 coding to use a simple expression to calculate the COUNT of how many are TRUE when INDEX is TRUE like this:
data want;
set have;
count=index*sum(of variable1-variable3);
run;
If instead of a COUNT you actual want another boolean flag then use the MAX() function instead of the SUM() function. Or replace the * operator with the AND operator instead.
Result:
3162 proc print;
3163 where count ne expect;
3164 run;
NOTE: No observations were selected from data set WORK.WANT.
NOTE: There were 0 observations read from the data set WORK.WANT.
WHERE count not = expect;
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
Nearly 200 sessions are now available on demand with the SAS Innovate Digital Pass.
Explore Now →SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.