BookmarkSubscribeRSS Feed
dstone121
Calcite | Level 5

Hello, everyone. I have a long data set with a different number of rows/observations per person. I need to add a new column to a data set that counts the number of times the value "No" occurs in one of the columns PRIOR to a "Yes" occurring. Unfortunately there is not date data in this dataset, so I have to sort it by age_months and then run the count. I am not very proficient in SAS so please explain any help you can provide. The data can be arranged as below, and the CountBeforeYes column is what I am trying to do. You can see that the count will return the same value for each patient. 

 

Data.PNG

6 REPLIES 6
mkeintz
PROC Star

What if a person has no event='Yes' record?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
dstone121
Calcite | Level 5
I apologize for not including that possibility in the data. In the case of event="Yes" record, can we make the count return as missing?
mkeintz
PROC Star

@dstone121 wrote:
I apologize for not including that possibility in the data. In the case of event="Yes" record, can we make the count return as missing?

Why not try the program offered and see what it does?  That's the thing about computers - they're usually waiting for you to learn and retry if neccessary.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
dstone121
Calcite | Level 5

Thank you for the code, but I tried it and it didn't work. I have included some example data lines: 

infile datalines dsd truncover;
  input PatientID:$12. age_months2:32. Index_Visit2:$14.;
  label PatientID="PatientID";
datalines;
0-00000001 66.72 Yes
0-00000002 58.72 Yes
0-00000003 58.96 Yes
0-00000004 23.68 No
0-00000004 31.84 No
0-00000004 41.84 No
0-00000005 30.32 No
0-00000005 57.36 Yes
0-00000006 8.32 No
0-00000006 8.48 No
0-00000006 9.04 No
0-00000006 12.32 No
0-00000006 18.8 No
0-00000006 39.76 Yes
0-00000007 22.32 No
0-00000007 27.36 No
0-00000007 41.76 No
0-00000007 46 Yes
0-00000007 46 No
0-00000008 6.08 No
0-00000008 39.6 Yes

Hopefully I did that right. When I ran the code provided, a couple problems emerged. First, it is counting the "Yes" instances even if there are not any "No" instances first. Second, it is counting the instances of "No" that occur prior to a "Yes" correctly, but in the rows where there is a "Yes" it is not repeating the count from the row prior. I'm not sure what it is doing, actually. Finally, when a "No" occurs after a "Yes", it seems to be repeating the incorrect count from the prior "Yes" row. 

Data.PNG

 

 

mkeintz
PROC Star

This program is untested in the absence of sample data in the form of a working data step:

 

data want (drop=_:);
  set have (in=firstpass) have (in=secondpass);
  by patientid;
  retain countbeforeyes;

  if first.patientid then do;
    _count=0;
    call missing(countbeforeyes);
  end;
*  if event='Yes' and countebeforeyes=. then countbeforeyes=_count;  **Replace per edit note below**;
  if firstpass and event='Yes' and countebeforeyes=. then countbeforeyes=_count;
  _count+1;

  if secondpass;
run;

 

Edit note: the program above is modified as per @Kurt_Bremser's observation 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1122 views
  • 0 likes
  • 3 in conversation