Hi,
I have problem with summarizing my data. I need to count how many distinct patient_Name for each distinct time point if date_completed ne .
I tried my code put I really missed up with it.
I need you to suggest an edit to my code.
proc sort data=readert1; by patient_name TimePoint; run;
data readert2;
set readert1;
by patient_name timepoint;
keep Patient_Name TimePoint Date_Completed ctime;
if date_completed ne . and first.Patient_name then do; ctime=0; end;
ctime+1;
if date_completed ne . and last.timepoint ;
this is the data that I need to summarize
Obs Patient_Name Timepoint Date_Completed1234567891011121314151617181920212223242526272829303132333435363738394041
W1490.1045001 | Baseline | 15DEC2017 |
W1490.1056002 | Baseline | 13NOV2017 |
W1490.1056002 | Baseline | 13NOV2017 |
W1490.1056002 | Week 16 | 15NOV2017 |
W1490.1056002 | Week 16 | 15NOV2017 |
W1490.1056002 | Week 24 | 31JAN2018 |
W1490.1056002 | Week 24 | 31JAN2018 |
W1490.1056002 | Week 8 | 15NOV2017 |
W1490.1056002 | Week 8 | 15NOV2017 |
W1490.1056003 | Baseline | 13NOV2017 |
W1490.1056003 | Week 8 | 13NOV2017 |
W1490.1056005 | Baseline | 13NOV2017 |
W1490.1056005 | Week 16 | 13NOV2017 |
W1490.1056005 | Week 24 | 25DEC2017 |
W1490.1056005 | Week 8 | 13NOV2017 |
W1490.1056006 | Baseline | 12DEC2017 |
W1490.1056006 | Week 16 | 12DEC2017 |
W1490.1056006 | Week 24 | 12JAN2018 |
W1490.1056006 | Week 8 | 12DEC2017 |
W1490.1068001 | Baseline | 13NOV2017 |
W1490.1068001 | Baseline | 13NOV2017 |
W1490.1068001 | Week 16 | 13DEC2017 |
W1490.1068001 | Week 16 | 13DEC2017 |
W1490.1068001 | Week 24 | 25DEC2017 |
W1490.1068001 | Week 24 | 25DEC2017 |
W1490.1068001 | Week 32 | 22FEB2018 |
W1490.1068001 | Week 32 | 22FEB2018 |
W1490.1068001 | Week 8 | 13NOV2017 |
W1490.1068001 | Week 8 | 13NOV2017 |
W1490.1068002 | Baseline | . |
W1490.1068002 | Week 8 | . |
W1490.1089001 | Baseline | 15NOV2017 |
W1490.1089001 | Week 16 | 15NOV2017 |
W1490.1089001 | Week 24 | 18DEC2017 |
W1490.1089001 | Week 32 | 10JAN2018 |
W1490.1089001 | Week 8 | 15NOV2017 |
W1490.1089002 | Baseline | 15DEC2017 |
W1490.1089003 | Baseline | 28DEC2017 |
W1490.1100002 | Baseline | 31JAN2018 |
W1490.1134001 | Baseline | . |
W1490.1134004 | Baseline | 30JAN2018 |
Show us the code you've used so far, and maybe someone can suggest the necessary changes.
Tom
You can find some examples here:
https://github.com/statgeek/SAS-Tutorials/blob/master/count_distinct_by_group.sas
I need to do it in the data step
@mona4u wrote:
I need to do it in the data step
Why? If a proc available to basically everyone works why the restriction.
You should show what the result for your example data would be. If you want to count names within timepoint then the sort and by processing likely should be BY Timepoint patient_name;
Perhaps something like this untested code.
proc sort data=readert1; by timepoint patient_name ; run; data readert2; set readert1 (where=(date_completed ne .)); by timepoint patient_name ; keep Patient_Name TimePoint Date_Completed ctime; if first.timepoint then patientcount=0; if first.patient then patientcount+1; if last.timepoint; run;
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
this code doesn't summarize that data correctly
@mona4u wrote:
this code doesn't summarize that data correctly
Since you did not show any desired result I don't think that is very responsive.
Show the correct summary for example data.
this code doesn't summarize that data correctly.
appreciate your try
proc freq data=have(where=(date_completed^=.));
tables patient_name/out=_a_;
run;
The number of observations in _a_ is the number of distinct patient_name exist in your data.
that's not right I want the number of distinct patient_name for each distinct timepoint
The SUM statement implies RETAIN.
@mona4u wrote:
I just found a way by using retain statement
Show your actual working result code in the post you marked correct so that someone searching the forum for help can see the approach taken.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.