DATA Step, Macro, Functions and more

Analysis of month to month screening performance by organization

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 77
Accepted Solution

Analysis of month to month screening performance by organization

Hello,

 

I have a dataset of inpatient data with the following relevant variables: date of discharge (dcdeathdate), HCV Screening (HepCScreenigndone), HCV Status (HCVResult), and organization (_OrganizationTax). HCV Screening can be either "Yes"/"No"/Blank and HCV status can be either "Positive"/"Negative"/Blank. I would like to see how well each organization is screening month to month (Nov, Dec, Jan) and would like sas to output something that looks like this. What kind of code can I use to do this level of analysis? Below is a sample dataset. Thank you.

 

Month by month analysis.png

 

data MYDATA.SAMPLEMONTHANALYSIS;
  infile datalines dsd truncover;
  input DCDeathDate:DATETIME16. HepCscreeningdone:$3. HCVresult:$8. _OrganizationTax:$9.;
datalines4;
28NOV16:00:00:00,No,,404908043
12DEC16:00:00:00,Yes,Negative,202193544
30NOV16:00:00:00,Yes,Negative,202453987
23DEC16:00:00:00,No,,202051689
06NOV16:00:00:00,Yes,Positive,211328703
18DEC16:00:00:00,No,,204871594
23NOV16:00:00:00,No,,211328703
22NOV16:00:00:00,No,,202051689
26DEC16:00:00:00,Yes,Negative,404879663
18NOV16:00:00:00,No,,402022253
09DEC16:00:00:00,Yes,Negative,204871781
30NOV16:00:00:00,No,,400153061
17NOV16:00:00:00,Yes,Negative,206120730
27DEC16:00:00:00,Yes,Negative,200010674
28DEC16:00:00:00,Yes,Negative,245428880
04NOV16:00:00:00,Yes,Negative,212806766
10DEC16:00:00:00,Yes,Negative,200010674
04NOV16:00:00:00,No,,404945164
20DEC16:00:00:00,No,,211328703
27DEC16:00:00:00,Yes,Negative,202050840
05DEC16:00:00:00,Yes,Negative,202051876
04NOV16:00:00:00,No,,202051689
16DEC16:00:00:00,Yes,Negative,404981622
19NOV16:00:00:00,Yes,Negative,206063383
10NOV16:00:00:00,No,,211385767
26NOV16:00:00:00,Yes,Negative,200010674
04NOV16:00:00:00,No,,204871594
25DEC16:00:00:00,No,,205165453
18NOV16:00:00:00,Yes,Negative,205005527
25NOV16:00:00:00,Yes,Negative,205005527
31DEC16:00:00:00,Yes,Negative,404866659
01NOV16:00:00:00,No,,211328703
21NOV16:00:00:00,No,,416289947
02NOV16:00:00:00,No,,401945605
02DEC16:00:00:00,No,,202051689
05DEC16:00:00:00,No,Negative,404905821
26DEC16:00:00:00,Yes,Negative,200010674
05NOV16:00:00:00,No,,400115362
15DEC16:00:00:00,Yes,Negative,211385767
02DEC16:00:00:00,Yes,Negative,405001466
14DEC16:00:00:00,Yes,Negative,202051876
04DEC16:00:00:00,Yes,Negative,400153061
10DEC16:00:00:00,Yes,Negative,200010674
28DEC16:00:00:00,Yes,Negative,404866659
30NOV16:00:00:00,No,,400153061
10DEC16:00:00:00,No,Negative,404945217
29NOV16:00:00:00,Yes,Negative,405018831
11DEC16:00:00:00,Yes,Negative,404907730
07NOV16:00:00:00,,,404514762
02DEC16:00:00:00,No,,406131939
18DEC16:00:00:00,No,,202051689
06NOV16:00:00:00,No,,211385767
08DEC16:00:00:00,Yes,Negative,404866123
21DEC16:00:00:00,Yes,Negative,404866659
04NOV16:00:00:00,No,,400153061
04NOV16:00:00:00,Yes,Negative,405032806
29DEC16:00:00:00,Yes,Negative,204927543
21DEC16:00:00:00,Yes,Positive,212153756
05NOV16:00:00:00,Yes,Negative,200199702
30DEC16:00:00:00,No,Negative,404905821
10NOV16:00:00:00,Yes,Negative,205210467
14NOV16:00:00:00,No,,211385767
14NOV16:00:00:00,Yes,Negative,200010674
26NOV16:00:00:00,No,,202051876
22DEC16:00:00:00,Yes,Negative,406055879
21DEC16:00:00:00,Yes,Negative,204927543
18NOV16:00:00:00,No,Negative,200007143
11DEC16:00:00:00,Yes,Negative,405018831
10NOV16:00:00:00,No,,211328703
28NOV16:00:00:00,No,Negative,404945217
07DEC16:00:00:00,Yes,Negative,205284093
17NOV16:00:00:00,No,,202051876
04DEC16:00:00:00,Yes,Negative,202051876
16DEC16:00:00:00,No,,202051689
14NOV16:00:00:00,No,,202051689
14NOV16:00:00:00,Yes,Negative,202453987
13NOV16:00:00:00,No,Negative,218071681
21NOV16:00:00:00,No,,202193544
02DEC16:00:00:00,No,,404945164
03DEC16:00:00:00,No,,211328703
13NOV16:00:00:00,Yes,Negative,404865981
17NOV16:00:00:00,No,Negative,200007143
17NOV16:00:00:00,No,,202051689
06DEC16:00:00:00,Yes,Negative,205250618
03DEC16:00:00:00,Yes,Positive,202051689
14NOV16:00:00:00,Yes,Negative,404865981
25NOV16:00:00:00,No,,202193544
22NOV16:00:00:00,Yes,Negative,205250618
15DEC16:00:00:00,Yes,Negative,218064699
05NOV16:00:00:00,Yes,Positive,201990694
02DEC16:00:00:00,Yes,Negative,211385767
30NOV16:00:00:00,Yes,Negative,406055879
21DEC16:00:00:00,No,Negative,417876711
02NOV16:00:00:00,Yes,Negative,404866659
08NOV16:00:00:00,No,,202907970
05DEC16:00:00:00,No,,202249968
08DEC16:00:00:00,Yes,Negative,404879663
30NOV16:00:00:00,No,,205284093
28NOV16:00:00:00,No,,404879663
03NOV16:00:00:00,Yes,Negative,205250618
09NOV16:00:00:00,Yes,Negative,200010674
26DEC16:00:00:00,Yes,Negative,202453987
17DEC16:00:00:00,No,,202051689
09DEC16:00:00:00,Yes,Negative,204931949
05NOV16:00:00:00,No,,211385767
18NOV16:00:00:00,No,,211327161
30DEC16:00:00:00,Yes,Negative,200209103
09NOV16:00:00:00,Yes,Negative,202193544
23DEC16:00:00:00,Yes,Positive,404865981
15DEC16:00:00:00,Yes,Positive,211328703
09DEC16:00:00:00,Yes,Negative,203827608
27NOV16:00:00:00,No,,202051876
26NOV16:00:00:00,No,,202051876
15NOV16:00:00:00,No,,202051876
23NOV16:00:00:00,Yes,Negative,206047464
19NOV16:00:00:00,Yes,Negative,202193544
26DEC16:00:00:00,Yes,Positive,212153756
30DEC16:00:00:00,No,,209446900
19NOV16:00:00:00,No,,400153061
12DEC16:00:00:00,Yes,Negative,212002580
17NOV16:00:00:00,No,,202051689
31DEC16:00:00:00,Yes,Negative,200010674
26DEC16:00:00:00,Yes,Negative,202442981
01NOV16:00:00:00,Yes,Negative,200010674
30NOV16:00:00:00,No,,204871594
21NOV16:00:00:00,Yes,Negative,402006592
25DEC16:00:00:00,Yes,Negative,404865981
03NOV16:00:00:00,No,,202051876
29NOV16:00:00:00,Yes,Negative,200010674
09DEC16:00:00:00,Yes,Negative,404866123
25DEC16:00:00:00,Yes,Negative,402006592
31DEC16:00:00:00,Yes,Negative,404866659
06DEC16:00:00:00,Yes,Negative,404941827
14NOV16:00:00:00,Yes,Negative,211328703
22DEC16:00:00:00,Yes,Negative,202050840
17NOV16:00:00:00,Yes,Negative,200010674
29NOV16:00:00:00,Yes,Negative,204923707
12NOV16:00:00:00,Yes,Negative,404907730
25NOV16:00:00:00,No,,202051876
23DEC16:00:00:00,Yes,Negative,202051876
11DEC16:00:00:00,Yes,Negative,202193544
27DEC16:00:00:00,Yes,Negative,202948819
17NOV16:00:00:00,No,,404907730
15NOV16:00:00:00,No,,202051876
14DEC16:00:00:00,Yes,Negative,211385767
12NOV16:00:00:00,Yes,Negative,206063383
07NOV16:00:00:00,No,,205284093
03DEC16:00:00:00,Yes,Negative,406160755
06DEC16:00:00:00,No,,209446900
14DEC16:00:00:00,No,,202051689
21NOV16:00:00:00,No,,400153061
11DEC16:00:00:00,Yes,Negative,402006592
03DEC16:00:00:00,Yes,Negative,404941827
16DEC16:00:00:00,No,,209446900
10NOV16:00:00:00,Yes,Negative,404866659
17DEC16:00:00:00,Yes,Positive,202193544
25DEC16:00:00:00,Yes,Negative,211385767
09DEC16:00:00:00,Yes,Negative,204955076
14NOV16:00:00:00,No,,202249968
14DEC16:00:00:00,Yes,Negative,401945605
30NOV16:00:00:00,No,,202193544
16NOV16:00:00:00,No,,202051689
21NOV16:00:00:00,No,,202051689
18NOV16:00:00:00,Yes,Negative,404866123
04NOV16:00:00:00,Yes,Negative,201990694
30NOV16:00:00:00,Yes,Positive,402022253
20NOV16:00:00:00,No,,400153061
02DEC16:00:00:00,Yes,Negative,202463752
20DEC16:00:00:00,No,,202051689
08DEC16:00:00:00,Yes,Negative,202193544
01DEC16:00:00:00,No,,202051689
02DEC16:00:00:00,Yes,Positive,212002580
04DEC16:00:00:00,No,,202051689
10NOV16:00:00:00,No,,202051689
28NOV16:00:00:00,Yes,Negative,200010674
01DEC16:00:00:00,Yes,Negative,205218030
09NOV16:00:00:00,No,,204871594
04DEC16:00:00:00,Yes,Negative,402022253
15NOV16:00:00:00,Yes,Negative,200010674
23NOV16:00:00:00,Yes,Negative,400037875
03NOV16:00:00:00,No,,202249968
10NOV16:00:00:00,No,Negative,404905821
14NOV16:00:00:00,No,Negative,405001466
08DEC16:00:00:00,Yes,Negative,202051876
08DEC16:00:00:00,Yes,Positive,400153061
21DEC16:00:00:00,Yes,Negative,202193544
27NOV16:00:00:00,No,,404879663
18NOV16:00:00:00,No,,202051876
30DEC16:00:00:00,Yes,Negative,202051876
22DEC16:00:00:00,No,,202051689
05DEC16:00:00:00,Yes,Negative,205005527
22NOV16:00:00:00,No,,202051689
29NOV16:00:00:00,Yes,Negative,212153756
09DEC16:00:00:00,No,,206089169
21DEC16:00:00:00,Yes,Negative,202193544
11NOV16:00:00:00,Yes,Negative,404907730
28DEC16:00:00:00,No,,209446900
19DEC16:00:00:00,Yes,Negative,211385767
20DEC16:00:00:00,No,,202051689
07DEC16:00:00:00,Yes,Negative,404866123
;;;;

Accepted Solutions
Solution
‎03-22-2017 04:47 AM
Super User
Posts: 6,936

Re: Analysis of month to month screening performance by organization

Misconception here:

data mydata.samplemonthanalysis1;
set mydata.samplemonthanalysis;
  infile datalines dsd truncover;
  input DCDeathDate:DATETIME16. HepCscreeningdone:$3. HCVresult:$8. _OrganizationTax:$9.;
  month = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
  format month yymon7.;
datalines4;

;;;;
run;

A data step with datalines is like a step that reads from external data, only that the data has been placed in-line in the data step itself.

To read your data from an existing data set, you have to use the set statement (what you did), but you have to remove everything that is related to reading from raw input data. So change it to

data mydata.samplemonthanalysis1;
set mydata.samplemonthanalysis;
  month = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
  format month yymon7.;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers

View solution in original post


All Replies
Super User
Posts: 6,936

Re: Analysis of month to month screening performance by organization

Try:

data have;
  infile datalines dsd truncover;
  input DCDeathDate:DATETIME16. HepCscreeningdone:$3. HCVresult:$8. _OrganizationTax:$9.;
  month = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
  format month yymon7.;
datalines4;

;;;;
run;

proc sort data=have;
by _OrganizationTax month;
run;

proc freq data=have;
by _OrganizationTax month;
tables HepCscreeningdone * HCVresult /norow nopercent nocum;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Frequent Contributor
Posts: 77

Re: Analysis of month to month screening performance by organization

Thanks. From what I understand, the first step basically adds another column to record only the month and year. I'm not sure exactly what happened, but I tried it on my main dataset, but it came back with the additional column but no entries:

 

data mydata.samplemonthanalysis1;
set mydata.samplemonthanalysis;
  infile datalines dsd truncover;
  input DCDeathDate:DATETIME16. HepCscreeningdone:$3. HCVresult:$8. _OrganizationTax:$9.;
  month = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
  format month yymon7.;
datalines4;

;;;;
run;
 
 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 61         
 62         data mydata.samplemonthanalysis1;
 63         set mydata.samplemonthanalysis;
 64           infile datalines dsd truncover;
 65           input DCDeathDate:DATETIME16. HepCscreeningdone:$3. HCVresult:$8. _OrganizationTax:$9.;
 66           month = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
 67           format month yymon7.;
 68         datalines4;
 
 NOTE: Missing values were generated as a result of performing an operation on missing values.
       Each place is given by: (Number of times) at (Line):(Column).
       1 at 66:11   1 at 66:25   
 NOTE: There were 2 observations read from the data set MYDATA.SAMPLEMONTHANALYSIS.
 NOTE: The data set MYDATA.SAMPLEMONTHANALYSIS1 has 1 observations and 5 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.05 seconds
       cpu time            0.03 seconds
       
 70         ;;;;
 
 71         run;
 72         
 73         OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 86         
Solution
‎03-22-2017 04:47 AM
Super User
Posts: 6,936

Re: Analysis of month to month screening performance by organization

Misconception here:

data mydata.samplemonthanalysis1;
set mydata.samplemonthanalysis;
  infile datalines dsd truncover;
  input DCDeathDate:DATETIME16. HepCscreeningdone:$3. HCVresult:$8. _OrganizationTax:$9.;
  month = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
  format month yymon7.;
datalines4;

;;;;
run;

A data step with datalines is like a step that reads from external data, only that the data has been placed in-line in the data step itself.

To read your data from an existing data set, you have to use the set statement (what you did), but you have to remove everything that is related to reading from raw input data. So change it to

data mydata.samplemonthanalysis1;
set mydata.samplemonthanalysis;
  month = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
  format month yymon7.;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Frequent Contributor
Posts: 77

Re: Analysis of month to month screening performance by organization

Thank you. It seems to work on my sample dataset, but when I apply it to my main dataset it is giving me this error. It also converts all my months into "**"

 

NOTE 484-185: Format $YYMON was not found or could not be loaded.
NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column).
       64:11   
 NOTE: Invalid character data, 20759.00 , at line 64 column 11.
NOTE: Invalid character data, 20789.00 , at line 64 column 11.
NOTE: Invalid character data, 20759.00 , at line 64 column 11.
NOTE: Invalid character data, 20789.00 , at line 64 column 11.
NOTE: Invalid character data, 20759.00 , at line 64 column 11.
NOTE: Invalid character data, 20789.00 , at line 64 column 11.
NOTE: Invalid character data, 20759.00 , at line 64 column 11.
NOTE: Invalid character data, 20759.00 , at line 64 column 11.
NOTE: Invalid character data, 20789.00 , at line 64 column 11.
NOTE: Invalid character data, 20759.00 , at line 64 column 11.
NOTE: Invalid character data, 20789.00 , at line 64 column 11.
NOTE: Invalid character data, 20759.00 , at line 64 column 11.
NOTE: Invalid character data, 20789.00 , at line 64 column 11.
NOTE: Invalid character data, 20789.00 , at line 64 column 11.
NOTE: Invalid character data, 20759.00 , at line 64 column 11.
NOTE: Invalid character data, 20789.00 , at line 64 column 11.
NOTE: Invalid character data, 20759.00 , at line 64 column 11.
NOTE: Invalid character data, 20789.00 , at line 64 column 11.
NOTE: Invalid character data, 20789.00 , at line 64 column 11.
 WARNING: Limit set by ERRORS= option reached.  Further errors of this type will not be printed.

NOTE: There were 74611 observations read from the data set MYDATA.ORIGINALDATASETDUPIDCLEAN.
 NOTE: The data set MYDATA.ORIGTESTMONTH has 74611 observations and 45 variables.
 NOTE: At least one W.D format was too small for the number to be printed. The decimal may be shifted by the "BEST" format.
 NOTE: DATA statement used (Total process time):
       real time           3.53 seconds
       cpu time            1.94 seconds
Super User
Posts: 6,936

Re: Analysis of month to month screening performance by organization

Do you already have a variable called month in your original dataset?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Frequent Contributor
Posts: 77

Re: Analysis of month to month screening performance by organization

No, I shouldn't. But let me delete it and rerun the code again to see.

Super User
Posts: 6,936

Re: Analysis of month to month screening performance by organization

I asked about month because

NOTE 484-185: Format $YYMON was not found or could not be loaded.

happens when one tries to assign a date format (which is numeric) to a character variable.

My assignment creates months as numeric, though, so it looks like month is already present and is of type character.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Frequent Contributor
Posts: 77

Re: Analysis of month to month screening performance by organization

Yeah, you're right. There was month in there. The month variable corresponds to the month of the patient's birthday. Would it be possible to rename month variable that we are creating to DCMonth?

Super User
Posts: 6,936

Re: Analysis of month to month screening performance by organization


byeh2017 wrote:

Yeah, you're right. There was month in there. The month variable corresponds to the month of the patient's birthday. Would it be possible to rename month variable that we are creating to DCMonth?


Absolutely. Name it any way you want, I just used the very generic "month" to illustrate what it is.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Frequent Contributor
Posts: 77

Re: Analysis of month to month screening performance by organization

Is this the right code?

 

data mydata.origtestmonth;
set mydata.originaldatasetdupidclean;
  month = intnx('DCmonth',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
  format month yymon7.;
run;

proc sort data=mydata.origtestmonth;
by _Organizationname DCmonth;
run;

proc freq data=mydata.origtestmonth;
by _Organizationname DCmonth;
tables HepCscreeningdone * HCVresult /norow nopercent nocum;
run;
Super User
Posts: 6,936

Re: Analysis of month to month screening performance by organization


byeh2017 wrote:

Is this the right code?

 

data mydata.origtestmonth;
set mydata.originaldatasetdupidclean;
  month = intnx('DCmonth',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
  format month yymon7.;
run;

proc sort data=mydata.origtestmonth;
by _Organizationname DCmonth;
run;

proc freq data=mydata.origtestmonth;
by _Organizationname DCmonth;
tables HepCscreeningdone * HCVresult /norow nopercent nocum;
run;

Well, that's now up to you to determine:

- does it run without ERRORs/WARNINGs/suspicious NOTEs?

- does it produce the output you desire (at least logically, you might have some "prettying up" to do)

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Frequent Contributor
Posts: 77

Re: Analysis of month to month screening performance by organization

[ Edited ]

This is the error that I got

 

 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 61         
 62         data mydata.origtestmonth;
 63         set mydata.originaldatasetdupidclean;
 64           dcmonth = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
 65           format month yymon7.;
                           _______
                           484
 NOTE 484-185: Format $YYMON was not found or could not be loaded.

It did create a separate variable called DCmonth but it gave me variables such as "20759, 20789,..."

 

This is the code i used:

 

data mydata.origtestmonth;
set mydata.originaldatasetdupidclean;
  dcmonth = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
  format month yymon7.;
run;
Super User
Posts: 6,936

Re: Analysis of month to month screening performance by organization


byeh2017 wrote:

This is the error that I got

 

 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 61         
 62         data mydata.origtestmonth;
 63         set mydata.originaldatasetdupidclean;
 64           dcmonth = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
 65           format month yymon7.;
                           _______
                           484
 NOTE 484-185: Format $YYMON was not found or could not be loaded.

It did create a separate variable called DCmonth but it gave me variables such as "20759, 20789,..."

 

This is the code i used:

 

data mydata.origtestmonth;
set mydata.originaldatasetdupidclean;
  dcmonth = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
  format month yymon7.;
run;

You have to replace every occurence of the variable month in my code with your new name dcmonth, of course.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Frequent Contributor
Posts: 77

Re: Analysis of month to month screening performance by organization

sample.png

 

There seems to be an error on the "yymon7" format

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 22 replies
  • 197 views
  • 1 like
  • 4 in conversation