BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
byeh2017
Quartz | Level 8

Hello,

 

I have a dataset of inpatient data with the following relevant variables: date of discharge (dcdeathdate), HCV Screening (HepCScreenigndone), HCV Status (HCVResult), and organization (_OrganizationTax). HCV Screening can be either "Yes"/"No"/Blank and HCV status can be either "Positive"/"Negative"/Blank. I would like to see how well each organization is screening month to month (Nov, Dec, Jan) and would like sas to output something that looks like this. What kind of code can I use to do this level of analysis? Below is a sample dataset. Thank you.

 

Month by month analysis.png

 

data MYDATA.SAMPLEMONTHANALYSIS;
  infile datalines dsd truncover;
  input DCDeathDate:DATETIME16. HepCscreeningdone:$3. HCVresult:$8. _OrganizationTax:$9.;
datalines4;
28NOV16:00:00:00,No,,404908043
12DEC16:00:00:00,Yes,Negative,202193544
30NOV16:00:00:00,Yes,Negative,202453987
23DEC16:00:00:00,No,,202051689
06NOV16:00:00:00,Yes,Positive,211328703
18DEC16:00:00:00,No,,204871594
23NOV16:00:00:00,No,,211328703
22NOV16:00:00:00,No,,202051689
26DEC16:00:00:00,Yes,Negative,404879663
18NOV16:00:00:00,No,,402022253
09DEC16:00:00:00,Yes,Negative,204871781
30NOV16:00:00:00,No,,400153061
17NOV16:00:00:00,Yes,Negative,206120730
27DEC16:00:00:00,Yes,Negative,200010674
28DEC16:00:00:00,Yes,Negative,245428880
04NOV16:00:00:00,Yes,Negative,212806766
10DEC16:00:00:00,Yes,Negative,200010674
04NOV16:00:00:00,No,,404945164
20DEC16:00:00:00,No,,211328703
27DEC16:00:00:00,Yes,Negative,202050840
05DEC16:00:00:00,Yes,Negative,202051876
04NOV16:00:00:00,No,,202051689
16DEC16:00:00:00,Yes,Negative,404981622
19NOV16:00:00:00,Yes,Negative,206063383
10NOV16:00:00:00,No,,211385767
26NOV16:00:00:00,Yes,Negative,200010674
04NOV16:00:00:00,No,,204871594
25DEC16:00:00:00,No,,205165453
18NOV16:00:00:00,Yes,Negative,205005527
25NOV16:00:00:00,Yes,Negative,205005527
31DEC16:00:00:00,Yes,Negative,404866659
01NOV16:00:00:00,No,,211328703
21NOV16:00:00:00,No,,416289947
02NOV16:00:00:00,No,,401945605
02DEC16:00:00:00,No,,202051689
05DEC16:00:00:00,No,Negative,404905821
26DEC16:00:00:00,Yes,Negative,200010674
05NOV16:00:00:00,No,,400115362
15DEC16:00:00:00,Yes,Negative,211385767
02DEC16:00:00:00,Yes,Negative,405001466
14DEC16:00:00:00,Yes,Negative,202051876
04DEC16:00:00:00,Yes,Negative,400153061
10DEC16:00:00:00,Yes,Negative,200010674
28DEC16:00:00:00,Yes,Negative,404866659
30NOV16:00:00:00,No,,400153061
10DEC16:00:00:00,No,Negative,404945217
29NOV16:00:00:00,Yes,Negative,405018831
11DEC16:00:00:00,Yes,Negative,404907730
07NOV16:00:00:00,,,404514762
02DEC16:00:00:00,No,,406131939
18DEC16:00:00:00,No,,202051689
06NOV16:00:00:00,No,,211385767
08DEC16:00:00:00,Yes,Negative,404866123
21DEC16:00:00:00,Yes,Negative,404866659
04NOV16:00:00:00,No,,400153061
04NOV16:00:00:00,Yes,Negative,405032806
29DEC16:00:00:00,Yes,Negative,204927543
21DEC16:00:00:00,Yes,Positive,212153756
05NOV16:00:00:00,Yes,Negative,200199702
30DEC16:00:00:00,No,Negative,404905821
10NOV16:00:00:00,Yes,Negative,205210467
14NOV16:00:00:00,No,,211385767
14NOV16:00:00:00,Yes,Negative,200010674
26NOV16:00:00:00,No,,202051876
22DEC16:00:00:00,Yes,Negative,406055879
21DEC16:00:00:00,Yes,Negative,204927543
18NOV16:00:00:00,No,Negative,200007143
11DEC16:00:00:00,Yes,Negative,405018831
10NOV16:00:00:00,No,,211328703
28NOV16:00:00:00,No,Negative,404945217
07DEC16:00:00:00,Yes,Negative,205284093
17NOV16:00:00:00,No,,202051876
04DEC16:00:00:00,Yes,Negative,202051876
16DEC16:00:00:00,No,,202051689
14NOV16:00:00:00,No,,202051689
14NOV16:00:00:00,Yes,Negative,202453987
13NOV16:00:00:00,No,Negative,218071681
21NOV16:00:00:00,No,,202193544
02DEC16:00:00:00,No,,404945164
03DEC16:00:00:00,No,,211328703
13NOV16:00:00:00,Yes,Negative,404865981
17NOV16:00:00:00,No,Negative,200007143
17NOV16:00:00:00,No,,202051689
06DEC16:00:00:00,Yes,Negative,205250618
03DEC16:00:00:00,Yes,Positive,202051689
14NOV16:00:00:00,Yes,Negative,404865981
25NOV16:00:00:00,No,,202193544
22NOV16:00:00:00,Yes,Negative,205250618
15DEC16:00:00:00,Yes,Negative,218064699
05NOV16:00:00:00,Yes,Positive,201990694
02DEC16:00:00:00,Yes,Negative,211385767
30NOV16:00:00:00,Yes,Negative,406055879
21DEC16:00:00:00,No,Negative,417876711
02NOV16:00:00:00,Yes,Negative,404866659
08NOV16:00:00:00,No,,202907970
05DEC16:00:00:00,No,,202249968
08DEC16:00:00:00,Yes,Negative,404879663
30NOV16:00:00:00,No,,205284093
28NOV16:00:00:00,No,,404879663
03NOV16:00:00:00,Yes,Negative,205250618
09NOV16:00:00:00,Yes,Negative,200010674
26DEC16:00:00:00,Yes,Negative,202453987
17DEC16:00:00:00,No,,202051689
09DEC16:00:00:00,Yes,Negative,204931949
05NOV16:00:00:00,No,,211385767
18NOV16:00:00:00,No,,211327161
30DEC16:00:00:00,Yes,Negative,200209103
09NOV16:00:00:00,Yes,Negative,202193544
23DEC16:00:00:00,Yes,Positive,404865981
15DEC16:00:00:00,Yes,Positive,211328703
09DEC16:00:00:00,Yes,Negative,203827608
27NOV16:00:00:00,No,,202051876
26NOV16:00:00:00,No,,202051876
15NOV16:00:00:00,No,,202051876
23NOV16:00:00:00,Yes,Negative,206047464
19NOV16:00:00:00,Yes,Negative,202193544
26DEC16:00:00:00,Yes,Positive,212153756
30DEC16:00:00:00,No,,209446900
19NOV16:00:00:00,No,,400153061
12DEC16:00:00:00,Yes,Negative,212002580
17NOV16:00:00:00,No,,202051689
31DEC16:00:00:00,Yes,Negative,200010674
26DEC16:00:00:00,Yes,Negative,202442981
01NOV16:00:00:00,Yes,Negative,200010674
30NOV16:00:00:00,No,,204871594
21NOV16:00:00:00,Yes,Negative,402006592
25DEC16:00:00:00,Yes,Negative,404865981
03NOV16:00:00:00,No,,202051876
29NOV16:00:00:00,Yes,Negative,200010674
09DEC16:00:00:00,Yes,Negative,404866123
25DEC16:00:00:00,Yes,Negative,402006592
31DEC16:00:00:00,Yes,Negative,404866659
06DEC16:00:00:00,Yes,Negative,404941827
14NOV16:00:00:00,Yes,Negative,211328703
22DEC16:00:00:00,Yes,Negative,202050840
17NOV16:00:00:00,Yes,Negative,200010674
29NOV16:00:00:00,Yes,Negative,204923707
12NOV16:00:00:00,Yes,Negative,404907730
25NOV16:00:00:00,No,,202051876
23DEC16:00:00:00,Yes,Negative,202051876
11DEC16:00:00:00,Yes,Negative,202193544
27DEC16:00:00:00,Yes,Negative,202948819
17NOV16:00:00:00,No,,404907730
15NOV16:00:00:00,No,,202051876
14DEC16:00:00:00,Yes,Negative,211385767
12NOV16:00:00:00,Yes,Negative,206063383
07NOV16:00:00:00,No,,205284093
03DEC16:00:00:00,Yes,Negative,406160755
06DEC16:00:00:00,No,,209446900
14DEC16:00:00:00,No,,202051689
21NOV16:00:00:00,No,,400153061
11DEC16:00:00:00,Yes,Negative,402006592
03DEC16:00:00:00,Yes,Negative,404941827
16DEC16:00:00:00,No,,209446900
10NOV16:00:00:00,Yes,Negative,404866659
17DEC16:00:00:00,Yes,Positive,202193544
25DEC16:00:00:00,Yes,Negative,211385767
09DEC16:00:00:00,Yes,Negative,204955076
14NOV16:00:00:00,No,,202249968
14DEC16:00:00:00,Yes,Negative,401945605
30NOV16:00:00:00,No,,202193544
16NOV16:00:00:00,No,,202051689
21NOV16:00:00:00,No,,202051689
18NOV16:00:00:00,Yes,Negative,404866123
04NOV16:00:00:00,Yes,Negative,201990694
30NOV16:00:00:00,Yes,Positive,402022253
20NOV16:00:00:00,No,,400153061
02DEC16:00:00:00,Yes,Negative,202463752
20DEC16:00:00:00,No,,202051689
08DEC16:00:00:00,Yes,Negative,202193544
01DEC16:00:00:00,No,,202051689
02DEC16:00:00:00,Yes,Positive,212002580
04DEC16:00:00:00,No,,202051689
10NOV16:00:00:00,No,,202051689
28NOV16:00:00:00,Yes,Negative,200010674
01DEC16:00:00:00,Yes,Negative,205218030
09NOV16:00:00:00,No,,204871594
04DEC16:00:00:00,Yes,Negative,402022253
15NOV16:00:00:00,Yes,Negative,200010674
23NOV16:00:00:00,Yes,Negative,400037875
03NOV16:00:00:00,No,,202249968
10NOV16:00:00:00,No,Negative,404905821
14NOV16:00:00:00,No,Negative,405001466
08DEC16:00:00:00,Yes,Negative,202051876
08DEC16:00:00:00,Yes,Positive,400153061
21DEC16:00:00:00,Yes,Negative,202193544
27NOV16:00:00:00,No,,404879663
18NOV16:00:00:00,No,,202051876
30DEC16:00:00:00,Yes,Negative,202051876
22DEC16:00:00:00,No,,202051689
05DEC16:00:00:00,Yes,Negative,205005527
22NOV16:00:00:00,No,,202051689
29NOV16:00:00:00,Yes,Negative,212153756
09DEC16:00:00:00,No,,206089169
21DEC16:00:00:00,Yes,Negative,202193544
11NOV16:00:00:00,Yes,Negative,404907730
28DEC16:00:00:00,No,,209446900
19DEC16:00:00:00,Yes,Negative,211385767
20DEC16:00:00:00,No,,202051689
07DEC16:00:00:00,Yes,Negative,404866123
;;;;
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Misconception here:

data mydata.samplemonthanalysis1;
set mydata.samplemonthanalysis;
  infile datalines dsd truncover;
  input DCDeathDate:DATETIME16. HepCscreeningdone:$3. HCVresult:$8. _OrganizationTax:$9.;
  month = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
  format month yymon7.;
datalines4;

;;;;
run;

A data step with datalines is like a step that reads from external data, only that the data has been placed in-line in the data step itself.

To read your data from an existing data set, you have to use the set statement (what you did), but you have to remove everything that is related to reading from raw input data. So change it to

data mydata.samplemonthanalysis1;
set mydata.samplemonthanalysis;
  month = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
  format month yymon7.;
run;

View solution in original post

22 REPLIES 22
Kurt_Bremser
Super User

Try:

data have;
  infile datalines dsd truncover;
  input DCDeathDate:DATETIME16. HepCscreeningdone:$3. HCVresult:$8. _OrganizationTax:$9.;
  month = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
  format month yymon7.;
datalines4;

;;;;
run;

proc sort data=have;
by _OrganizationTax month;
run;

proc freq data=have;
by _OrganizationTax month;
tables HepCscreeningdone * HCVresult /norow nopercent nocum;
run;
byeh2017
Quartz | Level 8

Thanks. From what I understand, the first step basically adds another column to record only the month and year. I'm not sure exactly what happened, but I tried it on my main dataset, but it came back with the additional column but no entries:

 

data mydata.samplemonthanalysis1;
set mydata.samplemonthanalysis;
  infile datalines dsd truncover;
  input DCDeathDate:DATETIME16. HepCscreeningdone:$3. HCVresult:$8. _OrganizationTax:$9.;
  month = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
  format month yymon7.;
datalines4;

;;;;
run;
 
 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 61         
 62         data mydata.samplemonthanalysis1;
 63         set mydata.samplemonthanalysis;
 64           infile datalines dsd truncover;
 65           input DCDeathDate:DATETIME16. HepCscreeningdone:$3. HCVresult:$8. _OrganizationTax:$9.;
 66           month = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
 67           format month yymon7.;
 68         datalines4;
 
 NOTE: Missing values were generated as a result of performing an operation on missing values.
       Each place is given by: (Number of times) at (Line):(Column).
       1 at 66:11   1 at 66:25   
 NOTE: There were 2 observations read from the data set MYDATA.SAMPLEMONTHANALYSIS.
 NOTE: The data set MYDATA.SAMPLEMONTHANALYSIS1 has 1 observations and 5 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.05 seconds
       cpu time            0.03 seconds
       
 70         ;;;;
 
 71         run;
 72         
 73         OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 86         
Kurt_Bremser
Super User

Misconception here:

data mydata.samplemonthanalysis1;
set mydata.samplemonthanalysis;
  infile datalines dsd truncover;
  input DCDeathDate:DATETIME16. HepCscreeningdone:$3. HCVresult:$8. _OrganizationTax:$9.;
  month = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
  format month yymon7.;
datalines4;

;;;;
run;

A data step with datalines is like a step that reads from external data, only that the data has been placed in-line in the data step itself.

To read your data from an existing data set, you have to use the set statement (what you did), but you have to remove everything that is related to reading from raw input data. So change it to

data mydata.samplemonthanalysis1;
set mydata.samplemonthanalysis;
  month = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
  format month yymon7.;
run;
byeh2017
Quartz | Level 8

Thank you. It seems to work on my sample dataset, but when I apply it to my main dataset it is giving me this error. It also converts all my months into "**"

 

NOTE 484-185: Format $YYMON was not found or could not be loaded.
NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column).
       64:11   
 NOTE: Invalid character data, 20759.00 , at line 64 column 11.
NOTE: Invalid character data, 20789.00 , at line 64 column 11.
NOTE: Invalid character data, 20759.00 , at line 64 column 11.
NOTE: Invalid character data, 20789.00 , at line 64 column 11.
NOTE: Invalid character data, 20759.00 , at line 64 column 11.
NOTE: Invalid character data, 20789.00 , at line 64 column 11.
NOTE: Invalid character data, 20759.00 , at line 64 column 11.
NOTE: Invalid character data, 20759.00 , at line 64 column 11.
NOTE: Invalid character data, 20789.00 , at line 64 column 11.
NOTE: Invalid character data, 20759.00 , at line 64 column 11.
NOTE: Invalid character data, 20789.00 , at line 64 column 11.
NOTE: Invalid character data, 20759.00 , at line 64 column 11.
NOTE: Invalid character data, 20789.00 , at line 64 column 11.
NOTE: Invalid character data, 20789.00 , at line 64 column 11.
NOTE: Invalid character data, 20759.00 , at line 64 column 11.
NOTE: Invalid character data, 20789.00 , at line 64 column 11.
NOTE: Invalid character data, 20759.00 , at line 64 column 11.
NOTE: Invalid character data, 20789.00 , at line 64 column 11.
NOTE: Invalid character data, 20789.00 , at line 64 column 11.
 WARNING: Limit set by ERRORS= option reached.  Further errors of this type will not be printed.

NOTE: There were 74611 observations read from the data set MYDATA.ORIGINALDATASETDUPIDCLEAN.
 NOTE: The data set MYDATA.ORIGTESTMONTH has 74611 observations and 45 variables.
 NOTE: At least one W.D format was too small for the number to be printed. The decimal may be shifted by the "BEST" format.
 NOTE: DATA statement used (Total process time):
       real time           3.53 seconds
       cpu time            1.94 seconds
byeh2017
Quartz | Level 8

No, I shouldn't. But let me delete it and rerun the code again to see.

Kurt_Bremser
Super User

I asked about month because

NOTE 484-185: Format $YYMON was not found or could not be loaded.

happens when one tries to assign a date format (which is numeric) to a character variable.

My assignment creates months as numeric, though, so it looks like month is already present and is of type character.

byeh2017
Quartz | Level 8

Yeah, you're right. There was month in there. The month variable corresponds to the month of the patient's birthday. Would it be possible to rename month variable that we are creating to DCMonth?

Kurt_Bremser
Super User

@byeh2017 wrote:

Yeah, you're right. There was month in there. The month variable corresponds to the month of the patient's birthday. Would it be possible to rename month variable that we are creating to DCMonth?


Absolutely. Name it any way you want, I just used the very generic "month" to illustrate what it is.

byeh2017
Quartz | Level 8

Is this the right code?

 

data mydata.origtestmonth;
set mydata.originaldatasetdupidclean;
  month = intnx('DCmonth',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
  format month yymon7.;
run;

proc sort data=mydata.origtestmonth;
by _Organizationname DCmonth;
run;

proc freq data=mydata.origtestmonth;
by _Organizationname DCmonth;
tables HepCscreeningdone * HCVresult /norow nopercent nocum;
run;
Kurt_Bremser
Super User

@byeh2017 wrote:

Is this the right code?

 

data mydata.origtestmonth;
set mydata.originaldatasetdupidclean;
  month = intnx('DCmonth',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
  format month yymon7.;
run;

proc sort data=mydata.origtestmonth;
by _Organizationname DCmonth;
run;

proc freq data=mydata.origtestmonth;
by _Organizationname DCmonth;
tables HepCscreeningdone * HCVresult /norow nopercent nocum;
run;

Well, that's now up to you to determine:

- does it run without ERRORs/WARNINGs/suspicious NOTEs?

- does it produce the output you desire (at least logically, you might have some "prettying up" to do)

byeh2017
Quartz | Level 8

This is the error that I got

 

 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 61         
 62         data mydata.origtestmonth;
 63         set mydata.originaldatasetdupidclean;
 64           dcmonth = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
 65           format month yymon7.;
                           _______
                           484
 NOTE 484-185: Format $YYMON was not found or could not be loaded.

It did create a separate variable called DCmonth but it gave me variables such as "20759, 20789,..."

 

This is the code i used:

 

data mydata.origtestmonth;
set mydata.originaldatasetdupidclean;
  dcmonth = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
  format month yymon7.;
run;
Kurt_Bremser
Super User

@byeh2017 wrote:

This is the error that I got

 

 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 61         
 62         data mydata.origtestmonth;
 63         set mydata.originaldatasetdupidclean;
 64           dcmonth = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
 65           format month yymon7.;
                           _______
                           484
 NOTE 484-185: Format $YYMON was not found or could not be loaded.

It did create a separate variable called DCmonth but it gave me variables such as "20759, 20789,..."

 

This is the code i used:

 

data mydata.origtestmonth;
set mydata.originaldatasetdupidclean;
  dcmonth = intnx('month',datepart(DCDeathDate),0,'begin'); /* normalizes to a month */
  format month yymon7.;
run;

You have to replace every occurence of the variable month in my code with your new name dcmonth, of course.

byeh2017
Quartz | Level 8

sample.png

 

There seems to be an error on the "yymon7" format

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 22 replies
  • 1236 views
  • 1 like
  • 4 in conversation