My task is to use a PROC MEANS analysis with the output grouped into very specific date ranges. Is it possible to do this through variable formats?
My current code:
PROC FORMAT;
VALUE dtfmt '14MAY2015'd - '14JUL2015'd ="First 3 months"
'15JUL2015'd - '15OCT2015'd ="Middle"
'16OCT2015'd - '16DEC2015'd ="Last 3 months"
'17DEC2015'd - '26JAN2016'd ="Wrap-up";
RUN;
PROC MEANS data=mydata MEAN STD MAXDEC=2;
CLASS thedate;
VAR var1;
FORMAT thedate dtfmt.;
RUN;
But this does not group the output into the format that I've requested and I end up just getting all dates as their own class. I know that I can create another variable in a data step, but this is purely curiosity about whether anyone else has done something like this through PROC FORMAT.
Best, Lauren
I AM SO STUPID. Last year was 2014. Not 2015. ::HEADDESK::
I feel like such a goober.
Thank you all for your assistance!
Try this version of the format:
PROC FORMAT;
VALUE dtfmt '14MAY2015'd - '14JUL2015'd ="First 3 months"
'15JUL2015'd - '15OCT2015'd ="Middle"
'16OCT2015'd - '16DEC2015'd ="Last 3 months"
'17DEC2015'd - '26JAN2016'd ="Wrap-up"
other = [mmddyy10.];
RUN;
It will likely show that you have date variables outside the range defined for DTFMT.
If so you will either need to restrict data using a where clause to the specified date range or add and appropriate OTHER range to the format definition. Possibly OTHER='Out of range'
Another possibility is that your date variable is not a SAS date value.
I used your suggestion to add the "other" category and my output returned that all values were out of range. Which makes no sense because when I go to the dataset's column properties, I see:
variable, type, length, format
mydate, number, 8, DATE9.
which indicates to me that it is definitely a date (DATE9. is the typical format for this variable when not being used in this particular analysis). Also, the values that I see in the data are DEFINITELY within those bounds.
Am I incorrectly indicating those date ranges? Are you even ALLOWED to indicate date ranges?
BEst, Lauren
Message was edited by: Lauren Parlett I added a data file with 20 of my observations.
You might want to post some examples of the maximum and minimum date values displayed with a best. format.
I ran your format, and the modified version I suggested, and proc means on one of my data sets with dates. It correctly identified 193 records as "First 3 months". That data set only has date values from 1/1/2015 to 6/30/2015 but worked fine. So it does not appear to be the actual format as a problem.
Run this code:
proc options group=envfiles; run;
Look in the log for the details of FMTSEARCH. There should be something like:
FMTSEARCH = (WORK SASHELP) . <there may be other libraries shown depending on your install>
If your default library is WORK and it does not appear as above then the format was not found to apply.
if that is the case then:
options append=(fmtsearch= (work));
might fix it.
You might verify that the WORK library has a FORMATS catalog with a DTFMT entry. If not then you need to determine in which library that format was created.
Tip: I keep project related data and formats in the same permanent library. And use the options statement to make the formats available. Either save the source code or use the CNTLOUT option to create a dataset that can be read to recreate the formats in a library, again direct the CNTLOUT to the project library. There are issues around SAS upgrades with compiled format compatibility and the data set may be helpful in the future.
Will the ORDER option for the CLASS statement make a difference?
Unfortunately, I still get all dates as "Out of range".
Are you using SAS datetime values versus SAS date values? Will this format definition do better?
I substituted in your format definition. I get the same output from PROC MEANS that all values are out of range. Thank you for your help with this. It's quite puzzling!
No.. Are your dates in string format? It would be nice to see a bit of your data in some form. A PROC PRINT of several rows, and a PROC CONTENTS perhaps.
I added a snippet of my datafile in a reply above.
PROC PRINT:
Obs mydate var1 1 14MAY2014 180 2 14MAY2014 . 3 15MAY2014 360 4 19MAY2014 120 5 19MAY2014 120 6 19MAY2014 180 7 20MAY2014 120 8 20MAY2014 120 9 20MAY2014 180 10 21MAY2014 300 11 21MAY2014 240 12 21MAY2014 300 13 22MAY2014 120 14 24MAY2014 180 15 24MAY2014 120 16 26MAY2014 180 17 27MAY2014 120 18 27MAY2014 180 19 29MAY2014 240 20 02JUN2014 180
CONTENTS:
The CONTENTS Procedure Data Set Name WORK.TEST Observations 20 Member Type DATA Variables 2 Engine V9 Indexes 0 Created Friday, August 14, 2015 12:37:34 PM Observation Length 16 Last Modified Friday, August 14, 2015 12:37:34 PM Deleted Observations 0 Protection Compressed NO Data Set Type Sorted NO Label Data Representation WINDOWS_64 Encoding wlatin1 Western (Windows) Engine/Host Dependent Information Data Set Page Size 4096 Number of Data Set Pages 1 First Data Page 1 Max Obs per Page 252 Obs in First Data Page 20 Number of Data Set Repairs 0 Filename test.sas7bdat Release Created 9.0301M0 Host Created X64_7PRO Alphabetic List of Variables and Attributes # Variable Type Len Format Label 1 mydate Num 8 DATE9. Visit Date 2 var1 Num 8 Total time for procedure (seconds)
And my code and result:
PROC MEANS data=test N NMISS MEAN STD MEDIAN MAXDEC=1; CLASS mydate; VAR var1; FORMAT mydate dtfmt.; RUN; The MEANS Procedure Analysis Variable : var1 Total time for procedure (seconds) N N Visit Date Obs N Miss Mean Std Dev Median ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Out of range 20 19 1 186.3 71.8 180.0 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
May well be that your date does not have any data between 14May2015 and 26Jan2016. A majority of the values in the ranges for that format are in the future...
What is the MAXIMUM value for the date variable in that set?
I just ran the means on the mini dataset.
My formatting:
FMTSEARCH=(SFORMAT WORK)
Changing that option to FMTSEARCH=(work) did not solve the issue.
Analysis Variable : mydate Visit Date Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 19857.00 19876.00 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
The format dtfmt entry: WRONG FORMAT HERE (format from earlier post's suggestion of datetime). SEE POST BELOW.
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ† ‚ FORMAT NAME: DTFMT LENGTH: 16 NUMBER OF VALUES: 5 ‚ ‚ MIN LENGTH: 1 MAX LENGTH: 40 DEFAULT LENGTH: 16 FUZZ: STD ‚ ‡ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ…ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ…ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ‰ ‚START ‚END ‚LABEL (VER. V7|V8 14AUG2015:12:49:32)‚ ‡ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ‰ ‚ 1747180800‚ 1752451200‚1-First 3 months ‚ ‚ 1752537600‚ 1760486400‚2-Middle ‚ ‚ 1760572800‚ 1765843200‚3-Last 3 months ‚ ‚ 1765929600‚ 1769385600‚4-Wrap-up ‚ ‚**OTHER** ‚**OTHER** ‚Out of range ‚ Šƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ‹ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ‹ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Message was edited by: Lauren Parlett
The fact that the OTHER was working indicated that the FMTSEARCH wasn't an issue, but I didn't see that until after completing the previous post.
It starts to look like the actual data may be the issue. Again, MAXIMUM value of the date variable is???
Ooops! That format was for the datetime suggestion above. Here it is again:
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ† ‚ FORMAT NAME: DTFMT LENGTH: 16 NUMBER OF VALUES: 5 ‚ ‚ MIN LENGTH: 1 MAX LENGTH: 40 DEFAULT LENGTH: 16 FUZZ: STD ‚ ‡ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ…ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ…ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ‰ ‚START ‚END ‚LABEL (VER. V7|V8 14AUG2015:13:11:29)‚ ‡ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ‰ ‚ 20222‚ 20283‚1-First 3 months ‚ ‚ 20284‚ 20376‚2-Middle ‚ ‚ 20377‚ 20438‚3-Last 3 months ‚ ‚ 20439‚ 20479‚4-Wrap-up ‚ ‚**OTHER** ‚**OTHER** ‚Out of range ‚ Šƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ‹ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ‹ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒŒ
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.