SAS Studio User
Hi again! Thank you for the help on my previous question, but there seems to be another issue with my code that I have to fix. I have a variable named StateCd for which SAS should read as 'MS'. In that same code, I included a FORMAT for the variable StateCd which I have named $StateCd. This format is supposed to read as 'Mississippi'. Here is my code for the format:
PROC FORMAT;
VALUE $StateCd
'IA' = 'Iowa'
'MS' = 'Mississippi'
'UT' = 'Utah';
RUN;
When I run my entire code, in the output my variable StateCd is read as 'MI' which are the first two letters of Mississippi. I know this is happening because in my code my length statement states that StateCd should be a length of 2. However, my data steps for IA and UT are similar to my MS data step code and I am not having issues with those reading like I want them to. If anyone can provide me with some insight as to what I am doing wrong, I would appreciate it! Let me know if you need to see anything else. Thank you in advance.
Here's my code for MS:
DATA WORK.Contact_MS;
RETAIN SSN Inits City StateCd ZipCd;
LENGTH Inits $3
City $20
StateCd $2;
SET HypImpt.MS_Citizens (RENAME = (SocSecNum = SSN));
LABEL SSN = 'Social Security Number'
Inits = 'Subject Initials'
City = 'City'
StateCd = 'State Code'
ZipCd = 'Zip Code';
City = SCAN(CityState, 1, ' ,');
City = PROPCASE(City);
StateCd = UPCASE(SCAN(CityState, 2, ' ,'), 1);
StateCd = COMPRESS((SCAN(CityState, 2, ',')));
StateCd = UPCASE(StateCd);
Inits = CATS(SUBSTR(FirstInit, 1, 1), SUBSTR(MiddleInit, 1, 1), SUBSTR (LastInit, 1, 1));
FORMAT StateCd $StateCd.; /*if I add this, "Unexpected Value" appears in output*/
KEEP SSN Inits City StateCd ZipCd;
RUN;
Formats have nothing to do with "reading" but display.
You have now asked several questions without actually showing any of your actual data.
The format you show will not display "Unexpected Value". If you see that as a formatted value, then the format is NOT the one you show as code.
So, provide example data in the form of a data step that generates the result you claim.
Also, when getting messages in the log that you question copy from the log the entire data step or procedure code and all the messages or notes. Then paste into a code box opened on the forum with the </> icon to preserve the formatting of the text. SAS often provides diagnostics and the code box preserves the location for use. The main message windows reformat pasted text.
It's tough for me to explain everything, but hopefully this can give you a better idea of what I want to achieve.
Here's an example of a code that achieves what I want to do for WORK.Contact_MS.
DATA WORK.Contact_IA ;
RETAIN SSN Inits City StateCd ZipCd;
SET HypImpt.IowaResidents
( RENAME = ( ZipCd = ZipCdNum ) );
ZipCd = PUT(ZipCdNum, 5.);
DROP ZipCdNum;
LABEL SSN = 'Social Security Number'
Inits = 'Subject Initials'
City = 'City'
StateCd = 'State Code'
ZipCd = 'Zip Code';
LENGTH Inits $ 3
StateCd $ 2;
City = PROPCASE(City);
StateCd = CAT(SUBSTR(State, 1, 1), SUBSTR(State, 4,1));
Inits = CATS(SUBSTR(Initials, 3, 2), SUBSTR(Initials, 1, 1));
FORMAT StateCd $StateCd.;
KEEP SSN Inits City StateCd ZipCd;
RUN;
My output looks like this for WORK.Contact_IA:
Just to give you an idea of what the original data set looks like:
Right now, my output for WORK.Contact_MS looks like this:
Here is what I see in my log (which doesn't show any errors or warning messages):
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 72 73 DATAWORK.Contact_MS; 74 RETAIN SSN Inits City StateCd ZipCd; 75 76 LENGTH Inits $3 77 City $20 78 StateCd $2; 79 SETHypImpt.MS_Citizens (RENAME = (SocSecNum = SSN)); 80 81 /*RETAIN SSN 82 Inits 83 City 84 StateCd 'MS' 85 ZipCd;*/ 86 87 LABEL SSN= 'Social Security Number' 88 Inits= 'Subject Initials' 89 City= 'City' 90 StateCd= 'State Code' 91 ZipCd= 'Zip Code'; 92 93 94 /* FirstInit = COMPRESS(FirstInit, '.'); 95 MiddleInit = COMPRESS(MiddleInit, '.'); 96 LastInit = COMPRESS(LastInit, '.'); 97 Inits = FirstInit||MiddleInit||LastInit; 98 City = SCAN(CityState, 1, ' ,'); 99 StateCd = SCAN(CityState, 2, ' ,'); */ 100 101 City = SCAN(CityState, 1, ' ,'); 102 City = PROPCASE(City); 103 StateCd = COMPRESS((SCAN(CityState, 2, ','))); 104 StateCd = UPCASE(StateCd); 105 Inits = CATS(SUBSTR(FirstInit, 1, 1), SUBSTR(MiddleInit, 1, 1), SUBSTR (LastInit, 1, 1)); 106 107 FORMAT StateCd $StateCd.; /*if I add this, "Unexpected Value" appears in output*/ 108 KEEP SSN Inits City StateCd ZipCd; 109 RUN; NOTE: Variable StateCd is uninitialized. NOTE: There were 207 observations read from the data set HYPIMPT.MS_CITIZENS. NOTE: The data set WORK.CONTACT_MS has 207 observations and 5 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 1119.18k OS Memory 32940.00k Timestamp 11/12/2020 02:28:04 AM Step Count 362 Switch Count 2 Page Faults 0 Page Reclaims 198 Page Swaps 0 Voluntary Context Switches 23 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 272 110 111 112 PROC SORT DATA = WORK.Contact_MS; 113 BY SSN; 114 RUN; NOTE: There were 207 observations read from the data set WORK.CONTACT_MS. NOTE: The data set WORK.CONTACT_MS has 207 observations and 5 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.00 seconds user cpu time 0.00 seconds system cpu time 0.01 seconds memory 1041.50k OS Memory 32940.00k Timestamp 11/12/2020 02:28:04 AM Step Count 363 Switch Count 2 Page Faults 0 Page Reclaims 113 Page Swaps 0 Voluntary Context Switches 11 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 264 115 116 PROC CONTENTS DATA = WORK.Contact_MS ORDER = VARNUM; 117 RUN; NOTE: PROCEDURE CONTENTS used (Total process time): real time 0.06 seconds user cpu time 0.07 seconds system cpu time 0.00 seconds memory 3129.90k OS Memory 33708.00k Timestamp 11/12/2020 02:28:04 AM Step Count 364 Switch Count 0 Page Faults 0 Page Reclaims 243 Page Swaps 0 Voluntary Context Switches 2 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 16 118 119 PROC PRINT DATA = WORK.Contact_MS; 120 RUN; NOTE: There were 207 observations read from the data set WORK.CONTACT_MS. NOTE: PROCEDURE PRINT used (Total process time): real time 0.30 seconds user cpu time 0.30 seconds system cpu time 0.00 seconds memory 3049.31k OS Memory 35752.00k Timestamp 11/12/2020 02:28:05 AM Step Count 365 Switch Count 0 Page Faults 0 Page Reclaims 643 Page Swaps 0 Voluntary Context Switches 2 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 152 121 122 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 134
The note after your data step:
NOTE: Variable StateCd is uninitialized.
Means at that point there is NO VALUE for the StateCd variable.
Running this code:
PROC FORMAT; VALUE $StateCd 'IA' = 'Iowa' 'MS' = 'Mississippi' 'UT' = 'Utah'; RUN; data junk; input x $; datalines; IA BC MS UT . ; ods listing; Proc print data=junk; format x $statecd.; run;
Yields this output.
Obs x 1 Iowa 2 BC 3 Mississippi 4 Utah 5
So a "blank" or missing value value with that format does NOT create "Unknown" for output. So, either you have a different actual format ACTIVE than you have shown or something else is happening.
Most likely the actual Format code you ran had either a missing or "other" assigning "Unknown" and that makes sense because you have no actual values for the variable.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.