BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
urban58
Quartz | Level 8

Hello SAS community,

 

I'm trying to get label and format values (e.g. 0=No, 1=Yes) from a dataset. Using dictionary tables I've been able to get the labels but the format values are proving more elusive.

Below is my test dataset and the sql code I've used to get the labels.

I could use some help with the format values.

 

proc format;
value agecat
0-< 10='0 to <10'
10-< 20='10 to <20'
20-< 30='20 to <30'
30-< 40='30 to <40'
40-high='40 and above';
value sexf
1='Male'
2='Female';
value racef
1='White'
2='Black'
3='Asian'
4='Native Hawaiian or Pacific Islander'
5='American Indian or Alaskan Native'
6='Asian Indian or Pacific Islander'
7=Prefer not to answer
8='Other'
9='>1 race';
value maritalf
1='Single'
2='Married'
3='Widowed'
4='Divorced';
value educf
1='High School or Less'
2='2 Yr College'
3='4 Yr College'
4='Graduate Degree';
run;


data have;
input id age gender race marital educ;
format age agecat. gender sexf. race racef. marital maritalf. educ educf.;
label
id='Subject id'
age='Age category'
gender='Reported gender'
race='Reported race'
marital='Marital status'
educ='Highest educational attainment';
cards;
1 9 1 1 1 1
2 25 2 2 2 2
3 35 1 3 2 4
4 27 1 1 1 1
5 18 2 1 3 1
6 41 1 2 4 3
;
run;

 

desired output

Is all this output possible from dictionary tables (adding format just gives agecat., etc.) or maybe proc contents - I couldn't find any resources that would get both the numeric value and it's meaning (e.g. 1=Male) in the same column

Variable

Label

Format Values

id

Subject id

  

 

age

Age category

   0-< 10=0 to <10

10-< 20=10 to <20

20-< 30=20 to <30

30-< 40=30 to <40

40-high=40 and above

gender

Reported gender

1=Male

2=Female

race

Reported race

1=White

 2=Black

 3=Asian

 4=Native Hawaiian or Pacific Islander

 5=American Indian or Alaskan Native

 6=Asian Indian or Pacific Islander

 7=Prefer not to answer

 8=Other

 9=>1 race

marital

Marital status

1=Single

2=Married

3=Widowed

4=Divorced

edu

Highest educational attainment

1=High School or Less

2=2 Yr College

3=4 Yr College

4=Graduate Degree

 

Thanks in advance,

Margaret

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

No need to hard code the values!  They are already in the dataset!

 

For these types of simple formats you could use something like this to populate your FORMAT_VALUE variable.

data want ;
  set formats(keep=fmtname start end label);
  length format_value $50 ;
  if start=end then format_value = catx('=',start,quote(trim(label)));
  else format_value=catx('=',catx('-',start,end),quote(trim(label)));
run;

Tom_0-1669141088357.png

 

View solution in original post

6 REPLIES 6
Tom
Super User Tom
Super User

Why not just copy the text from the program that makes the format?

Do you no longer have that code?

 

If not you can ask PROC FORMAT to convert the format definitions into a dataset.

proc format lib=work.formats out=formats;
run;

You can then use FMTNAME, START, END and LABEL in that dataset to generate the strings you want.  Since you are using the HIGH keyword in at least one of those formats you also might want to include the HLO variable.

Obs    FMTNAME     START          END           HLO    LABEL

  1    AGECAT        0                    10           0 to <10
  2    AGECAT       10                    20           10 to <20
  3    AGECAT       20                    30           20 to <30
  4    AGECAT       30                    40           30 to <40
  5    AGECAT       40      HIGH                 H     40 and above
  6    EDUCF         1                     1           High School or Less
  7    EDUCF         2                     2           2 Yr College
  8    EDUCF         3                     3           4 Yr College
  9    EDUCF         4                     4           Graduate Degree
 10    MARITALF      1                     1           Single
 11    MARITALF      2                     2           Married
 12    MARITALF      3                     3           Widowed
 13    MARITALF      4                     4           Divorced
 14    RACEF         1                     1           White
 15    RACEF         2                     2           Black
 16    RACEF         3                     3           Asian
 17    RACEF         4                     4           Native Hawaiian or Pacific Islander
 18    RACEF         5                     5           American Indian or Alaskan Native
 19    RACEF         6                     6           Asian Indian or Pacific Islander
 20    RACEF         7                     7           Prefer not to answer
 21    RACEF         8                     8           Other
 22    RACEF         9                     9           >1 race
 23    SEXF          1                     1           Male
 24    SEXF          2                     2           Female

urban58
Quartz | Level 8

Thanks Tom - very useful information.

1. I can copy in the format values but with a lot of datasets this is very time consuming and I hoped SAS would have a better solution

2. As you suggested, I converted the format definitions into a dataset 

and then created a variable for the format values

 

data format_value; length format_value $50;
set formats(keep=FMTNAME START END LABEL);
if FMTNAME='SEXF' and start=1 and end=1 and label='Male' then format_value ='1=Male'; else
if FMTNAME='SEXF' and start=2 and end=2 and label='Female' then format_value ='1=Female';

run;

getting for my format_value

 

1=Male

2=Female

 

This really works, not to ask too much but is there a more efficient way to do this?

Really appreciate your fast and spot-on response - this problem has stumped me for a long time!

Margaret

Tom
Super User Tom
Super User

No need to hard code the values!  They are already in the dataset!

 

For these types of simple formats you could use something like this to populate your FORMAT_VALUE variable.

data want ;
  set formats(keep=fmtname start end label);
  length format_value $50 ;
  if start=end then format_value = catx('=',start,quote(trim(label)));
  else format_value=catx('=',catx('-',start,end),quote(trim(label)));
run;

Tom_0-1669141088357.png

 

urban58
Quartz | Level 8
Tom, you’re a super helper, many thanks!
Margaret
Quentin
Super User

It looks like you're planning to build a data dictionary / codebook?  I would try searching lexjansen.com for papers on that subject.  

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
urban58
Quartz | Level 8
First thing I did but I could not find (or maybe understand) how the format information was abstracted

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1257 views
  • 0 likes
  • 3 in conversation