BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pensarchem
Obsidian | Level 7

I used proc format to format a variables, want to change the codes (let's say 10 codes) to 3 categories, A B C;

 

proc format;

 value  change

   1 = 'A'

   2= 'B'

  ....;

run;

 

 

data  data1;

     set  data;

   format codes  change.;

run;

 

proc sql;

    create table example as

   select counts(var), sum(var)

      from table

group by codes;

run;

 

 

It seems that proc format only cover the codes with the format, because it's still numeric variable and

when you group by, you still get 10 groups (corresponding to 10 codes), not three groups.

 

 

How to group by the three categories?   thanks.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
PROC SQL doesn't respect formats, but almost all other procs will. Using PROC FREQ or MEANS will give you what you need. If you want to stick with SQL, then you need to actually recode the variables using PUT() to apply the format and then use that new variable as your grouping variable.

select put(var, change.) as newVar

View solution in original post

14 REPLIES 14
PaigeMiller
Diamond | Level 26

Maybe this is what you want?

 

proc format;
    value change
    1,3,8 = 'A'
    2,7= 'B'
    4,5,6,9,10='C';
run;

If not, then please provide a more clear example that shows the mapping of the 10 codes into three groups.

 

But I don't think SQL will recognize these formatted categories, I think SQL will use the unformatted categories. Now, you can do the formatting inside of SQL, or use PROC SUMMARY to get counts and sum by the formatted levels of codes.

 

data have2;
    set have1;
    format codes change.;
run;
proc summary data=have2;
    class codes;
    var var;
    output out=_stats_ sum=sumvar n=nvar;
run;
    
--
Paige Miller
pensarchem
Obsidian | Level 7

Thanks Paige for your input.

 I want to hardcode or generated another column to group by the three categories A B C.  I like sql more than sas. 

 I just tested the   put(var, format), it's good.  

Reeza
Super User
SQL works for some things, but isn't optimal for all. Especially as you get into multiple variables and statistics. However, what is usually optimal is using what you know.
data_null__
Jade | Level 19

You can get PROC SUMMARY to create/convert a numeric class variable to character using the MLF CLASS statement option. 

 

proc format;
   value change(notsorted)
       1,3,8 = 'A'
       2,7= 'B'
       4,5,6,9,10='C';
   run;
data have;
   do code = 1 to 10;
      y=ranuni(1);
      output;
      end;
   run;
proc summary data=have nway;
   class code / mlf order=data preloadfmt;
   format code change.;
   var y;
   output out=_stats_ sum=sumvar n=nvar;
   run;
proc contents varnum;
   run;
proc print;
   run;

 

Capture.PNG 

novinosrin
Tourmaline | Level 20

Guru @data_null__  been following you for long and the time has come yet again to knock at your intelligence door to bother.

I am unable to understand notsorted in proc format viz. the intuition behind it. When and if you have a moment, some more notes plz.

 

and as always thank you * infinite!

Reeza
Super User

I suspect it's related to the last note here:

 

NOTSORTED

stores values or ranges in the order in which you define them. If you do not specify NOTSORTED, then values or ranges are stored in sorted order by default, and SAS uses a binary searching algorithm to locate the range that a particular value falls into. If you specify NOTSORTED, then SAS searches each range in the order in which you define them until a match is found.

Use NOTSORTED if one of the following is true:

  • You know the likelihood of certain ranges occurring, and you want your format to search those ranges first to save processing time.
  • You want to preserve the order that you define ranges when you print a description of the format using the FMTLIB option.
  • You want to preserve the order that you define ranges when you use the ORDER=DATA option and the PRELOADFMT option to analyze class variables in PROC MEANS, PROC SUMMARY, or PROC TABULATE.
data_null__
Jade | Level 19
Not sorted keeps the range label combinations in the order that you
specify them. That really works well if you’re making a table of
demography in a clinical trial and everywhere else too##- Please type your
reply above this line. No attachments. -##
novinosrin
Tourmaline | Level 20

Thank you Reeza & DN for the responses. Looks like I need some real scenario to get the mind to think along those lines so that I grip the intutitive sense. Nonetheless, I have added to my notes

Reeza
Super User
You'll remember it easily after you've forgotten to use it once or twice and get reports all out of order.
data_null__
Jade | Level 19

@novinosrin wrote:

Guru @data_null__  been following you for long and the time has come yet again to knock at your intelligence door to bother.

I am unable to understand notsorted in proc format viz. the intuition behind it. When and if you have a moment, some more notes plz.

 

and as always thank you * infinite!


@novinosrin I used a few options that are not actually necessary to the point I was trying to make.  I think one of @pensarchem wants was to convert the variable CODE to character with the formatted value.  The conversion to character is accomplished with the CLASS statement option MLF.

 

Capture.PNG

novinosrin
Tourmaline | Level 20

Guru @data_null__  Thank you & Sorry for the late acknowledgement as I was away. Okay, the conversion part is now clear and pretty well comprehended. The institution to notsorted in format is something I am understanding but not practically appreciating it yet. I will have to wait on a real scenario as you know I am outside of clinical domain 

data_null__
Jade | Level 19

@novinosrin you're welcome and thank you.  I think these links to the sample library may be helpful examples of NOTSORTED ORDER=DATA and MULTILABEL formats.

 

http://support.sas.com/kb/12/904.html

http://support.sas.com/kb/45/458.html

http://support.sas.com/kb/23/846.html

http://support.sas.com/kb/23/847.html

 

Reeza
Super User
PROC SQL doesn't respect formats, but almost all other procs will. Using PROC FREQ or MEANS will give you what you need. If you want to stick with SQL, then you need to actually recode the variables using PUT() to apply the format and then use that new variable as your grouping variable.

select put(var, change.) as newVar
pensarchem
Obsidian | Level 7

Thanks Reeza.   It's good. put(var, change.) works beautifully.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 1345 views
  • 2 likes
  • 5 in conversation