BookmarkSubscribeRSS Feed
DingDing
Quartz | Level 8

 

 

Hi guys, 

I am learning how to creat a barchart on an text book, and there is an example codes (for the raw data, see attached) like that

 


DATA chocolate;    
INFILE '....\Choc.txt';   
INPUT AgeGroup $ FavoriteFlavor $ @@; 
RUN;

PROC FORMAT;   
VALUE $AgeGroup 'A' = 'Adult' 'C' = 'Child'; 
RUN; 

* Bar chart for favorite flavor; 
PROC SGPLOT DATA = chocolate;  
VBAR FavoriteFlavor / GROUP = AgeGroup GROUPDISPLAY = CLUSTER;   
FORMAT AgeGroup $AgeGp.;    
LABEL FavoriteFlavor = 'Flavor of Chocolate';  
TITLE 'Favorite Chocolate Flavors by Age';
RUN; 

 

I don't know what the followwing syntax (in red ) mean 

PROC FORMAT;
VALUE $AgeGroup 'A' = 'Adult' 'C' = 'Child';

and 

FORMAT AgeGroup $AgeGp.;

 

 I delete these syntax to see the what may be changed, however, it makes no difference and comes the same graph (below), Could anyone tell me what is the usage of the above syntax in the codes ?

 

 

DATA chocolate;    
INFILE '....\Choc.txt';   
INPUT AgeGroup $ FavoriteFlavor $ @@; 
RUN;



* Bar chart for favorite flavor; 
PROC SGPLOT DATA = chocolate;  
VBAR FavoriteFlavor / GROUP = AgeGroup GROUPDISPLAY = CLUSTER;   
   
LABEL FavoriteFlavor = 'Flavor of Chocolate';  
TITLE 'Favorite Chocolate Flavors by Age';
RUN; 

 

 

They return the same graph like below

2015-11-17_085653.png

6 REPLIES 6
FreelanceReinh
Jade | Level 19

Hi DingDing,

 

You should make yourself familiar with the important concept of formats in SAS. Beyond the SAS documentation you will find various conference papers on this topic on the web.

 

In your program the PROC FORMAT step assigns labels 'Adult' and 'Child' to character values 'A' and 'C', respectively. The format name $AgeGroup must be used to refer to this assignment later in the program.

 

In your FORMAT statement, however, you misspell that name as "$AgeGp" (typo in your text book?). So, at best, SAS could apply a format named $AgeGp (if existent) to the values of variable AgeGroup.

 

If you correct the typo, you will see that in the legend of your bar chart the abbreviated names of the age groups, 'A' and 'C' (i.e. the original values of variable AgeGroup) are replaced by the more reader-friendly labels 'Adult' and 'Child', respectively.

DingDing
Quartz | Level 8
the example is from "the little SAS book", at page 231, which is an very classical SAS book, so I suppose it should not make mistake. I comfirm that the codes are all from the book and "$AgeGp" is not misspell, that is what makes me confused
kannand
Lapis Lazuli | Level 10

Hi Dingding,

 

I see that you have the variable that you are reading is named as "AgeGroup". Therefore,  I'd recommend changing your format name as "AgeGp" in the FORMAT statement as shown below 

DATA chocolate; 
INFILE datalines;
INPUT AgeGroup $ FavoriteFlavor $ @@;
datalines;
A VANILLA
A VANILLA
C VANILLA
C CHOCOLATE
C CHOCOLATE
C CHOCOLATE
A CHOCOLATE
A STRAWBERRY
A FRUIT
A FRUIT
A FRUIT
A FRUIT
C FRUIT
A JELLY
A JELLY
C JELLY
;
RUN;

PROC FORMAT;
VALUE $AgeGp 'A' = 'Adult' 'C' = 'Child';
RUN;

* Bar chart for favorite flavor;
PROC SGPLOT DATA = chocolate;
VBAR FavoriteFlavor / GROUP = AgeGroup GROUPDISPLAY = CLUSTER;
FORMAT AgeGroup $AgeGp.;
LABEL FavoriteFlavor = 'Flavor of Chocolate';
TITLE 'Favorite Chocolate Flavors by Age';
RUN;

For your reference, here is the formatted legend output.....hope this helps...

 

graph output.png

 

Good luck...!!!

Kannan Deivasigamani
FreelanceReinh
Jade | Level 19

Hi DingDing,

 

Good point from Kannan: Of course, you can also change the format name in the PROC FORMAT step in order to make it consistent with the name used in the FORMAT statement later. If you are not familiar with formats, it may indeed be less confusing to avoid using the same name for variables and formats (although this is permitted, as long as the variable name does not end in a number). The advantage of having the same name (except for the leading $-sign, of course, in the case of character formats) would be that you don't have to remember another name ("Did I call it $AgeGp or $AgeGrp or ...?").

 

In earlier versions of SAS (up to v8, in fact) the length of format names was restricted to 8 characters (including the leading $-sign for character formats). In those days, a format name $AgeGroup would not have been possible and, as a consequence, abbreviated format names such as $AGEGRP were very common.

DingDing
Quartz | Level 8
Thank FreelanceReinhard, I review the book and found that the " name" in PROC FORMAT VALUE name... is created by this syntax. I misunderstand it before.
DingDing
Quartz | Level 8
Thanks so much, Kannand, I got what you mean!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1039 views
  • 0 likes
  • 3 in conversation