Help using Base SAS procedures

Proc Format with a dataset that has multiple formats

Accepted Solution Solved
Reply
Contributor
Posts: 39
Accepted Solution

Proc Format with a dataset that has multiple formats

Hello,

 

I am trying to take a dataset that I want to use to create multiple formats in a different dataset with proc format. Below is a sample of my data:

 

 

CNTLIN dataset:

 

FMTNAME               START          END           LABEL

Question_1                 1                    1               TEST 

Question_1                 2                    2               NO TEST    

Question_2                 1                    1               MALE TEST

Question_2                 2                    2               FEMALE TEST 

Question_A                 1                    1               YES

Question_A                 2                    2               NO

 

data have=

 

ID             Question_1               Question_2             Question_A
1                    1                                1                             2

2                    2                                2                             2

3                    1                                1                             1

4                    1                                1                             1

 

 

And what I want is:

ID             Question_1               Question_2             Question_A     
1                    TEST                      MALE TEST            NO

2                   NO TEST                 FEMALE TEST       NO

3                    TEST                      MALE TEST            YES

4                    TEST                      MALE TEST            YES

 

 

I put the numbers intentionally at the end because some of my data do have numbers, and I know how to do this one by one using proc format and cntlin if all the FMTNAME are the same, but it's a monthly process that we would like to automate that includes hundreds of formats.

 

Any help would be great, thank you!


Accepted Solutions
Solution
‎11-01-2016 10:07 AM
Super User
Posts: 11,343

Re: Proc Format with a dataset that has multiple formats

Posted in reply to jacob_klimek

IF you mean that EVERY variable has its own format then you use that cntrlin data set to construct a format statement. However there would have to be a pattern to the naming of the formats so that you can strip the bit from the name to get back to the variable name.

 

The following constructs a string for the numeric formats.

data cntlin;
   informat fmtname $25. type $1. label $25.;
   input FMTNAME type  START     END  LABEL;
datalines;
Question_1_Fmt        N    1                    1               TEST 
Question_1_Fmt        N    2                    2               NO TEST    
Question_2_Fmt        N    1                    1               MALE TEST
Question_2_Fmt        N    2                    2               FEMALE TEST 
Question_A_Fmt        N    1                    1               YES
Question_A_Fmt        N    2                    2               NO
;
run;

proc sql noprint;
   select distinct catx(' ',transtrn(fmtname,'_Fmt',''),cats(FMTName,'.')) as string
         Into : NumFormatAssign separated by ' '
   From cntlin
   where type='N'
   ;
quit;

%put &NumFormatAssign;

You would use that by adding a format assignment as needed. I would recommend to the step you create the data but thats your choice. One example would be:

proc print data=yourdata;
   format &NumFormatAssign;
run;

View solution in original post


All Replies
Super User
Posts: 11,343

Re: Proc Format with a dataset that has multiple formats

Posted in reply to jacob_klimek

Do you mean that you are attempting to build a Cntlin data set to build the formats?

If so there are a couple of things to pay attention to. First the Name you supply for the format cannot end in a digit. If you have a lot of these then you may want to end all of the Fmtnames with something like FMT.

FMTNAME               START          END           LABEL
Question_1_Fmt            1                    1               TEST 
Question_1_Fmt            2                    2               NO TEST    
Question_2_Fmt            1                    1               MALE TEST
Question_2_Fmt            2                    2               FEMALE TEST 
Question_A_Fmt            1                    1               YES
Question_A_Fmt            2                    2               NO

 

 

Second is to make sure that you name character starting with $. Makesure that you have Length for the FMTNAME variable large enough to hold the longest expected value, don't forget to count $.

If your dataset is going to create mixed numeric and character formats the the start and end variables with have to be character. Ensure that the length is large enough for the longest expected value.

 

Third, before use of the cntlin data set you need to sort the cntlin dataset by Fmtname. 

You should also include a TYPE variable with values of N (Numeric value),  C (character value), I (numeric Informat) or J(character informat).

 

If you suspect that you may get out of range values you might want to include an "OTHER" value to indicate out or range or such. If so you woud include a character variable name HLO that has the value of O (for other).

 

Anything fancier you might make a short example Proc format statement and use the CNTLOUT option to see all of the variables and options set.

Contributor
Posts: 39

Re: Proc Format with a dataset that has multiple formats

I think I have the right set up for the dataset. Where I am struggling is just in the application part. The only time I have ever used formatting has been for one format at a time applied to one variable at a time, but I want this to do it for hundreds of variables each with their own independent formatting.
Contributor
Posts: 39

Re: Proc Format with a dataset that has multiple formats

I think I have the right set up for the dataset. Where I am struggling is just in the application part. The only time I have ever used formatting has been for one format at a time applied to one variable at a time, but I want this to do it for hundreds of variables each with their own independent formatting.
Solution
‎11-01-2016 10:07 AM
Super User
Posts: 11,343

Re: Proc Format with a dataset that has multiple formats

Posted in reply to jacob_klimek

IF you mean that EVERY variable has its own format then you use that cntrlin data set to construct a format statement. However there would have to be a pattern to the naming of the formats so that you can strip the bit from the name to get back to the variable name.

 

The following constructs a string for the numeric formats.

data cntlin;
   informat fmtname $25. type $1. label $25.;
   input FMTNAME type  START     END  LABEL;
datalines;
Question_1_Fmt        N    1                    1               TEST 
Question_1_Fmt        N    2                    2               NO TEST    
Question_2_Fmt        N    1                    1               MALE TEST
Question_2_Fmt        N    2                    2               FEMALE TEST 
Question_A_Fmt        N    1                    1               YES
Question_A_Fmt        N    2                    2               NO
;
run;

proc sql noprint;
   select distinct catx(' ',transtrn(fmtname,'_Fmt',''),cats(FMTName,'.')) as string
         Into : NumFormatAssign separated by ' '
   From cntlin
   where type='N'
   ;
quit;

%put &NumFormatAssign;

You would use that by adding a format assignment as needed. I would recommend to the step you create the data but thats your choice. One example would be:

proc print data=yourdata;
   format &NumFormatAssign;
run;
Respected Advisor
Posts: 4,919

Re: Proc Format with a dataset that has multiple formats

Posted in reply to jacob_klimek

You could use a single table driven format if you put your data in long form. The advantage would be that you wouldn't have to associate a different format to every table column. For example:

 

data QA;
length SET QUESTION $4 ANSWER $3 LABEL $20;
input SET QUESTION ANSWER LABEL &;
datalines;
1 1  1  TEST 
1 1  2  NO TEST    
1 2  1  MALE TEST
1 2  2  FEMALE TEST 
1 A  1  YES
1 A  2  NO
;

data QA_CNTLIN;
set QA end=done;
FMTNAME = "$QAfmt";
length START $15;
START = cats( "S", SET, "Q", QUESTION, "_", ANSWER, "X");
output;
if done then do;
    HLO = "O";
    LABEL = "---";
    output;
    end;
run;

proc format cntlin=QA_CNTLIN; run;

data RESULTS;
SET = 1;
input ID Question_1 Question_2 Question_A;
array Ans Question_:;
array Q{3} $ _temporary_ ("1" "2" "A");
do i = 1 to dim(Ans);
    QUESTION = Q{i};
    ANSWER = Ans{i};
    result = cats( "S", SET, "Q", QUESTION, "_", ANSWER, "X");
    output;
    end;
keep SET ID QUESTION ANSWER result;
format result $QAfmt.;
datalines;
1   1   1  2
2   2   2  2
3   1   1  1
4   1   1  1
5   1   3  1
;

proc transpose data=results out=rtable(drop=_name_) prefix=Question_;
by set id;
id question;
var result;
run;

proc print data=rtable noobs; run;
PG
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 351 views
  • 0 likes
  • 3 in conversation