Solved: Re: Proc Format with a dataset that has multiple formats

jacob_klimek · Posted 09-15-2016 05:37 PM

Hello,

I am trying to take a dataset that I want to use to create multiple formats in a different dataset with proc format. Below is a sample of my data:

CNTLIN dataset:

FMTNAME START END LABEL

Question_1 1 1 TEST

Question_1 2 2 NO TEST

Question_2 1 1 MALE TEST

Question_2 2 2 FEMALE TEST

Question_A 1 1 YES

Question_A 2 2 NO

data have=

ID Question_1 Question_2 Question_A
1 1 1 2

2 2 2 2

3 1 1 1

4 1 1 1

And what I want is:

ID Question_1 Question_2 Question_A
1 TEST MALE TEST NO

2 NO TEST FEMALE TEST NO

3 TEST MALE TEST YES

4 TEST MALE TEST YES

I put the numbers intentionally at the end because some of my data do have numbers, and I know how to do this one by one using proc format and cntlin if all the FMTNAME are the same, but it's a monthly process that we would like to automate that includes hundreds of formats.

Any help would be great, thank you!

ballardw · Posted 09-16-2016 11:30 AM

IF you mean that EVERY variable has its own format then you use that cntrlin data set to construct a format statement. However there would have to be a pattern to the naming of the formats so that you can strip the bit from the name to get back to the variable name.

The following constructs a string for the numeric formats.

data cntlin;
   informat fmtname $25. type $1. label $25.;
   input FMTNAME type  START     END  LABEL;
datalines;
Question_1_Fmt        N    1                    1               TEST 
Question_1_Fmt        N    2                    2               NO TEST    
Question_2_Fmt        N    1                    1               MALE TEST
Question_2_Fmt        N    2                    2               FEMALE TEST 
Question_A_Fmt        N    1                    1               YES
Question_A_Fmt        N    2                    2               NO
;
run;

proc sql noprint;
   select distinct catx(' ',transtrn(fmtname,'_Fmt',''),cats(FMTName,'.')) as string
         Into : NumFormatAssign separated by ' '
   From cntlin
   where type='N'
   ;
quit;

%put &NumFormatAssign;

You would use that by adding a format assignment as needed. I would recommend to the step you create the data but thats your choice. One example would be:

proc print data=yourdata;
   format &NumFormatAssign;
run;

View solution in original post

ballardw · Posted 09-15-2016 06:05 PM

Do you mean that you are attempting to build a Cntlin data set to build the formats?

If so there are a couple of things to pay attention to. First the Name you supply for the format cannot end in a digit. If you have a lot of these then you may want to end all of the Fmtnames with something like FMT.

FMTNAME               START          END           LABEL
Question_1_Fmt            1                    1               TEST 
Question_1_Fmt            2                    2               NO TEST    
Question_2_Fmt            1                    1               MALE TEST
Question_2_Fmt            2                    2               FEMALE TEST 
Question_A_Fmt            1                    1               YES
Question_A_Fmt            2                    2               NO

Second is to make sure that you name character starting with $. Makesure that you have Length for the FMTNAME variable large enough to hold the longest expected value, don't forget to count $.

If your dataset is going to create mixed numeric and character formats the the start and end variables with have to be character. Ensure that the length is large enough for the longest expected value.

Third, before use of the cntlin data set you need to sort the cntlin dataset by Fmtname.

You should also include a TYPE variable with values of N (Numeric value), C (character value), I (numeric Informat) or J(character informat).

If you suspect that you may get out of range values you might want to include an "OTHER" value to indicate out or range or such. If so you woud include a character variable name HLO that has the value of O (for other).

Anything fancier you might make a short example Proc format statement and use the CNTLOUT option to see all of the variables and options set.

jacob_klimek · Posted 09-16-2016 09:02 AM

I think I have the right set up for the dataset. Where I am struggling is just in the application part. The only time I have ever used formatting has been for one format at a time applied to one variable at a time, but I want this to do it for hundreds of variables each with their own independent formatting.

jacob_klimek · Posted 09-16-2016 09:03 AM

I think I have the right set up for the dataset. Where I am struggling is just in the application part. The only time I have ever used formatting has been for one format at a time applied to one variable at a time, but I want this to do it for hundreds of variables each with their own independent formatting.

ballardw · Posted 09-16-2016 11:30 AM

IF you mean that EVERY variable has its own format then you use that cntrlin data set to construct a format statement. However there would have to be a pattern to the naming of the formats so that you can strip the bit from the name to get back to the variable name.

The following constructs a string for the numeric formats.

data cntlin;
   informat fmtname $25. type $1. label $25.;
   input FMTNAME type  START     END  LABEL;
datalines;
Question_1_Fmt        N    1                    1               TEST 
Question_1_Fmt        N    2                    2               NO TEST    
Question_2_Fmt        N    1                    1               MALE TEST
Question_2_Fmt        N    2                    2               FEMALE TEST 
Question_A_Fmt        N    1                    1               YES
Question_A_Fmt        N    2                    2               NO
;
run;

proc sql noprint;
   select distinct catx(' ',transtrn(fmtname,'_Fmt',''),cats(FMTName,'.')) as string
         Into : NumFormatAssign separated by ' '
   From cntlin
   where type='N'
   ;
quit;

%put &NumFormatAssign;

You would use that by adding a format assignment as needed. I would recommend to the step you create the data but thats your choice. One example would be:

proc print data=yourdata;
   format &NumFormatAssign;
run;

PGStats · Posted 09-16-2016 01:12 AM

You could use a single table driven format if you put your data in long form. The advantage would be that you wouldn't have to associate a different format to every table column. For example:

data QA;
length SET QUESTION $4 ANSWER $3 LABEL $20;
input SET QUESTION ANSWER LABEL &;
datalines;
1 1  1  TEST 
1 1  2  NO TEST    
1 2  1  MALE TEST
1 2  2  FEMALE TEST 
1 A  1  YES
1 A  2  NO
;

data QA_CNTLIN;
set QA end=done;
FMTNAME = "$QAfmt";
length START $15;
START = cats( "S", SET, "Q", QUESTION, "_", ANSWER, "X");
output;
if done then do;
    HLO = "O";
    LABEL = "---";
    output;
    end;
run;

proc format cntlin=QA_CNTLIN; run;

data RESULTS;
SET = 1;
input ID Question_1 Question_2 Question_A;
array Ans Question_:;
array Q{3} $ _temporary_ ("1" "2" "A");
do i = 1 to dim(Ans);
    QUESTION = Q{i};
    ANSWER = Ans{i};
    result = cats( "S", SET, "Q", QUESTION, "_", ANSWER, "X");
    output;
    end;
keep SET ID QUESTION ANSWER result;
format result $QAfmt.;
datalines;
1   1   1  2
2   2   2  2
3   1   1  1
4   1   1  1
5   1   3  1
;

proc transpose data=results out=rtable(drop=_name_) prefix=Question_;
by set id;
id question;
var result;
run;

proc print data=rtable noobs; run;

PG

Proc Format with a dataset that has multiple formats

Re: Proc Format with a dataset that has multiple formats

Re: Proc Format with a dataset that has multiple formats

Re: Proc Format with a dataset that has multiple formats

Re: Proc Format with a dataset that has multiple formats

Re: Proc Format with a dataset that has multiple formats

Re: Proc Format with a dataset that has multiple formats

Registration is open

SAS Training: Just a Click Away