DATA Step, Macro, Functions and more

Proc format to create 1 million formats

Reply
Occasional Contributor
Posts: 12

Proc format to create 1 million formats

Hi Team,

 

I am trying to create given format for 2 million records. Please help me on this. Format is looks like given below:

 

Proc format;

value $urban

'xyqa1'='1'

'qwas2'='2'

'abcd1'='3'

----------

---------

---------

other='00'

;

 

Like that I have 1 million records. Please help me for that how we can create proc format for 1 million records.

 

Thanks,

Abhishek

Super User
Posts: 7,866

Re: Proc format to create 1 million formats

Posted in reply to abhityagi

With that many lookup values, I'd revert to joining/merging. I doubt that you could specify such a format in proc format code, as the length of the necessary value statement will exceed the maximum length of statements that the SAS interpreter can read (32767 bytes). You might try creating a cntlin file from your data.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Trusted Advisor
Posts: 1,586

Re: Proc format to create 1 million formats

Posted in reply to abhityagi

Do you have already a file containg urban name and urban code ?

If yes - you can create a "cntl" file to be used by:

     proc format lib=any cntlin=<the cntl dataset created>; run;

 

Example:

 

data sample;
    input u_name $ u_code;
cards;
abc 1
def 2
ghi 3
; run;

data cntl;
 set sample end=eof;
       retain fmtname 'urban'  type 'C' hlo ' ';
       start = u_name; end = start;
       label = left(u_code);
       output;

      if eof then do;
         start = '**OTHER**'; end=start;
        hlo = 'O';
        output;
     end;
run;

proc format lib=work cntlin=cntl;
run;

 

Occasional Contributor
Posts: 12

Re: Proc format to create 1 million formats

Thanks Shumel. Yes I have containing urban name and code.

After that I have a field(ZIP) and with the help of this I want to drive a new fiels postal_code.

 

postal_code=put(ZIP,$u_name);

 

I want to assign postal_code=u_code IF Zip is matching with u_name.

 

Please help.

Trusted Advisor
Posts: 1,586

Re: Proc format to create 1 million formats

[ Edited ]
Posted in reply to abhityagi

As @KurtBremser remrked, you may have memory issues and reach the limits.

 

It will be helpful if you post an example of your input (few/several records),

with all relevant variables (urban name, code,  zip ...),

and the output result you expect to achieve.

 

Then we can propose a more eficient code to get the desired results. 

Super User
Posts: 7,866

Re: Proc format to create 1 million formats

Posted in reply to abhityagi

A cntlin dataset for proc format needs to have at least 4 variables:

start - containing the value to be formatted/replaced

label - containing the target value

fmtname - the name of the format to be created

type - either 'C' or 'N', denoting character or numeric format. 'C' when start is character, 'N' when start is numeric.

 

So you need to create the cntlin dataset from your current dataset by renaming two variables to start and label and adding type and fmtname.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super Contributor
Posts: 298

Re: Proc format to create 1 million formats

Posted in reply to abhityagi

Your problem can be solved by Hash Objects too.

Yet it depends and seeing your sample datset, it can be confirmed.

Frequent Contributor
Posts: 119

Re: Proc format to create 1 million formats

Posted in reply to abhityagi

I aksed a similar question a number of years back to a senior birdie at SAS. They said above about 250K records and the value of the format drops off. Not sure how applicable it is today but little should have changed. That is what I always use as a line.

 

Agreed with other poster who said use a hash. 

Occasional Contributor
Posts: 12

Re: Proc format to create 1 million formats

Hi Alan.

 

Thanks for reply. I got the solution of this problem. I took all the labels in flat files and take into DATA STEP. We can use CNTLIN option later on for Proc format.

Ask a Question
Discussion stats
  • 8 replies
  • 180 views
  • 6 likes
  • 5 in conversation