DATA Step, Macro, Functions and more

more efficient way to program

Accepted Solution Solved
Reply
Contributor
Posts: 36
Accepted Solution

more efficient way to program

I have a table that is updated daily with the following values.
I need to reference these values as global variables in other sas programs.

ID:         Name:               Type:

101         Data                 Bytes
102         Account            Rand
103         DataValue         Bytes


I currently use the program below and manually add in entries so the problem is that I am not aware of new entries and often overlook it.
Is there a more efficient way to do this??
I want to create 2 variables data_unit and rand_unit based on the name and type.

*********************************************************************************************************
PROC FORMAT LIBRARY=WORK;   
  VALUE $Unit             
    'DATA'   =  '*B'
    'ACCOUNT'=  '*R'
    'DATAVALUE'='*B';
RUN;

%GLOBAL Rand_Unit Data_Unit;

DATA Unit;
   LENGTH Unit_Type $12 Unit_Name $10;
   INPUT Unit_Type 1-12  Unit_Name 13-23 ;
   Datalines;
Data_Unit DATA
Data_Unit DATAVALUE
Rand_Unit ACCOUNT
;;;
run;

PROC SQL Noprint;
SELECT QUOTE(TRIM(Unit_Name))INTO Smiley Very Happyata_Unit separated BY ',' FROM Unit WHERE Unit_Type='Data_Unit';;
SELECT QUOTE(TRIM(Unit_Name))INTO :Rand_Unit separated BY ',' FROM Unit WHERE Unit_Type='Rand_Unit';;
QUIT;

%PUT &Data_Unit;
%PUT &Rand_Unit;
*********************************************************************************************************

Please advise, thank you.


Accepted Solutions
Solution
‎06-28-2012 09:17 AM
Respected Advisor
Posts: 4,173

Re: more efficient way to program

Hi Havi

Shivas actually provided already the code demonstrating how you can create a format based on a data set. So whenever the template dataset changes you just re-create the format.

Below a code variation of what Shivas already posted to re-demonstrate the concept.

/* the changing template data set which is the source */
data template;
  infile datalines dlm=' ' truncover;
  input id name :$10. type :$8.;
datalines;
101 Data Bytes
102 Account Rand
103 DataValue Bytes
104 Datavalue Bytes
;
run;

/* only use distinct key/value pairs as source for format */
proc sql;
  create view DistinctValueList as 
    select distinct strip(upcase(name)) as name, type
    from template
    ;
quit;

/* (re-) create format based on latest entries in template data set: prepare the source data set for the format */
data fmt (keep=FMTNAME START  LABEL TYPE);
  length FMTNAME $30. START  $256.;
  set DistinctValueList;
  FMTNAME     = 'unit' ;
  START       = name;
  END         = START;
  if type     ='Rand' then LABEL='000';
  else LABEL  ='123';
  TYPE        = 'C' ;
run;

/* (re-) create format based on latest entries in template data set: create the format */

proc format cntlin=fmt lib=work;
run;


/*** and now let's test it ***/
data sample;
   input unit $10.;
   recoded=put(strip(upcase(unit)),$unit20.);
   Datalines;
DATA
DATAVALUE
ACCOUNT
OTHERVALUE
;;;
run;

View solution in original post


All Replies
Super Contributor
Posts: 349

Re: more efficient way to program

Hi,

I don't know what is the output required .

I guess this what you want...

DATA Unit (keep=FMTNAME START END LABEL TYPE);

   LENGTH Unit_Type $12 Unit_Name $10;

   INPUT Unit_Type 1-10  Unit_Name 11-23 ;

     FMTNAME = 'Type' ;

      START   = Unit_Name;

      END     = START;

      LABEL   = Unit_Type;

      TYPE    = 'C' ;

   Datalines;

Data_Unit DATA

Data_Unit DATAVALUE

Rand_Unit ACCOUNT

;;;

run;

proc format cntlin=unit lib=work; run;

data want;

input ID         Name $ 5-14            Type $ 15-20;

unit=put(Name,Type.);

cards;

101 DATA      Bytes

102 ACCOUNT   Rand

103 DATAVALUE Bytes

;

run;

Thanks,

Shiva

Contributor
Posts: 36

Re: more efficient way to program

Thank you for your response Shivas.

I actually need to read values from the table in order to create global variables.

Using cards/datalines will mean I have to enter data from the table and will not be aware immediately when a new entry is added.

In order to do this I need to go through each entry in the table and check its type to determine in which group of global variables it belongs.

eg: if type = bytes then it is data_unit

     if type = rand then it is rand_unit

Super Contributor
Posts: 349

Re: more efficient way to program

Hi,

As i dont have tables,I have created using datalines or cards.Just try to execute this on sample data.

/* Original table that you want to read ..You can delete this in your code as you are reading for table*/

DATA original ;

   LENGTH Unit_Type $12 Unit_Name $10;

   INPUT Unit_Type 1-10  Unit_Name 11-23 ;

    Datalines;

Data_Unit DATA

Data_Unit DATAVALUE

Rand_Unit ACCOUNT

;;;

run;

/* format for your original table*/

  data fmt (keep=FMTNAME START END LABEL TYPE);

length FMTNAME $30. START END $256.;

set WORK.original;

      FMTNAME = 'unit' ;

      START   = Unit_Name;

      END     = START;

      LABEL   = Unit_Type;

      TYPE    = 'C' ;

run;

proc format cntlin=fmt lib=work; run;

/*Check or applying formats for the newrecords as well as for existing records*/

data wanttable;

input ID         Name $ 5-14            Type $ 15-20;

unit=put(Name,unit.);

cards;

101 DATA      Bytes

102 ACCOUNT   Rand

103 DATAVALUE Bytes

;

run;

Thanks,

Shiva

Respected Advisor
Posts: 4,173

Re: more efficient way to program

Creating a format to look up values in downstream processes sounds like a good idea to me. Using SAS macro variables to store these values in a list for lookup sounds only like the second best approach (unless this needs to be processed on a DB and the list is short).

May be you show us/explain us how you actually need to look up the values later on for us to give you an adequate solution.

Contributor
Posts: 36

Re: more efficient way to program

Hi Patrick

Thank you for your response, please see below:


/* template.sas*/

ID:          Name:               Type:

101         Data                 Bytes
102         Account            Rand
103         DataValue         Bytes
104         Datavalue          Bytes

I currently use the program below and manually add in entries so the problem is that I am not aware of new entries that
are added in template.sas and often overlook it.

I want to create 2 variables data_unit and rand_unit as shown below:

*********************************************************************************************************
PROC FORMAT LIBRARY=WORK;   
  VALUE $Unit             
    'DATA'   =  '*B'
    'ACCOUNT'=  '*R'
    'DATAVALUE'='*B';
RUN;

%GLOBAL Rand_Unit Data_Unit;

DATA Unit;
   LENGTH Unit_Type $12 Unit_Name $10;
   INPUT Unit_Type 1-12  Unit_Name 13-23 ;
   Datalines;
Data_Unit DATA
Data_Unit DATAVALUE
Rand_Unit ACCOUNT
;;;
run;

PROC SQL Noprint;
SELECT QUOTE(TRIM(Unit_Name))INTO Smiley Very Happyata_Unit separated BY ',' FROM Unit WHERE Unit_Type='Data_Unit';;
SELECT QUOTE(TRIM(Unit_Name))INTO :Rand_Unit separated BY ',' FROM Unit WHERE Unit_Type='Rand_Unit';;
QUIT;

%PUT &Data_Unit;
%PUT &Rand_Unit;


I thereafter include this program in a program that needs the variables &Rand_unit and &data_unit
%include "/home/xxx/global_variables.sas";


This is an example of how I use it:


IF UPCASE(TRIM(unit)) in (&Rand_unit) THEN Prod='000';
ELSE IF UPCASE(TRIM(unit)) in (&Data_unit) THEN Prod='123';

Solution
‎06-28-2012 09:17 AM
Respected Advisor
Posts: 4,173

Re: more efficient way to program

Hi Havi

Shivas actually provided already the code demonstrating how you can create a format based on a data set. So whenever the template dataset changes you just re-create the format.

Below a code variation of what Shivas already posted to re-demonstrate the concept.

/* the changing template data set which is the source */
data template;
  infile datalines dlm=' ' truncover;
  input id name :$10. type :$8.;
datalines;
101 Data Bytes
102 Account Rand
103 DataValue Bytes
104 Datavalue Bytes
;
run;

/* only use distinct key/value pairs as source for format */
proc sql;
  create view DistinctValueList as 
    select distinct strip(upcase(name)) as name, type
    from template
    ;
quit;

/* (re-) create format based on latest entries in template data set: prepare the source data set for the format */
data fmt (keep=FMTNAME START  LABEL TYPE);
  length FMTNAME $30. START  $256.;
  set DistinctValueList;
  FMTNAME     = 'unit' ;
  START       = name;
  END         = START;
  if type     ='Rand' then LABEL='000';
  else LABEL  ='123';
  TYPE        = 'C' ;
run;

/* (re-) create format based on latest entries in template data set: create the format */

proc format cntlin=fmt lib=work;
run;


/*** and now let's test it ***/
data sample;
   input unit $10.;
   recoded=put(strip(upcase(unit)),$unit20.);
   Datalines;
DATA
DATAVALUE
ACCOUNT
OTHERVALUE
;;;
run;

Contributor
Posts: 36

Re: more efficient way to program

Thanks Patrick, makes sense.

The format however will not work as I have duplicate names with different types.

Respected Advisor
Posts: 4,173

Re: more efficient way to program

Then just amend the code so that it works with your data, eg:


/* (re-) create format based on latest entries in template data set: prepare the source data set for the format */
proc sql;
  create table fmt as 
    select distinct
        'unit' as fmtname
      , strip(upcase(name)) as start
      , strip(upcase(name)) as end
      , case(type) when 'Rand' then '000' else '123' end as label
      , 'C' as type
    from template
    ;
quit;

Valued Guide
Posts: 765

Re: more efficient way to program

hi ... another idea ... using Patrick's data set (adding an "OTHER" condition to the format) ...

proc sql;

create view distinctvaluelist as

select distinct '$unit' as fmtname, upcase(name) as start, ' ' as hlo,

    case

       when type eq 'Rand' then '000'

       else '123'

    end as label

from template

union corr

select '$unit' as fmtname, ' ' as start, 'o' as hlo, '999' as label from template;

quit;

proc format cntlin=distinctvaluelist;

run;

data sample;

input unit  :$10. @@;

recoded = put(upcase(unit),$unit.);

datalines;

DATA DATAVALUE ACCOUNT OTHERVALUE

;

Obs    unit          recoded

1     DATA            123

2     DATAVALUE       123

3     ACCOUNT         000

4     OTHERVALUE      999

if you want to stick with Patrick's PROC SQL plus a subsequent data step, maybe you would consider a couple changes (if a variable has the same value for all observations, you can use RETAIN to assign the value rather than an executable assignment that occurs at every data step iteration) ...

proc sql;

create view DistinctValueList as 

select distinct strip(upcase(name)) as start, type as new

from template;

quit;

data fmt / view=fmt;

retain fmtname '$unit';

do until (done);

   set distinctvaluelist end=done;

   label = ifc(new eq 'Rand', '000' , '123');

   output;

end;

hlo = 'o';

label = '999';

output;

run;

proc format cntlin=distinctvaluelist;

run;

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 414 views
  • 6 likes
  • 4 in conversation