DATA Step, Macro, Functions and more

Retain Statement vs Array statement

Accepted Solution Solved
Reply
Contributor
Posts: 60
Accepted Solution

Retain Statement vs Array statement

Hello,

 

I have the below code that works as a retain statement in SAS. However, I believe it could be done more effectively using an array statement and was wondering if someone could shead some light on how I could do it or even show me.

 

DATA CONCAT ;
SET WEEKS_OWNERSHIPS;
BY SBR_ID;
FORMAT 				RESORT1 RESORT2 RESORT3 $500.;
RETAIN 				   RESORT1 RESORT2 RESORT3 ;

IF FIRST.SBR_ID THEN DO;

RESORT1=''; RESORT2='';RESORT3='';FLAG1=. ;

END; 

IF FLAG1 = . THEN DO
RESORT1=RESORT_ID;
END;

IF FLAG1 = 1 THEN DO
RESORT2=RESORT_ID;
END;

IF FLAG1 = 2 THEN DO
RESORT3=RESORT_ID;
END;

FLAG1+1;

IF LAST.SBR_ID THEN OUTPUT;
RUN;

 

Thanks,

Cam


Accepted Solutions
Solution
‎11-24-2016 10:40 AM
Super User
Posts: 5,516

Re: Retain Statement vs Array statement

[ Edited ]
Posted in reply to CamRutherford

While you have some suggestions, I'm not sure that they increment FLAG properly.  Here's how I would try it:

 

data want;

array resort {5} $ 500;

flag=0;

do until (last.sbr_id);

   set weeks_ownerships;

   by sbr_id;

   flag + 1;

   if flag <= 5 then resort{flag} = resort_id;

end;

drop flag resort_id;

run;

 

By using a DO loop to read all observations for an ID, you don't need to RETAIN resort variables and you don't need to reinitialize them to missing.

 

Also note, PROC TRANSPOSE might be a viable approach (depending on what other variables are part of the data set).

View solution in original post


All Replies
Trusted Advisor
Posts: 1,585

Re: Retain Statement vs Array statement

Posted in reply to CamRutherford

There are more than one way to write a shorter code, thoght not specifically moe efficient:

DATA CONCAT ;
SET WEEKS_OWNERSHIPS;
BY SBR_ID;
FORMAT 	 RESORT1 - RESORT3 $500.;     /* equvilent to RESORT1 RESORT2 RESORT3 */
RETAIN 	 RESORT1 - RESORT3 ;           
array resx $ RESORT1 - RESORT3 ; 

IF FIRST.SBR_ID THEN DO i=1 to 3;
      resx(i) = ' ';
END; 

if flag = . then RESORT1=RESORT_ID; else
resx(flag +1) = RESORT_ID; IF LAST.SBR_ID THEN OUTPUT; RUN;
Super User
Super User
Posts: 7,991

Re: Retain Statement vs Array statement

[ Edited ]
Posted in reply to CamRutherford

Note that this code is not tested, if you want tested code post test data in the form of a datastep:

data concat ;
  set weeks_ownerships;
  by sbr_id;
  array resort{3} $500;
  retain resort:;
if first.sbr_id then call missing(of resort:); resort{sum(flag,1)}=resort_id; if last.sbr_id then output; run;

Note also that writing code all in upcase and with no indentations and such like makes your code very hard to read.

Note I use sum() so that if flag=. it will become 1, if you just use + it wont.

Contributor
Posts: 60

Re: Retain Statement vs Array statement

Hi, 

 

I actually find it easier to read in capitals.

 

Anyway, this is my complete code and does what I need but you're array doesn't seem to do this...

 

DATA CONCAT ;
SET WEEKS_OWNERSHIPS;
BY SBR_ID;
FORMAT 		RESORT1 RESORT2 RESORT3 RESORT4 RESORT5 R1_UNIT R2_UNIT R3_UNIT R4_UNIT R5_UNIT 
						R1_OWN_DATE R2_OWN_DATE R3_OWN_DATE R4_OWN_DATE R5_OWN_DATE R1_BED R2_BED R3_BED R4_BED R5_BED
						R1_OCCUPANCY_MAX R2_OCCUPANCY_MAX R3_OCCUPANCY_MAX R4_OCCUPANCY_MAX R5_OCCUPANCY_MAX
						R1_PRIVACY_MAX R2_PRIVACY_MAX R3_PRIVACY_MAX R4_PRIVACY_MAX R5_PRIVACY_MAX R1_OWN_KEY R2_OWN_KEY
						R3_OWN_KEY R4_OWN_KEY R5_OWN_KEY R1_RESORTNAME R2_RESORTNAME R3_RESORTNAME R4_RESORTNAME R5_RESORTNAME
						R1_RESORTSTAT R2_RESORTSTAT R3_RESORTSTAT R4_RESORTSTAT R5_RESORTSTAT R1_RESORTAREA R2_RESORTAREA R3_RESORTAREA
						R4_RESORTAREA R5_RESORTAREA R1_PRIM_REG R2_PRIM_REG R3_PRIM_REG R4_PRIM_REG R5_PRIM_REG R1_SUB_REG
						R2_SUB_REG R3_SUB_REG R4_SUB_REG R5_SUB_REG R1_FIXED_FLOAT R2_FIXED_FLOAT R3_FIXED_FLOAT R4_FIXED_FLOAT R5_FIXED_FLOAT $25.
						R1_START_DT  R2_START_DT  R3_START_DT  R4_START_DT  R5_START_DT R1_END_DT R2_END_DT R3_END_DT R4_END_DT R5_END_DT	DATE9.;
RETAIN 		   RESORT1 RESORT2 RESORT3 RESORT4 RESORT5 R1_UNIT R2_UNIT R3_UNIT R4_UNIT R5_UNIT 
						R1_OWN_DATE R2_OWN_DATE R3_OWN_DATE R4_OWN_DATE R5_OWN_DATE R1_BED R2_BED R3_BED R4_BED R5_BED
						R1_OCCUPANCY_MAX R2_OCCUPANCY_MAX R3_OCCUPANCY_MAX R4_OCCUPANCY_MAX R5_OCCUPANCY_MAX
						R1_PRIVACY_MAX R2_PRIVACY_MAX R3_PRIVACY_MAX R4_PRIVACY_MAX R5_PRIVACY_MAX R1_OWN_KEY R2_OWN_KEY
						R3_OWN_KEY R4_OWN_KEY R5_OWN_KEY R1_RESORTNAME R2_RESORTNAME R3_RESORTNAME R4_RESORTNAME R5_RESORTNAME
						R1_RESORTSTAT R2_RESORTSTAT R3_RESORTSTAT R4_RESORTSTAT R5_RESORTSTAT R1_RESORTAREA R2_RESORTAREA R3_RESORTAREA
						R4_RESORTAREA R5_RESORTAREA R1_PRIM_REG R2_PRIM_REG R3_PRIM_REG R4_PRIM_REG R5_PRIM_REG R1_SUB_REG
						R2_SUB_REG R3_SUB_REG R4_SUB_REG R5_SUB_REG R1_FIXED_FLOAT R2_FIXED_FLOAT R3_FIXED_FLOAT R4_FIXED_FLOAT R5_FIXED_FLOAT
						R1_START_DT  R2_START_DT  R3_START_DT  R4_START_DT  R5_START_DT R1_END_DT R2_END_DT R3_END_DT R4_END_DT R5_END_DT ;

IF FIRST.SBR_ID THEN DO;

RESORT1=''; RESORT2=''; RESORT3=''; RESORT4=''; RESORT5='';
R1_UNIT=''; R2_UNIT=''; R3_UNIT=''; R4_UNIT=''; R5_UNIT=''; 
R1_OWN_DATE=''; R2_OWN_DATE=''; R3_OWN_DATE=''; R4_OWN_DATE=''; R5_OWN_DATE='';
R1_BED=''; R2_BED=''; R3_BED=''; R4_BED=''; R5_BED='';
R1_OCCUPANCY_MAX=''; R2_OCCUPANCY_MAX=''; R3_OCCUPANCY_MAX=''; R4_OCCUPANCY_MAX=''; R5_OCCUPANCY_MAX='';
R1_PRIVACY_MAX=''; R2_PRIVACY_MAX=''; R3_PRIVACY_MAX=''; R4_PRIVACY_MAX=''; R5_PRIVACY_MAX='';
R1_OWN_KEY=''; R2_OWN_KEY=''; R3_OWN_KEY=''; R4_OWN_KEY=''; R5_OWN_KEY=''; 
R1_RESORTNAME=''; R2_RESORTNAME=''; R3_RESORTNAME=''; R4_RESORTNAME=''; R5_RESORTNAME='';
R1_RESORTSTAT=''; R2_RESORTSTAT=''; R3_RESORTSTAT=''; R4_RESORTSTAT=''; R5_RESORTSTAT='';
R1_RESORTAREA=''; R2_RESORTAREA=''; R3_RESORTAREA=''; R4_RESORTAREA=''; R5_RESORTAREA='';
R1_PRIM_REG=''; R2_PRIM_REG=''; R3_PRIM_REG=''; R4_PRIM_REG=''; R5_PRIM_REG='';
R1_SUB_REG=''; R2_SUB_REG=''; R3_SUB_REG=''; R4_SUB_REG=''; R5_SUB_REG='';
R1_FIXED_FLOAT=''; R2_FIXED_FLOAT=''; R3_FIXED_FLOAT=''; R4_FIXED_FLOAT=''; R5_FIXED_FLOAT='';
R1_START_DT='';  R2_START_DT='';  R3_START_DT='';  R4_START_DT='';  R5_START_DT=''; 
R1_END_DT=''; R2_END_DT=''; R3_END_DT=''; R4_END_DT=''; R5_END_DT='';		FLAG1=. ;
END; 

IF FLAG1 = . THEN DO
RESORT1=RESORT_ID;			R1_OCCUPANCY_MAX=MAX_OCC;			R1_RESORTNAME=RESORT_NAME;			R1_PRIM_REG=PRIM_REGION;			R1_END_DT=ENDDT_FORMATTED;
R1_UNIT=UNIT_NO;			R1_SUB_REG=SUBREG;			R1_RESORTSTAT=RESORT_STAT;			R1_OWN_DATE=OWN_PURCH_DT;			R1_PRIVACY_MAX=PRIV_OCC;
R1_RESORTAREA=RESORT_AREA;			R1_BED=BEDROOM;			R1_FIXED_FLOAT=FIXEDFLOAT;			R1_OWN_KEY=OWNERSHIP_KEY;			R1_START_DT=STARTDT_FORMATTED;
END;

IF FLAG1 = 1 THEN DO
RESORT2=RESORT_ID;			R2_OCCUPANCY_MAX=MAX_OCC;			R2_RESORTNAME=RESORT_NAME;			R2_PRIM_REG=PRIM_REGION;			R2_END_DT=ENDDT_FORMATTED;
R2_UNIT=UNIT_NO;			R2_SUB_REG=SUBREG;			R2_RESORTSTAT=RESORT_STAT;			R2_OWN_DATE=OWN_PURCH_DT;			R2_PRIVACY_MAX=PRIV_OCC;
R2_RESORTAREA=RESORT_AREA;			R2_BED=BEDROOM;			R2_FIXED_FLOAT=FIXEDFLOAT;			R2_OWN_KEY=OWNERSHIP_KEY;			R2_START_DT=STARTDT_FORMATTED;
END;

IF FLAG1 = 2 THEN DO
RESORT3=RESORT_ID;			R3_OCCUPANCY_MAX=MAX_OCC;			R3_RESORTNAME=RESORT_NAME;			R3_PRIM_REG=PRIM_REGION;			R3_END_DT=ENDDT_FORMATTED;
R3_UNIT=UNIT_NO;			R3_SUB_REG=SUBREG;			R3_RESORTSTAT=RESORT_STAT;			R3_OWN_DATE=OWN_PURCH_DT;			R3_PRIVACY_MAX=PRIV_OCC;
R3_RESORTAREA=RESORT_AREA;			R3_BED=BEDROOM;			R3_FIXED_FLOAT=FIXEDFLOAT;			R3_OWN_KEY=OWNERSHIP_KEY;			R3_START_DT=STARTDT_FORMATTED;
END;

IF FLAG1 = 3 THEN DO
RESORT4=RESORT_ID;			R4_OCCUPANCY_MAX=MAX_OCC;			R4_RESORTNAME=RESORT_NAME;			R4_PRIM_REG=PRIM_REGION;			R4_END_DT=ENDDT_FORMATTED;
R4_UNIT=UNIT_NO;			R4_SUB_REG=SUBREG;			R4_RESORTSTAT=RESORT_STAT;			R4_OWN_DATE=OWN_PURCH_DT;			R4_PRIVACY_MAX=PRIV_OCC;
R4_RESORTAREA=RESORT_AREA;			R4_BED=BEDROOM;			R4_FIXED_FLOAT=FIXEDFLOAT;			R4_OWN_KEY=OWNERSHIP_KEY;			R4_START_DT=STARTDT_FORMATTED;
END;

IF FLAG1 = 4 THEN DO
RESORT5=RESORT_ID;			R5_OCCUPANCY_MAX=MAX_OCC;			R5_RESORTNAME=RESORT_NAME;			R5_PRIM_REG=PRIM_REGION;			R5_END_DT=ENDDT_FORMATTED;
R5_UNIT=UNIT_NO;			R5_SUB_REG=SUBREG;			R5_RESORTSTAT=RESORT_STAT;			R5_OWN_DATE=OWN_PURCH_DT;			R5_PRIVACY_MAX=PRIV_OCC;
R5_RESORTAREA=RESORT_AREA;			R5_BED=BEDROOM;			R5_FIXED_FLOAT=FIXEDFLOAT;			R5_OWN_KEY=OWNERSHIP_KEY;			R5_START_DT=STARTDT_FORMATTED;
END;

FLAG1+1;
IF LAST.SBR_ID THEN OUTPUT;
RUN;
Super User
Posts: 5,516

Re: Retain Statement vs Array statement

RW9,

 

I'm not sure why I gave this a second look.  Here are a couple of easily fixable issues related to your program.

 

How does FLAG ever become 2 (or 3 or 4)?  

 

CALL MISSING will also set RESORT_ID to missing.

Solution
‎11-24-2016 10:40 AM
Super User
Posts: 5,516

Re: Retain Statement vs Array statement

[ Edited ]
Posted in reply to CamRutherford

While you have some suggestions, I'm not sure that they increment FLAG properly.  Here's how I would try it:

 

data want;

array resort {5} $ 500;

flag=0;

do until (last.sbr_id);

   set weeks_ownerships;

   by sbr_id;

   flag + 1;

   if flag <= 5 then resort{flag} = resort_id;

end;

drop flag resort_id;

run;

 

By using a DO loop to read all observations for an ID, you don't need to RETAIN resort variables and you don't need to reinitialize them to missing.

 

Also note, PROC TRANSPOSE might be a viable approach (depending on what other variables are part of the data set).

Contributor
Posts: 60

Re: Retain Statement vs Array statement

Posted in reply to Astounding

Hi I get an error...

 

1544       data concat;
1545       array resort {5} $ 500;
1546       flag=0;
1547       do until (last.sbr_id);
1548          set weeks_ownerships;
1549          by sbr_id;
1550          flag + 1;
1551          resort{flag} = resort;
ERROR: Illegal reference to the array resort.
1552       end;
1553       drop flag resort;
                     ______
                     241
ERROR 241-185: The array resort is not allowed in a DROP/KEEP/RENAME context.
Trusted Advisor
Posts: 1,585

Re: Retain Statement vs Array statement

Posted in reply to CamRutherford

Few questions to clarify:

 

1) What is the purpose of assigning RESORT_ID into an array ?

    Should array have 3 entries only ? dynamic amount ? what are the rules ?

 

2) Is FLAG a variable in the input dataset (HAVE) or a temporary variable ?

    @Astounding - within your post, why not use _N_ instead FLAG, as index ? 

Super User
Posts: 5,516

Re: Retain Statement vs Array statement

_N_ wouldn't work here.  It doesn't increment automatically just because the SET statement executes.  It actually counts something different:  how many  times has the DATA step left the DATA statement in order to execute the remaining statements in the DATA step?  So _N_=1 for all observations related to the first ID value.

Trusted Advisor
Posts: 1,585

Re: Retain Statement vs Array statement

[ Edited ]
Posted in reply to CamRutherford

reason for the error - it should be:
resort{flag} = resort_id;

 

and

DROP ...RESORT_ID;

Contributor
Posts: 60

Re: Retain Statement vs Array statement

1544       DATA CONCAT;
1545       ARRAY RESORT {5} $ 500;
1546       FLAG=0;
1547       DO UNTIL (LAST.SBR_ID);
1548          SET WEEKS_OWNERSHIPS;
1549          BY SBR_ID;
1550          FLAG + 1;
1551          RESORT{FLAG} = RESORT_ID;
1552       END;
1553       DROP RESORT_ID;
1554       RUN;

ERROR: Array subscript out of range at line 1551 column 4.

Nope - still doesn't like it

Trusted Advisor
Posts: 1,585

Re: Retain Statement vs Array statement

[ Edited ]
Posted in reply to CamRutherford

@Astounding assumed you are going to create an array as long as number of observations in your input dataset.

I don't think that this is your target. You got the error because you got index > 5 to enter value in the array.

Super User
Posts: 5,516

Re: Retain Statement vs Array statement

Posted in reply to CamRutherford

The latest results indicate that you have more than 5 observations (and therefore more than 5 RESORT_ID values) for the same person.  The real question now becomes what do you want to have happen in that case?

 

One possibility:  add more RESORT# variables.

 

Another possibility:  ignore anything above 5 resorts per person.

 

Another possibility:  let the program figure out how many are needed, and let the program create as many as are needed.

 

Any of these is possible, but first you have to decide on the right outcome.

Contributor
Posts: 60

Re: Retain Statement vs Array statement

Posted in reply to Astounding

Hi,

 

I only want it to pick the first 5.

 

Another possibility:  ignore anything above 5 resorts per person.

 

Thanks,

 

Super User
Posts: 5,516

Re: Retain Statement vs Array statement

Posted in reply to CamRutherford

OK, I added more changes to the original post to use the first 5 only.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 18 replies
  • 459 views
  • 0 likes
  • 5 in conversation