Hello,
I have the below code that works as a retain statement in SAS. However, I believe it could be done more effectively using an array statement and was wondering if someone could shead some light on how I could do it or even show me.
DATA CONCAT ; SET WEEKS_OWNERSHIPS; BY SBR_ID; FORMAT RESORT1 RESORT2 RESORT3 $500.; RETAIN RESORT1 RESORT2 RESORT3 ; IF FIRST.SBR_ID THEN DO; RESORT1=''; RESORT2='';RESORT3='';FLAG1=. ; END; IF FLAG1 = . THEN DO RESORT1=RESORT_ID; END; IF FLAG1 = 1 THEN DO RESORT2=RESORT_ID; END; IF FLAG1 = 2 THEN DO RESORT3=RESORT_ID; END; FLAG1+1; IF LAST.SBR_ID THEN OUTPUT; RUN;
Thanks,
Cam
While you have some suggestions, I'm not sure that they increment FLAG properly. Here's how I would try it:
data want;
array resort {5} $ 500;
flag=0;
do until (last.sbr_id);
set weeks_ownerships;
by sbr_id;
flag + 1;
if flag <= 5 then resort{flag} = resort_id;
end;
drop flag resort_id;
run;
By using a DO loop to read all observations for an ID, you don't need to RETAIN resort variables and you don't need to reinitialize them to missing.
Also note, PROC TRANSPOSE might be a viable approach (depending on what other variables are part of the data set).
There are more than one way to write a shorter code, thoght not specifically moe efficient:
DATA CONCAT ;
SET WEEKS_OWNERSHIPS;
BY SBR_ID;
FORMAT RESORT1 - RESORT3 $500.; /* equvilent to RESORT1 RESORT2 RESORT3 */
RETAIN RESORT1 - RESORT3 ;
array resx $ RESORT1 - RESORT3 ;
IF FIRST.SBR_ID THEN DO i=1 to 3;
resx(i) = ' ';
END;
if flag = . then RESORT1=RESORT_ID; else
resx(flag +1) = RESORT_ID;
IF LAST.SBR_ID THEN OUTPUT;
RUN;
Note that this code is not tested, if you want tested code post test data in the form of a datastep:
data concat ; set weeks_ownerships; by sbr_id; array resort{3} $500; retain resort:;
if first.sbr_id then call missing(of resort:); resort{sum(flag,1)}=resort_id; if last.sbr_id then output; run;
Note also that writing code all in upcase and with no indentations and such like makes your code very hard to read.
Note I use sum() so that if flag=. it will become 1, if you just use + it wont.
Hi,
I actually find it easier to read in capitals.
Anyway, this is my complete code and does what I need but you're array doesn't seem to do this...
DATA CONCAT ; SET WEEKS_OWNERSHIPS; BY SBR_ID; FORMAT RESORT1 RESORT2 RESORT3 RESORT4 RESORT5 R1_UNIT R2_UNIT R3_UNIT R4_UNIT R5_UNIT R1_OWN_DATE R2_OWN_DATE R3_OWN_DATE R4_OWN_DATE R5_OWN_DATE R1_BED R2_BED R3_BED R4_BED R5_BED R1_OCCUPANCY_MAX R2_OCCUPANCY_MAX R3_OCCUPANCY_MAX R4_OCCUPANCY_MAX R5_OCCUPANCY_MAX R1_PRIVACY_MAX R2_PRIVACY_MAX R3_PRIVACY_MAX R4_PRIVACY_MAX R5_PRIVACY_MAX R1_OWN_KEY R2_OWN_KEY R3_OWN_KEY R4_OWN_KEY R5_OWN_KEY R1_RESORTNAME R2_RESORTNAME R3_RESORTNAME R4_RESORTNAME R5_RESORTNAME R1_RESORTSTAT R2_RESORTSTAT R3_RESORTSTAT R4_RESORTSTAT R5_RESORTSTAT R1_RESORTAREA R2_RESORTAREA R3_RESORTAREA R4_RESORTAREA R5_RESORTAREA R1_PRIM_REG R2_PRIM_REG R3_PRIM_REG R4_PRIM_REG R5_PRIM_REG R1_SUB_REG R2_SUB_REG R3_SUB_REG R4_SUB_REG R5_SUB_REG R1_FIXED_FLOAT R2_FIXED_FLOAT R3_FIXED_FLOAT R4_FIXED_FLOAT R5_FIXED_FLOAT $25. R1_START_DT R2_START_DT R3_START_DT R4_START_DT R5_START_DT R1_END_DT R2_END_DT R3_END_DT R4_END_DT R5_END_DT DATE9.; RETAIN RESORT1 RESORT2 RESORT3 RESORT4 RESORT5 R1_UNIT R2_UNIT R3_UNIT R4_UNIT R5_UNIT R1_OWN_DATE R2_OWN_DATE R3_OWN_DATE R4_OWN_DATE R5_OWN_DATE R1_BED R2_BED R3_BED R4_BED R5_BED R1_OCCUPANCY_MAX R2_OCCUPANCY_MAX R3_OCCUPANCY_MAX R4_OCCUPANCY_MAX R5_OCCUPANCY_MAX R1_PRIVACY_MAX R2_PRIVACY_MAX R3_PRIVACY_MAX R4_PRIVACY_MAX R5_PRIVACY_MAX R1_OWN_KEY R2_OWN_KEY R3_OWN_KEY R4_OWN_KEY R5_OWN_KEY R1_RESORTNAME R2_RESORTNAME R3_RESORTNAME R4_RESORTNAME R5_RESORTNAME R1_RESORTSTAT R2_RESORTSTAT R3_RESORTSTAT R4_RESORTSTAT R5_RESORTSTAT R1_RESORTAREA R2_RESORTAREA R3_RESORTAREA R4_RESORTAREA R5_RESORTAREA R1_PRIM_REG R2_PRIM_REG R3_PRIM_REG R4_PRIM_REG R5_PRIM_REG R1_SUB_REG R2_SUB_REG R3_SUB_REG R4_SUB_REG R5_SUB_REG R1_FIXED_FLOAT R2_FIXED_FLOAT R3_FIXED_FLOAT R4_FIXED_FLOAT R5_FIXED_FLOAT R1_START_DT R2_START_DT R3_START_DT R4_START_DT R5_START_DT R1_END_DT R2_END_DT R3_END_DT R4_END_DT R5_END_DT ; IF FIRST.SBR_ID THEN DO; RESORT1=''; RESORT2=''; RESORT3=''; RESORT4=''; RESORT5=''; R1_UNIT=''; R2_UNIT=''; R3_UNIT=''; R4_UNIT=''; R5_UNIT=''; R1_OWN_DATE=''; R2_OWN_DATE=''; R3_OWN_DATE=''; R4_OWN_DATE=''; R5_OWN_DATE=''; R1_BED=''; R2_BED=''; R3_BED=''; R4_BED=''; R5_BED=''; R1_OCCUPANCY_MAX=''; R2_OCCUPANCY_MAX=''; R3_OCCUPANCY_MAX=''; R4_OCCUPANCY_MAX=''; R5_OCCUPANCY_MAX=''; R1_PRIVACY_MAX=''; R2_PRIVACY_MAX=''; R3_PRIVACY_MAX=''; R4_PRIVACY_MAX=''; R5_PRIVACY_MAX=''; R1_OWN_KEY=''; R2_OWN_KEY=''; R3_OWN_KEY=''; R4_OWN_KEY=''; R5_OWN_KEY=''; R1_RESORTNAME=''; R2_RESORTNAME=''; R3_RESORTNAME=''; R4_RESORTNAME=''; R5_RESORTNAME=''; R1_RESORTSTAT=''; R2_RESORTSTAT=''; R3_RESORTSTAT=''; R4_RESORTSTAT=''; R5_RESORTSTAT=''; R1_RESORTAREA=''; R2_RESORTAREA=''; R3_RESORTAREA=''; R4_RESORTAREA=''; R5_RESORTAREA=''; R1_PRIM_REG=''; R2_PRIM_REG=''; R3_PRIM_REG=''; R4_PRIM_REG=''; R5_PRIM_REG=''; R1_SUB_REG=''; R2_SUB_REG=''; R3_SUB_REG=''; R4_SUB_REG=''; R5_SUB_REG=''; R1_FIXED_FLOAT=''; R2_FIXED_FLOAT=''; R3_FIXED_FLOAT=''; R4_FIXED_FLOAT=''; R5_FIXED_FLOAT=''; R1_START_DT=''; R2_START_DT=''; R3_START_DT=''; R4_START_DT=''; R5_START_DT=''; R1_END_DT=''; R2_END_DT=''; R3_END_DT=''; R4_END_DT=''; R5_END_DT=''; FLAG1=. ; END; IF FLAG1 = . THEN DO RESORT1=RESORT_ID; R1_OCCUPANCY_MAX=MAX_OCC; R1_RESORTNAME=RESORT_NAME; R1_PRIM_REG=PRIM_REGION; R1_END_DT=ENDDT_FORMATTED; R1_UNIT=UNIT_NO; R1_SUB_REG=SUBREG; R1_RESORTSTAT=RESORT_STAT; R1_OWN_DATE=OWN_PURCH_DT; R1_PRIVACY_MAX=PRIV_OCC; R1_RESORTAREA=RESORT_AREA; R1_BED=BEDROOM; R1_FIXED_FLOAT=FIXEDFLOAT; R1_OWN_KEY=OWNERSHIP_KEY; R1_START_DT=STARTDT_FORMATTED; END; IF FLAG1 = 1 THEN DO RESORT2=RESORT_ID; R2_OCCUPANCY_MAX=MAX_OCC; R2_RESORTNAME=RESORT_NAME; R2_PRIM_REG=PRIM_REGION; R2_END_DT=ENDDT_FORMATTED; R2_UNIT=UNIT_NO; R2_SUB_REG=SUBREG; R2_RESORTSTAT=RESORT_STAT; R2_OWN_DATE=OWN_PURCH_DT; R2_PRIVACY_MAX=PRIV_OCC; R2_RESORTAREA=RESORT_AREA; R2_BED=BEDROOM; R2_FIXED_FLOAT=FIXEDFLOAT; R2_OWN_KEY=OWNERSHIP_KEY; R2_START_DT=STARTDT_FORMATTED; END; IF FLAG1 = 2 THEN DO RESORT3=RESORT_ID; R3_OCCUPANCY_MAX=MAX_OCC; R3_RESORTNAME=RESORT_NAME; R3_PRIM_REG=PRIM_REGION; R3_END_DT=ENDDT_FORMATTED; R3_UNIT=UNIT_NO; R3_SUB_REG=SUBREG; R3_RESORTSTAT=RESORT_STAT; R3_OWN_DATE=OWN_PURCH_DT; R3_PRIVACY_MAX=PRIV_OCC; R3_RESORTAREA=RESORT_AREA; R3_BED=BEDROOM; R3_FIXED_FLOAT=FIXEDFLOAT; R3_OWN_KEY=OWNERSHIP_KEY; R3_START_DT=STARTDT_FORMATTED; END; IF FLAG1 = 3 THEN DO RESORT4=RESORT_ID; R4_OCCUPANCY_MAX=MAX_OCC; R4_RESORTNAME=RESORT_NAME; R4_PRIM_REG=PRIM_REGION; R4_END_DT=ENDDT_FORMATTED; R4_UNIT=UNIT_NO; R4_SUB_REG=SUBREG; R4_RESORTSTAT=RESORT_STAT; R4_OWN_DATE=OWN_PURCH_DT; R4_PRIVACY_MAX=PRIV_OCC; R4_RESORTAREA=RESORT_AREA; R4_BED=BEDROOM; R4_FIXED_FLOAT=FIXEDFLOAT; R4_OWN_KEY=OWNERSHIP_KEY; R4_START_DT=STARTDT_FORMATTED; END; IF FLAG1 = 4 THEN DO RESORT5=RESORT_ID; R5_OCCUPANCY_MAX=MAX_OCC; R5_RESORTNAME=RESORT_NAME; R5_PRIM_REG=PRIM_REGION; R5_END_DT=ENDDT_FORMATTED; R5_UNIT=UNIT_NO; R5_SUB_REG=SUBREG; R5_RESORTSTAT=RESORT_STAT; R5_OWN_DATE=OWN_PURCH_DT; R5_PRIVACY_MAX=PRIV_OCC; R5_RESORTAREA=RESORT_AREA; R5_BED=BEDROOM; R5_FIXED_FLOAT=FIXEDFLOAT; R5_OWN_KEY=OWNERSHIP_KEY; R5_START_DT=STARTDT_FORMATTED; END; FLAG1+1; IF LAST.SBR_ID THEN OUTPUT; RUN;
RW9,
I'm not sure why I gave this a second look. Here are a couple of easily fixable issues related to your program.
How does FLAG ever become 2 (or 3 or 4)?
CALL MISSING will also set RESORT_ID to missing.
While you have some suggestions, I'm not sure that they increment FLAG properly. Here's how I would try it:
data want;
array resort {5} $ 500;
flag=0;
do until (last.sbr_id);
set weeks_ownerships;
by sbr_id;
flag + 1;
if flag <= 5 then resort{flag} = resort_id;
end;
drop flag resort_id;
run;
By using a DO loop to read all observations for an ID, you don't need to RETAIN resort variables and you don't need to reinitialize them to missing.
Also note, PROC TRANSPOSE might be a viable approach (depending on what other variables are part of the data set).
Hi I get an error...
1544 data concat; 1545 array resort {5} $ 500; 1546 flag=0; 1547 do until (last.sbr_id); 1548 set weeks_ownerships; 1549 by sbr_id; 1550 flag + 1; 1551 resort{flag} = resort; ERROR: Illegal reference to the array resort. 1552 end; 1553 drop flag resort; ______ 241 ERROR 241-185: The array resort is not allowed in a DROP/KEEP/RENAME context.
Few questions to clarify:
1) What is the purpose of assigning RESORT_ID into an array ?
Should array have 3 entries only ? dynamic amount ? what are the rules ?
2) Is FLAG a variable in the input dataset (HAVE) or a temporary variable ?
@Astounding - within your post, why not use _N_ instead FLAG, as index ?
_N_ wouldn't work here. It doesn't increment automatically just because the SET statement executes. It actually counts something different: how many times has the DATA step left the DATA statement in order to execute the remaining statements in the DATA step? So _N_=1 for all observations related to the first ID value.
reason for the error - it should be:
resort{flag} = resort_id;
and
DROP ...RESORT_ID;
1544 DATA CONCAT; 1545 ARRAY RESORT {5} $ 500; 1546 FLAG=0; 1547 DO UNTIL (LAST.SBR_ID); 1548 SET WEEKS_OWNERSHIPS; 1549 BY SBR_ID; 1550 FLAG + 1; 1551 RESORT{FLAG} = RESORT_ID; 1552 END; 1553 DROP RESORT_ID; 1554 RUN; ERROR: Array subscript out of range at line 1551 column 4.
Nope - still doesn't like it
@Astounding assumed you are going to create an array as long as number of observations in your input dataset.
I don't think that this is your target. You got the error because you got index > 5 to enter value in the array.
The latest results indicate that you have more than 5 observations (and therefore more than 5 RESORT_ID values) for the same person. The real question now becomes what do you want to have happen in that case?
One possibility: add more RESORT# variables.
Another possibility: ignore anything above 5 resorts per person.
Another possibility: let the program figure out how many are needed, and let the program create as many as are needed.
Any of these is possible, but first you have to decide on the right outcome.
Hi,
I only want it to pick the first 5.
Another possibility: ignore anything above 5 resorts per person.
Thanks,
OK, I added more changes to the original post to use the first 5 only.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.