BookmarkSubscribeRSS Feed
deleted_user
Not applicable
I have a repeated measures data set and I would like to change it from a single observation per person to multiples observations per person. I am using the DATA step and an array. My code is something like this:

DATA longdata;
SET mydata;
ARRAY HDL[4];
DO VISIT = 1 to 4;
HiDL = HDL[VISIT];
OUTPUT;
END;
KEEP HiDL;

In my dataset, there are 4 time points for repeated measures. However, in some variables, they were only measured at the first and third time points and variables do not exist at all for the second and last time points. There are also some that were only measured at the first two time points, and no variables exist to represent the last two time points for those variables. When I write the code for these variables, similar to above, it tells me that the ARRAY line is not in the appropriate range. Can anyone help me fix this?? Thank you in advance for any help!
16 REPLIES 16
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Without additional SASLOG diagnostics, I would expect that you need to declare the SAS variables in your ARRAY statement, explicitly. Have a look at the ARRAY statement discussion in the SAS Language DOC, accessible at http://support.sas.com/ and click on Product Documentation.

Scott Barry
SBBWorks, Inc.
deleted_user
Not applicable
Hi Scott,

Thanks for the suggestion on where to look. However, I can't seem to find the document you are referring to in the link you posted. Can you be more specific on where to find it? Thanks again for your help!

-tjf
Doc_Duke
Rhodochrosite | Level 12
The product documentation is the third item down on the left. It will get you to the main SAS documentation page. The ARRAY statement is documented under BASE SAS.

You probably need to explicitly specify your array, something like
ARRAY hdl[4] hdl1 hdl2 hdl3 hdl4;
where hdl1-hdl4 contain the "wide" version of your data.

We can't be more specific in our help because your specification of the problem is a bit contradictory. In the first paragraph, you say you have one observation per person and in the second you talk about variables being present or not (which implies something other than one observation per person).

Doc Muhlbaier
Duke

Also, post your question to just one forum. We recognize that you may not see the 'best,' one for every question, but it is frustrating to the reader to keep up with multiple threads.
deleted_user
Not applicable
I was unable to find an answer to my problem in the SAS docs, but perhaps I will try to explain it again more clearly.

I have a data set which has many different variables as columns, and each row represents one person. The variables in the columns are measurements which were taken at varying times and represent 1 baseline measurement and 3 follow-up measurements. However, not all variables were measured at all 4 time points. Some were only measured at baseline and follow-up three and others were measured at baseline and follow-up one. Here is an example of how the data is structured:
id weight1 weight2 weight3 weight4 bmi1 bmi3 hdl1 hdl2
0 25 26 25 26 28 28 10 10

I want to create a data set from this which will look like this:
id visit weight bmi hdl
0 1 26
0 2 26
0 3
0 4
and I think you can get the idea from there.

My code for this would look like:
DATA datalong;
SET mydata;

ARRAY weight[4];
ARRAY bmi[2];
ARRAY hdl[2];
DO VISIT = 1 to 4;
WT = weight[VISIT];
BoMI= bmi[VISIT];
HiDL = hdl[VISIT];
OUTPUT;
END;
KEEP WT BoMI HiDL VISIT;

This is the error message that I get:
ERROR: Array subscript out of range at line 60 column 17
ERROR: Array subscript out of range at line 61 column 17
And these correspond to these two lines of code:
BoMI= bmi[VISIT];
HiDL = hdl[VISIT];

I know the problem is that these two variables do not have measurements for all 4 visit time points. Is there anyways to get around that?

I hope this is more clear albeit long winded. Thanks in advance for any suggestions!!
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Use the DIM function with assignment statements, as needed, and remember to set variable values to missing/blank, such as:

DATA;
retain cv1-cv4 1 xv1-xv2 2;
ARRAY ACV (*) CV1-CV4;
ARRAY AXV (*) XV1-XV2;
DO I=1 TO DIM(ACV);
OUTVAR1 = ACV(I);
IF I LE DIM(AXV) THEN OUTVAR2 = AXV(I);
ELSE OUTVAR2 = .;
OUTPUT;
END;
RUN;


Scott Barry
SBBWorks, Inc.
deleted_user
Not applicable
Scott,

Thank you very much! The code you wrote does exactly what I wanted. It's wonderful!

tjf5004
deleted_user
Not applicable
One more question... I have no problem adding more variables to that code that have less than 4 time points; they print out fine, just as expected. However, I have written in code for those variables that also have 4 time points (other than the first one with the DIM), but in the output, they are all set to missing. I'm not sure how I should write them into this code. Any suggestions?
deleted_user
Not applicable
Also, the variables with time points that aren't consecutive are giving me trouble. I mean for those variables that were measured on the 1st and 3rd visit. When I write the code in the same syntax as those variables that just have time points 1 and 2, it doesn't work. SAS makes all time points for that variable missing. Again, any suggestions?
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
You need to share code by pasting the SASLOG, and use a SAS DATA step PUT (or PUTLOG in SAS 9) statement for diagnosis purposes, something like:

PUTLOG ">DIAG99" / _ALL_;

I suspect that you will need to setup a "largest sized ARRAY" and use that with your outside DO/END with a DIM(array_name) function. And you must realize that the DO loop occurrences with your SAS variables (and the inside ARRAYs) will only work if they are declared, either with an explicit variable list or if the naming convention works, something like:

DATA ;
RETAIN XVAR1-XVAR99 0;
ARRAY AXVAR (*) XVAR:
DO I=1 TO DIM(AXVAR);
* Create one obs for each var in the array. ;
XVAR = AXVAR(I);
OUTPUT;
END;
RUN;

I would recommend adding the diagnostic code first, debug your program using the SASLOG info, and then if necesssary, contact the group for additional assistance. Remember to share your SASLOG output to help explain your situation.

Scott Barry
SBBWorks, Inc.
deleted_user
Not applicable
When I add the diagnostic code, my SASLOG says I have 0 errors.

Here is what my code looks like:
DATA newlong;
SET data;
RETAIN CHOLESG1-CHOLESG4 1 BMI1-BMI2 2 WT1-WT2 3 WTK1-WTK2 4 CHOL1 CHOL3 5;
ARRAY ACHOLESG (*) CHOLESG1-CHOLESG4;
ARRAY ABMI (*) BMI1-BMI2;
ARRAY AWT (*) WT1-WT2;
ARRAY AWTK (*) WTK1-WTK2;
ARRAY ACHOL (*) ACHOL1 ACHOL3;
DO VISIT=1 TO DIM(ACHOLESG);
CHOLESTG = ACHOLESG(VISIT);
IF VISIT LE DIM(ABMI) THEN BoMI = ABMI(VISIT);
ELSE BoMI = .;
IF VISIT LE DIM(AWT) THEN WTlb= AWT(VISIT);
ELSE WTlb = .;
IF VISIT LE DIM(AWTK) THEN WTkg = AWTK=(VISIT);
ELSE WTkg = .;
IF VISIT LE DIM(ACHOL) THEN CHOLES = ACHOL(VISIT);
ELSE CHOLES = .;
OUTPUT;
END;

My SASLOG shows no error messages and reports everything being read correctly. However, the column for CHOLES, which is created from CHOL1 and CHOL3, has all missing values. The same thing happens when I add another variable that has all 4 time points. I assume there must be something wrong with my syntax, but I'm just not sure what it is. Any suggestions?
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Check your ARRAY statement for ACHOL and compare it to your other ARRAY statements for consistency. There are multiple problems - one hint: where do you initialize ACHOL1 ??

Scott
deleted_user
Not applicable
Even if I change it to be the same as the others (ie. ARRAY ACHOL (*) ACHOL1-ACHOL3;) the same thing happens. I also change the RETAIN statement to match. Or did you have another way of writing it in mind?
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Okay, you corrected part #1 with the missing hyphen. Double-check your SAS variable names coded in that same ARRAY statement, as ACHOL1 through ACHOL3, and then look at the variables named in your RETAIN statement. Comparing the variable names used should help identify your problem. Also, adding a SAS PUTLOG _ALL_; to your program will also help with DATA step execution debugging.

Scott Barry
SBBWorks, Inc.
deleted_user
Not applicable
Wow, I think I have been looking at this way too long. Such a small mistake! I have it corrected and I think everything is running properly now. Thanks for all of your help!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 16 replies
  • 1256 views
  • 0 likes
  • 3 in conversation