BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
femiajumobi1
Quartz | Level 8

A. 

I have this data (short) with 50 date variables DFS_Fam1 - DFS_Fam50. Some observations do not have date value for DFS_Fam1 - DFS_Fam50

I want to create variables that show if a date value is available, it is indicated as 1, and no date value = 0 or missing. 

 

           ID              DFS_Fam1               DFS_Fam2                 DFS_Fam3

           1                01/01/2023              02/01/2023                   03/01/2023 

           2                04/01/2023              05/01/2023                   06/01/2023 

           3                05/01/2023                                 

 

For example for observation 1, if DFS_Fam1, DFS_Fam2, DFS_Fam3 have date values, then New_A1=1, New_A2=1, New_A3=1 as shown (but up to whenever DFS_Fam has dates)

           ID              DFS_Fam1               DFS_Fam2                 DFS_Fam3        New_A1       New_A2   New_A3

           1                01/01/2023              02/01/2023                   03/01/2023         1                   1               1

           2                04/01/2023              05/01/2023                   06/01/2023         1                   1               1

           3                05/01/2023                                                                              1                     

 

This is clinical data, everyone has DFS_Fam1 but not necessary  DFS_Fam2 ,  DFS_Fam3 up to  DFS_Fam50, thus there would be missing dates.

 

How do I go about these without writing 50 lines of code?

 

B. I would like to convert New_A to time points

Data long;

set short;

New=New_A1; time=0; output;

New=New_A2; time=1; output;

New=New_A3; time=2; output;

...        ....          .........     .......

New=New_A50; time=50; output;

 

How do I go about these without writing 50 lines of code?

Thanks.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

Use ARRAYs in a SAS data step.

 

UNTESTED CODE

 

data want;
    set have;
    array dfs dfs_:
    array new new_a1-new_a50;
    do i=1 to dim(dfs);
        if not missing(dfs(i)) then new(i)=1;
        else new(i)=0;
    end;
    drop i;
run;
--
Paige Miller

View solution in original post

23 REPLIES 23
PaigeMiller
Diamond | Level 26

Use ARRAYs in a SAS data step.

 

UNTESTED CODE

 

data want;
    set have;
    array dfs dfs_:
    array new new_a1-new_a50;
    do i=1 to dim(dfs);
        if not missing(dfs(i)) then new(i)=1;
        else new(i)=0;
    end;
    drop i;
run;
--
Paige Miller
femiajumobi1
Quartz | Level 8
SAS shows

ERROR Mixing of implicit and explicit array subscription is not allowed.
PaigeMiller
Diamond | Level 26

Try this:

 

    array dfs dfs_fam1-dfs_fam50;
--
Paige Miller
femiajumobi1
Quartz | Level 8
Hi@Paige Miller. I rechecked the code and the 1st approach worked. Thanks.

Kindly assist with Part B of my question above. I need to create time points before I move ahead to GEE. Thanks.
PaigeMiller
Diamond | Level 26

@femiajumobi1 wrote:

Kindly assist with Part B of my question above. I need to create time points before I move ahead to GEE. Thanks.

Its one more line of code using the ARRAY variables, in the DO loop. Perhaps you could give it a try and if you get stuck, show us what you tried.

--
Paige Miller
femiajumobi1
Quartz | Level 8
Please see below but it did not work.

data want;
set have;
array dfa_fam dfa_fam1-dfa_fam50:
array new new_a1-new_a50;
array time t1-t50;
do i=1 to dim( dfa_fam);
if not missing(dfa_fam(i)) then new(i)=1;
else new(i)=0;
if not missing(new(i)) then time(i)=1;
else time(i)=0;
end;
drop i;
run;

Proc transpose data =want out =wanted;
by time;
run;
Reeza
Super User
Which part didn't work? How did it not work?

Rather than by time, I would expect to see the transpose use BY ID.

https://stats.oarc.ucla.edu/sas/modules/how-to-reshape-data-wide-to-long-using-proc-transpose/
PaigeMiller
Diamond | Level 26

From now on, please put code in a code box, by clicking on the "little running man" icon and pasting your code into the window that appears. Then it looks like this:

 

data want;
set have;
array dfa_fam dfa_fam1-dfa_fam50:
array new new_a1-new_a50;
array time t1-t50;
do i=1 to dim( dfa_fam);
if not missing(dfa_fam(i)) then new(i)=1;
else new(i)=0;
if not missing(new(i)) then time(i)=1;
else time(i)=0;
end;
drop i;
run;

 

You said you wanted code like this for all 50 variables

 

New=New_A3; time=2; output;

 

There's no need (if I am understanding you properly) to have an array for TIME, time is just a single variable not an array. So, I would do this:

 

data want;
set have;
array dfa_fam dfa_fam1-dfa_fam50:
array new new_a1-new_a50;
array time t1-t50;
do i=1 to dim( dfa_fam);
if not missing(dfa_fam(i)) then new(i)=1;
else new(i)=0;
if new(i)>0 then do;
     time=i-1;
     new=new(i);
     output;
end;
drop i;
run;

The new code assigns the value of i-1 to variable TIME, and the value of new(i) to a variable named NEW, which seems to be what you are asking for.

 

The whole process could be simplified, without creating variables new_a1-new_a50, these don't seem to serve any purpose and so you could just compute them without creating these variables, just doing the calculation and using the calculation. You could also use a DROP statement to remove them from data set WANT.

 

--
Paige Miller
femiajumobi1
Quartz | Level 8
data want;
set have;
array dfa_fam dfa_fam1-dfa_fam50:
array new new_a1-new_a50;
do i=1 to dim( dfa_fam);
if not missing(dfa_fam(i)) then new(i)=1;
else new(i)=0;
if new(i)>0 then do;
     time=i-1;
     new=new(i);
     output;
end;
end;
drop i;
run;

@PaigeMiller thanks, back to square one. ERROR message: Mixing of implicit and explicit arrays is not allowed.

femiajumobi1
Quartz | Level 8

@PaigeMiller I want to ensure time is reflected in a longitudinal format, wouldn't I need to have an array? Thanks.

PaigeMiller
Diamond | Level 26

@femiajumobi1 wrote:

@PaigeMiller I want to ensure time is reflected in a longitudinal format, wouldn't I need to have an array? Thanks.


I don't know what you need. Your original post indicated that TIME was a single variable.

--
Paige Miller
PaigeMiller
Diamond | Level 26

Please show me the actual log, all of the log for this data step, meaning I want to see every single line and every single character in the log for this data step. Please copy the log as text and paste it into the window that appears when you click on the </> icon.

 

Insert Log Icon in SAS Communities.png

 

From now on, please provide the log this way EVERY time there are errors or other problems in the log.

--
Paige Miller
femiajumobi1
Quartz | Level 8
data want;
set have;
array dfa_fam dfa_fam1-dfa_fam50:
array new new_a1-new_a50;
do i=1 to dim( dfa_fam);
if not missing(dfa_fam(i)) then new(i)=1;
else new(i)=0;
if new(i)>0 then do;
     time=i-1;
     new=new(i);
     output;
end;
end;
drop i;
run;

@PaigeMiller Here is the log of the code. Thanks.

PaigeMiller
Diamond | Level 26

This is not the log.

--
Paige Miller

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 23 replies
  • 1133 views
  • 3 likes
  • 3 in conversation