BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
gspritom
Fluorite | Level 6

Hi, I have a longitudinal dataset where age, sex, income, residence etc. were collected only for baseline. So, for the follow up visits, there is missing value in these fields.

I want to populate the missing cells with the values at baseline for a particular ID.

 

I tried the following code:

data cx.final;
set cx.final;
%macro create(howmany);
%do i=11001 %to &howmany;
data wave&id;
infile 1&i;
input age=agea;
run;
%end;
%mend create;
%create(62180);

 

but I am not clear where I am stuck based on the logs.

Could you please help me?

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

If

  1. Your data are sorted by ID,
  2. The Baseline is the first observation for each ID

then you can use the "last non-missing carried forward" property of the UPDATE statement.  Say your data are in dataset HAVE:

 

data want;
  update have (obs=0)  have;
  by id;
  output;
run;

BTW, if you drop the OUTPUT statement, you would generate only one obs per ID, with the latest non-missing value for each variable.  But with the OUTPUT, you will have the same number of obs per ID as there was to start with.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

5 REPLIES 5
Tom
Super User Tom
Super User

"cell" ???

 

Datasets have VARIABLES and OBSERVATIONS.

If you want to carry forward values you use the RETAIN statement.  Normally a new variable is reset to missing at the start of each iteration of a data step.  But a variable that is being RETAINed will keep the value it had at the end of the previous iteration.

 

So let's assume you have a dataset named HAVE with variables named ID, VISIT, and VALUE.  So to create a new variable name BASELINE you could do something like this:

data want;
  set have;
  by id;
  retain baseline;
  if first.id then call missing(baseline);
  if visit='BASELINE' then baseline=value;
run;

Note: Do not bother with trying to use the macro language to generate SAS code until you are proficient enough at writing SAS code that you know what SAS code you want the macro code to generate. 

gspritom
Fluorite | Level 6

Thank you for correcting me,

I meant to say that I want to copy the value of a VARIABLE (e.g. SEX) from the row corresponding to another VISIT=1 and paste it in rows of other values of VISIT, given the ID variable has same value for all of these rows.

Tom
Super User Tom
Super User

@gspritom wrote:

Thank you for correcting me,

I meant to say that I want to copy the value of a VARIABLE (e.g. SEX) from the row corresponding to another VISIT=1 and paste it in rows of other values of VISIT, given the ID variable has same value for all of these rows.


Adapt the code I showed to your dataset.  If you have problems then explain how it did not work.  Share your code and the log.  And share example input data and the desired result for that input.

mkeintz
PROC Star

If

  1. Your data are sorted by ID,
  2. The Baseline is the first observation for each ID

then you can use the "last non-missing carried forward" property of the UPDATE statement.  Say your data are in dataset HAVE:

 

data want;
  update have (obs=0)  have;
  by id;
  output;
run;

BTW, if you drop the OUTPUT statement, you would generate only one obs per ID, with the latest non-missing value for each variable.  But with the OUTPUT, you will have the same number of obs per ID as there was to start with.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
gspritom
Fluorite | Level 6

Thank you for your help.

 

🙂

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 802 views
  • 2 likes
  • 3 in conversation