BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
freshap
Fluorite | Level 6

Hello,

 

Let's say I have this dataset :

 

 

data have;
input id $ var1 var2 var3 maxvar;
datalines;
id1 . . 999 3
id2 . 10 10 3
id3 10 . . 1 ; run;

The . are what I called missing values.

 

I want that, whenever I finda missing value in var1-var3, I go and take the first non-missing value to the right and split it by the amount of columns I went through to find this value.

 

To help myself doing so, I implemented a column that gives the column in which I can find the last non missing value : maxvar

 

If maxvar=2, it means that var2 is the last non missing value of all vars (so var3, var4,... will be . )

 

I guess it will be more clear with an example : using the data set I typed above, here is the result that I would like :

 

idvar1var2var3maxvar
id13333333333
id255103
id310..1

 

I know that witch this example I could do something like "if var1=. and var2<>. then var1=var2/2 and var2=var2/2" etc, but the problem is that the number of var colums is variable and can go up to 100 colums

 

I thought of using array statement and lag function in a data step, but I'm not sure how to manage the variable amount of columns I'll have to go through.

 

Don't hesitate to ask if you want me to be more precise.

 

Thanks !

 

1 ACCEPTED SOLUTION

Accepted Solutions
jarapoch
Obsidian | Level 7

Maybe this code could work.

 

data want;
set have;
array v _numeric_;
n=1;
nmi=1; *non-missing value index;

do i=1 to dim(v);

if missing(v(i)) then
n=n+1;
else
do;

do j=nmi to i;
v[j]=v[i]/n;
end;
n=1;
nmi=i+1;
end;
end;
drop n nmi i j;
run;

View solution in original post

5 REPLIES 5
Rick_SAS
SAS Super FREQ

Do all of the missing values start in the first column? If not, it seems that you would need 

StartVar and EndVar rather than MaxVar. 

 

 

Rick_SAS
SAS Super FREQ

As the problem is currently stated, you can try to use arrays, similar to the following. I assume the MaxVar value for ID='id2' was supposed to be 2.

 

data have;
input id $ var1 var2 var3 maxvar;
datalines;
id1 .  . 999 3
id2 .  10 10 2
id3 10 . . 1
;

data want;
set have;
array v[*] var1-var3;
amount = v[maxvar] / maxvar;
do i = 1 to maxvar;
   v[i] = amount;
end;
run;

proc print;run;
freshap
Fluorite | Level 6

No, maxvar is not 2 in id2. I understand why you think I mistyped 3 into 2 but it's really a 3, as the last non-missing value is 10 in var3.

 

Let me give another exemple. If I had a row with var1-var6 such as :

 

var1var2var3var4var5var6
10.12..60

 

I would like to have this :

 

var1var2var3var4var5var6
1066202020

 

And maxvar here would be 6. I mean, I'm not even sure if maxvar will be of any use but I've got it in my dataset so I was thinking about using it.

 

Some observations/rows will have like maxvar=60 where some maxvar=4, so I thought of stopping the process when exceeding maxvar

jarapoch
Obsidian | Level 7

Maybe this code could work.

 

data want;
set have;
array v _numeric_;
n=1;
nmi=1; *non-missing value index;

do i=1 to dim(v);

if missing(v(i)) then
n=n+1;
else
do;

do j=nmi to i;
v[j]=v[i]/n;
end;
n=1;
nmi=i+1;
end;
end;
drop n nmi i j;
run;

freshap
Fluorite | Level 6
Thank you ! That's exactly what I wanted (I just have to replace the array by the values I want, being var1-var5 in the example I gave).
Learned quite some tricks about data steps thanks to your answer.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 837 views
  • 1 like
  • 3 in conversation