Hi Folks:
I'm trying to replace missing information by existing information in the data with some conditions applied as shown in the image below. I couldn't do it manually and hope there's a way to automate this in SAS?
if prxmatch("m/A|B|C/oi",NAME) > 0 then fill missing in variable I using the
available value. Which is 10 for I, 25 for J and 23 for K variables.
if prxmatch("m/D|E|F/oi",NAME) > 0 then fill missing in variable I using the
available value. Which is 2 for I, 25 for J and 3 for K variables.
I have 10 more batches by prxmatch conditions as shown above.
About 85% of data has no missing as shown by G and H rows. No action needed
if no missing outside the rows specified in the prxmatch clauses.
DATA FILL;
INPUT NAME $ I J K;
CARDS;
A 10 . .
B . 25 .
C . . 23
D 2 . .
E . . 3
F . 25 .
G 1 2 4
H 5 7 8
;
I greatly appreciate your time and help.
DATA FILL;
INPUT NAME $ I J K;
CARDS;
A 10 . .
B . 25 .
C . . 23
D 2 . .
E . . 3
F . 25 .
G 1 2 4
H 5 7 8
;
data have;
set fill;
n+1;
retain group;
if name in ('A' 'B' 'C') then group=1;
else if name in ('D' 'E' 'F') then group=2;
else group+1;
run;
data temp;
update have(obs=0) have;
by group;
output;
run;
proc sort data=temp;
by group descending n;
run;
data want;
update temp(obs=0) temp;
by group;
output;
run;
proc sort data=want;
by group n;
run;
please try the below code, the expected output will be in i1 j1 and k1 variables
DATA FILL;
INPUT NAME $ I J K;
CARDS;
A 10 . .
B . 25 .
C . . 23
D 2 . .
E . . 3
F . 25 .
G 1 2 4
H 5 7 8
;
data have;
set fill;
retain group;
if name in ('A' 'B' 'C') then group=1;
else if name in ('D' 'E' 'F') then group=2;
else group+1;
run;
proc sort data=have;
by group name;
run;
data want;
do until(last.group);
set have;
by group;
retain i1 j1 k1;
array vars1(3) i j k;
array vars(3) i1 j1 k1;
do z = 1 to 3;
if first.group then vars(z)=.;
if vars1(z) ne . then vars(z)=vars1(z);
end;
end;
do until(last.group);
set have;
by group;
output;
end;
run;
From where do you know that A,B,C and D,E,F form a group?
They are city names that are part of a province. Data I merged didn't have city-specific values but province average. That province average is recorded as one of city values. Therefore, known value of a city value (actually province average) is being used for the other cities. I can not remove them just because they're less accurate. Also, I have about 32 variables as each one of these are survey items. Jagadishkatam would do the job but I have to create an exhaustive list of 32 variables and their index variables. Then rename and drop not needed ones et.c. Is there anyway to go around this iterative typing process?
You need to have a LOGICAL RULE which can be converted into code. No rule, no code. At least not code that can be generalized. And writing every single instance into the code defeats the purpose of programming.
So the first thing you need to do is set up a mechanism that assigns those groups automatically. If you need assistance in that, post your original data (or a mock-up that so closely resembles your original data that you can translate our code suggestions).
creating mock data? i'll work on that tomorrow. but i still believe that there is a logic and problem could be solved based on the small mock data I posted assuming that I have many columns about N=32.
@Jagadishkatam's suggestion will work for the data you posted, BUT FOR NOTHING ELSE! Just read that code, it does the grouping with literal values in the code, and this part HAS TO BE REPLACED with something that automates this, or you'll be writing code for every city you encounter.
DATA FILL;
INPUT NAME $ I J K;
CARDS;
A 10 . .
B . 25 .
C . . 23
D 2 . .
E . . 3
F . 25 .
G 1 2 4
H 5 7 8
;
data have;
set fill;
n+1;
retain group;
if name in ('A' 'B' 'C') then group=1;
else if name in ('D' 'E' 'F') then group=2;
else group+1;
run;
data temp;
update have(obs=0) have;
by group;
output;
run;
proc sort data=temp;
by group descending n;
run;
data want;
update temp(obs=0) temp;
by group;
output;
run;
proc sort data=want;
by group n;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.