I have dataset similar to the below
DATA COMP_LIST;
input NUM M1 $ M2 $;
datalines ;
12345 IN .
34556 IN IN
;
run;
What I need to do based on a list provided check if that number exists in the above table.
1. If it does, I need to check if there is a M* column that is NULL. If it is then update it with NULL.
For example 12345 exists in the table above and M1 is not null and has value of IN. So as long the next M column exists update that with 'IN'. so M2 for 12345 has no value so update M2 with 'IN'
2. if there is no M* column that is NULL then add the next M column to the table and update it with value.
For example 34556 exists in the dataset and both M1 and M2 has values of 'IN' . so now add M3 to the dataset and update with 'IN'
Hope this makes sense. What is the best way to achieve this ?
The problem with the WIDE structure is you cannot add a variable to a dataset once the data step has already started running. So you end up having to do some type of conditional code generation.
You can use PROC TRANSPOSE to convert between the tall and wide layouts.
Say you have this original dataset in WIDE format.
data wide ;
length num 8 m1-m3 $10;
input num m1-m3 ;
datalines;
12345 IN-18 IN-19 .
34565 IN-16 IN-24 IN-25
;
And you have this new data to append to it.
data new;
length num 8 m $10;
input num m ;
datalines;
34565 IN-27
;
So first transpose the WIDE dataset, then interleave the two, and then re-transpose back to a new WIDE dataset.
proc transpose data=wide
out=tall(drop=_name_ rename=(col1=M) )
;
by num;
var m: ;
run;
data new_tall;
set tall new ;
by num;
if missing(M) and not first.num then delete;
run;
proc transpose data=new_tall out=want(drop=_name_) prefix=M;
by num;
var m;
run;
Results
Obs num M1 M2 M3 M4 1 12345 IN-18 IN-19 2 34565 IN-16 IN-24 IN-25 IN-27
Not sure I understand what you are trying to do, but it looks to me like it would be a lot easier with a vertical structure instead.
data have;
infile datalines truncover;
input NUM @;
length m $10;
do index=1 by 1 until(missing(M));
input m @;
if not missing(m) then output;
end;
datalines ;
12345 IN .
34556 IN IN
;
data want;
set have;
by num;
output;
if last.num then do;
index+1;
output;
end;
run;
Results:
Obs NUM m index 1 12345 IN 1 2 12345 IN 2 3 34556 IN 1 4 34556 IN 2 5 34556 IN 3
@Tom Think you got what I meant. definitely vertical structure would be easier. But what i need is a horizontal one.
NUM M1 M2 M3
12345 IN-18 IN-19
34565 IN-16 IN-24 IN-25
I need to maintain an output as above. I would be adding a number to the IN. While I can check how many M columns are there by using the dictionary.columns table. What i am struggling to do is find if there is a M column with NULL value that I can update. If yes which one is that.
So in the example above, if I have a new entry for 34565, the should add a new column M4 to the dataset and populate it with say IN-28
NUM M1 M2 M3 M4
12345 IN-18 IN-19
34565 IN-16 IN-24 IN-25 IN-27
The problem with the WIDE structure is you cannot add a variable to a dataset once the data step has already started running. So you end up having to do some type of conditional code generation.
You can use PROC TRANSPOSE to convert between the tall and wide layouts.
Say you have this original dataset in WIDE format.
data wide ;
length num 8 m1-m3 $10;
input num m1-m3 ;
datalines;
12345 IN-18 IN-19 .
34565 IN-16 IN-24 IN-25
;
And you have this new data to append to it.
data new;
length num 8 m $10;
input num m ;
datalines;
34565 IN-27
;
So first transpose the WIDE dataset, then interleave the two, and then re-transpose back to a new WIDE dataset.
proc transpose data=wide
out=tall(drop=_name_ rename=(col1=M) )
;
by num;
var m: ;
run;
data new_tall;
set tall new ;
by num;
if missing(M) and not first.num then delete;
run;
proc transpose data=new_tall out=want(drop=_name_) prefix=M;
by num;
var m;
run;
Results
Obs num M1 M2 M3 M4 1 12345 IN-18 IN-19 2 34565 IN-16 IN-24 IN-25 IN-27
@Tom Transpose seems to be the option then. Thank you very much for your time and solution
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.