SAS Programming

Anuz · Posted 02-19-2022 06:43 PM

I have dataset similar to the below

DATA COMP_LIST;
input NUM M1 $ M2 $;
datalines ;
12345 IN .
34556 IN IN
;
run;

What I need to do based on a list provided check if that number exists in the above table.

1. If it does, I need to check if there is a M* column that is NULL. If it is then update it with NULL.

For example 12345 exists in the table above and M1 is not null and has value of IN. So as long the next M column exists update that with 'IN'. so M2 for 12345 has no value so update M2 with 'IN'

2. if there is no M* column that is NULL then add the next M column to the table and update it with value.

For example 34556 exists in the dataset and both M1 and M2 has values of 'IN' . so now add M3 to the dataset and update with 'IN'

Hope this makes sense. What is the best way to achieve this ?

Tom · Posted 02-20-2022 10:50 AM

The problem with the WIDE structure is you cannot add a variable to a dataset once the data step has already started running. So you end up having to do some type of conditional code generation.

You can use PROC TRANSPOSE to convert between the tall and wide layouts.

Say you have this original dataset in WIDE format.

data wide ;
  length num 8 m1-m3 $10;
  input num m1-m3 ;
datalines;
12345 IN-18 IN-19 .
34565 IN-16 IN-24 IN-25
;

And you have this new data to append to it.

data new;
  length num 8 m $10;
  input num m ;
datalines;
34565 IN-27
;

So first transpose the WIDE dataset, then interleave the two, and then re-transpose back to a new WIDE dataset.

proc transpose data=wide
   out=tall(drop=_name_ rename=(col1=M) )
;
  by num;
  var m: ;
run;

data new_tall;
  set tall new ;
  by num;
  if missing(M) and not first.num then delete;
run;

proc transpose data=new_tall out=want(drop=_name_) prefix=M;
  by num;
  var m;
run;

Results

Obs     num      M1       M2       M3       M4

 1     12345    IN-18    IN-19
 2     34565    IN-16    IN-24    IN-25    IN-27

View solution in original post

Tom · Posted 02-19-2022 07:18 PM

Not sure I understand what you are trying to do, but it looks to me like it would be a lot easier with a vertical structure instead.

data have;
  infile datalines truncover;
  input NUM @;
  length m $10;
  do index=1 by 1 until(missing(M));
    input m @;
    if not missing(m) then output;
  end;
datalines ;
12345 IN .
34556 IN IN
;

data want;
  set have;
  by num;
  output;
  if last.num then do;
    index+1;
    output;
  end;
run;

Results:

Obs     NUM     m     index

 1     12345    IN      1
 2     12345    IN      2
 3     34556    IN      1
 4     34556    IN      2
 5     34556    IN      3

Anuz · Posted 02-20-2022 04:04 AM

@Tom Think you got what I meant. definitely vertical structure would be easier. But what i need is a horizontal one.

NUM M1 M2 M3

12345 IN-18 IN-19

34565 IN-16 IN-24 IN-25

I need to maintain an output as above. I would be adding a number to the IN. While I can check how many M columns are there by using the dictionary.columns table. What i am struggling to do is find if there is a M column with NULL value that I can update. If yes which one is that.

So in the example above, if I have a new entry for 34565, the should add a new column M4 to the dataset and populate it with say IN-28

NUM M1 M2 M3 M4

12345 IN-18 IN-19

34565 IN-16 IN-24 IN-25 IN-27

Tom · Posted 02-20-2022 10:50 AM

The problem with the WIDE structure is you cannot add a variable to a dataset once the data step has already started running. So you end up having to do some type of conditional code generation.

You can use PROC TRANSPOSE to convert between the tall and wide layouts.

Say you have this original dataset in WIDE format.

data wide ;
  length num 8 m1-m3 $10;
  input num m1-m3 ;
datalines;
12345 IN-18 IN-19 .
34565 IN-16 IN-24 IN-25
;

And you have this new data to append to it.

data new;
  length num 8 m $10;
  input num m ;
datalines;
34565 IN-27
;

So first transpose the WIDE dataset, then interleave the two, and then re-transpose back to a new WIDE dataset.

proc transpose data=wide
   out=tall(drop=_name_ rename=(col1=M) )
;
  by num;
  var m: ;
run;

data new_tall;
  set tall new ;
  by num;
  if missing(M) and not first.num then delete;
run;

proc transpose data=new_tall out=want(drop=_name_) prefix=M;
  by num;
  var m;
run;

Results

Obs     num      M1       M2       M3       M4

 1     12345    IN-18    IN-19
 2     34565    IN-16    IN-24    IN-25    IN-27

Anuz · Posted 02-20-2022 02:21 PM

@Tom Transpose seems to be the option then. Thank you very much for your time and solution

SAS Programming

Find column with NULL

Re: Find column with NULL

Re: Find column with NULL

Re: Find column with NULL

Re: Find column with NULL

Re: Find column with NULL

How to find a date that goes from Not Null to Null

Find Columns with Missing Values in a list of tables

Datagrid output column names are always uppercase

Strange find behavior with scan/find

finding a text in various columns

Follow Us

What is...

SAS Programming

Special offer for SAS Communities members

SAS Training: Just a Click Away

Follow Us

What is...