BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Anuz
Quartz | Level 8

I have dataset similar to the below 

 

DATA COMP_LIST;
input NUM M1 $ M2 $;
datalines ;
12345 IN .
34556 IN IN
;
run;

 

What I need to do based on a list provided check if that number exists in the above table.

 

1. If it does, I need to check if there is a M* column that is NULL. If it is then update it with NULL.

For example 12345 exists in the table above and M1 is not null and has value of IN. So as long the next M column exists update that with 'IN'. so M2 for 12345 has no value so update M2 with 'IN'

2. if there is no M* column that is NULL then add the next M column to the table and update it with value.

For example 34556 exists in the dataset and both M1 and M2 has values of 'IN' . so now add M3 to the dataset and update with 'IN'

 

Hope this makes sense. What is the best way to achieve this ? 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

The problem with the WIDE structure is you cannot add a variable to a dataset once the data step has already started running. So you end up having to do some type of conditional code generation.

 

You can use PROC TRANSPOSE to convert between the tall and wide layouts.

Say you have this original dataset in WIDE format.

data wide ;
  length num 8 m1-m3 $10;
  input num m1-m3 ;
datalines;
12345 IN-18 IN-19 .
34565 IN-16 IN-24 IN-25
;

And you have this new data to append to it.

data new;
  length num 8 m $10;
  input num m ;
datalines;
34565 IN-27
;

So first transpose the WIDE dataset, then interleave the two, and then re-transpose back to a new WIDE dataset.

proc transpose data=wide
   out=tall(drop=_name_ rename=(col1=M) )
;
  by num;
  var m: ;
run;

data new_tall;
  set tall new ;
  by num;
  if missing(M) and not first.num then delete;
run;

proc transpose data=new_tall out=want(drop=_name_) prefix=M;
  by num;
  var m;
run;

Results

Obs     num      M1       M2       M3       M4

 1     12345    IN-18    IN-19
 2     34565    IN-16    IN-24    IN-25    IN-27

View solution in original post

4 REPLIES 4
Tom
Super User Tom
Super User

Not sure I understand what you are trying to do, but it looks to me like it would be a lot easier with a vertical structure instead.

data have;
  infile datalines truncover;
  input NUM @;
  length m $10;
  do index=1 by 1 until(missing(M));
    input m @;
    if not missing(m) then output;
  end;
datalines ;
12345 IN .
34556 IN IN
;

data want;
  set have;
  by num;
  output;
  if last.num then do;
    index+1;
    output;
  end;
run;

Results:

Obs     NUM     m     index

 1     12345    IN      1
 2     12345    IN      2
 3     34556    IN      1
 4     34556    IN      2
 5     34556    IN      3

Anuz
Quartz | Level 8

@Tom Think you got what I meant. definitely vertical structure would be easier. But what i need is a horizontal one. 

 

NUM     M1       M2        M3

12345   IN-18    IN-19

34565   IN-16    IN-24    IN-25

 

I need to maintain an output as above. I would be adding a number to the IN. While I can check how many M columns are there by using the dictionary.columns table. What i am struggling to do is find if there is a M column with NULL value that I can update. If yes which one is that. 

 

So in the example above, if I have a new entry for 34565, the should add a new column M4 to the dataset and populate it with say IN-28

 

 

NUM     M1       M2        M3        M4

12345   IN-18    IN-19

34565   IN-16    IN-24    IN-25    IN-27

 

Tom
Super User Tom
Super User

The problem with the WIDE structure is you cannot add a variable to a dataset once the data step has already started running. So you end up having to do some type of conditional code generation.

 

You can use PROC TRANSPOSE to convert between the tall and wide layouts.

Say you have this original dataset in WIDE format.

data wide ;
  length num 8 m1-m3 $10;
  input num m1-m3 ;
datalines;
12345 IN-18 IN-19 .
34565 IN-16 IN-24 IN-25
;

And you have this new data to append to it.

data new;
  length num 8 m $10;
  input num m ;
datalines;
34565 IN-27
;

So first transpose the WIDE dataset, then interleave the two, and then re-transpose back to a new WIDE dataset.

proc transpose data=wide
   out=tall(drop=_name_ rename=(col1=M) )
;
  by num;
  var m: ;
run;

data new_tall;
  set tall new ;
  by num;
  if missing(M) and not first.num then delete;
run;

proc transpose data=new_tall out=want(drop=_name_) prefix=M;
  by num;
  var m;
run;

Results

Obs     num      M1       M2       M3       M4

 1     12345    IN-18    IN-19
 2     34565    IN-16    IN-24    IN-25    IN-27
Anuz
Quartz | Level 8

@Tom Transpose seems to be the option then. Thank you very much for your time and solution 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 2276 views
  • 2 likes
  • 2 in conversation