BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
benjamin_2018
Fluorite | Level 6

Hello. I have a dataset of specifications for certain items, and each item has an old model and a new model. The data looks something like this. 

 

data items;
input Item $ SpecA $ SpecB $ SpecC $ ;
datalines;
ABC A3 B2 C5
ABC A3 B3 .
DEF .  B2 C4
DEF A1 .  C9
GHI A4 B9 C6
GHI A1 B9 .
;
run;

I want to, for each Item in the first column, compare SpecA, SpecB, and SpecC.

I want the output to have the following:

1) If there is a change from missing to not missing, or a change in value, then return the value of the 2nd row in the by group (the new model).

2) If there is a change from not missing to missing, then return the value from the 1st row in the by group (the old model).

3) If there is no change, then return a blank value. 

So here is what the output for the above would look like:

 

Item  SpecA  SpecB  SpecC

ABC  .           B3        C5

DEF  A1       B2         C9

GHI   A1      .             C6

 

Thank you for any suggestions on the best approach. I have tried using a array but I'm not too experienced with that feature. I typically use SAS Enterprise Guide 7.15 and SAS 9.4

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hi @benjamin_2018  I'm curious to know why does the sample always have sets of 2 items. Is that really a good representative of your real?

 

Anyways, for what it's worth-->


data items;
input Item $ SpecA $ SpecB $ SpecC $ ;
datalines;
ABC A3 B2 C5
ABC A3 B3 .
DEF .  B2 C4
DEF A1 .  C9
GHI A4 B9 C6
GHI A1 B9 .
;
run;

data want;
 set items;
 by item;
 array s  SpecA  SpecB  SpecC;
 array l(3) $ _temporary_;
 do over s;
  if first.item then l(_i_)=s;
  else do;
   if l(_i_)>' ' and s=' ' then s=l(_i_);
   else if l(_i_)>' ' and s=l(_i_) then s=' ';
  end;
 end;
 if last.item;
run; 

View solution in original post

7 REPLIES 7
novinosrin
Tourmaline | Level 20

Hi @benjamin_2018  I'm curious to know why does the sample always have sets of 2 items. Is that really a good representative of your real?

 

Anyways, for what it's worth-->


data items;
input Item $ SpecA $ SpecB $ SpecC $ ;
datalines;
ABC A3 B2 C5
ABC A3 B3 .
DEF .  B2 C4
DEF A1 .  C9
GHI A4 B9 C6
GHI A1 B9 .
;
run;

data want;
 set items;
 by item;
 array s  SpecA  SpecB  SpecC;
 array l(3) $ _temporary_;
 do over s;
  if first.item then l(_i_)=s;
  else do;
   if l(_i_)>' ' and s=' ' then s=l(_i_);
   else if l(_i_)>' ' and s=l(_i_) then s=' ';
  end;
 end;
 if last.item;
run; 
benjamin_2018
Fluorite | Level 6

@novinosrinThank you so much for the input. I will test out this code next week. This dataset has 2 items because I am examining a substitution from an old model to a new model and identifying the specifications that change between models. The items are grouped by similar type, for example, breads, cars, sewing machines, etc. 

benjamin_2018
Fluorite | Level 6

I got this to work! Thank you. I just added a length to the elements in the temporary array because in my real dataset the lengths for all character variables are 64. 

novinosrin
Tourmaline | Level 20

Hi @benjamin_2018  I'm glad. Yes Indeed, I did assume you would be able to adjust the length. Good 🙂 Have a good day and stay safe

mkeintz
PROC Star

This would probably benefit from use a conditional SET with POINT=  and a bunch of renames:

 

data items;
input Item $ SpecA $ SpecB $ SpecC $ ;
datalines;
ABC A3 B2 C5
ABC A3 B3 .
DEF .  B2 C4
DEF A1 .  C9
GHI A4 B9 C6
GHI A1 B9 .
run;


data want (drop=frst_:);
  set items;
  by item;
  if first.item then set items (rename=(speca=frst_speca specb=frst_specb specc=frst_specc)) point=_n_; 
  array spc  spec:;
  array fsp  frst_spec: ;
  if last.item;
  do over spc ;
    if spc=fsp then spc=' '; else
    if missing(spc) then spc=fsp; 
  end;
run;

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
benjamin_2018
Fluorite | Level 6

@mkeintzOk, thank you! I've not used the point= option on the set statement so I will need to look at that. At least I was on the right path by using 2 arrays, a first. and last. automatic variable with the by group, and a do loop. I will test this option out when I'm back at work next week. Thank you very much! Ben

mkeintz
PROC Star

You can also use a self-merge with offset as below - no extra SET statement needed:

 

data items;
input Item $ SpecA $ SpecB $ SpecC $ ;
datalines;
ABC A3 B2 C5
ABC A3 B3 .
DEF .  B2 C4
DEF A1 .  C9
GHI A4 B9 C6
GHI A1 B9 .
run;
data want (drop=frst_:);
  merge items (rename=(item=frst_item speca=frst_speca specb=frst_specb specc=frst_specc))
        items (firstobs=2);
  if item=frst_item;
  array spc  spec:;
  array fsp  frst_spec: ;
  do over spc ;
    if spc=fsp then spc=' '; else
    if missing(spc) then spc=fsp; 
  end;
run;

But note you can NOT use a BY statement with this merge.  Because if you do, the initial offset of the two streams of data from HAVE will be lost when the second BY-group is encountered.  I.e. the first output obs would have data from incoming obs1 and obs2, but the second output obs would pause the second data stream (as it had already reached the end of the by group.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1270 views
  • 0 likes
  • 3 in conversation