BookmarkSubscribeRSS Feed
xxformat_com
Barite | Level 11

Datasets to append can have different variables.

Sometimes, it can be useful to identify in the output dataset when values are missing before the variable was not in the original dataset.

 

data a;
    a='A';
    b='B';
    x=1;
    y=1;
run;

data b;
    a='A';
    c='C';
    x=2;
    z=2;
run;

How would you get a slash when the character variable was missing and a .A when the numeric variable was missing?

xxformat_com_0-1727582653828.png

 

8 REPLIES 8
xxformat_com
Barite | Level 11

I used a set statement only.

 

The solution I've found is using arrays.

I'm just assuming that there is a better way of doing it.

Kurt_Bremser
Super User

@xxformat_com wrote:

I used a set statement only.

 

The solution I've found is using arrays.

I'm just assuming that there is a better way of doing it.


This can not result in what you posted.

Code I ran on SAS ODA:

data a;
    a='A';
    b='B';
    x=1;
    y=1;
run;

data b;
    a='A';
    c='C';
    x=2;
    z=2;
run;

data want;
set a b;
run;

proc print data=want;
run;

Result:

Beob.	a	b	x	y	c	z
1	A	B	1	1	 	.
2	A	 	2	.	C	2

As you can see, the order of the variables is different, and the display of the missing values is as it should be.

So you must have run different code, and additional things (e.g. formats assigned to the variables, or a custom setting of OPTIONS MISSING.

As already said, POST YOUR COMPLETE CODE, and add the result of PROC CONTENTS from your result dataset, and the setting of the MISSING option.

xxformat_com
Barite | Level 11
data a;
    a='A';
    b='B';
    x=1;
    y=1;
run;

data b;
    a='A';
    c='C';
    x=2;
    z=2;
run;

data z (drop=i j);
    set a b;
    array charvar {*} _character_; 
    array numvar  {*} _numeric_;
    do i=1 to dim(charvar);
        if charvar{i} = ' ' then charvar{i}='/';
    end;
    do j=1 to dim(numvar);
        if numvar{j} = . then numvar{j}=.A;
    end;
run;

proc print data=z;
   var a b c x y z;
run;

But I'm assuming there is an easier approach.

Actually, I added more data and an extra layer of difficulty after that in my code because any missing value of an existing variable should not be reported with / or .A

data a;
    a='A';
    b='B';
    w=.;
    x=1;
    y=1;
run;

data b;
    a='A';
    c='C';
    d=' ';
    x=2;
    z=2;
run;

At the end, my excel report changes the background of the cells only when the variables were not in the original datasets (my datasets are coming from .csv file I import) and I have an extra layer of probably unneeded complication using a loop for call define in proc report.

 

Tom
Super User Tom
Super User

Set all of the variables to your special values BEFORE the SET.  That way the values read will overwrite the special values.  Use IF 0 THEN (an obviously false condition) to allow the data step compiler to see all the datasets so all of the variables are define so the ARRAY statement can find them.

But you need to use a separate SET statement for each dataset. Otherwise the SET operation will clear (set to missing) the variables that do not come from that particular dataset.  When setting each dataset separately the SET statement will not do that since its won't know anything about the other dataset(s) that are not mentioned.

data a;
  input a $ b $ x y ;
cards;
A1 B1 1 1
A2 . 3 .
;

data b;
  input a $ c $ x z ;
cards;
A3 C1 2 2
. C2 . 4 
;

data want;
  if 0 then set a b ;
  array _char_ _character_;
  array _num_ _numeric_;
  do over _char_ ;
    _char_='/';
  end;
  do over _num_;
    _num_=.A;
  end;
  if not eof then set a end=eof;
  else set b ;
run;

Result

Obs    a     b     x    y    c     z

 1     A1    B1    1    1    /     A
 2     A2          3    .    /     A
 3     A3    /     2    A    C1    2
 4           /     .    A    C2    4

 

Patrick
Opal | Level 21

If this is just about display/print then a format could do the job.

data a;
  a='A';
  b='B';
  w=.;
  x=1;
  y=1;
  v=.;
run;

data b;
  a='A';
  c='C';
  d=' ';
  x=2;
  z=2;
  v=.;
run;

proc format;
  value numMiss
    . = .A
    ;
  value $charMiss
    ' ' = '/'
    ;
run;

data all;
  set a b;
  format _character_ $charMiss. _numeric_ numMiss.;
run;

proc print data=all;
run;

Patrick_1-1727659652637.png

I've added a variable V to your sample data to demonstrate that a missing doesn't tell you if a source variable existed in both source tables or not. It can just be missing in both source tables.

PaigeMiller
Diamond | Level 26

@xxformat_com wrote:

I used a set statement only.


Show us the code.

--
Paige Miller
mkeintz
PROC Star

This program preserves original missing values in A and B, but gives the new designated "missing" values  when the variable lists are expanded to include all vars not originally in both datasets.  

 

data notina(drop=_left--_right);
  retain _Left .;
  set a (obs=1);
  retain _Right ' ';
  set b;
  array _n _numeric_;
  array _c _character_;
  do over _n;  _n=.A;  end;
  do over _c;  _c='/'; end;
run;
data notinb(drop=_left--_right);
  retain _Left .;
  set b (obs=1);
  retain _Right ' ';
  set a;
  array _n _numeric_;
  array _c _character_;
  do over _n;  _n=.A;  end;
  do over _c;  _c='/'; end;
run;

data want;
  set a (in=ina)  b  (in=inb);
  retain _p 1;
  if ina then set notina point=_p;
  if inb then set notinb point=_p;
run;  
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 454 views
  • 5 likes
  • 6 in conversation