Question for SET statement

Accepted Solution Solved
Reply
N/A
Posts: 1
Accepted Solution

Question for SET statement

Hi folks,

I have a script to combine 2 dataset but has unexpected result.

=================================================================

data TEMP1;

input IDNO $16.;

cards;

1

2

;

data TEMP2;

input CUS_ID $16.;

cards;

3

4

5

;

data TEMP;

set temp1 temp2;

if IDNO='' then IDNO=CUS_ID;

run;

=================================================================

I expect TEMP datasets has IDNO value as "1 2 3 4 5" for 5 observations. However, it turns out "1 2 3 3 3".

Why is the missing IDNO cannot get the value from CUS_ID for 3rd~5th observation?

Thanks.


Accepted Solutions
Solution
‎08-26-2015 01:42 AM
Valued Guide
Posts: 3,206

Re: Question for SET statement

Back to the PDV SAS(R) 9.3 Language Reference: Concepts, Second Edition

It sets variables to missing when belonging to the PDV before reading.    Using a SET will not set the other variables of the other dataset to missing.
For the more advanced usage of how to benefit from that http://support.sas.com/resources/papers/proceedings09/038-2009.pdf  the DOW loop

---->-- ja karman --<-----

View solution in original post


All Replies
Valued Guide
Posts: 854

Re: Question for SET statement

You should complete the dataset before making this change:

data compile;

set temp1 temp2;

/*if missing(IDNO) then IDNO=CUS_ID;*/

run;

data temp;

set compile;

if missing(IDNO) then IDNO=CUS_ID;

run;

Esteemed Advisor
Posts: 6,685

Re: Question for SET statement

Since the data step has no records to read for IDNO (from temp1) after iteration 2, it does not reset IDNO to missing after iteration 3, and therefore implicitly retains the value "3".

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Esteemed Advisor
Esteemed Advisor
Posts: 7,232

Re: Question for SET statement

Perhaps you mean something like this:

data temp1;

  input idno $16.;

cards;

1

2

;

run;

data temp2;

  input cus_id $16.;

cards;

3

4

5

;

run;

data temp;

  set temp1 (in=a) temp2 (in=b);

  if b then idno=cus_id;

run;

Super User
Super User
Posts: 6,364

Re: Question for SET statement

Because that's how it works.  Smiley Happy

You could force the values to missing.  Add an OUTPUT statement to write the current observations. Then add a CALL MISSING() statement to set all values to missing. Then the next time through the data step only datasets that are still contributing will change the variable's values.

data TEMP;

   set temp1 temp2;

   CUS_ID = coalescec(cus_id,idno);

   OUTPUT;

   CALL MISSING(of _ALL_);

run;


But it looks like what you really want is a RENAME dataset option.


data temp;

  set temp1 (rename=(IDNO=CUS_ID)) temp2;

run;


And perhaps even a MERGE instead of a SET?


data temp;

  merge temp1 (rename=(IDNO=CUS_ID)) temp2;

  by cus_id ;

run;


Grand Advisor
Posts: 9,584

Re: Question for SET statement

Haha. Very interesting Question.

Because variable IDNO has been retained.

after read all obs from TEMP1, SAS will set IDNO be missing .So your condition(IDNO=' ') is right, when sas read the first obs from TEMP2, and you got IDNO=3.After that SAS will NOT set IDNO be missing due to it is from dataset TEMP1,so your condition(IDNO=' ') will never right ,and you got 3 always.

66       

67         data TEMP;

68         set temp1 temp2;

69         put 'Before:' _all_;

70         if IDNO=' ' then IDNO=CUS_ID;

71         put 'After:' _all_ ;

72         put /;

73         run;

Before:IDNO=1 CUS_ID=  _ERROR_=0 _N_=1

After:IDNO=1 CUS_ID=  _ERROR_=0 _N_=1

Before:IDNO=2 CUS_ID=  _ERROR_=0 _N_=2

After:IDNO=2 CUS_ID=  _ERROR_=0 _N_=2

Before:IDNO=  CUS_ID=3 _ERROR_=0 _N_=3

After:IDNO=3 CUS_ID=3 _ERROR_=0 _N_=3

Before:IDNO=3 CUS_ID=4 _ERROR_=0 _N_=4

After:IDNO=3 CUS_ID=4 _ERROR_=0 _N_=4

Before:IDNO=3 CUS_ID=5 _ERROR_=0 _N_=5

After:IDNO=3 CUS_ID=5 _ERROR_=0 _N_=5

66        

67         data TEMP;

68         set temp1 temp2;

69         put 'Before:' _all_;

70         if IDNO=' ' then IDNO=CUS_ID;

71         put 'After:' _all_ ;

72        

73         output;

74         call missing(of _all_);

75         run;

Before:IDNO=1 CUS_ID=  _ERROR_=0 _N_=1

After:IDNO=1 CUS_ID=  _ERROR_=0 _N_=1

Before:IDNO=2 CUS_ID=  _ERROR_=0 _N_=2

After:IDNO=2 CUS_ID=  _ERROR_=0 _N_=2

Before:IDNO=  CUS_ID=3 _ERROR_=0 _N_=3

After:IDNO=3 CUS_ID=3 _ERROR_=0 _N_=3

Before:IDNO=  CUS_ID=4 _ERROR_=0 _N_=4

After:IDNO=4 CUS_ID=4 _ERROR_=0 _N_=4

Before:IDNO=  CUS_ID=5 _ERROR_=0 _N_=5

After:IDNO=5 CUS_ID=5 _ERROR_=0 _N_=5

消息编辑者为:xia keshan

Solution
‎08-26-2015 01:42 AM
Valued Guide
Posts: 3,206

Re: Question for SET statement

Back to the PDV SAS(R) 9.3 Language Reference: Concepts, Second Edition

It sets variables to missing when belonging to the PDV before reading.    Using a SET will not set the other variables of the other dataset to missing.
For the more advanced usage of how to benefit from that http://support.sas.com/resources/papers/proceedings09/038-2009.pdf  the DOW loop

---->-- ja karman --<-----
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 336 views
  • 7 likes
  • 7 in conversation