Help using Base SAS procedures

how to combine 2 dataset by common variable with out replacing missing values for other variables

Reply
Occasional Contributor
Posts: 19

how to combine 2 dataset by common variable with out replacing missing values for other variables

Here for example,

one dataset is

serial no  start date

101           jan2008

101          feb2008

101           march 2008

.

.

.

.

102            feb 2007

102            march2007

102            april2007

.

.

.

secon dataset is

serial no     end date

101            july2007

101            aug2007

101            sept2007

.

.

.

102           may 2008

102           june 2008

102           july2008

.

.

.

How can I get dataset like bellow from above 2 datasets by using common variable serial no

serial no    start date   end date

101               .             july2007

101               .             aug2007

101               .             sept2007

.

.

.

101             jan2008         .

101           feb 2008         .

101           march              .

.

.

.

102         feb2007             .

102         mar2007            .

102         april2007           .

.

.

.

102                .                may2008

102                .                june2008

102                .                july2008

.

.

.

PROC Star
Posts: 7,363

how to combine 2 dataset by common variable with out replacing missing values for other variables

You didn't explain what missing data you don't want to modify, but I think you either want to use the UPDATE statement in a datastep or write your own rules using a MERGE of the files in a datastep.  The update statement is described at: http://support.sas.com/documentation/cdl/en/basess/58133/HTML/default/viewer.htm#a001329151.htm

Occasional Contributor
Posts: 19

how to combine 2 dataset by common variable with out replacing missing values for other variables

As in result dataset i do not want to modify missing data for start date and end date values as i want to keep both as per timeline and if value do not exit for that perticular month for any of them I want to keep it same way

Super User
Posts: 17,829

how to combine 2 dataset by common variable with out replacing missing values for other variables

From the want data set it looks like you want a set statement rather than a merge.

Does the following do what you want?

data want;

set table1 table2;

run;

If you need to keep the missing in their position you may need to create another variable within serial no that allows you to sort by that.

HTH,

Reeza

Occasional Contributor
Posts: 19

how to combine 2 dataset by common variable with out replacing missing values for other variables

As i mention i am creating dataset from 2 dataset  using serial no as common variable and only  values common in both dataset based on serial no will be kept .

so i can not use set satement .

PROC Star
Posts: 7,363

how to combine 2 dataset by common variable with out replacing missing values for other variables

It sounds like you simply want to do a merge of the two datasets.  You do that either in a data step or, in proc sql, with a full join.

Occasional Contributor
Posts: 19

how to combine 2 dataset by common variable with out replacing missing values for other variables

here i want to merge 2 dataset but with common varibale with matching values record only,,,,,and during this combination ,,I do not want to apply values to missing values for compared columns and want to retain values and apply missing value where value is not available.

Super User
Posts: 17,829

how to combine 2 dataset by common variable with out replacing missing values for other variables

Do the periods under the serial no indicate missing serial no or a continuation of the previous serial no. Because otherwise all missing serial no's would be included according to your critieria.

Given that you're not overwriting or actually adding in a column it really sounds like you need an append or a set statement with a condition that the dataset contains only records in both files. This is different than a merge in my opinion.

You can try sql something like the following (untested code)

proc sql;

create table want as

select * from (select * from table1 where serial_no in (select serial_no from table2)

               union all

                    select * from table2 where serial_no in (select serial_no from table1)) a;

quit;

If you actually have serial no missing that you need they may cause problems as I mentioned above and you'll have to identify them first somehow.

Occasional Contributor
Posts: 19

how to combine 2 dataset by common variable with out replacing missing values for other variables

here period indicates continuation,,,,not missing serial no,,,,

and i think concept is right but ,,,,union all can not be use as columns in both data set is not same ,,,

and not able to get result through it,,

PROC Star
Posts: 7,363

how to combine 2 dataset by common variable with out replacing missing values for other variables

Do you have duplicate serial number, enddate combinations in either dataset?

Occasional Contributor
Posts: 19

how to combine 2 dataset by common variable with out replacing missing values for other variables

yes serial no value is repeated sevral time as per ex. tables but enddate value is different for each one

Super User
Posts: 17,829

how to combine 2 dataset by common variable with out replacing missing values for other variables

What about two steps then.

data want1;

set table1 (in=a) table2(in=b);

if a then source='a';

else if b then source='b';

run;

proc sql;

createt able want2

select * from want1

group by serial_no

where max(source) ne min(source); *check for both ids in there not just 1;

quit;

Again untested, but closer I think.

Occasional Contributor
Posts: 19

how to combine 2 dataset by common variable with out replacing missing values for other variables

REEZA,

it did not work as it was using where and it not valid data step,,,,as per log error

Super User
Posts: 17,829

how to combine 2 dataset by common variable with out replacing missing values for other variables

Show the code you used and the log error...

PROC Star
Posts: 7,363

how to combine 2 dataset by common variable with out replacing missing values for other variables

Does the following example match what you want to accomplish?:

data one;

  informat start_date date9.;

  format start_date date9.;

  input serial_no  start_date;

  cards;

101 1jan2008

101 1feb2008

101 1mar2008

102 1feb2007

102 1mar2007

102 1apr2007

;

data two;

  informat end_date date9.;

  format end_date date9.;

  input serial_no  end_date;

  cards;

101 1jul2007

101 1aug2007

101 1sep2007

102 1feb2007

102 1may2008

102 1jun2008

102 1jul2008

;

data twoprep;

  set two;

  start_date=end_date;

run;

data want;

  merge one (in=a) twoprep (in=b);

  by serial_no start_date;

  if b and not(a) then

   call missing(start_date);

run;

Ask a Question
Discussion stats
  • 20 replies
  • 238 views
  • 6 likes
  • 5 in conversation