10-25-2011 01:09 PM
Here for example,
one dataset is
serial no start date
101 march 2008
102 feb 2007
secon dataset is
serial no end date
102 may 2008
102 june 2008
How can I get dataset like bellow from above 2 datasets by using common variable serial no
serial no start date end date
101 . july2007
101 . aug2007
101 . sept2007
101 jan2008 .
101 feb 2008 .
101 march .
102 feb2007 .
102 mar2007 .
102 april2007 .
102 . may2008
102 . june2008
102 . july2008
10-25-2011 01:24 PM
You didn't explain what missing data you don't want to modify, but I think you either want to use the UPDATE statement in a datastep or write your own rules using a MERGE of the files in a datastep. The update statement is described at: http://support.sas.com/documentation/cdl/en/basess/58133/HTML/default/viewer.htm#a001329151.htm
10-25-2011 01:33 PM
As in result dataset i do not want to modify missing data for start date and end date values as i want to keep both as per timeline and if value do not exit for that perticular month for any of them I want to keep it same way
10-25-2011 01:35 PM
From the want data set it looks like you want a set statement rather than a merge.
Does the following do what you want?
set table1 table2;
If you need to keep the missing in their position you may need to create another variable within serial no that allows you to sort by that.
10-25-2011 01:50 PM
As i mention i am creating dataset from 2 dataset using serial no as common variable and only values common in both dataset based on serial no will be kept .
so i can not use set satement .
10-25-2011 01:57 PM
It sounds like you simply want to do a merge of the two datasets. You do that either in a data step or, in proc sql, with a full join.
10-25-2011 02:58 PM
here i want to merge 2 dataset but with common varibale with matching values record only,,,,,and during this combination ,,I do not want to apply values to missing values for compared columns and want to retain values and apply missing value where value is not available.
10-25-2011 02:16 PM
Do the periods under the serial no indicate missing serial no or a continuation of the previous serial no. Because otherwise all missing serial no's would be included according to your critieria.
Given that you're not overwriting or actually adding in a column it really sounds like you need an append or a set statement with a condition that the dataset contains only records in both files. This is different than a merge in my opinion.
You can try sql something like the following (untested code)
create table want as
select * from (select * from table1 where serial_no in (select serial_no from table2)
select * from table2 where serial_no in (select serial_no from table1)) a;
If you actually have serial no missing that you need they may cause problems as I mentioned above and you'll have to identify them first somehow.
10-25-2011 02:53 PM
here period indicates continuation,,,,not missing serial no,,,,
and i think concept is right but ,,,,union all can not be use as columns in both data set is not same ,,,
and not able to get result through it,,
10-25-2011 03:00 PM
yes serial no value is repeated sevral time as per ex. tables but enddate value is different for each one
10-25-2011 03:11 PM
What about two steps then.
set table1 (in=a) table2(in=b);
if a then source='a';
else if b then source='b';
createt able want2
select * from want1
group by serial_no
where max(source) ne min(source); *check for both ids in there not just 1;
Again untested, but closer I think.
10-25-2011 03:53 PM
it did not work as it was using where and it not valid data step,,,,as per log error
10-25-2011 03:12 PM
Does the following example match what you want to accomplish?:
informat start_date date9.;
format start_date date9.;
input serial_no start_date;
informat end_date date9.;
format end_date date9.;
input serial_no end_date;
merge one (in=a) twoprep (in=b);
by serial_no start_date;
if b and not(a) then