Here for example,
one dataset is
serial no start date
101 jan2008
101 feb2008
101 march 2008
.
.
.
.
102 feb 2007
102 march2007
102 april2007
.
.
.
secon dataset is
serial no end date
101 july2007
101 aug2007
101 sept2007
.
.
.
102 may 2008
102 june 2008
102 july2008
.
.
.
How can I get dataset like bellow from above 2 datasets by using common variable serial no
serial no start date end date
101 . july2007
101 . aug2007
101 . sept2007
.
.
.
101 jan2008 .
101 feb 2008 .
101 march .
.
.
.
102 feb2007 .
102 mar2007 .
102 april2007 .
.
.
.
102 . may2008
102 . june2008
102 . july2008
.
.
.
You didn't explain what missing data you don't want to modify, but I think you either want to use the UPDATE statement in a datastep or write your own rules using a MERGE of the files in a datastep. The update statement is described at: http://support.sas.com/documentation/cdl/en/basess/58133/HTML/default/viewer.htm#a001329151.htm
As in result dataset i do not want to modify missing data for start date and end date values as i want to keep both as per timeline and if value do not exit for that perticular month for any of them I want to keep it same way
From the want data set it looks like you want a set statement rather than a merge.
Does the following do what you want?
data want;
set table1 table2;
run;
If you need to keep the missing in their position you may need to create another variable within serial no that allows you to sort by that.
HTH,
Reeza
As i mention i am creating dataset from 2 dataset using serial no as common variable and only values common in both dataset based on serial no will be kept .
so i can not use set satement .
It sounds like you simply want to do a merge of the two datasets. You do that either in a data step or, in proc sql, with a full join.
here i want to merge 2 dataset but with common varibale with matching values record only,,,,,and during this combination ,,I do not want to apply values to missing values for compared columns and want to retain values and apply missing value where value is not available.
Do the periods under the serial no indicate missing serial no or a continuation of the previous serial no. Because otherwise all missing serial no's would be included according to your critieria.
Given that you're not overwriting or actually adding in a column it really sounds like you need an append or a set statement with a condition that the dataset contains only records in both files. This is different than a merge in my opinion.
You can try sql something like the following (untested code)
proc sql;
create table want as
select * from (select * from table1 where serial_no in (select serial_no from table2)
union all
select * from table2 where serial_no in (select serial_no from table1)) a;
quit;
If you actually have serial no missing that you need they may cause problems as I mentioned above and you'll have to identify them first somehow.
here period indicates continuation,,,,not missing serial no,,,,
and i think concept is right but ,,,,union all can not be use as columns in both data set is not same ,,,
and not able to get result through it,,
Do you have duplicate serial number, enddate combinations in either dataset?
yes serial no value is repeated sevral time as per ex. tables but enddate value is different for each one
What about two steps then.
data want1;
set table1 (in=a) table2(in=b);
if a then source='a';
else if b then source='b';
run;
proc sql;
createt able want2
select * from want1
group by serial_no
where max(source) ne min(source); *check for both ids in there not just 1;
quit;
Again untested, but closer I think.
REEZA,
it did not work as it was using where and it not valid data step,,,,as per log error
Show the code you used and the log error...
Does the following example match what you want to accomplish?:
data one;
informat start_date date9.;
format start_date date9.;
input serial_no start_date;
cards;
101 1jan2008
101 1feb2008
101 1mar2008
102 1feb2007
102 1mar2007
102 1apr2007
;
data two;
informat end_date date9.;
format end_date date9.;
input serial_no end_date;
cards;
101 1jul2007
101 1aug2007
101 1sep2007
102 1feb2007
102 1may2008
102 1jun2008
102 1jul2008
;
data twoprep;
set two;
start_date=end_date;
run;
data want;
merge one (in=a) twoprep (in=b);
by serial_no start_date;
if b and not(a) then
call missing(start_date);
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.