hi, i need heellpp!
I need to achieve the following:
data master;
input id $ x y z;
datalines;
1001 340 2 6
1001 2670 4 8
1002 990 10 6
1002 1050 2 1
1002 887 4 5
1003 5896 3 2
;
run;
data update;
input id $ x y z;
datalines;
1001 340 4 9
1001 390 2 1
1003 9900 10 6
1003 1050 2 1
1004 887 4 5
1009 5896 3 2
;
run;
data want;**-----> add (or update) records by id (if id are present in master. i.e id=1004 or 1009 dont interest me);
input id $ x y z;
datalines;
1001 340 2 6
1001 2670 4 8
1001 340 4 9
1001 390 2 1
1002 990 10 6
1002 1050 2 1
1002 887 4 5
1003 5896 3 2
1003 9900 10 6
1003 1050 2 1
;
run;
Thankss!
Alan
data master;
input id $ x y z;
datalines;
1001 340 2 6
1001 2670 4 8
1002 990 10 6
1002 1050 2 1
1002 887 4 5
1003 5896 3 2
;
run;
data new_data;
input id $ x y z;
datalines;
1001 340 4 9
1001 390 2 1
1003 9900 10 6
1003 1050 2 1
1004 887 4 5
1009 5896 3 2
;
run;
proc sql;
create table want as
select *
from master
union all corr
select *
from new_data
where id in (select id from master)
order by id;
quit;
something like this . I just changed your table name from update to updated
proc sql; create table temp_table as select * from updated a where exists(select * from master b where a.id =b.id); insert into master select * from temp_table; quit;
or
proc sql;
create table temp_table as
select * from updated
where id in(select id from master);
insert into master
select * from temp_table;
quit;
Couple of issues with what you post, dont call a dataste "update", and also specify what the by groups are here as x y z is non descriptive. I assumed that as ID has multiple rows that X is the other grouping variable. SAS provides the update statement in datastep specifically for these master/transaction processes, docs here:
And for your example:
data master; input id $ x y z; datalines; 1001 340 2 6 1001 2670 4 8 1002 990 10 6 1002 1050 2 1 1002 887 4 5 1003 5896 3 2 ; run; data new_data; input id $ x y z; datalines; 1001 340 4 9 1001 390 2 1 1003 9900 10 6 1003 1050 2 1 1004 887 4 5 1009 5896 3 2 ; run; proc sort data=master; by id x; run; proc sort data=new_data; by id x; run; data want; update master new_data; by id x; run;
Here's a DATA step approach that assumes both data sets are sorted:
data want;
set master (in=in1) update;
by id;
if first.id then keep_flag = in1;
retain keep_flag;
if keep_flag=1;
drop keep_flag;
run;
In SAS terminology, we would usually call this interleaving (rather than merging or updating). You have the added twist of wanting to eliminate ID values that are not in the master data set.
data master;
input id $ x y z;
datalines;
1001 340 2 6
1001 2670 4 8
1002 990 10 6
1002 1050 2 1
1002 887 4 5
1003 5896 3 2
;
run;
data new_data;
input id $ x y z;
datalines;
1001 340 4 9
1001 390 2 1
1003 9900 10 6
1003 1050 2 1
1004 887 4 5
1009 5896 3 2
;
run;
proc sql;
create table want as
select *
from master
union all corr
select *
from new_data
where id in (select id from master)
order by id;
quit;
Hi Ksharp, thank you very much for your reply, its works!
Can I ask you another question?
I have the following situation:
data have;
input id $ weight1 weight2 age year $ group $;
datalines;
1001 340 330 35.7 1990 2
1001 267 289 35.7 1988 2
1001 300 297 35.7 1989 2
1002 450 438 33.2 1995 1
1002 299 279 33.2 1993 1
1002 390 385 33.2 1994 1
1003 519 512 39.5 2000 4
1003 450 443 39.5 1998 4
1003 329 316 39.5 1997 4
1003 310 301 39.5 1995 4
;
run;
I need to generate a new dataset ("want") which adds an observation by id putting zeroes (0) to two variables ("weight1" & "weight2") and retaining the original values of the others ("age", "year" & "group").
like as:
data want;
input id $ weight1 weight2 age year $ group $;
datalines;
1001 0 0 35.7 1990 2
1001 340 330 35.7 1990 2
1001 267 289 35.7 1988 2
1001 300 297 35.7 1989 2
1002 0 0 33.2 1995 1
1002 450 438 33.2 1995 1
1002 299 279 33.2 1993 1
1002 390 385 33.2 1994 1
1003 0 0 39.5 2000 4
1003 519 512 39.5 2000 4
1003 450 443 39.5 1998 4
1003 329 316 39.5 1997 4
1003 310 301 39.5 1995 4
;
run;
it is understood?
thanks!
OK. Here is .
data have;
input id $ weight1 weight2 age year $ group $;
datalines;
1001 340 330 35.7 1990 2
1001 267 289 35.7 1988 2
1001 300 297 35.7 1989 2
1002 450 438 33.2 1995 1
1002 299 279 33.2 1993 1
1002 390 385 33.2 1994 1
1003 519 512 39.5 2000 4
1003 450 443 39.5 1998 4
1003 329 316 39.5 1997 4
1003 310 301 39.5 1995 4
;
run;
data want;
set have;
by id;
if first.id then do;
w1=weight1;w2=weight2;
weight1=0;weight2=0;output;
weight1=w1;weight2=w2;output;
end;
else output;
drop w1 w2;
run;
Thanks Ksharp!!
Another query please, is there any function in SAS that allows to invoke "all variables"?
For example: If "all variables" ne weight1 or weight2 then "all variables" = .;
Do you understand what I want to do?
In my situation above (if first.id then do), if I wanted to put all other variables missing except for weight1 or weight2 (latter = 0) ...
because in my original file I have many variables.
thanks, again!
No . There is no such function . But you can use ARRAY to get what you want .
Ok.
I ask one last help:
I'm building a macro that allows you to keep observations for 2 conditions:
1- if the frequencies of &contgroup are greater than or equal to 3;
2- if the frequencies of &sire are greater than or equal to 3;
The complex thing is that eliminating observations by 1- changes the frequencies to 2-, and vice versa. It is understood?
Here is an example of my data file:
data have;
input id $ sire $ contgroup $;
datalines;
9914089 9613073 81999
9914096 9512799 81999
9914108 100671 81999
9914115 100690 81999
9914119 100690 81999
9914124 100689 81999
9914130 9613166 81999
9914136 9613048 81999
9914137 100671 91999
9914148 9512799 91999
9914158 9613166 91999
9914163 100671 91999
9914164 9512782 91999
;
run;
I have tried several ways but I always have the same problem, deleting/retaining observations for one of the conditions, change the frequencies of the other variable.
How could the macro do?
Thanks!
Alan
Sorry. I couldn't understand what you mean.
Maybe you should start a new session. Let others see your question.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.