BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
alan_maxs
Calcite | Level 5

hi, i need heellpp!

I need to achieve the following:

 

data master;
 input id $ x y z;
 datalines;
 1001 340 2 6
 1001 2670 4 8
 1002 990 10 6
 1002 1050 2 1
 1002 887 4 5
 1003 5896 3 2
 ;
run;

data update;
 input id $ x y z;
 datalines;
 1001 340 4 9
 1001 390 2 1
 1003 9900 10 6
 1003 1050 2 1
 1004 887 4 5
 1009 5896 3 2
 ;
run;

data want;**-----> add (or update) records by id (if id are present in master. i.e id=1004 or 1009 dont interest me); 
 input id $ x y z;
 datalines;
 1001 340 2 6
 1001 2670 4 8
 1001 340 4 9
 1001 390 2 1 
 1002 990 10 6
 1002 1050 2 1
 1002 887 4 5
 1003 5896 3 2
 1003 9900 10 6
 1003 1050 2 1
 ;
run;

Thankss!

Alan

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data master;
 input id $ x y z;
 datalines;
 1001 340 2 6
 1001 2670 4 8
 1002 990 10 6
 1002 1050 2 1
 1002 887 4 5
 1003 5896 3 2
 ;
run;

data new_data;
 input id $ x y z;
 datalines;
 1001 340 4 9
 1001 390 2 1
 1003 9900 10 6
 1003 1050 2 1
 1004 887 4 5
 1009 5896 3 2
 ;
run;
proc sql;
create table want as
 select *
  from master
union all corr
 select *
  from new_data
   where id in (select id from master)
order by id;
quit;

View solution in original post

13 REPLIES 13
kiranv_
Rhodochrosite | Level 12

something like this . I just changed your table name from update to updated

 

proc sql;
create table temp_table as 
select * from updated a
where exists(select * from master b
where a.id =b.id);
insert into master
select * from temp_table;
quit;

or

 

proc sql;
create table temp_table as
select * from updated
where id in(select id from master);
insert into master
select * from temp_table;
quit; 

alan_maxs
Calcite | Level 5
thanks! its works! 🙂
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Couple of issues with what you post, dont call a dataste "update", and also specify what the by groups are here as x y z is non descriptive.  I assumed that as ID has multiple rows that X is the other grouping variable.  SAS provides the update statement in datastep specifically for these master/transaction processes, docs here:

http://support.sas.com/documentation/cdl/en/basess/68381/HTML/default/viewer.htm#n0s3jm3mkzvz5qn1mri...

 

And for your example:

data master;
 input id $ x y z;
 datalines;
 1001 340 2 6
 1001 2670 4 8
 1002 990 10 6
 1002 1050 2 1
 1002 887 4 5
 1003 5896 3 2
 ;
run;

data new_data;
 input id $ x y z;
 datalines;
 1001 340 4 9
 1001 390 2 1
 1003 9900 10 6
 1003 1050 2 1
 1004 887 4 5
 1009 5896 3 2
 ;
run;

proc sort data=master;
  by id x;
run;

proc sort data=new_data;
  by id x;
run;
  
data want;
  update master new_data;
  by id x;
run;
Astounding
PROC Star

Here's a DATA step approach that assumes both data sets are sorted:

 

data want;

set master (in=in1) update;

by id;

if first.id then keep_flag = in1;

retain keep_flag;

if keep_flag=1;

drop keep_flag;

run;

 

In SAS terminology, we would usually call this interleaving (rather than merging or updating).  You have the added twist of wanting to eliminate ID values that are not in the master data set.

alan_maxs
Calcite | Level 5
thanks!!
I sorry for the terminology, i'm new!
Ksharp
Super User
data master;
 input id $ x y z;
 datalines;
 1001 340 2 6
 1001 2670 4 8
 1002 990 10 6
 1002 1050 2 1
 1002 887 4 5
 1003 5896 3 2
 ;
run;

data new_data;
 input id $ x y z;
 datalines;
 1001 340 4 9
 1001 390 2 1
 1003 9900 10 6
 1003 1050 2 1
 1004 887 4 5
 1009 5896 3 2
 ;
run;
proc sql;
create table want as
 select *
  from master
union all corr
 select *
  from new_data
   where id in (select id from master)
order by id;
quit;
alan_maxs
Calcite | Level 5

Hi Ksharp, thank you very much for your reply, its works!

Can I ask you another question?

I have the following situation:

 

data have;
 input id $ weight1 weight2 age year $ group $;
 datalines;
 1001 340 330 35.7 1990 2
 1001 267 289 35.7 1988 2
 1001 300 297 35.7 1989 2  
 1002 450 438 33.2 1995 1
 1002 299 279 33.2 1993 1
 1002 390 385 33.2 1994 1
 1003 519 512 39.5 2000 4
 1003 450 443 39.5 1998 4
 1003 329 316 39.5 1997 4
 1003 310 301 39.5 1995 4
 ;
run;

I need to generate a new dataset ("want") which adds an observation  by id putting zeroes (0) to two variables ("weight1" & "weight2") and retaining the original values ​​of the others ("age", "year" & "group").

 

like as:

data want;
 input id $ weight1 weight2 age year $ group $;
 datalines;
 1001 0 0 35.7 1990 2
 1001 340 330 35.7 1990 2
 1001 267 289 35.7 1988 2
 1001 300 297 35.7 1989 2
 1002 0 0 33.2 1995 1
 1002 450 438 33.2 1995 1
 1002 299 279 33.2 1993 1
 1002 390 385 33.2 1994 1
 1003 0 0 39.5 2000 4
 1003 519 512 39.5 2000 4
 1003 450 443 39.5 1998 4
 1003 329 316 39.5 1997 4
 1003 310 301 39.5 1995 4
 ;
run;

it is understood?

 

thanks!

Ksharp
Super User

OK. Here is .

 

data have;
 input id $ weight1 weight2 age year $ group $;
 datalines;
 1001 340 330 35.7 1990 2
 1001 267 289 35.7 1988 2
 1001 300 297 35.7 1989 2  
 1002 450 438 33.2 1995 1
 1002 299 279 33.2 1993 1
 1002 390 385 33.2 1994 1
 1003 519 512 39.5 2000 4
 1003 450 443 39.5 1998 4
 1003 329 316 39.5 1997 4
 1003 310 301 39.5 1995 4
 ;
run;
data want;
 set have;
 by id;
 if first.id then do;
  w1=weight1;w2=weight2;
  weight1=0;weight2=0;output;
  weight1=w1;weight2=w2;output;
 end;
 else output;
 drop w1 w2;
run;
alan_maxs
Calcite | Level 5

Thanks Ksharp!!

Another query please, is there any function in SAS that allows to invoke "all variables"?

For example: If "all variables" ne weight1 or weight2 then "all variables" = .;
Do you understand what I want to do?

In my situation above (if first.id then do), if I wanted to put all other variables missing except for weight1 or weight2 (latter = 0) ...
because in my original file I have many variables.

 

thanks, again!

Ksharp
Super User

No . There is no such function . But you can use ARRAY to get what you want .

alan_maxs
Calcite | Level 5

Ok.

I ask one last help:

I'm building a macro that allows you to keep observations for 2 conditions:
1- if the frequencies of &contgroup are greater than or equal to 3;
2- if the frequencies of &sire are greater than or equal to 3;

The complex thing is that eliminating observations by 1- changes the frequencies to 2-, and vice versa. It is understood?

Here is an example of my data file:

 

data have;
 input id $ sire $ contgroup $;
 datalines;
9914089	9613073	81999	
9914096	9512799	81999	
9914108	100671	81999	
9914115	100690	81999	
9914119	100690	81999	
9914124	100689	81999	
9914130	9613166	81999	
9914136	9613048	81999	
9914137	100671	91999	
9914148	9512799	91999	
9914158	9613166	91999	
9914163	100671	91999	
9914164	9512782	91999
;
run;	

I have tried several ways but I always have the same problem, deleting/retaining observations for one of the conditions, change the frequencies of the other variable.

How could the macro do?

 

Thanks!

Alan

Ksharp
Super User

Sorry. I couldn't understand what you mean.

Maybe you should start a new session. Let others see your question.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 1302 views
  • 1 like
  • 5 in conversation