BookmarkSubscribeRSS Feed
littlestone
Fluorite | Level 6
suppose I have two data sets which are "Main" and "Support":

Main Dataset:

ID Variable_1
1..........0
2..........0
3..........0
4..........0
5..........0
6..........0
7..........0
8..........0
9..........0

Support Dataset:

ID Variable_1
2..........False
4..........True
5..........False
6..........True
9..........True

I want to update the Main data set in a way that the Variable_1 will be re-valued based on information from Support, For example, for ID 4, because the corresponding Variable_1 in the Support data set is "True", the Variable_1 in the Main data set will be re-valued to 1; similarly, for ID 2, because the corresponding Variable_1 in the Support data set is "False", the Variable_1 in the Main data set will be kept 0.

So the new updated Main should be:

ID Variable_1
1..........0
2..........0
3..........0
4..........1
5..........0
6..........1
7..........0
8..........0
9..........1

What could be an efficient Data step or Proc SQL to accomplish such task?

Thanks.
6 REPLIES 6
SPR
Quartz | Level 8 SPR
Quartz | Level 8
Hello Littlestone,

This is a possible solution:
[pre]
data m;
input ID Variable_1;
datalines;
1 0
2 0
3 0
4 0
5 0
6 0
7 0
8 0
9 0
run;
data s;
input ID Variable_1 $;
datalines;
2 False
4 True
5 False
6 True
9 True
run;
proc SQL;
create table r as
select a.ID,
case
when UPCASE(b.Variable_1) = "TRUE" then 1
when UPCASE(b.Variable_1) = "FALSE" or UPCASE(b.Variable_1) = "" then 0
else -1
end as Variable_1
from m as a left join s as b
on a.ID=b.ID
;quit;
[/pre]
Sincerely,
SPR
littlestone
Fluorite | Level 6
Thank you very much. It is just what I need!
littlestone
Fluorite | Level 6
Hello, SPR

Sorry to bother you again. I noticed that in your code:


.......proc SQL;
create table r as
select a.ID,
case........



You used "select a.ID", what if there are many variables in dataset a? Can I use "select a.* ?
SPR
Quartz | Level 8 SPR
Quartz | Level 8
Hello Littlestone,

In general you can use a.* but in your request it was necessary to update variable_1.
If you say a.* it means that variable_1 (without any changes) is also included, so it will be necessary to create some variable_2 according to requirements in your initial post.

Sincerely,
SPR
littlestone
Fluorite | Level 6
thank you all for great help
Ksharp
Super User
[pre]




data m;
input ID Variable_1;
datalines;
1 0
2 0
3 0
4 0
5 0
6 0
7 0
8 0
9 0
;
run;
data s;
input ID Variable_1 $;
datalines;
2 False
4 True
5 False
6 True
9 True
;
run;
[/pre]


proc format;



 invalue        varr



        
'False'=0



        
'True'=1;



run;



 



proc sql;



 create table result as



  select coalesce(s.id,m.id) as id ,coalesce(input(s.variable_1,varr.)
,m.variable_1)
as var



   from s right join m
on s.id = m.id



   ;



quit;



 



 



 



 



 



Ksharp



Message was edited by: Ksharp

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 812 views
  • 0 likes
  • 3 in conversation