BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
leehsin
Quartz | Level 8

Hi, I encountered an issue when using the following code to get a combined data xy from data x and y:

 

data sdp;
input name $ score1	score2	gender $ stu_id	score4;
datalines;
Tim	88	80	m	1	98
Sharon	95	90	f	2	90
Sean	45	78	m	3	.
Mike	89	.	m	4	.
Tina	54	69	f	5	60
Sara	67	78	f	6	80
Noah	89	78	m	12	85
;;;

data sda;
input name $ score1	score2	score3	gender $ stu_id;
datalines;
Tim	88	80	99	m	1
Sharon	95	90	90	f	2
Sean	45	78	56	m	3
Mike	89	.	77	m	4
Tina	54	69	69	f	5
Sara	67	78	74	f	6
Sam	98	82	78	m	7
James	.	96	89	m	8
Jaden	98	78	89	m	9
Jack	69	79	70	m	11
Will	92	89	98	m	10
;;;

data x (keep = stu_id name score4 );
set sdp;
run;

data y (index=(Sid_name=(stu_id name)));
set sda;
run;

data xy;
   set x;
   set y key=Sid_name;
run;

In the final data after combing, xy, the last record (#7) which has no matching record in y initially has no values for score1- score3 and gender, but in the combined date xy it comes with values and the same values for columns score1- score3 and gender are the same as the record #6. What could be the reason? Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

The first set statement brings in ALL the variables from the data set X.

So those values are all ready there when the second Set takes place. If the second does not have values to replace the values of variables already present guess what, they stay.

 

You do note that your LOG shows a data error don't you?

93   data xy;
94      set x;
95      set y key=Sid_name;
96   run;

name=Noah stu_id=12 score4=85 score1=67 score2=78 score3=74 gender=f _ERROR_=1 _IORC_=1230015
_N_=7

The question is what do you expect for a result.

 

This uses the error, with a very big assumption, to set the scores to missing. There will still be an error message though.

data xy;
   set x ;
   set y key=Sid_name ;
   if _error_ then call missing(of score1-score3);
run;

View solution in original post

3 REPLIES 3
Reeza
Super User
You may want to reformat your question. Remember to put code in a code box.
ballardw
Super User

The first set statement brings in ALL the variables from the data set X.

So those values are all ready there when the second Set takes place. If the second does not have values to replace the values of variables already present guess what, they stay.

 

You do note that your LOG shows a data error don't you?

93   data xy;
94      set x;
95      set y key=Sid_name;
96   run;

name=Noah stu_id=12 score4=85 score1=67 score2=78 score3=74 gender=f _ERROR_=1 _IORC_=1230015
_N_=7

The question is what do you expect for a result.

 

This uses the error, with a very big assumption, to set the scores to missing. There will still be an error message though.

data xy;
   set x ;
   set y key=Sid_name ;
   if _error_ then call missing(of score1-score3);
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 426 views
  • 1 like
  • 3 in conversation