Hi, I encountered an issue when using the following code to get a combined data xy from data x and y:
data sdp; input name $ score1 score2 gender $ stu_id score4; datalines; Tim 88 80 m 1 98 Sharon 95 90 f 2 90 Sean 45 78 m 3 . Mike 89 . m 4 . Tina 54 69 f 5 60 Sara 67 78 f 6 80 Noah 89 78 m 12 85 ;;; data sda; input name $ score1 score2 score3 gender $ stu_id; datalines; Tim 88 80 99 m 1 Sharon 95 90 90 f 2 Sean 45 78 56 m 3 Mike 89 . 77 m 4 Tina 54 69 69 f 5 Sara 67 78 74 f 6 Sam 98 82 78 m 7 James . 96 89 m 8 Jaden 98 78 89 m 9 Jack 69 79 70 m 11 Will 92 89 98 m 10 ;;; data x (keep = stu_id name score4 ); set sdp; run; data y (index=(Sid_name=(stu_id name))); set sda; run; data xy; set x; set y key=Sid_name; run;
In the final data after combing, xy, the last record (#7) which has no matching record in y initially has no values for score1- score3 and gender, but in the combined date xy it comes with values and the same values for columns score1- score3 and gender are the same as the record #6. What could be the reason? Thanks!
The first set statement brings in ALL the variables from the data set X.
So those values are all ready there when the second Set takes place. If the second does not have values to replace the values of variables already present guess what, they stay.
You do note that your LOG shows a data error don't you?
93 data xy; 94 set x; 95 set y key=Sid_name; 96 run; name=Noah stu_id=12 score4=85 score1=67 score2=78 score3=74 gender=f _ERROR_=1 _IORC_=1230015 _N_=7
The question is what do you expect for a result.
This uses the error, with a very big assumption, to set the scores to missing. There will still be an error message though.
data xy; set x ; set y key=Sid_name ; if _error_ then call missing(of score1-score3); run;
The first set statement brings in ALL the variables from the data set X.
So those values are all ready there when the second Set takes place. If the second does not have values to replace the values of variables already present guess what, they stay.
You do note that your LOG shows a data error don't you?
93 data xy; 94 set x; 95 set y key=Sid_name; 96 run; name=Noah stu_id=12 score4=85 score1=67 score2=78 score3=74 gender=f _ERROR_=1 _IORC_=1230015 _N_=7
The question is what do you expect for a result.
This uses the error, with a very big assumption, to set the scores to missing. There will still be an error message though.
data xy; set x ; set y key=Sid_name ; if _error_ then call missing(of score1-score3); run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.