Hello Everyone, I'm pretty stuck on a question in my practice exam any help explaining it would be very much appreciated. I've been given 2 data sets, cert.input08a and cert.input08b. Both data sets contain a common numeric variable named ID. I’m not sure how else to provide the data but the variables in cert.input08a are: exa1 through exa 22 (i.e. exa1, exa2, exa3, etc), ValA, J, ID While cert.input08b has the variables: exb1 through exb45 (i.e. exb1, exb2, exb3, etc), excess1 through excess45 (i.e. excess1, excess2, excess3, etc), ValB, X, ID With this I've been tasked to write a program that uses a SAS Data Step to: -Combine the 2 tables by matching values of the ID variable. -Write only the observations that are in both data sets to a new set named results.match08. -Write all other non-matching observations from either data set to a new set named results.nomatch08. -Exclude all variables that begin with "ex" from results.nomatch08. I started by sorting both the data sets by the common variable: proc sort data = cert.input08a out=sorted_input08a ; by ID ; run ; proc sort data = cert.input08b out=sorted_input08b ; by ID ; run ; Then I wrote a data step to merge them using the now sorted tables: data results.match08 results.nomatch08 ; merge sorted_input08a (in=inPut8a) sorted_input08b (in=inPut8b) ; by ID; if inPut8a = 1 and inPut8b = 1 then output results.match08 ; else output results.nomatch08 ; run ; When I run this code (the data step, after running the sort steps) I receive no error in the log, but I do get note that isn't familiar, "MERGE statement has more than one data set with repeats of BY values. The output data for results.nomatch08 is 2 rows 117 columns, while results.match08 is 1200 rows and 117 columnns. input08a.sas7bdat originally has 1200 rows and 25 columns. input08b.sas7bdat originally has 1202 rows and 93 columns. As well the only matching variable from the two data sets are ID, so shouldn't the new data set results.match08 only have the variable ID as a column? One last question, for exlcuding variables that begin with "ex" from the set results.nomatch08 do you use a drop = option? data results.match08 results.nomatch08 (drop = "ex%") ; (like this?) Thanks in advance for the help, Alexander
... View more