Hello all! SAS noob here. I have a dataset with thousands of observations. There are quite a few variables, e.g. Group, Subgroup, Title. There is one variable called the "Code". I have a list of about 1000 codes that I want to include in the dataset. I would like to create a new dataset in which all the "undesirable" observations (e.g. the observations whose codes are NOT on my list) are deleted. How can I go about this?
Thank you so much! Any help or tips would be greatly appreciated.
A simple SQL query will do that
proc sql;
create table validData as
select * from myData
where Code in (select Code from myCodes);
quit;
PG
A simple SQL query will do that
proc sql;
create table validData as
select * from myData
where Code in (select Code from myCodes);
quit;
PG
Hello! Thanks so much for replying. This is what the SAS log says when I run your proc sql code:
6852 proc sql;
6853 create table validData as
6854 select * from LIBRARY.NOCTRY4
6855 where CODE in (select CODE from NOC75);
NOTE: Table WORK.VALIDDATA created, with 1614 rows and 13 columns.
6856 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.04 seconds
cpu time 0.03 seconds
The only problem is that the dataset NOC75 (which is the list of codes I want to include) has only 1106 observations, but the new table "validData" has 1614 observations. The original data set, NOCTRY4, has 1666 observations, so I can infer that not all of the undesirable NOCs were not deleted.
Did you actually verify your input and output results to confirm the inference? Possibly there are/were duplicates? Consider using PROC FREQ to analyze both files to confirm as desk-checking the results, based on some known input.
There must exist many records in LIBRARY.NOCTRY4 that share the same CODE. To find them out, try running :
proc sql;
create table duplicates as
select *, count(CODE) as n
from LIBRARY.NOCTRY4
group by CODE
having count(CODE) > 1;
quit;
PG
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.