Hello all, I am about intermediate level with SAS programming and beginner level with SQL. I am having difficulty subsetting my dataset by certain criteria. The dataset contains patients who have had cancer either once or multiple times. However, if the patients FIRST cancer was not pancreatic, then they need to be deleted. This part I have done. The second step is to get rid of the remaining patients who are in the dataset with any type of cancer, but never had a first cancer listed as pancreatic cancer. One goal of this project is to take a look at patients who had pancreatic cancer as their first cancer, and for those who have had multiple cancers, find out what other types of cancers they got after their pancreatic cancer diagnosis. One way I was trying to do this was by Patient_ID, but I am not aware of any DISTINCT equivalency statements in SAS and I don't know SQL that well. Below is the code I used to get rid of those whose first cancer was not pancreatic. It is also subset by histology type as I do not want patients with certain histological types of pancreatic cancer: DATA pancreas_temp1; SET pancreas;
IF Histologic_Type_ICD_O_3 not in (8150, 8151, 8152, 8153, 8155, 8240, 8241, 8246)
AND
Sequence_number IN ("One primary only", "1st of 2 or more primaries")
then DELETE;
IF Primary_Site___labeled not =: 'C25'
AND
Sequence_number IN ("One primary only", "1st of 2 or more primaries")
then DELETE;
RUN; Here is a sample of my dataset. I have sorted it by Patient_ID which is what revealed the problem - patients with no "Only one primary" or "1st of 2 or more primaries" are in this dataset: Patient_ID Sequence_number Primary_site__labeled Histologic_Type_ICD_O_3 16 2nd of 2 or more primaries C20.9 - Rectum, NOS 8140 18 2nd of 2 or more primaries C38.1-Anterior mediastinum 8801 22 2nd of 2 or more primaries C42.1-Bone marrow 9732 22 3rd of 3 or more primaries C50.9-Breast, NOS 8500 25 One primary only C25.2-Tail of pancreas 8246 31 4th of 4 or more primaries C50.4-Upper-outer quadrant of breast 8500 31 5th of 5 or more primaries C34.1-Upper lobe, lung 8070 32 1st of 2 or more primaries C25.0-Head of pancreas 8155 Here is what I want: Patient_ID Sequence_number Primary_site__labeled Histologic_Type_ICD_O_3 25 One primary only C25.2-Tail of pancreas 8246 32 1st of 2 or more primaries C25.0-Head of pancreas 8155 32 2nd of 2 or more primaries C64.9-Kidney, NOS 8312 Patients 16, 18, 22, and 31 need to go because it is apparent that their first cancer is not listed so we can't tell if it was pancreatic or not. Any help is appreciated. Thank you!
... View more