Hello everyone! I'm using SAS version 9.4. I have 2 data sets that have overlapping participants and I have tried importing them in 2 different ways already (PROC IMPORT & INFILE STATEMENT) to try and solve my problem. I am confused as to why I'm getting the following errors since when I use PROC CONTENTS and PROC PRINT I get all 233 and 860 observations along with all of the variables that are supposed to be there. NOTE: Updated analytical products: SAS/STAT 14.1 SAS/ETS 14.1 SAS/OR 14.1 SAS/IML 14.1 SAS/QC 14.1 NOTE: Additional host information: X64_8HOME WIN 6.2.9200 Workstation NOTE: SAS initialization used: real time 0.71 seconds cpu time 0.62 seconds 1 /* IMPORT CSV FILE*/ 2 data WORK.HCVSTEP; 3 infile 'C:\Users\Tanya Renteria\Desktop\ILE EVERYTHING\DATABASES FOR ILE\HCVSTEP_LINKED.CSV' firstobs=2 dlm=',' dsd ; 4 length SEQN 8 LINKED $3 AGECD 8 GENDER $4 RACE $5 IDUCD $3 HOUSECD $50 PCPCD $3 INSURCD $3 5 EDUCD $50 INCCD $3 PRIMLANG $20 RAPIDCD $20 VIRLD $20 REFTO1 $50 REFTO2 $50 COINFCD $3 VIRLDB 8 INSTYPE $50; 6 input SEQN LINKED $ AGECD GENDER $ RACE $ IDUCD $ HOUSECD $ PCPCD $ INSURCD $ 7 EDUCD $ INCCD $ PRIMLANG $ RAPIDCD $ VIRLD $ REFTO1 $ REFTO2 $ COINFCD $ VIRLDB INSTYPE $; 8 label SEQN = 'Sequence Number' 9 LINKED = " Linked to Hepatitis C Care" 10 AGECD = 'Age' 11 GENDER = 'Gender' 12 RACE = 'Race' 13 IDUCD = 'Injection Drug Use' 14 HOUSECD = 'Housing Status' 15 PCPCD = 'Primary Care Provider' 16 INSURCD = 'Insurance Status' 17 EDUCD = 'Education' 18 INCCD = 'Income' 19 PRIMLANG = 'Primary Language' 20 RAPIDCD = 'Rapid Test Result' 21 VIRLD = ' Viral Load' 22 REFTO1 = 'First Location Referred to' 23 REFTO2 = ' Second Location Referred to' 24 COINFCD = 'Co-infection with HIV' 25 VIRLDB = 'Viral Load at Baseline' 26 INSTYPE = 'Insurance Type'; 27 run; NOTE: The infile 'C:\Users\Tanya Renteria\Desktop\ILE EVERYTHING\DATABASES FOR ILE\HCVSTEP_LINKED.CSV' is: Filename=C:\Users\Tanya Renteria\Desktop\ILE EVERYTHING\DATABASES FOR ILE\HCVSTEP_LINKED.CSV, RECFM=V,LRECL=32767,File Size (bytes)=24443, Last Modified=06Mar2019:14:53:25, Create Time=26Feb2019:02:42:54 NOTE: 233 records were read from the infile 'C:\Users\Tanya Renteria\Desktop\ILE EVERYTHING\DATABASES FOR ILE\HCVSTEP_LINKED.CSV'. The minimum record length was 49. The maximum record length was 141. NOTE: The data set WORK.HCVSTEP has 233 observations and 19 variables. NOTE: DATA statement used (Total process time): real time 0.04 seconds cpu time 0.04 seconds 28 /* IMPORT CSV FILE*/ 29 data WORK.HVCFLOW2; 30 infile 'C:\Users\Tanya Renteria\Desktop\ILE EVERYTHING\DATABASES FOR ILE\HCVFLOW2.CSV' firstobs=2 dlm=',' dsd ; 31 length SEQN 8 LINKED $3 AGECD 8 GENDER $4 RACE $5 IDUCD $3 HOUSECD $50 PCPCD $3 INSURCD $3 32 EDUCD $50 INCCD $3 PRIMLANG $20 RAPIDCD $20 VIRLD $20 REFTO1 $50 REFTO2 $50 COINFCD $3 VIRLDB 8; 33 input SEQN LINKED $ AGECD GENDER $ RACE $ IDUCD $ HOUSECD $ PCPCD $ INSURCD $ 34 EDUCD $ INCCD $ PRIMLANG $ RAPIDCD $ VIRLD $ REFTO1 $ REFTO2 $ COINFCD $ VIRLDB; 35 label SEQN = 'Sequence Number' 36 LINKED = " Linked to Hepatitis C Care" 37 AGECD = 'Age' 38 GENDER = 'Gender' 39 RACE = 'Race' 40 IDUCD = 'Injection Drug Use' 41 HOUSECD = 'Housing Status' 42 PCPCD = 'Primary Care Provider' 43 INSURCD = 'Insurance Status' 44 EDUCD = 'Education' 45 INCCD = 'Income' 46 PRIMLANG = 'Primary Language' 47 RAPIDCD = 'Rapid Test Result' 48 VIRLD = ' Viral Load' 49 REFTO1 = 'First Location Referred to' 50 REFTO2 = ' Second Location Referred to' 51 COINFCD = 'Co-infection with HIV' 52 VIRLDB = 'Viral Load at Baseline' 53 INSTYPE = 'Insurance Type'; 54 run; NOTE: Variable INSTYPE is uninitialized. NOTE: The infile 'C:\Users\Tanya Renteria\Desktop\ILE EVERYTHING\DATABASES FOR ILE\HCVFLOW2.CSV' is: Filename=C:\Users\Tanya Renteria\Desktop\ILE EVERYTHING\DATABASES FOR ILE\HCVFLOW2.CSV, RECFM=V,LRECL=32767,File Size (bytes)=73995, Last Modified=06Mar2019:14:52:56, Create Time=03Mar2019:00:09:17 NOTE: 860 records were read from the infile 'C:\Users\Tanya Renteria\Desktop\ILE EVERYTHING\DATABASES FOR ILE\HCVFLOW2.CSV'. The minimum record length was 37. The maximum record length was 128. NOTE: The data set WORK.HVCFLOW2 has 860 observations and 18 variables. NOTE: DATA statement used (Total process time): real time 0.04 seconds cpu time 0.04 seconds 55 /* RENAME DATA SET */ 56 data HCVSTEP; 57 set WORK.HCVSTEP; 58 run; NOTE: There were 233 observations read from the data set WORK.HCVSTEP. NOTE: The data set WORK.HCVSTEP has 233 observations and 19 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 59 60 /* RENAME DATA SET */ 61 data HCVFLOW2; 62 set WORK.HVCFLOW2; 63 run; NOTE: There were 860 observations read from the data set WORK.HVCFLOW2. NOTE: The data set WORK.HCVFLOW2 has 860 observations and 18 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 64 /* CHECK DATA SET */ 65 proc contents data = HCVSTEP; NOTE: Writing HTML Body file: sashtml.htm 66 run; NOTE: PROCEDURE CONTENTS used (Total process time): real time 0.32 seconds cpu time 0.28 seconds 67 proc print data = HCVSTEP label; 68 run; NOTE: There were 233 observations read from the data set WORK.HCVSTEP. NOTE: PROCEDURE PRINT used (Total process time): real time 0.14 seconds cpu time 0.12 seconds 69 /* CHECK DATA SET */ 70 proc contents data = HCVFLOW2; 71 run; NOTE: PROCEDURE CONTENTS used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 72 proc print data = HCVFLOW2 label; 73 run; NOTE: There were 860 observations read from the data set WORK.HCVFLOW2. NOTE: PROCEDURE PRINT used (Total process time): real time 0.42 seconds cpu time 0.42 seconds 74 proc sort data= HCVSTEP; by SEQN; run; NOTE: There were 233 observations read from the data set WORK.HCVSTEP. NOTE: The data set WORK.HCVSTEP has 233 observations and 19 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 75 proc format; 76 value GENDER 77 1 = 'Male' 78 2 = 'Female' 79 3 = 'MTF' 80 4 = 'FTM'; NOTE: Format GENDER has been output. 81 value RACE 82 1 = 'Latino' 83 2 = 'White' 84 3 = 'African American' 85 4 = 'Native American' 86 5 = 'Mixed' 87 6 = 'Unknown'; NOTE: Format RACE has been output. 88 value AGECAT 89 1 = '18 - 29' 90 2 = '30 - 39' 91 3 = '40 - 49' 92 4 = '50 - 59' 93 5 = 'Greater than 60'; NOTE: Format AGECAT has been output. 94 value agedi 95 1 = 'Younger' 96 2 = 'Older'; NOTE: Format AGEDI has been output. 97 value VIRLDB 98 1 = 'LOW' /* IF LESS THAN 800,000 IU/ml*/ 99 2 = 'HIGH'; NOTE: Format VIRLDB has been output. 99 ! /* IF MORE THAN 800,000 IU/ml*/ 100 run; NOTE: PROCEDURE FORMAT used (Total process time): real time 0.04 seconds cpu time 0.04 seconds 101 DATA HCVSTEP; 102 if AGECD <18 and AGECD ge 100 then delete; 103 /*RENAMING VARIABLES FOR EASE OF USE*/ 104 rename AGECD=AGE IDUCD= IDU HOUSECD= HOUSING 105 PCPCD= PRIMARYCARE INSURCD= INSURANCE EDUCD= EDUCATION INCCD= INCOME 106 PRIMLANG= LANGUAGE RAPIDCD= RAPID COINFCD= COINFECTION; 107 /*MAKING AGE GROUPS*/ 108 if AGECD ge 18 AND AGECD < 30 then AGECAT=1; 109 else if AGECD ge 30 AND AGECD < 40 then AGECAT=2; 110 else if AGECD ge 40 AND AGECD < 50 then AGECAT=3; 111 else if AGECD ge 50 AND AGECD < 60 then AGECAT=4; 112 else if AGECD ge 60 then AGECAT=5; 113 if AGECD le 50 then AGEDI=1; 114 else AGEDI=2; 115 /*FORMATTING VARIABLES*/ 116 /*MAKING GENDER GROUPS*/ 117 if GENDER = 'M' then GENDCAT= 1; 118 else if GENDER = 'F' then GENDCAT=2; 119 else if GENDER = 'MTF' then GENDCAT=3; 120 else if GENDER = 'FTM' then GENDCAT=4; 121 if VIRLDB < 800000 then VIRLDB=1; 122 else if VIRLDB > 800000 then VIRLDB=2; 123 keep SEQN AGE AGECAT AGEDI GENDER RACE IDU HOUSING PRIMARYCARE INSURANCE 124 EDUCATION INCOME LANGUAGE RAPID COINFECTION VIRLD REFTO1 REFTO2 VIRLDB INSTYPE; 125 run; NOTE: Variable AGECD is uninitialized. NOTE: Variable GENDER is uninitialized. WARNING: The variable SEQN in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable AGE in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable RACE in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable IDU in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable HOUSING in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable PRIMARYCARE in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable INSURANCE in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable EDUCATION in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable INCOME in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable LANGUAGE in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable RAPID in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable COINFECTION in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable VIRLD in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable REFTO1 in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable REFTO2 in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable INSTYPE in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable AGECD in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable IDUCD in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable HOUSECD in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable PCPCD in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable INSURCD in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable EDUCD in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable INCCD in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable PRIMLANG in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable RAPIDCD in the DROP, KEEP, or RENAME list has never been referenced. WARNING: The variable COINFCD in the DROP, KEEP, or RENAME list has never been referenced. NOTE: The data set WORK.HCVSTEP has 1 observations and 4 variables. NOTE: DATA statement used (Total process time): real time 0.04 seconds cpu time 0.03 seconds The code that I'm using is the following: /* IMPORT CSV FILE*/ data WORK.HCVSTEP; infile 'C:\Users\Tanya Renteria\Desktop\ILE EVERYTHING\DATABASES FOR ILE\HCVSTEP_LINKED.CSV' firstobs=2 dlm=',' dsd ; length SEQN 8 LINKED $3 AGECD 8 GENDER $4 RACE $5 IDUCD $3 HOUSECD $50 PCPCD $3 INSURCD $3 EDUCD $50 INCCD $3 PRIMLANG $20 RAPIDCD $20 VIRLD $20 REFTO1 $50 REFTO2 $50 COINFCD $3 VIRLDB 8 INSTYPE $50; input SEQN LINKED $ AGECD GENDER $ RACE $ IDUCD $ HOUSECD $ PCPCD $ INSURCD $ EDUCD $ INCCD $ PRIMLANG $ RAPIDCD $ VIRLD $ REFTO1 $ REFTO2 $ COINFCD $ VIRLDB INSTYPE $; label SEQN = 'Sequence Number' LINKED = " Linked to Hepatitis C Care" AGECD = 'Age' GENDER = 'Gender' RACE = 'Race' IDUCD = 'Injection Drug Use' HOUSECD = 'Housing Status' PCPCD = 'Primary Care Provider' INSURCD = 'Insurance Status' EDUCD = 'Education' INCCD = 'Income' PRIMLANG = 'Primary Language' RAPIDCD = 'Rapid Test Result' VIRLD = ' Viral Load' REFTO1 = 'First Location Referred to' REFTO2 = ' Second Location Referred to' COINFCD = 'Co-infection with HIV' VIRLDB = 'Viral Load at Baseline' INSTYPE = 'Insurance Type'; run; /* IMPORT CSV FILE*/ data WORK.HVCFLOW2; infile 'C:\Users\Tanya Renteria\Desktop\ILE EVERYTHING\DATABASES FOR ILE\HCVFLOW2.CSV' firstobs=2 dlm=',' dsd ; length SEQN 8 LINKED $3 AGECD 8 GENDER $4 RACE $5 IDUCD $3 HOUSECD $50 PCPCD $3 INSURCD $3 EDUCD $50 INCCD $3 PRIMLANG $20 RAPIDCD $20 VIRLD $20 REFTO1 $50 REFTO2 $50 COINFCD $3 VIRLDB 8; input SEQN LINKED $ AGECD GENDER $ RACE $ IDUCD $ HOUSECD $ PCPCD $ INSURCD $ EDUCD $ INCCD $ PRIMLANG $ RAPIDCD $ VIRLD $ REFTO1 $ REFTO2 $ COINFCD $ VIRLDB; label SEQN = 'Sequence Number' LINKED = " Linked to Hepatitis C Care" AGECD = 'Age' GENDER = 'Gender' RACE = 'Race' IDUCD = 'Injection Drug Use' HOUSECD = 'Housing Status' PCPCD = 'Primary Care Provider' INSURCD = 'Insurance Status' EDUCD = 'Education' INCCD = 'Income' PRIMLANG = 'Primary Language' RAPIDCD = 'Rapid Test Result' VIRLD = ' Viral Load' REFTO1 = 'First Location Referred to' REFTO2 = ' Second Location Referred to' COINFCD = 'Co-infection with HIV' VIRLDB = 'Viral Load at Baseline' INSTYPE = 'Insurance Type'; run; /* RENAME DATA SET */ data HCVSTEP; set WORK.HCVSTEP; run; /* RENAME DATA SET */ data HCVFLOW2; set WORK.HVCFLOW2; run; /* CHECK DATA SET */ proc contents data = HCVSTEP; run; proc print data = HCVSTEP label; run; /* CHECK DATA SET */ proc contents data = HCVFLOW2; run; proc print data = HCVFLOW2 label; run; proc sort data= HCVSTEP; by SEQN; run; proc sort data= HCVFLOW2; by SEQN; run; proc format; value GENDER 1 = 'Male' 2 = 'Female' 3 = 'MTF' 4 = 'FTM'; value RACE 1 = 'Latino' 2 = 'White' 3 = 'African American' 4 = 'Native American' 5 = 'Mixed' 6 = 'Unknown'; value AGECAT 1 = '18 - 29' 2 = '30 - 39' 3 = '40 - 49' 4 = '50 - 59' 5 = 'Greater than 60'; value agedi 1 = 'Younger' 2 = 'Older'; value VIRLDB 1 = 'LOW' /* IF LESS THAN 800,000 IU/ml*/ 2 = 'HIGH'; /* IF MORE THAN 800,000 IU/ml*/ run; DATA HCVSTEP; if AGECD <18 and AGECD ge 100 then delete; /*RENAMING VARIABLES FOR EASE OF USE*/ rename AGECD=AGE IDUCD= IDU HOUSECD= HOUSING PCPCD= PRIMARYCARE INSURCD= INSURANCE EDUCD= EDUCATION INCCD= INCOME PRIMLANG= LANGUAGE RAPIDCD= RAPID COINFCD= COINFECTION; /*MAKING AGE GROUPS*/ if AGECD ge 18 AND AGECD < 30 then AGECAT=1; else if AGECD ge 30 AND AGECD < 40 then AGECAT=2; else if AGECD ge 40 AND AGECD < 50 then AGECAT=3; else if AGECD ge 50 AND AGECD < 60 then AGECAT=4; else if AGECD ge 60 then AGECAT=5; if AGECD le 50 then AGEDI=1; else AGEDI=2; /*FORMATTING VARIABLES*/ /*MAKING GENDER GROUPS*/ if GENDER = 'M' then GENDCAT= 1; else if GENDER = 'F' then GENDCAT=2; else if GENDER = 'MTF' then GENDCAT=3; else if GENDER = 'FTM' then GENDCAT=4; if VIRLDB < 800000 then VIRLDB=1; else if VIRLDB > 800000 then VIRLDB=2; keep SEQN AGE AGECAT AGEDI GENDER RACE IDU HOUSING PRIMARYCARE INSURANCE EDUCATION INCOME LANGUAGE RAPID COINFECTION VIRLD REFTO1 REFTO2 VIRLDB INSTYPE; run; I don't understand how I'm ending up with 1 observation and 4 variables if when I checked the data sets before the last data step everything was there. Any help will be greatly appreciated. Thank you in advance.
... View more