Hi All,
Please tell me what is wrong with this code.
DATA satratha.Merged1718_Q;
SET satratha.Merged1718_P;
IF HIV_EIABLOOD ="N" THEN GISP_HIVRESULT ="0";
IF HIV_EIABLOOD ="P" THEN GISP_HIVRESULT ="1";
IF HIV_EIABLOOD ="" THEN GISP_HIVRESULT ="7";
IF HIV_PCRN_A ="N" THEN GISP_HIVRESULT ="0";
IF HIV_PCRN_A ="P" THEN GISP_HIVRESULT ="1";
IF HIV_PCRN_A ="" THEN GISP_HIVRESULT ="7";
IF HIV_1_2_ABBLOOD ="N" THEN GISP_HIVRESULT ="0";
IF HIV_1_2_ABBLOOD ="P" THEN GISP_HIVRESULT ="1";
IF HIV_1_2_ABBLOOD ="" THEN GISP_HIVRESULT ="7";
IF HIV_EIA_RPBLOOD ="N" THEN GISP_HIVRESULT ="0";
IF HIV_EIA_RPBLOOD ="P" THEN GISP_HIVRESULT ="1";
IF HIV_EIA_RPBLOOD ="" THEN GISP_HIVRESULT ="7";
IF HIV_COMBON_A ="N" THEN GISP_HIVRESULT ="0";
IF HIV_COMBON_A ="P" THEN GISP_HIVRESULT ="1";
IF HIV_COMBON_A ="" THEN GISP_HIVRESULT ="7";
IF EIAN_A ="N" THEN GISP_HIVRESULT ="0";
IF EIAN_A ="P" THEN GISP_HIVRESULT ="1";
IF EIAN_A ="" THEN GISP_HIVRESULT ="7";
RUN;
PROC FREQ DATA=satratha.Merged1718_Q;
TABLE EIAN_A HIV_EIABLOOD HIV_PCRN_A HIV_1_2_ABBLOOD HIV_EIA_RPBLOOD HIV_COMBON_A GISP_HIVRESULT;
RUN;
Log says:
DATA satratha.Merged1718_Q;
1902 SET satratha.Merged1718_P;
1903 IF HIV_EIABLOOD ="N" THEN GISP_HIVRESULT ="0";
1904 IF HIV_EIABLOOD ="P" THEN GISP_HIVRESULT ="1";
1905 IF HIV_EIABLOOD ="" THEN GISP_HIVRESULT ="7";
1906 IF HIV_PCRN_A ="N" THEN GISP_HIVRESULT ="0";
1907 IF HIV_PCRN_A ="P" THEN GISP_HIVRESULT ="1";
1908 IF HIV_PCRN_A ="" THEN GISP_HIVRESULT ="7";
1909 IF HIV_1_2_ABBLOOD ="N" THEN GISP_HIVRESULT ="0";
1910 IF HIV_1_2_ABBLOOD ="P" THEN GISP_HIVRESULT ="1";
1911 IF HIV_1_2_ABBLOOD ="" THEN GISP_HIVRESULT ="7";
1912 IF HIV_EIA_RPBLOOD ="N" THEN GISP_HIVRESULT ="0";
1913 IF HIV_EIA_RPBLOOD ="P" THEN GISP_HIVRESULT ="1";
1914 IF HIV_EIA_RPBLOOD ="" THEN GISP_HIVRESULT ="7";
1915 IF HIV_COMBON_A ="N" THEN GISP_HIVRESULT ="0";
1916 IF HIV_COMBON_A ="P" THEN GISP_HIVRESULT ="1";
1917 IF HIV_COMBON_A ="" THEN GISP_HIVRESULT ="7";
1918 IF EIAN_A ="N" THEN GISP_HIVRESULT ="0";
1919 IF EIAN_A ="P" THEN GISP_HIVRESULT ="1";
1920 IF EIAN_A ="" THEN GISP_HIVRESULT ="7";
1921 RUN;
NOTE: There were 9346 observations read from the data set SATRATHA.MERGED1718_P.
NOTE: The data set SATRATHA.MERGED1718_Q has 9346 observations and 190 variables.
NOTE: DATA statement used (Total process time):
real time 1.40 seconds
cpu time 0.24 seconds
1922 PROC FREQ DATA=satratha.Merged1718_Q;
1923 TABLE EIAN_A HIV_EIABLOOD HIV_PCRN_A HIV_1_2_ABBLOOD HIV_EIA_RPBLOOD HIV_COMBON_A
1923! GISP_HIVRESULT;
1924 RUN;
NOTE: There were 9346 observations read from the data set SATRATHA.MERGED1718_Q.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.94 seconds
cpu time 0.04 seconds
results shows:
EIAN_A Frequency Percent CumulativeFrequency CumulativePercentNPFrequency Missing = 9264
70 | 85.37 | 70 | 85.37 |
12 | 14.63 | 82 | 100.00 |
7 | 100.00 | 7 | 100.00 |
4 | 100.00 | 4 | 100.00 |
864 | 99.31 | 864 | 99.31 |
6 | 0.69 | 870 | 100.00 |
70 | 0.75 | 70 | 0.75 |
12 | 0.13 | 82 | 0.88 |
9264 | 99.12 | 9346 | 100.00 |
There's nothing wrong with the code. But your data is complete garbage.
You have no values for two of your variables on any observations: HIV_EIABLOOD, HIV_PCRN_A
For your other variables, they contain unexpected values that are not part of the IF THEN statements, so they are likely wrong as well.
Fix the data.
I don't see an actual question.
I can see potential problems with your coding for the variable though. Consider this bit of your code:
IF HIV_EIABLOOD ="N" THEN GISP_HIVRESULT ="0"; IF HIV_EIABLOOD ="P" THEN GISP_HIVRESULT ="1"; IF HIV_EIABLOOD ="" THEN GISP_HIVRESULT ="7"; IF HIV_PCRN_A ="N" THEN GISP_HIVRESULT ="0"; IF HIV_PCRN_A ="P" THEN GISP_HIVRESULT ="1"; IF HIV_PCRN_A ="" THEN GISP_HIVRESULT ="7";
If your HIV_EIABLOOD is "P" and your HIV_PCRN_A is "" (missing) what do you want for the GISP_HIVRESULT?
Your code as currently written will have '7', at least at the end of those 6 lines of code with that condition.
You should clearly state what the recode rules are and if there is an hierarchy. If you have a value assigned based on HIV_EIABLOOD do you want to actually reassign the value based on other variable(s)? Or do you want to assign '0' when any of
HIV_EIABLOOD, HIV_PCRN_A, HIV_1_2_ABBLOOD, HIV_EIA_RPBLOOD, HIV_COMBON_A or EIAN_A are "N"?
What @ballardwsaid, but I also wanted to add that you should probably use else if.
IF HIV_EIABLOOD ="N" THEN GISP_HIVRESULT ="0";
Else IF HIV_EIABLOOD ="P" THEN GISP_HIVRESULT ="1";
Else IF HIV_EIABLOOD ="" THEN GISP_HIVRESULT ="7";
Seeing that if HIV_EIABLOOD is "N", then it cant be "P" or missing, so there is no need to do those checks. Sure you have less than 10 000 rows, but it's still good practice to keep the processing/calculations down to a minimum when you can. And typing in Else does not take that long. 🙂
Thank you all for trying to help me. I figured this out, this is how I did it.
DATA satratha.Merged1718_Q;
SET satratha.Merged1718_P;
IF HIV_EIABLOOD ="N" THEN GISP_HIVRESULT ="0";
ELSE IF HIV_EIABLOOD ="P" THEN GISP_HIVRESULT ="1";
ELSE IF HIV_PCRN_A ="N" THEN GISP_HIVRESULT ="0";
ELSE IF HIV_PCRN_A ="P" THEN GISP_HIVRESULT ="1";
ELSE IF HIV_1_2_ABBLOOD ="N" THEN GISP_HIVRESULT ="0";
ELSE IF HIV_1_2_ABBLOOD ="P" THEN GISP_HIVRESULT ="1";
ELSE IF HIV_EIA_RPBLOOD ="N" THEN GISP_HIVRESULT ="0";
ELSE IF HIV_EIA_RPBLOOD ="P" THEN GISP_HIVRESULT ="1";
ELSE IF HIV_COMBON_A ="N" THEN GISP_HIVRESULT ="0";
ELSE IF HIV_COMBON_A ="P" THEN GISP_HIVRESULT ="1";
ELSE IF EIAN_A ="N" THEN GISP_HIVRESULT ="0";
ELSE IF EIAN_A ="P" THEN GISP_HIVRESULT ="1";
ELSE GISP_HIVRESULT ="9";
RUN;
it worked!
Calling all data scientists and open-source enthusiasts! Want to solve real problems that impact your company or the world? Register to hack by August 31st!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.