BookmarkSubscribeRSS Feed
mdakkak
Fluorite | Level 6

Hi SAS communities,

 

I am currently creating a temporary data set that contains the most recent visit for each person. I have this code: 

PROC SORT DATA= HypTabs.Vitals;
BY SSN VISITDT;
RUN;

DATA WORK.Vital_Last;
	SET HYPTABS.Vitals;
	BY SSN;
	IF LAST.SSN;
RUN;

This gives me 6 columns and 598 observations. The goal is to get to 573 observations. 

So then I merged 4 different data sets together and created the HypAnalysis2 data with this code:

DATA HYPANL.HYPANALYSIS2;
	RETAIN SSN StateCd GenderCd EthCd RaceCd AgeAtVisit SBP DBP WtLb HypRelDeathInd;
	MERGE 	WORK.Vital_Last		(IN = InVitals)
			HypTabs.NDI 		(IN = InNDI)
			HypTabs.Contact 	(IN = InContact)
			HypTabs.DEMOG 		(IN = InDemog);
	BY SSN;
	IF 	InContact 		= 1 
		AND InVitals 	= 1 
		AND InDemog 	= 1;
	IF LAST.SSN;
	AgeAtVisit=yrdif(BirthDt, VisitDt, 'AGE');	
	FORMAT 	BirthDt YYMMDD10.
			AgeAtVisit 4.0;
			
/* 		IF InNDI = 0 THEN HypRelDeathInd = 0;		 */
	IF InNDI = . THEN HypRelDeathInd = 0;
	ELSE IF InNDI = 0 THEN HypRelDeathInd = 0;
	ELSE IF InNDI = 1 THEN HypRelDeathInd = 1;
	FORMAT 	HypRelDeathInd 	IndVbl. 
			StateCd 		$StateCd.;

	FORMAT HypRelDeathInd IndVbl. StateCd $StateCd.;
	DROP VisitDt HtIn DeathDt ICD10 CODCd City Inits ZipCd EthRaceCd BirthDt;
RUN;

Once I run that code, I get 403 observations (which should be 573) and 10 columns (which is what I want). In those data sets that were merged, there were 3 states, Utah, Iowa and Mississippi. When looking at the StateCd, I noticed that none of the observations where from Utah. Which leads me to assume that my Utah Vitals data is incomplete/incorrect. If needed, I can share my Utah Vitals data but I am not sure if that is the problem, or if the problem is within the code above.

 

Thanks in advance for any help!

 

6 REPLIES 6
PaigeMiller
Diamond | Level 26

This gives me 6 columns and 598 observations. The goal is to get to 573 observations.

 

This requires a lot more explanation. How do you want to get down to 573? Which 573? What is the logic?

--
Paige Miller
mdakkak
Fluorite | Level 6
Here are the instructions on how to get the 573 observations.

The data set should be comprised of all people (identified by SSN) who have data in all 3 of the ‘Contact’, ‘Demog’, and ‘Vitals’ data sets. Next, consider that most of the people in those data sets have not died and thus will not have a matching observation in the ‘NDI’ data set. Conversely, most of the people in the ‘NDI’ data set are not in our cohort (so we don’t want to include those observations). Therefore, when performing a match-merge with the ‘NDI’ data set, do not include observations from the ‘NDI’ data set which don’t have a matching row in the other 3 data sets (use several IN= data set options!). The resulting data set should have 573 observations (see the values shown in the table above).
PaigeMiller
Diamond | Level 26

I'm afraid that without having your data sets, there's no way we can help you get to the proper 573 records. But, if you'd like to send us small portions of these data sets, and tell us the right number of records to get when using these small portions of the data, then we might have a chance. Please provide the small portions of the data sets using these instructions, and not using any other method.

--
Paige Miller
yeji6144
Calcite | Level 5

Hi!

 

I had a similar issue with the Utah dataset. I realized that some of the SSN from the Utah dataset were missing a leading 0 and were therefore only 8 digits long. I went back to Prog 2 and added this to where we created Contact_UT which fixed it for me.

IDLength=Lengthn(IDCHAR);
IF IDLength=8 THEN IDCHAR=CAT(0,IDCHAR);

It's odd that you are missing all data for Utah though, maybe the Utah SSNs are not matching up to the other SSNs for some reason. If you want to share your code from Prog 2 where you created Contact_UT I can take a look for you.

 

Also, I'm not sure if you need this whole thing:

/* 		IF InNDI = 0 THEN HypRelDeathInd = 0;		 */
	IF InNDI = . THEN HypRelDeathInd = 0;
	ELSE IF InNDI = 0 THEN HypRelDeathInd = 0;
	ELSE IF InNDI = 1 THEN HypRelDeathInd = 1;

 If you're trying to get HypRelDeathInd to be 0 where it's missing data, all you would need is:

	IF HypRelDeathInd = . THEN HypRelDeathInd = 0;
JamesPatton01
Calcite | Level 5

Hello mdakkak! 

 

I am wondering if the problem is occurring during your IF and IF ELSE statements. For example, it could be possible that you are "overmatching" your data and, as a result, too many observations are being removed. Since you are non-match merging these data sets, you are dealing with gaps in the data sets which is why SAS is listing less observations and indicating that some observations do not exist.

 

I would consider trying to use an IF statement in conjunction with an IF NOT statement in replace of the code that you currently have for the merge. I think the solution is that you need to specify the InDemog and InVitals automatic variables to show up. In addition to this, you would need to manually populate the value for HypRelDeathInd using conditional logic. You can use the keyword "NOT" to call people who don't show up in your dataset and then manually fill in the indicator variable to be 0. I have listed an example for you to look at here: 

 
IF InDemog AND InVitals;
IF NOT InNDI THEN HypRelDeathInd = 0;
 
Hope that this helps! 
- James 

 

kristiepauly1
Fluorite | Level 6

As others have mentioned, it has to do with leading zero's in some of your observations ssn's. Go to your Utah demog dataset from the previous project and add this: ID = PUT(NewID,z9.); before you string together your SSN using CatX. 

OR 

If that doesn't work the other code to try using Z9 in the SUBSTR as follows: 

SSN = CATX('-', SUBSTR(PUT(ID,Z9.),1,3), SUBSTR(PUT(ID,Z9.), 4,2), SUBSTR (PUT(ID,Z9.),6,4));

Good Luck. 

 

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1635 views
  • 0 likes
  • 5 in conversation