BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ChuksManuel
Pyrite | Level 9

Hello,

 

I have this dataset with observations sorted by 6 hospitals - 518 546 548 573 672 673.

 

I am trying to determine the number of people within each hospital and overall that have a diagnostic code of '311' through '31199' and i have used the retain function for this.

Here are my codes.

Any help will be appreciated. Data is attached

 

data seven; set one;
by hospital id ;
retain hospital;
array ya icd1-icd6;
if first.id then do; hosp=0; end;
do over ya;
if '311' le ya le '31199' then hosp=1;
end;
if last.id then output;
proc print; run;

1 ACCEPTED SOLUTION

Accepted Solutions
heffo
Pyrite | Level 9

You could use an informat.

proc format;
	invalue $icdCode "311" - "31199" = "Depression"
				other = "Other";
run;
data have;
	*Using informat to transform the data in this case, but you could use put(icd1,$icdCode.);
	input icd :$icdCode.;
datalines;
31125
32000
25000
311
31199
;;;
run;
data want;
	set have;
	if icd = "Depression" then hosp =1;	
	else hosp = 0;
run;

 

With a format, instead of an informat. 

proc format;
	value $icdCode "311" - "31199" = "Depression"
				other = "Other";
run;
data have;
	length icd $12;
	input icd $;
datalines;
31125
32000
25000
311
31199
;;;
run;
data want;
	set have;
	icd_formated = put(icd,$icdCode.);
	if icd_formated = "Depression" then hosp =1;	
	else hosp = 0;
run;

 

View solution in original post

11 REPLIES 11
ScottBass
Rhodochrosite | Level 12

Post your code as a self-contained data step containing the datalines statement.  Very few people are going to download your attached SAS dataset.

 

Post your code using the Insert SAS Code icon so it's formatted correctly.

 

data seven; set one;
by hospital id ;
retain hospital;
array ya icd1-icd6;
if first.id then do; hosp=0; end;
do over ya;
if '311' le ya le '31199' then hosp=1;
end;
if last.id then output;
proc print; run;

 

Is that your problem?

 

the number of people within each hospital and overall

 

I'd consider PROC FREQ or PROC SUMMARY and possibly PROC TRANSPOSE to get one row per ICD, then use a where clause with PROC FREQ/SUMMARY.  You could then merge back with your data if required, or re-transpose.


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.
ChuksManuel
Pyrite | Level 9

Thank you.

Please any correction in the logic of the statements can help.

I have 6 hospitals and i am trying to find the count in each hospital using the retain the statement.

 

Reeza
Super User
Are you sure that this statement works?

if '311' le ya le '31199' then hosp=1;

That would be a character comparison so you could get weird results and I'd highly suggest an alternative method of coding this line. Unless you're 100% sure it works but I wouldn't be....
ChuksManuel
Pyrite | Level 9

You are right. It didn't work. 

Is there a way i can recode that? I just want ICD codes that fall between 311 and 31199. The ICD codes were coded as a character variable in our dataset.

 

Tom
Super User Tom
Super User

Fix the RETAIN statement to reference the variable you want to retain HOSP instead of the variable that is already in your dataset HOSPITAL.

ChuksManuel
Pyrite | Level 9

Thank you. I revised it. But i get stuck on the character vs numeric variable issue.

 

data one; set two;
by hospital id visitdate;
retain depression;
array ya icd1-icd6;
if first.id then do depression=0; end;
do over ya;
if '311' le ya le '31199' then depression=1;
end;
if last.id then output;
proc print; run;

 

 

Tom
Super User Tom
Super User

What character vs numeric issue?

You should NOT be storing ICD codes in numeric variables. First of all they are CODEs and not numbers. Also there are code where the leading zeros are important.  But even worse there are many ICD9 codes that have letters in them.

ChuksManuel
Pyrite | Level 9

You are right. They are stored as character variables. How do i find out which codes lie between '311' and '31199' in each hospital. I guess i would need to convert the character codes to numeric right? Because i tried running the code in character form and it didn't give me what i was looking for.

ScottBass
Rhodochrosite | Level 12

How do i find out which codes lie between '311' and '31199' in each hospital

 

Define "between".  Is the list long?  Can you just use an IN() list.

 

And post self-contained sample code so we can help you better.

 


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.
Tom
Super User Tom
Super User

@ChuksManuel wrote:

You are right. They are stored as character variables. How do i find out which codes lie between '311' and '31199' in each hospital. I guess i would need to convert the character codes to numeric right? Because i tried running the code in character form and it didn't give me what i was looking for.


Codes between '311' and '31199' will work with clean data.  Although I don't think there are any valid codes in that range other than '311'.  Also make sure you haven't mistakenly stored codes that start with zero without the leading zero.  Otherwise you might include codes that should have started with '031'  

031 Diseases due to other mycobacteria

Show the data that is causing the trouble. Make a small self contained example.

 

Your data step is just reducing the data to one observation per hospital and id combination. If you want the total per hospital you will still need to count/add them.

heffo
Pyrite | Level 9

You could use an informat.

proc format;
	invalue $icdCode "311" - "31199" = "Depression"
				other = "Other";
run;
data have;
	*Using informat to transform the data in this case, but you could use put(icd1,$icdCode.);
	input icd :$icdCode.;
datalines;
31125
32000
25000
311
31199
;;;
run;
data want;
	set have;
	if icd = "Depression" then hosp =1;	
	else hosp = 0;
run;

 

With a format, instead of an informat. 

proc format;
	value $icdCode "311" - "31199" = "Depression"
				other = "Other";
run;
data have;
	length icd $12;
	input icd $;
datalines;
31125
32000
25000
311
31199
;;;
run;
data want;
	set have;
	icd_formated = put(icd,$icdCode.);
	if icd_formated = "Depression" then hosp =1;	
	else hosp = 0;
run;

 

sas-innovate-white.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.

 

Early bird rate extended! Save $200 when you sign up by March 31.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 1799 views
  • 2 likes
  • 5 in conversation