BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
bonnie27
Calcite | Level 5

I am working with CMS data and trying to convert it so that each facility has only one row, but I'm having a difficult time with one specific aspect. 

 

The original data in formatted like (I removed some unnecessary columns for ease of reading):

ObsFacility_IDFacility_NameMeasure_IDMeasure_NameCompared_to_NationalScore
1100001SHANDS JACKSONVILLEHAI_1_CILOWERCentral Line Associated Bloodstream Infection (ICU + select Wards): Lower Confidence LimitWorse than the National Benchmark1.081
2100001SHANDS JACKSONVILLEHAI_5_CILOWERMRSA Bacteremia: Lower Confidence LimitWorse than the National Benchmark1.595
3100001SHANDS JACKSONVILLEHAI_1_SIRCentral Line Associated Bloodstream Infection (ICU + select Wards)Worse than the National Benchmark1.62

 

Each hospital has about 40 rows with a separate measure in each.

I am trying to get data that looks like below, for all HAI_x_SIR measures.

 

ObsFacility_IDStateHAI_1_SIRCompared_1HAI_2_SIRCompared_2HAI_3_SIRCompared_3
1100001FL1.62Better than average0.832Better than average0.904No different than average
2100002FL1.685Worse than average1.106No different than average1.197Worse than average
3100006FL0.495No different than average0.286Better than average0.966Better than average

 

 

data hai_test;
	set hai;
	if facility_id = . then delete;
	if score="Not Available" then delete;
run;

proc sort data=hai;
    by Facility_ID;
run;

proc print data=work.hai_test (obs=100);
run;

proc transpose data=hai_test out=hai_test_row (keep=facility_id 
		state HAI_1_SIR HAI_2_SIR HAI_3_SIR 
		HAI_4_SIR HAI_5_SIR);
	by facility_ID state;
	var score;
	id measure_id;
run;

 

 

With this code, I have been able to get data that looks like this:

 

ObsFacility_IDStateHAI_1_SIRHAI_2_SIRHAI_3_SIRHAI_4_SIRHAI_5_SIR
1100001FL1.620.8320.9040.3862.322
2100002FL1.6851.1061.19721.433
3100006FL0.4950.2860.9661.514

1.247

 

 

proc transpose data=hai_test out=hai_test_row (keep=facility_id 
		state HAI_1_SIR HAI_2_SIR HAI_3_SIR 
		HAI_4_SIR HAI_5_SIR);
	by facility_ID state;
	var score compared_to_national;
	id measure_id;
run;

 

 

Using this code instead, including compared_to_national, gives the following result:

 

ObsFacility_IDStateHAI_1_SIRHAI_5_SIRHAI_4_SIRHAI_2_SIRHAI_3_SIR
1100001FL1.622.3220.3860.8320.904
2100001FLWorse than the National BenchmarkWorse than the National BenchmarkNo Different than National BenchmarkNo Different than National BenchmarkNo Different than National Benchmark
3100002FL1.6851.433 1.1061.197
4100002FLWorse than the National BenchmarkNo Different than National Benchmark No Different than National BenchmarkNo Different than National Benchmark
5100006FL0.4951.2471.5140.2860.966
6100006FLBetter than the National BenchmarkNo Different than National BenchmarkNo Different than National BenchmarkBetter than the National BenchmarkNo Different than National Benchmark

 

Which is obviously still not correct, but I think may be an intermediary step? Or I may be completely off base. I have no idea how to proceed with what I've been able to get so far. I'm thinking an array may be necessary or maybe retain, but I don't even know what to search for to find advice on that for this scenario. 

 

Thank you so much!

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

So both of your PROC TRANSPOSE steps will create variables with the same names because you are only using the ID statement to name them.  If you want the numeric and the character variables to have different names you could add the PREFIX= or SUFFIX= option to the PROC statement.

 

What is the purpose of the KEEP= dataset option in your code? Are you just trying to get rid of the _NAME_ variable?  It might be easier to use a DROP= option instead.

 

proc transpose data=hai_test out=hai_test_numbers (drop=_name_);
  by facility_ID state;
  id measure_id;
  var score;
run;
proc transpose data=hai_test suffix=_change out=hai_test_strings(drop=_name_);
  by  facility_ID state;
  id measure_id;
  var compared_to_national;
run;

data want;
  merge hai_test_numbers hai_test_strings;
  by facility_ID state;
run;

View solution in original post

3 REPLIES 3
Tom
Super User Tom
Super User

So both of your PROC TRANSPOSE steps will create variables with the same names because you are only using the ID statement to name them.  If you want the numeric and the character variables to have different names you could add the PREFIX= or SUFFIX= option to the PROC statement.

 

What is the purpose of the KEEP= dataset option in your code? Are you just trying to get rid of the _NAME_ variable?  It might be easier to use a DROP= option instead.

 

proc transpose data=hai_test out=hai_test_numbers (drop=_name_);
  by facility_ID state;
  id measure_id;
  var score;
run;
proc transpose data=hai_test suffix=_change out=hai_test_strings(drop=_name_);
  by  facility_ID state;
  id measure_id;
  var compared_to_national;
run;

data want;
  merge hai_test_numbers hai_test_strings;
  by facility_ID state;
run;
bonnie27
Calcite | Level 5

The purpose of KEEP= was to drop the many other measure variables that aren't relevant to my final project (HAI_5_ELIGCASES for example). 

Your solution worked though! Thank you so much, you're my hero. 🙂

Tom
Super User Tom
Super User

Since the MEASURE_ID variable is used to NAME the resulting variables you can use a WHERE on the INPUT to remove them rather than a DROP or KEEP on the OUTPUT.  Should make it run faster too.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1389 views
  • 0 likes
  • 2 in conversation