BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

New to SAS and would appreciate advice and help on how best to handle this data mangement situation.

I have a dataset in which each observation represents a client. Each client has a "description" variable which could include either a comprehensive assessment, treatment or discharge. I have created 3 new variables to flag each observation if they contain one of these.

 

So for example:

treat_yes = 1 if description contains "tx", "treatment" dc_yes = 1 if description contains "dc", "d/c" or "discharge" ca_yes = 1 if desciption contains "comprehensive assessment" or "ca" or "comprehensive ax"

 

My end goal is to have a new dataset of clients that have gone through a Comprehensive Assessment, Treatment and Discharge.

I'm a little stumped as to what my next move should be here. I have all my variables flagged for clients. But there could be duplicate observations just because a client could have come in many times. So for example:

Client_id    treatment_yes    ca_yes   dc_yes
1234               0            1        1
1234               1            0        0
1234               1            0        1

All I really care about is if for a particular client the variables treatment_yes, ca_yes and dc_yes DO NOT equal 0 (i.e., they each have at least one "1". They could have more than one "1" but as long as they are flagged at least once).

I was thinking my next step might be to collapse the data (how do you do this?) for each unique client ID and sum treatment_yes, dc_yes and ca_yes for each client.

 

Does that work?

 

If so, how the heck do I accomplish this? Where do I start?

thanks everyone

1 ACCEPTED SOLUTION

Accepted Solutions
christinagting0
Quartz | Level 8

Thank you both!

 

For some reason when I was trying both of your proc sql codes I kept on getting a syntax error.

 

What I ended up using was this:

 

proc sql;
create table temp_collapse as
select distinct mrn,
sum(tx_yes) as tx_yes,
sum(ca_yes) as ca_yes,
sum(dc_yes) as dc_yes
from collapsed_data
group by mrn;
quit;

 

thanks!

View solution in original post

11 REPLIES 11
Reeza
Super User

You have a few options. 

 

Proc SQL is a good one, proc means is another. 

 

Proc sql;
Create table want as
Select id, max(treatment_yes) as treatment, max(ca_yes) as ca, max(dc_yes) as dc, sum( calculated treatment, calculated ca, calculated dc) as number_vars
from have
Group by id
Having calculated number_vars=3;
Quit;
Kurt_Bremser
Super User
proc sql;
create table want as
  select
    client_id,
    max(treatment_yes) as treatment_yes
    max(ca_yes) as ca_yes,
    max(dc_yes) as dc_yes
  from have
  group by client_id
;
quit;

 

christinagting0
Quartz | Level 8

Thank you both!

 

For some reason when I was trying both of your proc sql codes I kept on getting a syntax error.

 

What I ended up using was this:

 

proc sql;
create table temp_collapse as
select distinct mrn,
sum(tx_yes) as tx_yes,
sum(ca_yes) as ca_yes,
sum(dc_yes) as dc_yes
from collapsed_data
group by mrn;
quit;

 

thanks!

christinagting0
Quartz | Level 8

Thanks Kurt!

 

I actually couldn't figure out where the comma was left out. I tried many variations of your code. 

 

Just so I can learn, where should the comma have been? It looks identical to the code I used in terms of syntax the only difference being I used the word DISTINCT. I thought maybe this was why you code wasn't working.

 

 

Reeza
Super User

In my code you would need WHERE instead of HAVING.

Kurt_Bremser
Super User

In SQL, commas are needed to separate members of lists.

In a select, the elements need to be separated by a comma, but at the end (before the keyword from) there must be no comma, as a different part of the statement begins.

In my style of writing (every element on its own line) there should be a comma on every line of the select part, with the exception of the last.

christinagting0
Quartz | Level 8

weird...my code didn't require a comma after each select line (as you can see from above in my solution).

 

Anyways, thanks very much!

Reeza
Super User

@christinagting0 

Each item in select statement has a comma except for last. 

Your solution does have this. 

Kurt_Bremser
Super User

From your code snippet:

select distinct mrn,
sum(tx_yes) as tx_yes,
sum(ca_yes) as ca_yes,
sum(dc_yes) as dc_yes

You can see the commas, and the (correctly) missing comma in the last line.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 2525 views
  • 5 likes
  • 3 in conversation