BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Mystik
Obsidian | Level 7

I have been working around this since Friday, can't seem to get around it.

Let me present my case again.

 

I have over 232million observations.

I need to categorize my IDs into 3 groups. Namely Whyz, Emfy and Pheno based on codes which an ID has in hltcode.

 

ID's have more than one observations for hltcode.

 

If an ID have only one of the following codes (35, 37, 38, 39, 484, 7827, 2728, 2539, 2218, 2721912) in hltcode then it is Whyz.

If an ID have only one of the following codes (728, 7298, 722, 27218, 2721821, 27282, 27218) in hltcode then it is Emfy.

    However if an ID have both Whyz code AND Emfy code in hltcode then that ID needs to categorized as Pheno.

 

Also, If any ID have any of these codes (637383, 35363, 9830) then that ID also have to be categorized as Pheno.

 

Now, there are other 5 groups, called Klig, Uttr, Thaip, Ghait, Dondj

 

the hltcode for Klig are (6373, 3633, 73392, 11928, 638383, 362892, 3691101)

                         Uttr are (7748, 373229, 372382, 363763, 36373)

                         Thaip are (872920,647474, 3731738, 646)

                         Ghait are (77747, 6377, 36787, 46478498, 738991) 

                         Dondj are (7733, 63599, 37383, 3663, 37137)

 

if any ID have a hltcode for Whyz AND Klig then that ID also have to be categorized also as Pheno.

if any ID have a hltcode for Whyz AND Klig AND Dondj then that ID also have to be categorized also as Pheno.

if any ID have a hltcode for Whyz AND Thaip AND Dondj then that ID also have to be categorized also as Pheno.

 

if any ID have a hltcode for Emfy AND Thaip then that ID also have to be categorized also as Pheno.

if any ID have a hltcode for Emfy AND Utter AND Thaip then that ID also have to be categorized also as Pheno.

if any ID have a hltcode for Emfy AND Dondj AND Thaip then that ID also have to be categorized also as Pheno.

if any ID have a hltcode for Emfy AND Klig AND Dondj then that ID also have to be categorized also as Pheno.

 

This was what i tried doing but realized that was not appropriate based on suggestions in SAS community. 

 

libname obs_rad "C:\obs_rad";

data obs_rad.radft;

set obs_rad.radft;

 

data obs_rad.SOB;

set obs_rad.radft;

if hltcode in (262, 2727, 26252, 227272) then alc_grp="Whyz";

if hltcode in (626228, 227, 22772, 26252, 7383) then alc_grp="Emfy";

if hltcode in (637383, 35363, 9830) then alc_grp="Pheno";

if hltcode in (636, 6363, 4484, 37393, 383839, 393, 3737) and hltcode in (636, 2262, 2627, 26272) then alc_grp="Pheno";

if hltcode in (262, 2727, 26252, 227272) and hltcode in (636, 6363, 4484, 37393, 383839, 393, 3737) and hitcode in 636, 2262, 2627, 26272) then alc_grp="Pheno";

 

proc freq data=obs_rad.SOB;

table alc_grp;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
Mystik
Obsidian | Level 7

I was able to get around it.

Firstly i created new variables for all my groups of interest.

Then using IF, OR and THEN statements i was able to create my Pheno group.

Finally, defining a new variable and using IF statement, i was able to create my three exclusive groups.

 

Thanks!

View solution in original post

4 REPLIES 4
Shmuel
Garnet | Level 18

I understand that relevant input variables are just two: ID and HLTCODE and that there may be more than one observation per ID.

 

Instead of using if hltcode in (...) I would prefer build a format:

proc format lib=work;   /* or any library to save the format */
   value grp
       35,37-39,484,7827, ...     = 'Whyz'
       728,7298,722,2721821 .... = 'Emfy'
      ...........
  ;
run;

next step will be to assign value to alc_grp 

data temp;
 set have;
      alc_grp = put(hltcode, grp. );
run;

last step is to check is there a single value of alc_grp per ID or more and create output with one observation per ID

containing ID and alt_grp as only variables. That can be done by sql or by a data step:

data want;
set temp (keep=ID alt_grp);
  by ID;
      length group $5; 
      retain group;
      drop group;

      if first.ID then group = alt_grp; else
      if alt_grp NE group then group = 'Pheno';
      if last.ID then do;
        alt_grp = group;
        output;
     end;
run;
Mystik
Obsidian | Level 7

Thank you.
I was able to execute the 1st step however,
i got an Error: the Format GRP was not found or could not be loaded.

Can you pls give me some clarifications in the 2nd step;

 

data "temp";

set "have";

 

what specifically should "temp" and "have" denote?

 

Shmuel
Garnet | Level 18

1st step creates the format grp. in work.formats catalog.

 

2nd step add the alt_grp as required per observation.

Please post your full log with the exact error message.

 

3rd step alters alt_grp depending on values exist in same ID observations.

Mystik
Obsidian | Level 7

I was able to get around it.

Firstly i created new variables for all my groups of interest.

Then using IF, OR and THEN statements i was able to create my Pheno group.

Finally, defining a new variable and using IF statement, i was able to create my three exclusive groups.

 

Thanks!

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1304 views
  • 0 likes
  • 2 in conversation