BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
V_27
Obsidian | Level 7

 

Hello,

 

I am new to SAS and I am trying to create an array which can calculate the total count of particular observations value that appears in 3 different variables.
Example: O/P = count of observation value "D80.1" in Var1 + "D80.1" in Var2 + "D80.1" in Var3

Similarly for all other values.

 

Please find the sample attachment of input data and required output

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Thank you @V_27 , I am just curious, if you just want to read the variables you need to compute the counts, I think you should still have your desired counts output.

 

For example, 

 

/*Reading only the vars you need from have 
i.e rand_id and diag group*/
proc transpose data=have(keep=rand_id diag:) out=w1;
by rand_id;
var diag:;
run;
/*vertical*/
/*This should give you counts for each diag*/
proc freq data=w1(where=(col1 ne '9999')) noprint;
tables col1/out=want(drop=PERCENT);
run;

The above doesn't mess up your original dataset "have"  and the "want" has your outputs you need.  Am i missing something beyond this?

View solution in original post

13 REPLIES 13
novinosrin
Tourmaline | Level 20

I am unable to follow your output count of 29,149 etc for your input

V_27
Obsidian | Level 7

 

Hi Novinosrin,

 

The attachment is just a overview of the whole data file. There are 1000+ observation. I created this output with a basic logic like below.

 

if Var1="D80.0" or Var2="D80.0" or Var3="D80.0" then NewVar_D80.0=1;

if Var1="D80.1" or Var2="D80.1" or Var3="D80.1" then NewVar_D80.1=1;

and so on for all such different observation values.

 

Then I used proc tabulate procedure to get the total number as output.

 

I want to create an array instead of writing individual if-then statement for all observations.

novinosrin
Tourmaline | Level 20

Not sure if I understood your req well, but here is an attempt

 

/*k=_n_ a by variable for transpose*/

data have;
k=_n_;
input (Var1	Var2	Var3) ($);
cards;
D80.1	9999	9999
D83.9	D83.0	D80.6
D80.3	9999	9999
D83.9	9999	9999
D82.9	9999	9999
D83.9	9999	9999
D83.9	9999	9999
D83.9	9999	9999
D83.0	9999	9999
D80.6	9999	9999
D83.9	9999	D80.4
D83.8	D89.9	9999
D80.1	9999	9999
D83.9	9999	9999
D83.9	9999	9999
D83.9	9999	9999
D80.1	9999	9999
D80.1	D83.9	D80.4
D83.9	9999	9999
D83.9	9999	9999
D80.1	D83.9	D83.9
D84.9	D80.3	9999
D80.9	9999	9999
D83.9	9999	9999
D83.9	D83.8	9999
D83.8	9999	9999
D80.1	D80.8	D89.9
D80.6	9999	9999
D80.1	9999	9999
D80.1	9999	9999
D83.9	9999	9999
D80.1	9999	9999
D83.9	9999	9999
D83.9	9999	9999
D83.9	D80.3	9999
D83.9	9999	D80.3
D83.9	9999	9999
;


proc transpose data=have out=w1;
by k;
var var:;
run;
/*vertical*/
proc freq data=w1(where=(col1 ne '9999')) noprint;
tables col1/out=w2;
run;

/*hortizontal*/
proc transpose data=w2 out=want;
id col1;
var count;
run;
V_27
Obsidian | Level 7

 

Hi Novinosrin,

 

Thank you for the help.

This code worked but due to transpose, my other variable observations are repeating which I don't want as its affecting the other results due to increase in total number of observations.

 

For example: let say there are another variable ID. Then my new data set is changed to shown below

 

ID0003Var1D80.1
ID0003Var29999
ID0003Var39999
ID0004Var1D83.9
ID0004Var29999
ID0004Var39999
ID0007Var1D80.3
ID0007Var29999
ID0007Var39999
ID0012Var1D83.9
ID0012Var29999
ID0012Var39999
ID0019Var1D82.9
ID0019Var2D83.9
ID0019Var39999
ID0021Var1D83.9
ID0021Var29999
ID0021Var39999
ID0022Var1D83.9
ID0022Var29999
ID0022Var39999
ID0023Var1D83.9
ID0023Var29999
ID0023Var39999
ID0029Var1D83.9
ID0029Var29999
ID0029Var39999
ID0031Var1D83.0
ID0031Var29999
ID0031Var39999
ID0032Var1D80.6
ID0032Var29999
ID0032Var39999
ID0035Var1D83.9
ID0035Var29999
ID0035Var39999
ID0036Var1D83.8
ID0036Var29999
ID0036Var39999
ID0040Var1D80.1
ID0040Var29999
ID0040Var39999
ID0042Var1D83.9
ID0042Var29999
ID0042Var39999
ID0043Var1D83.9
ID0043Var29999
ID0043Var39999
ID0046Var1D83.9
ID0046Var29999
ID0046Var39999
ID0047Var1D80.1
ID0047Var2D80.2
ID0047Var3D80.9
ID0049Var1D80.1
ID0049Var29999
ID0049Var39999
ID0053Var1D83.9
ID0053Var29999
ID0053Var39999
ID0056Var1D83.9
ID0056Var29999
ID0056Var39999
ID0058Var1D83.9
ID0058Var29999
ID0058Var39999
ID0060Var1D80.1
ID0060Var29999
ID0060Var39999
ID0061Var1D84.9
ID0061Var29999
ID0061Var39999
ID0062Var1D80.9
ID0062Var29999
ID0062Var39999
ID0063Var1D83.9
ID0063Var29999
ID0063Var39999
ID0066Var1D83.9
ID0066Var29999
ID0066Var39999
ID0068Var1D83.8
ID0068Var29999
ID0068Var39999
ID0074Var1D80.1
ID0074Var29999
ID0074Var39999
ID0077Var1D80.6
ID0077Var29999
ID0077Var39999
ID0079Var1D80.1
ID0079Var29999
ID0079Var39999
ID0080Var1D80.1
ID0080Var29999
ID0080Var39999
ID0087Var1D83.9
ID0087Var29999
ID0087Var39999
ID0089Var1D80.1
ID0089Var29999
ID0089Var39999
ID0093Var1D83.9
ID0093Var29999
ID0093Var39999
ID0095Var1D83.9
ID0095Var29999
ID0095Var39999
ID0099Var1D83.9
ID0099Var29999
ID0099Var39999
ID0100Var1D83.9
ID0100Var2D83.8
ID0100Var39999
ID0101Var1D83.9
ID0101Var29999
ID0101Var39999
ID0102Var1D83.9
ID0102Var29999
ID0102Var39999
ID0103Var1D80.1
ID0103Var2D83.0
ID0103Var39999
ID0116Var1D83.0
ID0116Var29999
ID0116Var39999
ID0117Var1D83.8
ID0117Var29999
ID0117Var39999
ID0118Var1D80.6
ID0118Var29999
ID0118Var39999
ID0122Var1D83.9
ID0122Var29999
ID0122Var39999
ID0123Var1D83.9
ID0123Var29999
ID0123Var39999
ID0127Var1D83.9
ID0127Var29999
ID0127Var39999
ID0128Var1D83.9
ID0128Var29999
ID0128Var39999
ID0134Var1D83.9
ID0134Var29999
ID0134Var39999
ID0139Var1D83.9
ID0139Var29999
ID0139Var39999
ID0140Var1D83.9
ID0140Var29999
ID0140Var39999

 

Is there any way I can avoid this situation? As I was also trying to use transpose only before but I can't figure out how to avoid other repeating values.
Thanks in advance again.

novinosrin
Tourmaline | Level 20

Can you please post a good representative sample for us so that we can avoid going back and forth? That will help us test better with the right input sample. 

V_27
Obsidian | Level 7

 

 

Attached is a part of the original dataset

novinosrin
Tourmaline | Level 20

Thank you @V_27 , I am just curious, if you just want to read the variables you need to compute the counts, I think you should still have your desired counts output.

 

For example, 

 

/*Reading only the vars you need from have 
i.e rand_id and diag group*/
proc transpose data=have(keep=rand_id diag:) out=w1;
by rand_id;
var diag:;
run;
/*vertical*/
/*This should give you counts for each diag*/
proc freq data=w1(where=(col1 ne '9999')) noprint;
tables col1/out=want(drop=PERCENT);
run;

The above doesn't mess up your original dataset "have"  and the "want" has your outputs you need.  Am i missing something beyond this?

V_27
Obsidian | Level 7

I understood what you were saying. The missing part was, you created new dataset "w1" after transpose which does not included variable infusion_1-infusion_4 and I was also asked to create another output table(apart from the output I asked in the question) which contains total diag count during different number of infusions which I was unable to as dataset w1 has repeated rand_id due to transpose.

 

But in the end I figured out. I used proc transpose method only and merged the new transposed dataset to original one by rand_id which I can use to generate different output tables with different variables.

 

Thank you so much for your help. 🙂

Is there a way I can get in touch with you? @novinosrin

novinosrin
Tourmaline | Level 20

I am glad, you got it to work. Feel free to open up any number of threads/questions. We(you included) are all part of the SAS family. Don't ever hesitate. The pleasure would all me mine to help. Of course, this is the best platform for all us to share and learn. I try to get here almost everyday. 🙂

 

PS IF anything urgent, besides your open thread here, PM me. I will try to look into it for sure. Nevertheless, on the open thread you are bound to get responses from champs like John King, Pierre Gagnon, Tom, Paul Dorfman, Xia keshan, Mkeintz, Art T etc  with their super answers. These people are synonymous with this gorgeous language of SAS. I tend to copy theirs. haha

 

Tom
Super User Tom
Super User

If it appears in two of the variables on the same observation does that count as one or two?

V_27
Obsidian | Level 7

 

Hi Tom,

 

Yes, it should count as two.

Tom
Super User Tom
Super User

So just transpose the data so that you have one column and run frequencies on that.

Here is one method using an ARRAY, but you should also look into just using PROC TRANSPOSE.

data for_analysis;
  set have ;
  array dxlist var1-var3 ;
  do i=1 to dim(dxlist);
    dx=dxlist(i);
    output;
  end;
run;

proc freq data=for_analysis;
  tables dx;
run;

 

V_27
Obsidian | Level 7

Hi Tom,

 

Thanks for the solution.

I already tried this code before but the problem is my other variable observations are repeating which I don't want as its affecting the other results due to increase in total number of observations like I said to Novinosrin.

 

For example: let say there is another variable ID. Then my new data set is changed to shown below

 

ID0003Var1D80.1
ID0003Var29999
ID0003Var39999
ID0004Var1D83.9
ID0004Var29999
ID0004Var39999
ID0007Var1D80.3
ID0007Var29999
ID0007Var39999
ID0012Var1D83.9
ID0012Var29999
ID0012Var39999
ID0019Var1D82.9
ID0019Var2D83.9
ID0019Var39999
ID0021Var1D83.9
ID0021Var29999
ID0021Var39999
ID0022Var1D83.9
ID0022Var29999
ID0022Var39999
ID0023Var1D83.9
ID0023Var29999
ID0023Var39999
ID0029Var1D83.9
ID0029Var29999
ID0029Var39999
ID0031Var1D83.0
ID0031Var29999
ID0031Var39999
ID0032Var1D80.6
ID0032Var29999
ID0032Var39999
ID0035Var1D83.9
ID0035Var29999
ID0035Var39999
ID0036Var1D83.8
ID0036Var29999
ID0036Var39999
ID0040Var1D80.1
ID0040Var29999
ID0040Var39999
ID0042Var1D83.9
ID0042Var29999
ID0042Var39999
ID0043Var1D83.9
ID0043Var29999
ID0043Var39999
ID0046Var1D83.9
ID0046Var29999
ID0046Var39999
ID0047Var1D80.1
ID0047Var2D80.2
ID0047Var3D80.9
ID0049Var1D80.1
ID0049Var29999
ID0049Var39999
ID0053Var1D83.9
ID0053Var29999
ID0053Var39999
ID0056Var1D83.9
ID0056Var29999
ID0056Var39999
ID0058Var1D83.9
ID0058Var29999
ID0058Var39999
ID0060Var1D80.1
ID0060Var29999
ID0060Var39999
ID0061Var1D84.9
ID0061Var29999
ID0061Var39999
ID0062Var1D80.9
ID0062Var29999
ID0062Var39999
ID0063Var1D83.9
ID0063Var29999
ID0063Var39999
ID0066Var1D83.9
ID0066Var29999
ID0066Var39999
ID0068Var1D83.8
ID0068Var29999
ID0068Var39999
ID0074Var1D80.1
ID0074Var29999
ID0074Var39999
ID0077Var1D80.6
ID0077Var29999
ID0077Var39999
ID0079Var1D80.1
ID0079Var29999
ID0079Var39999
ID0080Var1D80.1
ID0080Var29999
ID0080Var39999
ID0087Var1D83.9
ID0087Var29999
ID0087Var39999
ID0089Var1D80.1
ID0089Var29999
ID0089Var39999
ID0093Var1D83.9
ID0093Var29999
ID0093Var39999
ID0095Var1D83.9
ID0095Var29999
ID0095Var39999
ID0099Var1D83.9
ID0099Var29999
ID0099Var39999
ID0100Var1D83.9
ID0100Var2D83.8
ID0100Var39999
ID0101Var1D83.9
ID0101Var29999
ID0101Var39999
ID0102Var1D83.9
ID0102Var29999
ID0102Var39999
ID0103Var1D80.1
ID0103Var2D83.0
ID0103Var39999
ID0116Var1D83.0
ID0116Var29999
ID0116Var39999
ID0117Var1D83.8
ID0117Var29999
ID0117Var39999
ID0118Var1D80.6
ID0118Var29999
ID0118Var39999
ID0122Var1D83.9
ID0122Var29999
ID0122Var39999
ID0123Var1D83.9
ID0123Var29999
ID0123Var39999
ID0127Var1D83.9
ID0127Var29999
ID0127Var39999
ID0128Var1D83.9
ID0128Var29999
ID0128Var39999
ID0134Var1D83.9
ID0134Var29999
ID0134Var39999
ID0139Var1D83.9
ID0139Var29999
ID0139Var39999
ID0140Var1D83.9
ID0140Var29999
ID0140Var39999

 

Is there any way I can avoid this situation? As I was also trying to use transpose only before but I can't figure out how to avoid other repeating values.
Thanks again.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 2510 views
  • 1 like
  • 3 in conversation