Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- SAS Procedures
- /
- How to Tell PROC CORR to Run Until the First Zero For Each Participant

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 05-10-2016 08:34 PM
(1081 views)

Howdy folks,

I am running both Pearson and Spearman correlations for a large dataset (approx. 7,000 sets of data), and am **wondering whether there is a way to program the PROC CORR program to only run the analysis up to the first instance of "0" for each participant**. All sets of data begin at a non-zero number and theoretically drop to zero. However, the point at which these sets of data reach zero differ between participants, so I can't simply delete or replace all columns following the first-occurring zero. For example, see the following three example sets of data (note that each cell indicates number of item purchased at that price):

Price | |||||||||||

$1 | $2 | $3 | $4 | $5 | $6 | $7 | $8 | $9 | $10 | ||

Participant | 1 | 10 | 8 | 6 | 4 | 2 | 0 | 0 | 0 | 0 | 0 |

2 | 25 | 25 | 25 | 20 | 18 | 15 | 10 | 0 | 0 | 0 | |

3 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

Notice that as price goes up, the instances of the observed purchase decrease. What I need for PROC CORR to execute is to read each set of observations, and only analyze observations through the first zero (for example, I highlighted these zeroes in the above set), but not consider any other zeroes after the first zero. This task needs to be executed for the basic Pearson PROC CORR and the SPEARMAN-enabled PROC CORR statement.

Theoretically, I could simply work through the code and delete (or replace) all zeroes after the first-occurring zero, but it would be difficult to do so for 7,000 sets of observations, and I believe that SAS is capable of executing this task.

I had previously used an ARRAY function to replace all instances of zero with "." to mark them as missing. However, the analysis requires that the first-occuring zero be considered as a component of the function, and the ARRAY function I was using would delete the first-occuring zero.

Any advice? Many thanks for your time!

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Which two variables are you going to use to calculate Correlation coefficience ? two participant ? or two Price ?

And why not just set the zeros which follows the first zero to be missing value ?

```
data have;
infile cards expandtabs truncover;
input Participant _1-_10;
cards;
1 10 8 6 4 2 0 0 0 0 0
2 25 25 25 20 18 15 10 0 0 0
3 2 0 0 0 0 0 0 0 0 0
;
run;
data want;
set have;
array x{*} _1-_10;
do i=1 to dim(x);
if found then x{i}=.;
if x{i}=0 then found=1;
end;
drop i found;
run;
proc print;run;
```

9 REPLIES 9

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

What are you correlating. Price with number of purchases or number of purchases between participants?

PG

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

One way to do this is to stack all variables' names and values then use by processing in proc corr for values greater than 0. Something like this:

data have;

input Price Participant1 Participant2 Participant3;

datalines;

1 10 25 0

2 8 25 0

3 6 25 0

4 4 20 0

5 2 18 0

6 0 15 0

7 0 10 0

8 0 0 0

9 0 0 0

10 0 0 0

;

data want(keep=variable price value);

set have;

array p(*) Participant:;

do i=1 to dim(p);

value=p(i);

variable=vname(p(i));

output;

end;

run;

proc sort data=want;

by variable;

run;

proc corr data=want(where=(value>0));

by variable;

var price;

with value;

run;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks for the reply. I see that I should have been clearer in the original post - I need to include the first instance of zero (for each participant) in the proc corr. Would this syntax maintain the first instance of zero?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Please try the following syntax that will maintain the first instance of zero for proc corr:

data want(keep=variable price value);

set have;

array p(*) Participant:;

do i=1 to dim(p);

value=p(i);

variable=vname(p(i));

output;

end;

run;

proc sort data=want;

by variable;

run;

data corr(drop=flag);

do until(last.variable);

set want;

by variable;

if not flag then output;

if value=0 then flag = 1;

end;

run;

proc corr data=corr;

by variable;

var price;

with value;

run;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

As an addendum, is there any SAS code that could modify the dataset with which I'm working so as to simply *delete* or mark as missing all values after the first 0, rather than simply teaching the PROC CORR to only read up to the first zero? Having the data sets pruned to this point would help quite a bit with future analyses.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Make your data long instead of wide:

```
data long;
set wide;
array A n1-n10;
do price = 1 to dim(A) until(A{price}=0);
number = A{price};
output;
end;
keep participant price number;
run;
```

PG

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Which two variables are you going to use to calculate Correlation coefficience ? two participant ? or two Price ?

And why not just set the zeros which follows the first zero to be missing value ?

```
data have;
infile cards expandtabs truncover;
input Participant _1-_10;
cards;
1 10 8 6 4 2 0 0 0 0 0
2 25 25 25 20 18 15 10 0 0 0
3 2 0 0 0 0 0 0 0 0 0
;
run;
data want;
set have;
array x{*} _1-_10;
do i=1 to dim(x);
if found then x{i}=.;
if x{i}=0 then found=1;
end;
drop i found;
run;
proc print;run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@Ksharp wrote:Which two variables are you going to use to calculate Correlation coefficience ? two participant ? or two Price ?

And why not just set the zeros which follows the first zero to be missing value ?

`data have; infile cards expandtabs truncover; input Participant _1-_10; cards; 1 10 8 6 4 2 0 0 0 0 0 2 25 25 25 20 18 15 10 0 0 0 3 2 0 0 0 0 0 0 0 0 0 ; run; data want; set have; array x{*} _1-_10; do i=1 to dim(x); if found then x{i}=.; if x{i}=0 then found=1; end; drop i found; run; proc print;run;`

This looks like **exactly** what I'm looking for! Thanks for sending this. Will this syntax work with data libraries that have already been imported to SAS? I.E., can I just replace the highlighted text below with my libname.refname ?

`data `**want**;
set have;
array x{*} _1-_10;
do i=1 to dim(x);
if found then x{i}=.;
if x{i}=0 then found=1;
end;
drop i found;
run;
proc print;run;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Yes. You can use that as long as it turn into a SAS dataset.

`data `**yourlib.**want;
set **yourlib**.have;

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.