Hello,
I'm having trouble trying to get a counting variable to decreased based on the rules of other variables given to me by a client.
currently, i have the "HAVE" column, but want to achieve the "WANT"
we know there are 5 unique date changed for this ID, and ignoring the duplicate value on the same day - this would then show my i have 5 unqiue value changes over time.
ID | DATE_OF_VALUE | VALUE | HAVE | WANT |
109999999 | 05/12/2017 | 42376 | 1 | 5 |
109999999 | 05/12/2017 | 42376 | 2 | . |
109999999 | 13/12/2017 | 42521 | 1 | 4 |
109999999 | 13/12/2017 | 42521 | 2 | . |
109999999 | 19/03/2018 | 43395 | 1 | 3 |
109999999 | 19/03/2018 | 43395 | 2 | . |
109999999 | 19/03/2018 | 43395 | 3 | . |
109999999 | 19/03/2018 | 43395 | 4 | . |
109999999 | 04/06/2018 | 43953 | 1 | 2 |
109999999 | 04/06/2018 | 43953 | 2 | . |
109999999 | 17Sep2018 | 44899 | 1 | 1 |
109999999 | 17Sep2018 | 44899 | 2 | . |
the code i have for the "HAVE" column looks like this
can someone please help me achieve the "WANT" column, as i'm having issues with sorting and the BY statement
data TEST1;
set TEST;
by ID VALUE DATE_OF_VALUE;
if first.VALUE and first.DATE_OF_VALUE then do;
HAVE=1;
end;
else do ;
HAVE+1;
end;
run;
i want to try and achieve this in one step if possible
apolozies for this data please try this code
data test;
length id 8 value 8 have 8;
infile datalines4 dlm=',' missover dsd;
input id date_of_value :ddmmyy10. value have ;
format date_of_value date9.;
datalines4;
109999999,05/12/2017,42376,1
109999999,05/12/2017,42376,2
109999999,13/12/2017,42521,1
109999999,13/12/2017,42521,2
109999999,19/03/2018,43395,1
109999999,19/03/2018,43395,2
109999999,19/03/2018,43395,3
109999999,19/03/2018,43395,4
109999999,04/06/2018,43953,1
109999999,04/06/2018,43953,2
109999999,17/09/2018,44899,1
109999999,17/09/2018,44899,2
109999998,21/04/2018,4788,1
109999998,21/04/2018,4788,2
109999998,01/06/2018,4788,3
109999998,01/06/2018,4788,4
109999998,23/09/2018,4718,1
109999998,23/09/2018,4718,2
109999997,09/07/2018,8717,1
109999997,09/07/2018,8717,2
109999997,01/09/2018,8750,1
109999997,01/09/2018,8750,2
109999995,15/02/2018,22868,1
109999995,15/02/2018,22868,2
109999995,07/09/2018,22631,1
109999995,07/09/2018,22631,2
109554995,05/12/2017,92376,1
109554995,05/12/2017,92376,2
109554995,13/12/2017,92521,1
109554995,13/12/2017,92521,2
109554995,19/03/2018,93395,1
109554995,19/03/2018,93395,2
109554995,19/03/2018,93395,3
109554995,19/03/2018,93395,4
109554995,04/06/2018,93953,1
109554995,04/06/2018,93953,2
109554995,11/07/2018,94953,1
;;;;
run;
proc sort data=test;
by id descending DATE_OF_VALUE have;
run;
data want;
set test;
by id descending DATE_OF_VALUE have;
if first.id then _want=.;
if first.DATE_OF_VALUE then _want+1;
if first.DATE_OF_VALUE and _want ne . then want=_want;
drop _want;
run;
proc sort data=want;
by id DATE_OF_VALUE ;
run;
To start with, dont code in upppercase. Anyways this should work:
data test1;
set test;
by id value date_of_value;
retain want;
if first.id then want=5;
else if first.date_of_value then want=want-1;
run;
Also, post test data in the form of a datastep in future, not here to type that in.
Please post data as datastep using datalines and add at least a second id, so that we can verify our ideas.
I would start by sorting by id and decreasing "date_of_value", than start with want=1 and increment it when "value" changes. You can either add a variable and retain its value, or you can use lag-function to get older values of "value".
thank you for the replies, i'm just getting the datalines done and a few more ID's to show change
each date and value can be duplicated but the same value can repeat to a different date too.
DATA TEST;
LENGTH
ID 8 DATE_OF_VALUE $10 VALUE 8 HAVE 8 WANT 8 ;
FORMAT
ID 9. DATE_OF_VALUE $10. VALUE 5. HAVE 1. WANT 1. ;
INFORMAT
ID 9. DATE_OF_VALUE $10. VALUE 5. HAVE 1. WANT 1. ;
INFILE DATALINES4
DLM=','
MISSOVER
DSD ;
INPUT
ID : 9. DATE_OF_VALUE : $10. VALUE: 5. HAVE: 1.WANT : 1. ;
DATALINES4;
109999999,05/12/2017,42376,1,5
109999999,05/12/2017,42376,2,.
109999999,13/12/2017,42521,1,4
109999999,13/12/2017,42521,2,.
109999999,19/03/2018,43395,1,3
109999999,19/03/2018,43395,2,.
109999999,19/03/2018,43395,3,.
109999999,19/03/2018,43395,4,.
109999999,04/06/2018,43953,1,2
109999999,04/06/2018,43953,2,.
109999999,17/09/2018,44899,1,1
109999999,17/09/2018,44899,2,.
109999998,21/04/2018,4788,1,2
109999998,21/04/2018,4788,2,.
109999998,01/06/2018,4788,3,.
109999998,01/06/2018,4788,4,.
109999998,23/09/2018,4718,1,1
109999998,23/09/2018,4718,2,.
109999997,09/07/2018,8717,1,2
109999997,09/07/2018,8717,2,.
109999997,01/09/2018,8750,1,1
109999997,01/09/2018,8750,2,.
109999995,15/02/2018,22868,1,2
109999995,15/02/2018,22868,2,.
109999995,07/09/2018,22631,1,1
109999995,07/09/2018,22631,2,.
109554995,05/12/2017,92376,1,5
109554995,05/12/2017,92376,2,.
109554995,13/12/2017,92521,1,4
109554995,13/12/2017,92521,2,.
109554995,19/03/2018,93395,1,3
109554995,19/03/2018,93395,2,.
109554995,19/03/2018,93395,3,.
109554995,19/03/2018,93395,4,.
109554995,04/06/2018,93953,1,2
109554995,04/06/2018,93953,2,.
109554995,11/07/2018,94953,1,1
;;;;
Its just good practice not to shout code, hence using lower case. As for the data you present, it doesn't seem to match your original request. You first start by saying you know there is 5 occurrences, however this test data clearly shows less than 5 instances in some cases? So which is it? Do you need to find out how many there are first, say a quick SQL count(distinct), then merge that information back on? So assuming want is present and the first per id shows total (note your test data also seems to show various levels of changes, i.e. obs15, should change as the date is different, but doesnt?):
data test; length id 8 date_of_value $10 value 8 have 8 want 8; infile datalines4 dlm=',' missover dsd; input id date_of_value $ value have want; datalines4; 109999999,05/12/2017,42376,1,5 109999999,05/12/2017,42376,2,. 109999999,13/12/2017,42521,1,4 109999999,13/12/2017,42521,2,. 109999999,19/03/2018,43395,1,3 109999999,19/03/2018,43395,2,. 109999999,19/03/2018,43395,3,. 109999999,19/03/2018,43395,4,. 109999999,04/06/2018,43953,1,2 109999999,04/06/2018,43953,2,. 109999999,17/09/2018,44899,1,1 109999999,17/09/2018,44899,2,. 109999998,21/04/2018,4788,1,2 109999998,21/04/2018,4788,2,. 109999998,01/06/2018,4788,3,. 109999998,01/06/2018,4788,4,. 109999998,23/09/2018,4718,1,1 109999998,23/09/2018,4718,2,. 109999997,09/07/2018,8717,1,2 109999997,09/07/2018,8717,2,. 109999997,01/09/2018,8750,1,1 109999997,01/09/2018,8750,2,. 109999995,15/02/2018,22868,1,2 109999995,15/02/2018,22868,2,. 109999995,07/09/2018,22631,1,1 109999995,07/09/2018,22631,2,. 109554995,05/12/2017,92376,1,5 109554995,05/12/2017,92376,2,. 109554995,13/12/2017,92521,1,4 109554995,13/12/2017,92521,2,. 109554995,19/03/2018,93395,1,3 109554995,19/03/2018,93395,2,. 109554995,19/03/2018,93395,3,. 109554995,19/03/2018,93395,4,. 109554995,04/06/2018,93953,1,2 109554995,04/06/2018,93953,2,. 109554995,11/07/2018,94953,1,1 ;;;; run; data test1; set test; by id date_of_value notsorted; retain count; if first.id then count=want; else if first.date_of_value then count=count-1; run;
data have;
infile cards truncover;
input ID $ DATE_OF_VALUE :ddmmyy10. VALUE HAVE;
format DATE_OF_VALUE date9.;
cards;
109999999 05/12/2017 42376 1
109999999 05/12/2017 42376 2
109999999 13/12/2017 42521 1
109999999 13/12/2017 42521 2
109999999 19/03/2018 43395 1
109999999 19/03/2018 43395 2
109999999 19/03/2018 43395 3
109999999 19/03/2018 43395 4
109999999 04/06/2018 43953 1
109999999 04/06/2018 43953 2
109999999 17/09/2018 44899 1
109999999 17/09/2018 44899 2
;
proc sort data=have;
by id descending DATE_OF_VALUE have;
run;
data want;
set have;
by id descending DATE_OF_VALUE have;
if first.DATE_OF_VALUE then _want+1;
if first.DATE_OF_VALUE and _want ne . then want=_want;
drop _want;
run;
proc sort data=want;
by id DATE_OF_VALUE ;
run;
i wanted to show an example of my workings
Hi, its seems where the numbers are placed are correct just the values are not as they should be
apolozies for this data please try this code
data test;
length id 8 value 8 have 8;
infile datalines4 dlm=',' missover dsd;
input id date_of_value :ddmmyy10. value have ;
format date_of_value date9.;
datalines4;
109999999,05/12/2017,42376,1
109999999,05/12/2017,42376,2
109999999,13/12/2017,42521,1
109999999,13/12/2017,42521,2
109999999,19/03/2018,43395,1
109999999,19/03/2018,43395,2
109999999,19/03/2018,43395,3
109999999,19/03/2018,43395,4
109999999,04/06/2018,43953,1
109999999,04/06/2018,43953,2
109999999,17/09/2018,44899,1
109999999,17/09/2018,44899,2
109999998,21/04/2018,4788,1
109999998,21/04/2018,4788,2
109999998,01/06/2018,4788,3
109999998,01/06/2018,4788,4
109999998,23/09/2018,4718,1
109999998,23/09/2018,4718,2
109999997,09/07/2018,8717,1
109999997,09/07/2018,8717,2
109999997,01/09/2018,8750,1
109999997,01/09/2018,8750,2
109999995,15/02/2018,22868,1
109999995,15/02/2018,22868,2
109999995,07/09/2018,22631,1
109999995,07/09/2018,22631,2
109554995,05/12/2017,92376,1
109554995,05/12/2017,92376,2
109554995,13/12/2017,92521,1
109554995,13/12/2017,92521,2
109554995,19/03/2018,93395,1
109554995,19/03/2018,93395,2
109554995,19/03/2018,93395,3
109554995,19/03/2018,93395,4
109554995,04/06/2018,93953,1
109554995,04/06/2018,93953,2
109554995,11/07/2018,94953,1
;;;;
run;
proc sort data=test;
by id descending DATE_OF_VALUE have;
run;
data want;
set test;
by id descending DATE_OF_VALUE have;
if first.id then _want=.;
if first.DATE_OF_VALUE then _want+1;
if first.DATE_OF_VALUE and _want ne . then want=_want;
drop _want;
run;
proc sort data=want;
by id DATE_OF_VALUE ;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.