DATA Step, Macro, Functions and more

Numbering with in groups and missing data

Reply
Contributor
Posts: 29

Numbering with in groups and missing data

Hi Again,

 

I always get great help from this Group.

I have a dataset where I want to number with in Groups. I have done this before with out problems however this time is complicated for some reason I cannot understand.

 

I have

 

{

data have; input Record_id date_of_admission ddmmyy10. admssion var1 $ var2 $ var3 $ var4 $ date_of_ques ddmmyy10. var5 $ ;

datalines;

 

1 07-01-2010 1 A B D . . .

1 09-01-2010 2 F D G X 10-01-2010 B

1 09-02-2010 3 A D G X 09-03-2010 A

1 09-02-2010 3 A D G X 12-03-2010 A

1 09-02-2010 3 A D G X 14-03-2010 A

1 09-03-2010 4 F D G . . .

1 09-03-2017 5 X T V . . .

2 07-02-2009 1 C R V X 14-07-2009 C

2 03-09-2010 2 C T V . . .

;

run;

 

 

What I want is number with in Groups defined by record_id date of admission, but only for the lines where var4=x.

 

datalines;

 

1 07-01-2010 1 A B D . . .

1 09-01-2010 2 F D G X 10-01-2010 B 1

1 09-02-2010 3 A D G X 09-03-2010 A 1

1 09-02-2010 3 A D G X 12-03-2010 A 2

1 09-02-2010 3 A D G X 14-03-2010 A 3

1 09-03-2010 4 F D G . . .

1 09-03-2017 5 X T V . . .

2 07-02-2009 1 C R V X 14-07-2009 C 1

2 03-09-2010 2 C T V . . .

;

run; }

 

I have tried the following however this only produces numbers with record_id.

 

data WANT;

set have;

NUMMERDRUGc + 1;

by record_id date_of_admission var4;

if first.record_id and first.date_of_admission and first.var4 then NUMMERDRUGc = 1;

run;

 }

Any suggestions.

 

Kind regards

 

Solvej

Super User
Super User
Posts: 9,227

Re: Numbering with in groups and missing data

Just split the data, add the count for data where var4=X, then add back to other data and sort:

data have; 
  input record_id date_of_admission ddmmyy10. admission var1 $ var2 $ var3 $ var4 $ date_of_ques ddmmyy10. var5 $;
datalines;
1 07-01-2010 1 A B D . . .
1 09-01-2010 2 F D G X 10-01-2010 B
1 09-02-2010 3 A D G X 09-03-2010 A
1 09-02-2010 3 A D G X 12-03-2010 A
1 09-02-2010 3 A D G X 14-03-2010 A
1 09-03-2010 4 F D G . . .
1 09-03-2017 5 X T V . . .
2 07-02-2009 1 C R V X 14-07-2009 C
2 03-09-2010 2 C T V . . .
;
run;
data x1;
  set have (where=(var4="X"));
  retain cnt 0;
  by record_id date_of_admission admission;
  if first.admission then cnt=cnt+1;
run;

data want;
  set have (where=(var4 ne "X")) x1;
run; 

Please note how I use the code window - its the {i} above post area - to post code.

Contributor
Posts: 29

Re: Numbering with in groups and missing data

HI Again

 

Thank you for your answer.

 

When I run the code I get this result.

 

 

Obs Record_id date_of_admission admission var1 var2 var3 var4 date_of_ques var5 cnt123456789

1182691ABD . .
1182712FDGX18272B1
1183023ADGX18330A2
1183023ADGX18333A2
1183023ADGX18335A2
1183304FDG . .
1208875XTV . .
2179351CRVX18092C3
2185082CTV . .

 

 

This is not exactly what I had in  mind. However I have tried to exchange the admission variable with a multiple of variables but still not getting it right.

 

What am I missing?

 

Solvej

Super User
Super User
Posts: 9,227

Re: Numbering with in groups and missing data

Sorry, I can't really tell what it is you want:

1 09-01-2010 2 F D G X 10-01-2010 B 1

1 09-02-2010 3 A D G X 09-03-2010 A 1

1 09-02-2010 3 A D G X 12-03-2010 A 2

 

Why does it only change on row 3?  Sounds like lagged result or something, i.e. if the previous result is the first one then add 1.

Contributor
Posts: 29

Re: Numbering with in groups and missing data

Hi Again

 

The reason for the change in row three is that row two and three two different admissions. Thus I want to number the questionnaires based on the dates of questionnaires within each admission.

 

Kind regards

 

Solvej

Super User
Posts: 10,621

Re: Numbering with in groups and missing data

data have; 
  input record_id date_of_admission : $20. admission var1 $ var2 $ var3 $ var4 $ date_of_ques : $20. var5 $;
datalines;
1 07-01-2010 1 A B D . . .
1 09-01-2010 2 F D G X 10-01-2010 B
1 09-02-2010 3 A D G X 09-03-2010 A
1 09-02-2010 3 A D G X 12-03-2010 A
1 09-02-2010 3 A D G X 14-03-2010 A
1 09-03-2010 4 F D G . . .
1 09-03-2017 5 X T V . . .
2 07-02-2009 1 C R V X 14-07-2009 C
2 03-09-2010 2 C T V . . .
;
run;
data want;
 set have;
 by record_id var5 notsorted;
 if first.var5 then n=0;
 n+1;
 if missing(var5) then call missing(n);
 run;
 
 proc print noobs;run;
Contributor
Posts: 29

Re: Numbering with in groups and missing data

Thank you Ksharp.

 

I see that you programme Works, but I see that my example is bad. I apologize for this.

 

I also needs to work when my data looks like this

 

{i}

 

data have; input Record_id date_of_admission ddmmyy10. admission var1 $ var2 $ var3 $ var4 $ date_of_ques ddmmyy10. var5 $ ;

datalines;

 

1 07-01-2010 1 A B D . . .

1 09-01-2010 2 F D G X 10-01-2010 B

1 09-02-2010 3 A D G X 09-03-2010 A

1 09-02-2010 3 A D G X 12-03-2010 G

1 09-02-2010 3 A D G X 14-03-2010 B

1 09-03-2010 4 F D G . . .

1 09-03-2017 5 X T V . . .

2 07-02-2009 1 C R V X 14-07-2009 C

2 03-09-2010 2 C T V . . .

;

run;

 

 

Contributor
Posts: 29

Re: Numbering with in groups and missing data

I changed your code to this and now it seems to work. Thank you.

 

{i}

 

data want;

set have;

by record_id admission notsorted;

if first.admission then n=0;

n+1;

if missing(var5) then call missing(n);

run;

 

 

proc print noobs;run;

 

Ask a Question
Discussion stats
  • 7 replies
  • 107 views
  • 0 likes
  • 3 in conversation