DATA Step, Macro, Functions and more

create new dataset

Reply
Contributor QLi
Contributor
Posts: 57

create new dataset

How can I create this kind dataset from data a?
the reqirement is the unique code with smaller score vale and alos non-missing ss.

data a;
input code score ss $;
cards;
1944 .
1944 . 412
959 548 417
959 .
170 665 408
170 659
171 581
171 580 183
176 665
176 . 414
;
run;


Thanks,
Super Contributor
Super Contributor
Posts: 3,174

Re: create new dataset

You might want to explain the layout of your CARDS (instream) data, such as how are the individual records organized and how you expect the output to appear, both in detail and after any data manipulation to meet your objectives.

Suggest you reply to your post and paste a sample of the WORK.A file and also what you expect *AFTER* any additional SAS processing (possibly explaining your DATA step flow to help you with developing your SAS program).

Scott Barry
SBBWorks, Inc.
Contributor QLi
Contributor
Posts: 57

Re: create new dataset

Thanks for your response.

The original data like this:
input code score ss $;

1944 .
1944 . 412
959 548 417
959 .
170 665 408
170 659
171 581
171 580 183
176 665
176 . 414


I want to get dataset like this
959 548 417
170 659 408
171 580 183
176 665 414

1st column code is doubled, I want to keep one unique code,

2nd column is score, there are two scores for doubled code, and I want to pick the smaller one.

3rd column is ss, there are one value and one missing for doubled code, and I want to pick non-missing value.

Could you please help fix it out?

Thanks very much!

Qing
Super Contributor
Super Contributor
Posts: 3,174

Re: create new dataset

What happens to 1944 input data field/value? Also, suggest you label your output columns to make no assumptions with your processing.

Using a DATA step, you will likely need to input a field at a time and assign either a SAS CHARACTER or NUMERIC variable, possibly in a continuous DO/END loop to avoid needing to use a RETAIN statement. You can use PUTLOG _ALL_ statements in your code to interrogate what SAS processing is doing as you develop your program. And since you have multiple measurement values on one record, you will do an explicit OUTPUT statement.

Then look at PROC MEANS or SUMMARY to generate an output for each unique CODE variable and a non-missing MIN value.

No question that this can be done with a DATA step approach alone, but it may also be more straightforward to rely on a suitable SAS PROC, like MEANS.

Scott Barry
SBBWorks, Inc.

Suggested Google advanced search argument, this topic/post:

data step programming introduction site:sas.com

using proc summary site:sas.com
N/A
Posts: 0

Re: create new dataset

First, look at the documentation for the infile statement.
You do not have all fields in all lines so you will need an infile option like MISSOVER or TRUNCOVER.

Second, I would recommend you use a three step approach.

1) Data step to read in the data.
if ss = . then delete;

2) PROC SORT data=_______; by u_code, s_value, ss;

3) Data step to produce the desired set.

Data last_step;
set in_data;
by u_code, s_value, ss;
if first.s_value then output;
run;
Ask a Question
Discussion stats
  • 4 replies
  • 146 views
  • 0 likes
  • 3 in conversation