BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Simon80
Calcite | Level 5

I'm trying to create a date set by entering information using the datelines statement in a data step.  For instance, I have multiple observations with the same value. I was wondering if there was a shorthand way of entering these values for multiple observations without having to type them one by one.

Example:

data students;

     input Age;

     datalines;

     20

     20

     20

     20

     and so on...

     ;

Is there a function or an operator that I can use to generate repeated values.  I know certain programming languages you can take a value and add the product sign  (*) and a number to do this sort of operation.  Is there something similar in SAS?  Thanks for your response and help!

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Requires a little programming :

data students(keep=age);
infile datalines missover;
input Age repeat;
do i = 1 to coalesce(repeat, 1);
     output;
     end;
datalines;
20 4
10
15 2
55
;

proc print; run;

But don't forget that most data analysis procedures allow a FREQ statement that names a variable containing observation frequencies. Thus, you could use :

data students(keep=age repeat);
infile datalines missover;
input Age repeat;
repeat = coalesce(repeat,1);

datalines;
20 4
10
15 2
55
;

proc univariate data=students; var age; freq repeat; run;

PG

PG

View solution in original post

8 REPLIES 8
PGStats
Opal | Level 21

Requires a little programming :

data students(keep=age);
infile datalines missover;
input Age repeat;
do i = 1 to coalesce(repeat, 1);
     output;
     end;
datalines;
20 4
10
15 2
55
;

proc print; run;

But don't forget that most data analysis procedures allow a FREQ statement that names a variable containing observation frequencies. Thus, you could use :

data students(keep=age repeat);
infile datalines missover;
input Age repeat;
repeat = coalesce(repeat,1);

datalines;
20 4
10
15 2
55
;

proc univariate data=students; var age; freq repeat; run;

PG

PG
Simon80
Calcite | Level 5

PGStats, thank you for your quick response!  I tried the second method and it works beautifully!  I also tried a very minimal data step, shown below, based on yours that also seems to work. Would you mind explaining a few things to me as I'm relatively new to SAS. In your code, you have the infile statement (infile datalines missover).  What is this for? Why did you use the coalesce function? I know it returns the first non-missing value but I don't understand what its function is in your data step.  Thanks for your help!

My simplified code:

data students(keep=age repeat);

input Age repeat;

datalines;
20 4
10
15 2
55
;

proc univariate data=students; var age; freq repeat; run;

PGStats
Opal | Level 21

When you run that code, you get the following error messages:

NOTE: LOST CARD.

RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7---

10         ;

Age=55 repeat=. _ERROR_=1 _N_=3

NOTE: SAS went to a new line when INPUT statement reached past the end of a line.

NOTE: The data set WORK.STUDENTS has 2 observations and 2 variables.

The reason for this is that the default behaviour when SAS reads data is to go to the next line to read the remaining input variables. The first statement  (infile datalines missover) tells SAS to set the remaining variables to missing when the end of line is reached. The use of coalesce function says that when repeat is missing, take it as meaning 1.

PG

PG
Simon80
Calcite | Level 5

Thanks again PGStats! Smiley Happy  I think I understand now. 

ballardw
Super User

Depending on the analysis you're going to do, I suggest investigating in adding the count variable to every value, not creating multiple records and use the WEIGHT option in analysis with the count as the weight variable.

Simon80
Calcite | Level 5

ballardw, thanks for your input! I haven't seen the WEIGHT option before. Can you please tell me what it does? Thank you!

ballardw
Super User

Actually more properly with your data the FREQ option might be better which is available in many procs, says to use the count variable and treat that record as representing N records.

For example with proc univariate add a statement

Freq countvariablename;

Weights are similar but need not be integers and affect calculations of some statistics a bit differently.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 2985 views
  • 4 likes
  • 3 in conversation