BookmarkSubscribeRSS Feed
annypanny
Quartz | Level 8

Hi All,

I want to create a dummy data set in that data set I want coloumn names as number, date(20042020 this format) and performance. I want rows in performance initially populate as 0.

example:

Number  Date           Performance

1A            30042020  0

2B             31032015  0

After that there is another dataset which is already created in which there is Number coloumn, Date coloumn and  Performace coloumn are already there

Example:

Number Date            Performace

1A           30042020    5.254

1A           31032020    7.458

2B           31032015    7.569

2A           30042015    9.546

In already created dataset the date coloumn have 200 rows but with identical number rows. so I have to distinct only 100 rows to handle the duplicate rows and there should be 61 rows in my dummy dataset which I have to create starting from 31 march 2015 to 30 april 2020 last date of every month and there is 61 month only from today. I want to match  both datasets number and date which can be same in both the dataset. If its match then the performace coloumn from dummy dataset change from the performace value which already created dataset contains.

edited

12 REPLIES 12
PaigeMiller
Diamond | Level 26

SAS can do this match/combining data sets without dummy variables.

 

Use the UPDATE statement in a DATA step. The link contains examples.

--
Paige Miller
annypanny
Quartz | Level 8

I want to create dummy dataset

PaigeMiller
Diamond | Level 26

Sorry for the misunderstanding.

 

How would we know how to create a dummy data set with 61 rows? What 61 rows?

--
Paige Miller
annypanny
Quartz | Level 8

there are 200 rows there but I want to select only 61 rows i.e., from march 31 2015 to 30 april 2020, 61 rows for 61 months and date contains only month's last date value

PaigeMiller
Diamond | Level 26

@annypanny wrote:

there are 200 rows there but I want to select only 61 rows i.e., from march 31 2015 to 30 april 2020, 61 rows for 61 months and date contains only month's last date value


Please explain further. Provide meaningful example data sets (not data sets like you have provided with only two rows, both of which match the other data set)

--
Paige Miller
annypanny
Quartz | Level 8

alredy created dataset:

Number       performance      Date

A000810A  1.85538             30092019
A000810A  4.37                   30062019
A000825A  1.34265             30092019
A000825A  3.04                   30062019
A000826A  1.32974             30092019
A000826A  3.03                   30062019

annypanny
Quartz | Level 8

Dummy dataset I want to create:

Number       performance      Date

A000810A  0                     30092019
A000810A  0                     30062019
A000825A  0                     30092019
A000825A  0                     30062019
A000826A  0                     30092019
A000826A  0                     30062019

Kurt_Bremser
Super User

@annypanny wrote:

Dummy dataset I want to create:

Number       performance      Date

A000810A  0                     30092019
A000810A  0                     30062019
A000825A  0                     30092019
A000825A  0                     30062019
A000826A  0                     30092019
A000826A  0                     30062019


Now that's simple:

data want;
set have;
performance = 0;
run;
PaigeMiller
Diamond | Level 26

So I still think your examples don't match your wording. However using the data sets shown

 

data want; /* This is the dummy data set */
    set have; /* this is the data set already created */
    performance=0;
run;
--
Paige Miller
Kurt_Bremser
Super User

Since creating usable example data seems to be beyond your current SAS skills, I have done it for you:

data have1;
input Number :$2. Date :ddmmyy8. Performance;
format date yymmddd10.;
datalines;
1A 30042020 0
2B 31032015 0
;

data have2;
input Number :$2. Date :ddmmyy8. Performance;
format date yymmddd10.;
datalines;
1A 30042020 5.254
1A 31032020 7.458
2B 31032015 7.569
2A 30042015 9.546
;

What do you expect as output from these two tables?

annypanny
Quartz | Level 8
if the values from have 1 matches with have 2 then performace observation of have 2 should overwrite performance observation of have 1. In your case 5.254 will replace 0 in first row of performace in have 1
Kurt_Bremser
Super User

Please don't make vague and incomplete descriptions, SHOW the expected result.

Without that, I can only think you want an update:

data have1;
input Number :$2. Date :ddmmyy8. Performance;
format date yymmddd10.;
datalines;
1A 30042020 0
2B 31032015 0
;

data have2;
input Number :$2. Date :ddmmyy8. Performance;
format date yymmddd10.;
datalines;
1A 30042020 5.254
1A 31032020 7.458
2B 31032015 7.569
2A 30042015 9.546
;

proc sort data=have1;
by number date;
run;

proc sort data=have2;
by number date;
run;

data want;
update
  have1
  have2
;
by number date;
run;

Result:

1	1A	2020-03-31	7.458	
2	1A	2020-04-30	5.254	
3	2A	2015-04-30	9.546	
4	2B	2015-03-31	7.569

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 2009 views
  • 3 likes
  • 3 in conversation