BookmarkSubscribeRSS Feed
paulschm
Fluorite | Level 6

Hi,

I am using PROC SQL to create a table with the distinct variable combinations I am interested in:

PROC SQL;
	CREATE TABLE want AS
	SELECT DISTINCT var1, var2, var3
    FROM have;
QUIT;

However, I would like to add a column to that created table with a running number (and not use a subsequent data step to do so). Is there an elegant way to do that?

Thanks

9 REPLIES 9
LinusH
Tourmaline | Level 20

Try the undocumented/unsupported function monotonic() (I havn't tried with distinct - so no guarantees).

Why is a subsequent data step so bad - do you got tons of data?

Data never sleeps
paulschm
Fluorite | Level 6

I just appreciate short code.
But yeah, I read about that function while googling. However, I could not get it to work in terms of creating a table. Could you give an example code?

LinusH
Tourmaline | Level 20

Just 

monotonic() as running_no

 

Data never sleeps
paulschm
Fluorite | Level 6

Hmm, the moment I run this:

PROC SQL;
	CREATE TABLE want AS
	SELECT DISTINCT var1, var2, var3, monotonic() as running_no
    FROM have;
QUIT;

the DISTINCT counts every single running number as a distinct combination of those variables and just gives back the entire dataset "have" with an added running-number-column.

LinusH
Tourmaline | Level 20

No guarantees, remember 🙂

Ok, then you need to nest your original query as an in-line view to the outer query, in which you assign your running_no.

Data never sleeps
paulschm
Fluorite | Level 6

I remember Man Happy

Uhm, sorry but I have never even heard inner and outer query. Do you mind suggesting it as an example once more?

LinusH
Tourmaline | Level 20
select a.*, monotonic() as running_no
from (select distinct.... from have) as a
Data never sleeps
RW9
Diamond | Level 26 RW9
Diamond | Level 26

As @LinusH has mentioned, monotonic() couts each observation as its processed.  However SQL is not an ordered processing like SAS, it can be affected by many different things internally, such as sorts, distinct, joins etc.  I wouldn't recommend using it in this instance, as the numerb assigned to a distinct value is unlikely to be the sequence you intend, unless you happen to know offhand what internal procesuders, and what threads each is assigned to.  Why do you need to add this?  Am thinking if you do need this, then why not combine it into another step as datastep already has _n_ for position of observation in the dataset, however I always find its better to fix it yourself by hand, e.g.

proc sort data=have nodupkey;

  by value;

run;

data have;

  set have;

  code=_n_;

run;

 

Then use that coded list to merge an identifier onto the data, otherwise if your list changes programmatically as you do now, then your output may change accordingly and further code might not even run.  If you can give a proper example of what you are trying to do...

Ksharp
Super User

Do not use MONOTONIC() , Since it is not documented , that would generate unexpected result .

 

 

ods _all_ close;
ods output sql_results=want;
PROC SQL number ;
	SELECT DISTINCT name,sex,age
    FROM sashelp.class;
QUIT;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 2110 views
  • 6 likes
  • 4 in conversation