About m1986MM

m1986MM · ‎03-30-2015

Yes, I tried compress.

m1986MM · ‎03-30-2015

Hello everyone, I am working with a giant dataset, including 1.5 billion observations and 45 variables. It contains time serious data, so one of the variables is date. Since It takes too much time to run a simple command, like proc means, I am trying to find a way to work with my dataset more efficiently. One way is to change my long dataset to a wider one. In that case I will have about 700 variables and 1 million onservations. In general, is it efficient to have more observations or more variables? I also thought to improve the hardware. Can SSD be helpful? Is there any other way I can use to decrease the running time? I appreciate any suggestion.

m1986MM · ‎02-20-2015

Thank you.

m1986MM · ‎02-20-2015

Hello everyone, I have two datasets which has the same variable (Uni_Id). I want to create a subset of dataset1 such that the new subset will contain the observations whose uni_id values are equal to the values for this variable in the other dataset, dataset2. I appreciate any suggestion.

m1986MM · ‎02-02-2015

Actually the dataset is very large. So I can't easily find all values, like 22 or 33. I'm looking for an algorithm that can find all the values belonging to a certain student_ID, and then keep those observations that have the same value.

m1986MM · ‎02-02-2015

Hi everyone, I want to create a subset from a dataset. Suppose I have a table like below. What I want to do is keeping all those observations with the same value of Class_ID as what student 1111 has. In this case, I want to create a subset of all observations that the values of their class_ID are equal to 22 or 33. Student_ID Class_ID Score 1111 22 A 1111 33 A 2520 22 A 2520 44 A 5148 33 A 5148 66 A 6251 55 A I appreciate any suggestion.

m1986MM · ‎01-29-2015

Hello everyone; I tried to import a dataset and set the guessing row equal to 20,000. However, formats of the variables are nor what they should be. So I need to define formats of the variables. The following in my primary code. proc import datafile="F:\Address of the file.txt" dbms=dlm out=test replace; delimiter='|'; getnames=yes; Guessingrows=20000; run; I used the following code to define variable's format, but it didn't work. proc import datafile="E:\Mobina\wo23370_LSU\Deposits\depositRateData_1998_03.txt" dbms=dlm out=depositRateData_1998_03 replace; format rate 5.3; format apy 5.3; format datesurveyed yyyymmdd10.; format comment $30; delimiter='|'; getnames=yes; run; I appreciate any suggestion.

m1986MM · ‎11-30-2014

Thanks.

m1986MM · ‎11-30-2014

Thanks Patrick. Could you please explain below statements in the code? 1. Date='01Jan2012'd; 2. cost=date/1000*_n_; Thanks.

m1986MM · ‎11-29-2014

Hello everyone, I have a huge dataset with more than 30m observations. The main structure of the dataset is similar to table below. However, the number of variables are 46 (not all of them useful). Company_Nm size Brach product date cost X X1 A X X2 B X X3 C X X4 D Y Y1 A Y Y2 B Y Y3 C Y Y4 D Y Y5 E Z Z1 A In this dataset what I want to analyze is the cost of different product, offered by different branches of each company. The data for cost is available since 1995 in a weekly order (not similar day of the week for all of the observations). Now what I want to do is taking the average cost for each product, offered by each branch, for every month. In this way I can both reduce the number on obs and analyze the price variable cross sectionally. The date viable is in this format: YYYY-MM-DD. Because of the large number of variables, I prefer to not use CLASS statement. However, even if I have to use CLASS, how can I write the code to take the average for every month and create a dataset that has date (YYYY-MM) and Ave_price instead of the last two variables in the current data set. I greatly appreciate any and all suggestions.

m1986MM · ‎11-28-2014

Hello, I need to compare the value of two variables, both of them in character format. Like following example: data X; set Y; if Surv_city=city; run; As you see, I have two variable, Surv_city and city, in dataset Y, and I just want to keep those observations with the same name of Surv_city and city.However, A problem comes up and the result is zero observation. I think the problem occurs because the names of the city are not in the same form of uppercase and lowercase. How can I fixed this problem? I greatly appreciate any suggestion.

m1986MM · ‎11-28-2014

Hello, I've merged two datasets. However, there exist lots of replicated observations in the merged dataset. How can I say to just keep one of the observations with the same value for all the variables? I appreciate your help in advance.

m1986MM · ‎11-27-2014

Thanks.

m1986MM · ‎11-26-2014

Hi again, I have a question about the mechanism of this code and appreciate if you can help me. When I merge the datasets by using proc sql, for example in this specific code, those observations in table 1 with the missing value of table 2 (Field_of _interest) would be deleted or not? Thanks.

m1986MM · ‎11-24-2014

Yes, that's right.

Online Status	Offline
Date Last Visited	‎08-29-2020 09:15 PM

Re: adjusting the axis values in proc sgscatter

adjusting the axis values in proc sgscatter

Re: Reshape from wide to long

Re: Reshape from wide to long

Reshape from wide to long

Re: Difference between merging by Proc sql in SAS and merge m:m in Sta...

Re: Difference between merging by Proc sql in SAS and merge m:m in Sta...

Difference between merging by Proc sql in SAS and merge m:m in Stata

Put the dummy variable equal to one if the value is equal to a variabl...

Re: Filling the missing value

Re: Error in Libname statement

Re: Reshape from wide to long

Re: Reshape from wide to long

Re: Difference between merging by Proc sql in SAS and merge m:m in Sta...

Re: Difference between merging by Proc sql in SAS and merge m:m in Sta...

Re: Is it more efficient to work with a longer dataset or a wider one?

Is it more efficient to work with a longer dataset or a wider one?

Re: Creating a subset of observations

Creating a subset of observations

Re: Finding the observations whose variable's values are the same as a...

Finding the observations whose variable's values are the same as a spe...

Defining the format

Re: Taking the average for each specific period in a time series datas...

Re: Taking the average for each specific period in a time series datas...

Taking the average for each specific period in a time series dataset

A problem with lowercase and uppercase

How to delete the replicated data in a dataset

Re: How to merge part of a dataset with another dataset

Re: How to merge part of a dataset with another dataset

Re: How to merge part of a dataset with another dataset