Go Back   Rhinocerus > Newsgroup > Newsgroup comp.soft-sys.sas

Reply
 
Thread Tools Display Modes
  #1 (permalink)  
Old 07-13-2012, 05:51 PM
divergent.tseries@gmail.com
Guest
 
Posts: n/a
Default Minimum data size

I have one variable (or alternatively five binary variables) that I want touse and at the same time minimize memory usage. Either the variable is one with five statuses (and for human memory purposes single character c, p, m, o, or b) or five binary variables c=0 or 1 through b= 0 or 1.

If I use it as a character variable then when I need it later I will have to do proc summary five times

proc summary data=dataset (where=(variable=c)) n;
var variable;
output out=output_for_c n=N_C;
run;

for each one and then merge.

Alternatively with the five binary variables I would end up with

proc summary data=dataset sum;
var c p m o b;
output out=output_for_all sum/autoname;
run;


Any suggestions to minimize memory useage?

Thanks!
There are 50 to 100 million observations.
Reply With Quote
Alt Today
Advertising
 
and become member of Rhinocerus
Standard Sponsored Links

  #2 (permalink)  
Old 07-13-2012, 10:02 PM
hlschreier@gmail.com
Guest
 
Posts: n/a
Default Re: Minimum data size

Use PROC FREQ:

proc freq data=dataset noprint ;
tables variable / out=output_for_all ;
run ;

On Friday, July 13, 2012 1:51:50 PM UTC-4, (unknown) wrote:
> I have one variable (or alternatively five binary variables) that I want to use and at the same time minimize memory usage. Either the variable is one with five statuses (and for human memory purposes single character c, p, m, o, or b) or five binary variables c=0 or 1 through b= 0 or 1.
>
> If I use it as a character variable then when I need it later I will haveto do proc summary five times
>
> proc summary data=dataset (where=(variable=c)) n;
> var variable;
> output out=output_for_c n=N_C;
> run;
>
> for each one and then merge.
>
> Alternatively with the five binary variables I would end up with
>
> proc summary data=dataset sum;
> var c p m o b;
> output out=output_for_all sum/autoname;
> run;
>
>
> Any suggestions to minimize memory useage?
>
> Thanks!
> There are 50 to 100 million observations.


Reply With Quote
 
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off




All times are GMT. The time now is 02:37 AM.


Copyright ©2009

LinkBacks Enabled by vBSEO 3.3.0 RC2 © 2009, Crawlability, Inc.