Go Back   Rhinocerus > Newsgroup > Newsgroup comp.lang.* 1 > Newsgroup comp.lang.tcl



Reply
 
Thread Tools Display Modes
  #1 (permalink)  
Old 02-09-2010, 05:18 PM
GizmoGorilla
Guest
 
Posts: n/a
Default Saving TABS/CRLF's from text widget. Best Way?

I'm working on a project using a text widget. I've noticed that
after saving the data from the text widget that the data file
format is exactly as it was displayed in the text widget. So
TABS and CRLFs end up taking a huge amount of space. My data
file looks like...

Some text tabbed to some column








several CRLF's down the page


So the data file ends up taking 10 lines of data in this
example. Is there any way to "compress" this for saving
and "decompress" for loading, retaining the original
tabbing & CRLF's without using so much space in the
data file??

Thanks!

GG
Reply With Quote
Alt Today
Advertising
Google Adsense
 
and become member of Rhinocerus
Standard Sponsored Links

  #2 (permalink)  
Old 02-09-2010, 06:30 PM
Will Duquette
Guest
 
Posts: n/a
Default Re: Saving TABS/CRLF's from text widget. Best Way?

On Feb 9, 9:18*am, GizmoGorilla <gizmogori...@hotmail.com> wrote:
> I'm working on a project using a text widget. I've noticed that
> after saving the data from the text widget that the data file
> format is exactly as it was displayed in the text widget. So
> TABS and CRLFs end up taking a huge amount of space. My data
> file looks like...
>
> Some text * * * * * * * * *tabbed to some column
>
> several CRLF's down the page
>
> So the data file ends up taking 10 lines of data in this
> example. Is there any way to "compress" this for saving
> and "decompress" for loading, retaining the original
> tabbing & CRLF's without using so much space in the
> data file??
>
> Thanks!
>
> GG


I'm missing something. Why is this a problem? Each TAB is one
character, and each newline is one character (two characters on the
disk, on Windows). How much more can you compress it?

You could choose to replace each tab and each CRLF with some other
printable string on output, e.g., "\t" and "\n", using [string map],
and convert them back on input; but that's not going to decrease the
actual number of bytes.
Reply With Quote
  #3 (permalink)  
Old 02-09-2010, 07:26 PM
GizmoGorilla
Guest
 
Posts: n/a
Default Re: Saving TABS/CRLF's from text widget. Best Way?

On 2010-02-09 1:30 PM, Will Duquette wrote:
> On Feb 9, 9:18 am, GizmoGorilla<gizmogori...@hotmail.com> wrote:
>> I'm working on a project using a text widget. I've noticed that
>> after saving the data from the text widget that the data file
>> format is exactly as it was displayed in the text widget. So
>> TABS and CRLFs end up taking a huge amount of space. My data
>> file looks like...
>>
>> Some text tabbed to some column
>>
>> several CRLF's down the page
>>
>> So the data file ends up taking 10 lines of data in this
>> example. Is there any way to "compress" this for saving
>> and "decompress" for loading, retaining the original
>> tabbing& CRLF's without using so much space in the
>> data file??
>>
>> Thanks!
>>
>> GG

>
> I'm missing something. Why is this a problem? Each TAB is one
> character, and each newline is one character (two characters on the
> disk, on Windows). How much more can you compress it?
>
> You could choose to replace each tab and each CRLF with some other
> printable string on output, e.g., "\t" and "\n", using [string map],
> and convert them back on input; but that's not going to decrease the
> actual number of bytes.


The tab is inserting blanks up to the column that was tabbed to.
CRLF's will leave a blank line if there's no text on the line.
That's fine for the user but not for the data file.
If I crlf 5 times with no text on the line, I get 5 blank lines,
in my data file. This is a huge waste of space, that's why its a
problem. Data is being saved exactly as it is viewed in the text
widget, WYSIWYG. So perhaps when I save, for example...

(users view)
some text tab to here
\n
\n
\n
\n

....it could be saved as...

(data view)
some text [5x\t] tab to here
[4x\n]

and then expanded on a load. This is what I mean by compressing,
removing the redundant data to save space.

This would remove the spaces inserted by the tabs, and remove
the blank lines, So my data now uses 2 lines, not 5.

I was hoping there was an easier way to do this, other than
converting during save/load.

I hope that clarifies my post...

Thanks!
Reply With Quote
  #4 (permalink)  
Old 02-09-2010, 08:55 PM
Bryan Oakley
Guest
 
Posts: n/a
Default Re: Saving TABS/CRLF's from text widget. Best Way?

On Feb 9, 1:26 pm, GizmoGorilla <gizmogori...@hotmail.com> wrote:
>
> The tab is inserting blanks up to the column that was tabbed to.


.... so, you don't actually have tabs in the files, you have blocks of
consecutive spaces. Make sure when you describe a problem that you
describe it accurately.

> CRLF's will leave a blank line if there's no text on the line.
> That's fine for the user but not for the data file.
> If I crlf 5 times with no text on the line, I get 5 blank lines,
> in my data file. This is a huge waste of space,


five *bytes* is a huge waste of space? *bytes*? Even if you're talking
about one thousand blank lines, that's still only 1K. That's a
minuscule amount of disk space in this day and age, probably smaller
than the disk block size. Are you are a system that is seriously
constrained in space?

If you're concerned about disk space you should consider using a
standard compression scheme. Just run your data through zip/gzip and
that way any other wasteful characters will also get compressed.

Of course, you're doing this all to the inconvenience of the user who
will no longer be able to edit your text files with anything but your
tool (or, force them to unzip the file before editing it with some
other tool)

I don't mean to sound preachy, but it sounds like you're making a
beginners mistake of premature optimization. Don't worry about
compression until it actually proves to be a problem. Otherwise you'll
spend way too much time on something that just doesn't matter.


Reply With Quote
  #5 (permalink)  
Old 02-09-2010, 09:20 PM
Will Duquette
Guest
 
Posts: n/a
Default Re: Saving TABS/CRLF's from text widget. Best Way?

On Feb 9, 11:26*am, GizmoGorilla <gizmogori...@hotmail.com> wrote:
> On 2010-02-09 1:30 PM, Will Duquette wrote:
>
>
>
>
>
> > On Feb 9, 9:18 am, GizmoGorilla<gizmogori...@hotmail.com> *wrote:
> >> I'm working on a project using a text widget. I've noticed that
> >> after saving the data from the text widget that the data file
> >> format is exactly as it was displayed in the text widget. So
> >> TABS and CRLFs end up taking a huge amount of space. My data
> >> file looks like...

>
> >> Some text * * * * * * * * *tabbed to some column

>
> >> several CRLF's down the page

>
> >> So the data file ends up taking 10 lines of data in this
> >> example. Is there any way to "compress" this for saving
> >> and "decompress" for loading, retaining the original
> >> tabbing& *CRLF's without using so much space in the
> >> data file??

>
> >> Thanks!

>
> >> GG

>
> > I'm missing something. *Why is this a problem? *Each TAB is one
> > character, and each newline is one character (two characters on the
> > disk, on Windows). *How much more can you compress it?

>
> > You could choose to replace each tab and each CRLF with some other
> > printable string on output, e.g., "\t" and "\n", using [string map],
> > and convert them back on input; but that's not going to decrease the
> > actual number of bytes.

>
> The tab is inserting blanks up to the column that was tabbed to.
> CRLF's will leave a blank line if there's no text on the line.
> That's fine for the user but not for the data file.
> If I crlf 5 times with no text on the line, I get 5 blank lines,
> in my data file. This is a huge waste of space, that's why its a
> problem. Data is being saved exactly as it is viewed in the text
> widget, WYSIWYG. So perhaps when I save, for example...
>
> (users view)
> some text * * * * * * * * * * * * * *tab to here
> \n
> \n
> \n
> \n
>
> ...it could be saved as...
>
> (data view)
> some text [5x\t] tab to here
> [4x\n]
>
> and then expanded on a load. This is what I mean by compressing,
> removing the redundant data to save space.
>
> This would remove the spaces inserted by the tabs, and remove
> the blank lines, So my data now uses 2 lines, not 5.
>
> I was hoping there was an easier way to do this, other than
> converting during save/load.
>
> I hope that clarifies my post...
>
> Thanks!


No, converting during save/load is the way to do it.

But why does the number of lines in the data file matter? You say
it's a waste of space; but in what sense is the space being wasted?
Are you simply concerned about the amount of screen space the record
consumes when you view the file in an editor? Unless you're dealing
with truly vast amounts of data, you're not going to saving enough
bytes to make the disk space consumption matter one way or another.

Will
Reply With Quote
  #6 (permalink)  
Old 02-10-2010, 12:49 PM
GizmoGorilla
Guest
 
Posts: n/a
Default Re: Saving TABS/CRLF's from text widget. Best Way?

On 2010-02-09 3:55 PM, Bryan Oakley wrote:

[8<]

> I don't mean to sound preachy, but it sounds like you're making a
> beginners mistake of premature optimization. Don't worry about
> compression until it actually proves to be a problem. Otherwise you'll
> spend way too much time on something that just doesn't matter.


....comments appreciated, and I will try to be more precise next
time...

Yes you're right that in this day, space is cheap & abundant,
but that is not always the case. I'm developing this software
to be run on an old, slow, Solaris networked server where space
is not as abundant and cpu speed is not blazing. This, I have
no control over. The text widget will be used to link a
user comment to a record in a db, of which there may be 10's
of 000's of records so Im trying to be efficient given my
constraints. I will take the posted comments into
consideration. Thanks.

Norm [GG]






Reply With Quote
  #7 (permalink)  
Old 02-10-2010, 03:29 PM
Will Duquette
Guest
 
Posts: n/a
Default Re: Saving TABS/CRLF's from text widget. Best Way?

On Feb 10, 4:49*am, GizmoGorilla <gizmogori...@hotmail.com> wrote:
> On 2010-02-09 3:55 PM, Bryan Oakley wrote:
>
> [8<]
>
> > I don't mean to sound preachy, but it sounds like you're making a
> > beginners mistake of premature optimization. Don't worry about
> > compression until it actually proves to be a problem. Otherwise you'll
> > spend way too much time on something that just doesn't matter.

>
> ...comments appreciated, and I will try to be more precise next
> time...
>
> Yes you're right that in this day, space is cheap & abundant,
> but that is not always the case. I'm developing this software
> to be run on an old, slow, Solaris networked server where space
> is not as abundant and cpu speed is not blazing. This, I have
> no control over. The text widget will be used to link a
> user comment to a record in a db, of which there may be 10's
> of 000's of records so Im trying to be efficient given my
> constraints. I will take the posted comments into
> consideration. Thanks.
>
> Norm [GG]


OK; but man, it's been a long time since I've worried about the space
taken up by *text*.

What kind of database are you using?

Will
Reply With Quote
  #8 (permalink)  
Old 02-10-2010, 06:00 PM
GizmoGorilla
Guest
 
Posts: n/a
Default Re: Saving TABS/CRLF's from text widget. Best Way?


[8<]


> OK; but man, it's been a long time since I've worried about the space
> taken up by *text*.
>
> What kind of database are you using?


Everything is custom made here using a simple
flat file system. SQL would be nicer but this
is what I have to work with. The DB is used to
track about 1000 students per semester for
equipment assignment and account related issues.
Each student could potentially have a couple
dozen entries in a semester, each entry with a
comment field...

Thanks again for your input.

Norm
Reply With Quote
  #9 (permalink)  
Old 02-11-2010, 03:07 PM
drscrypt@gmail.com
Guest
 
Posts: n/a
Default Re: Saving TABS/CRLF's from text widget. Best Way?

GizmoGorilla wrote:
> widget, WYSIWYG. So perhaps when I save, for example...
>
> (users view)
> some text tab to here
> \n
> \n
> \n
> \n
>
> ...it could be saved as...
>
> (data view)
> some text [5x\t] tab to here
> [4x\n]



> This would remove the spaces inserted by the tabs, and remove
> the blank lines, So my data now uses 2 lines, not 5.



Do you realize by doing that you are actually increasing your file size
and not compressing it in any way?

> string length "\n\n\n\n"

4

> string length "\[5x\\n\]"

6

Especially with tabs, where you are more likely to have one here and
there, or 2-3 in consecutive order, you are increasing the size of the
file not to mention the extra processing that you must do at
load/unload. Plus how many 1x commands are you planning to define?

I would suggest to either trim tabs/newlines away if they are multiples,
or leave things as is.


DrS
Reply With Quote
  #10 (permalink)  
Old 02-12-2010, 12:32 PM
GizmoGorilla
Guest
 
Posts: n/a
Default Re: Saving TABS/CRLF's from text widget. Best Way?

On 10.02.11 7:07, drscrypt@gmail.com wrote:
> Do you realize by doing that you are actually increasing your file size
> and not compressing it in any way?


[8<]

>
> I would suggest to either trim tabs/newlines away if they are multiples,
> or leave things as is.


Yea, I do realize that, and in retrospect, a little more
thought should have been exercised prior my to posting.
I will leave things as they are for now.
Once the software goes live I'll see how it performs...

Thanks for the feedback...

Norm

Reply With Quote
 
Reply

Popular Tags in the Forum
saving, tabs or crlf, text, widget

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Can't update contents of text widget Will Duquette Newsgroup comp.lang.tcl 3 12-21-2009 09:48 PM
ODS Layout Question J M Newsgroup comp.soft-sys.sas 0 08-11-2009 09:56 PM
looking for -textvariable option for text widget ... Spam@ControlQ.com Newsgroup comp.lang.tcl 5 05-21-2009 03:59 PM
how to get rid of this error msg Jeff Newsgroup comp.soft-sys.sas 0 03-02-2009 06:36 PM
Re: SAS term: character string, text expression Ian Whitlock Newsgroup comp.soft-sys.sas 0 06-09-2007 05:08 PM



Language 1 | C | C++ | Php | Python | Lisp | Perl | Ruby | Java | Pascal | Basic | Language 2 | Databases | Oracle | Mysql | Access | Drupal
All times are GMT. The time now is 04:04 PM.


Copyright ©2009

LinkBacks Enabled by vBSEO 3.3.0 RC2 © 2009, Crawlability, Inc.