Go Back   Rhinocerus > Newsgroup > Newsgroup comp.lang.perl.misc

Reply
 
Thread Tools Display Modes
  #1 (permalink)  
Old 01-05-2010, 10:00 PM
PerlFAQ Server
Guest
 
Posts: n/a
Default FAQ 6.15 How can I print out a word-frequency or line-frequency summary?

This is an excerpt from the latest version perlfaq6.pod, which
comes with the standard Perl distribution. These postings aim to
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

--------------------------------------------------------------------

6.15: How can I print out a word-frequency or line-frequency summary?

To do this, you have to parse out each word in the input stream. We'll
pretend that by word you mean chunk of alphabetics, hyphens, or
apostrophes, rather than the non-whitespace chunk idea of a word given
in the previous question:

while (<>) {
while ( /(\b[^\W_\d][\w'-]+\b)/g ) { # misses "`sheep'"
$seen{$1}++;
}
}

while ( ($word, $count) = each %seen ) {
print "$count $word\n";
}

If you wanted to do the same thing for lines, you wouldn't need a
regular expression:

while (<>) {
$seen{$_}++;
}

while ( ($line, $count) = each %seen ) {
print "$count $line";
}

If you want these output in a sorted order, see perlfaq4: "How do I sort
a hash (optionally by value instead of key)?".



--------------------------------------------------------------------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in
perlfaq.pod.
Reply With Quote
Alt Today
Advertising
 
and become member of Rhinocerus
Standard Sponsored Links

Reply

Popular Tags in the Forum
615, faq, linefrequency, print, summary, wordfrequency

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Sed One-Liners Sidney Lambe Newsgroup comp.lang.awk 22 10-15-2009 09:07 AM
pyinstaller Arlie Newsgroup comp.lang.python 5 06-25-2009 01:59 PM
Re: Help AIX 5.3 build on Python-3.1a2 Aahz Newsgroup comp.lang.python 7 06-13-2009 08:40 AM
Re: Frequency count of words Howard Schreier Newsgroup comp.soft-sys.sas 0 01-13-2009 05:16 PM
Re: Frequency count of words Joe Matise Newsgroup comp.soft-sys.sas 0 01-13-2009 03:41 PM



All times are GMT. The time now is 12:15 PM.


Copyright ©2009

LinkBacks Enabled by vBSEO 3.3.0 RC2 © 2009, Crawlability, Inc.