የውይይት መድረኮች ምስሎች ማውጫ ድር
Recently Visited Groups | Help | Sign in
Google Groups Home
C++ code for parsing syllables?
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  11 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
m...@privacy.net  
View profile  
 More options Mar 18, 4:25 am
Newsgroups: comp.lang.c++
From: m...@privacy.net
Date: Wed, 17 Mar 2010 20:25:41 -0500
Local: Thurs, Mar 18 2010 4:25 am
Subject: C++ code for parsing syllables?
I'm pulling my hair out trying to figure out code for
parsing and counting syllables in simple English
sentences.

Can someone throw the dog a bone on where to start?


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Michael Angelo Ravera  
View profile  
 More options Mar 18, 4:51 am
Newsgroups: comp.lang.c++
From: Michael Angelo Ravera <marav...@prodigy.net>
Date: Wed, 17 Mar 2010 18:51:12 -0700 (PDT)
Local: Thurs, Mar 18 2010 4:51 am
Subject: Re: C++ code for parsing syllables?
On Mar 17, 6:25 pm, m...@privacy.net wrote:

> I'm pulling my hair out trying to figure out code for
> parsing and counting syllables in simple English
> sentences.

> Can someone throw the dog a bone on where to start?

This isn't really a C++ question, but a Computational Linguistics
question.

The first step is in recognizing vowel groups. Once you recognize
vowel groups, you can try to determine whether the group forms 1or
more sylables.


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Sam  
View profile  
 More options Mar 18, 5:00 am
Newsgroups: comp.lang.c++
From: Sam <s...@email-scan.com>
Date: Wed, 17 Mar 2010 21:00:18 -0500
Local: Thurs, Mar 18 2010 5:00 am
Subject: Re: C++ code for parsing syllables?

m...@privacy.net writes:
> I'm pulling my hair out trying to figure out code for
> parsing and counting syllables in simple English
> sentences.

> Can someone throw the dog a bone on where to start?

You can start by defining the actual algorithm you want to use. Then,
proceed with implementing it in C++.

If you have no idea what kind of an algorithm you need, then you need to
look elsewhere. Try asking in alt.english.usage, perhaps.

  application_pgp-signature_part
< 1K Download

    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Daniel T.  
View profile  
 More options Mar 18, 6:08 am
Newsgroups: comp.lang.c++
From: "Daniel T." <danie...@earthlink.net>
Date: Wed, 17 Mar 2010 23:08:51 -0400
Local: Thurs, Mar 18 2010 6:08 am
Subject: Re: C++ code for parsing syllables?

m...@privacy.net wrote:
> I'm pulling my hair out trying to figure out code for
> parsing and counting syllables in simple English
> sentences.

> Can someone throw the dog a bone on where to start?

Google is your friend:
http://english.glendale.cc.ca.us/phonics.rules.html

Now all you have to do is code it up, and worry about all the
exceptions. The exceptions remind me of a joke by Emo Phillips.

   Most states do not end in the letter "a." The only ones that do are
   Alabama, Georgia, Florida, Louisiana, Oklahoma, Arizona, California,
   Nevada, Alaska, Montana, Nebraska, South Dakota, North Dakota,
   Minnesota, Iowa, Indiana, Pennsylvania, North Carolina, South
   Carolina, West Virginia, east Virginia, and Missouri.


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andy Champ  
View profile  
 More options Mar 19, 12:19 am
Newsgroups: comp.lang.c++
From: Andy Champ <no....@nospam.invalid>
Date: Thu, 18 Mar 2010 21:19:53 +0000
Local: Fri, Mar 19 2010 12:19 am
Subject: Re: C++ code for parsing syllables?
Daniel T. wrote:
> m...@privacy.net wrote:

>> I'm pulling my hair out trying to figure out code for
>> parsing and counting syllables in simple English
>> sentences.

>> Can someone throw the dog a bone on where to start?

> Google is your friend:
> http://english.glendale.cc.ca.us/phonics.rules.html

<snip>

Pay special attention to rule 1.

The rhythm can be foretold by looking at where the vowels are, right?
So "rhythm" has ... err... two syllables, because it's split by the Y
which counts as a vowel, whereas "foretold" obviously has three
syllables, centred around the three vowels.  Or is that centered?

This web site

http://www.wordcalc.com/

seems to do what you want.  Except... "The rhythm of life" contains two
syllables.  Half a syllable per word.

Good luck.  This is a hard problem.

Andy


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kai-Uwe Bux  
View profile  
 More options Mar 19, 1:52 am
Newsgroups: comp.lang.c++
Followup-To: comp.lang.c++
From: Kai-Uwe Bux <jkherci...@gmx.net>
Date: Thu, 18 Mar 2010 23:52:06 +0100
Local: Fri, Mar 19 2010 1:52 am
Subject: Re: C++ code for parsing syllables?

Maybe from a linguistic point of view, it is hard. But algorithmically, it
seems somewhat easy: English has about 1,000,000 words (with very inclusive
counting) and the number of syllables in each of them is known. So just do a
table look-up. This algorithm also has the advantage of being applicable to
any language (and it will be easier as English has a huge vocabulary).

It's a finite problem and in fact smaller than, say, the problem of finding
phone numbers based on name and address. The interesting part would be to
use frequency information about words to make the look-up fast; or to find a
good data structure to reduce memory consumption.

Of course, there is the issue of words being added to the language. However,
a rule based algorithm should not be expected to cope with the new words
either: its rules are just designed to deal with the known words.

Best

Kai-Uwe Bux


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Daniel Pitts  
View profile  
 More options Mar 19, 2:33 am
Newsgroups: comp.lang.c++
From: Daniel Pitts <newsgroup.spamfil...@virtualinfinity.net>
Date: Thu, 18 Mar 2010 16:33:55 -0700
Local: Fri, Mar 19 2010 2:33 am
Subject: Re: C++ code for parsing syllables?
On 3/18/2010 3:52 PM, Kai-Uwe Bux wrote:

How about a hash-map for both of those.

Actually, with only 1 million words, the entirety of the data structure
can easily fit in memory on even the cheapest of today's desktop/server
machines (mobile/embedded are a different story).  Making look up
extremely fast.

--
Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
m...@privacy.net  
View profile  
 More options Mar 19, 7:07 pm
Newsgroups: comp.lang.c++
From: m...@privacy.net
Date: Fri, 19 Mar 2010 11:07:36 -0500
Local: Fri, Mar 19 2010 7:07 pm
Subject: Re: C++ code for parsing syllables?

"Daniel T." <danie...@earthlink.net> wrote:
> The exceptions remind me of a joke by Emo Phillips.

>   Most states do not end in the letter "a." The only ones that do are
>   Alabama, Georgia, Florida, Louisiana, Oklahoma, Arizona, California,
>   Nevada, Alaska, Montana, Nebraska, South Dakota, North Dakota,
>   Minnesota, Iowa, Indiana, Pennsylvania, North Carolina, South
>   Carolina, West Virginia, east Virginia, and Missouri.

That's funny!

I live in MissourA as well!!


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
James Kanze  
View profile  
 More options Mar 19, 10:35 pm
Newsgroups: comp.lang.c++
From: James Kanze <james.ka...@gmail.com>
Date: Fri, 19 Mar 2010 12:35:30 -0700 (PDT)
Local: Fri, Mar 19 2010 10:35 pm
Subject: Re: C++ code for parsing syllables?
On Mar 18, 9:19 pm, Andy Champ <no....@nospam.invalid> wrote:

> Daniel T. wrote:
> > m...@privacy.net wrote:
> >> I'm pulling my hair out trying to figure out code for
> >> parsing and counting syllables in simple English
> >> sentences.
> >> Can someone throw the dog a bone on where to start?
> > Google is your friend:
> >http://english.glendale.cc.ca.us/phonics.rules.html
> <snip>
> Pay special attention to rule 1.
> The rhythm can be foretold by looking at where the vowels are,
> right?  So "rhythm" has ... err... two syllables, because it's
> split by the Y which counts as a vowel,

The y is the only possible vowel, so rhythm can't have more than
one syllable.  Except that as I hear it (and according to
dictionaries), it has two: in this case, the m acts as a
syllable.

> whereas "foretold" obviously has three syllables, centred
> around the three vowels.
> Or is that centered?

Rule 7 and the second point under 1 in the Basic Syllable Rules
do imply that silent e's don't count:-).  (Of course, they don't
give any hint as to how a program is to determine whether an e
is silent or not.)

> This web site
> http://www.wordcalc.com/
> seems to do what you want.  Except... "The rhythm of life"
> contains two syllables.  Half a syllable per word.
> Good luck.  This is a hard problem.

To put it mildly.  Compare "ccoper" with the beginning of
"cooperation".

And that's without internationalization: the rules will be
distinctly different in French or in German than in English.

For starters, you'll probably want to see
http://tug.org/docs/liang/.  To my knowledge, no one has done
better since (and it works for all, or at least most languages,
with a simple replacement of machine generated tables).

--
James Kanze


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andy Champ  
View profile  
 More options Mar 20, 6:26 pm
Newsgroups: comp.lang.c++
From: Andy Champ <no....@nospam.invalid>
Date: Sat, 20 Mar 2010 15:26:09 +0000
Local: Sat, Mar 20 2010 6:26 pm
Subject: Re: C++ code for parsing syllables?

Daniel Pitts wrote:
> How about a hash-map for both of those.

> Actually, with only 1 million words, the entirety of the data structure
> can easily fit in memory on even the cheapest of today's desktop/server
> machines (mobile/embedded are a different story).  Making look up
> extremely fast.

Thinking about it even a lookup won't fix the problem.  There are a few
words where the spelling is the same, and the pronunciation is
different.  So you'll have to perform a linguistic analysis.

eg:

moped (as in sulked) 1 syllable; (as in small motorcycle) 2 syllables.

I think we've lost the C++ content now!

Andy


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Pete Becker  
View profile  
 More options Mar 20, 10:59 pm
Newsgroups: comp.lang.c++
From: Pete Becker <p...@versatilecoding.com>
Date: Sat, 20 Mar 2010 15:59:11 -0400
Local: Sat, Mar 20 2010 10:59 pm
Subject: Re: C++ code for parsing syllables?

See footnote 274 ([lib.streambuf.virt.get]) in the current standard.

--
   Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of
"The Standard C++ Library Extensions: a Tutorial and Reference"
(www.petebecker.com/tr1book)


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2010 Google