CPC G10L 15/1815 (2013.01) [G06F 16/951 (2019.01); G10L 15/22 (2013.01); G10L 21/06 (2013.01); G10L 2015/223 (2013.01); G10L 2015/227 (2013.01); G10L 2015/228 (2013.01); H04M 2250/74 (2013.01)] |
AS A RESULT OF REEXAMINATION, IT HAS BEEN DETERMINED THAT: |
Claims 1 and 3-7 are cancelled. |
New claims 9-14 are added and determined to be patentable. |
Claims 2 and 8 were not reexamined. |
[ 9. The system of claim 1, wherein the system further comprises a plurality of servers distributed over public and private networks, the plurality of servers capable of communicating with a plurality of speech-based units and with a plurality of mobile phones, wherein the command or the request is directed to a payment service agent; and wherein the payment service agent is configured to provide all of the following voice services:
a secure payment wallet;
remote ordering of products by using each of a user location information, one or more user preferences, and a user order history;
providing a first option to secure payments with a personal identification number; and
providing a second option to secure payments with a voiceprint.]
|
[ 10. The system of claim 1, wherein the system further comprises a plurality of servers distributed over public and private networks, the plurality of servers configured to communicate with a plurality of mobile phones, and wherein the processors in the system are further caused to:
receive natural language utterances from the plurality of mobile phones via both local and wide-area wireless networks; and
provide the determined command or request to a payment service agent that is configured to provide all of the following voice services in connection with credit and debit accounts owned by users of the plurality of mobile phones:
payment services;
account informational services;
transaction history services; and
account balance services.]
|
[ 11. The system of claim 10, wherein the identifying, based on the one or more rank scores, of the one or more context entries from among the plurality of context entries further comprises:
utilizing a personalized cognitive model corresponding to a user who uttered the natural language utterance, wherein the system is configured to maintain a plurality of personalized cognitive models for a plurality of users, track one or more determined actions of the plurality of users, and update each of the plurality of personalized cognitive models based on the tracking;
utilizing a general cognitive model, wherein the general cognitive model corresponds to one or more interaction patterns for the plurality of users with the system;
utilizing an environmental model, wherein the environmental model comprises data reflecting the location of the user who uttered the natural language utterance, wherein such data was determined by a global positioning system component within a mobile phone associated with the user who uttered the natural language utterance; and
generating, based on the utilization of the personal cognitive model, the general cognitive models, and the environmental model, a request directed to an agent, wherein the request comprises first data comprising credit card account information, second data comprising a textual transcription of the natural language utterance, and third data comprising non-speech information in a text-based format.]
|
[ 12. A system for processing a natural language utterance, the system including one or more processors executing one or more computer program modules which, when executed, cause the one or more processors to:
generate a context stack comprising context information that corresponds to a plurality of prior utterances, wherein the context stack includes a plurality of context entries;
receive the natural language utterance, wherein the natural language utterance is associated with a command or is associated with a request;
determine one or more words of the natural language utterance by performing speech recognition on the natural language utterance;
identify, from among the plurality of context entries, one or more context entries that correspond to the one or more words, wherein the context information includes the one or more context entries, wherein identifying the one or more context entries comprises:
comparing the plurality of context entries to the one or more words;
generating, based on the comparison, one or more rank scores for individual context entries of the plurality of context entries; and
identifying, based on the one or more rank scores, the one or more context entries from among the plurality of context entries; and
determine, based on the determined one or more words and the context information, the command or the request associated with the natural language utterance,
wherein the system further comprises a plurality of servers distributed over public and private networks, the plurality of servers configured to communicate with a plurality of mobile phones, and wherein the processors in the system are further caused to:
receive the natural language utterance from at least one of the plurality of mobile phones via both local and wide-area wireless networks;
send the command or the request associated with the natural language utterance to an agent for processing;
provide a response to the command or the request received from the agent to the mobile phone from which the natural language utterance was received; and
receive an update to the context stack from the agent;
where the processors are further caused to:
determine, in the speech recognition, that one or more components of the natural language utterance are unrecognized;
generate, in response to the determination that one or more components of the natural language utterance are unrecognized, an unrecognized event; and
prompt, in response to the unrecognized event and via the mobile phone from which the natural language utterance was received, a user who uttered the natural language utterance to provide a rephrased natural language utterance; and
wherein the processors are further caused to:
determine, in response to the determination that one or more components of the natural language utterance are unrecognized, one or more tuning parameters for performing the speech recognition, wherein the one or more tuning parameters are personalized to the user who uttered the natural language utterance.]
|
[ 13. The system of claim 1, wherein the processors are further caused to:
provide a first option to store a voice-annotated digital photograph in a first memory of a computer server;
provide a second option to store the voice-annotated digital photograph in a second memory of a mobile phone; and
provide a third option to store the voice-annotated digital photograph in the first memory of the computer server and the second memory of the mobile phone.]
|
[ 14. The system of claim 13, wherein the processors are further caused to:
establish a work group comprising a shared workspace; and
share the voice-annotated digital photograph with a plurality of mobile devices associated with the work group.]
|