US7069220B2 - Method for determining and maintaining dialog focus in a conversational speech system - Google Patents

Method for determining and maintaining dialog focus in a conversational speech system Download PDF

Info

Publication number
US7069220B2
US7069220B2 US09/374,374 US37437499A US7069220B2 US 7069220 B2 US7069220 B2 US 7069220B2 US 37437499 A US37437499 A US 37437499A US 7069220 B2 US7069220 B2 US 7069220B2
Authority
US
United States
Prior art keywords
events
modal
command
recited
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime, expires
Application number
US09/374,374
Other versions
US20030014260A1 (en
Inventor
Daniel M. Coffman
Popani Gopalakrishnan
Ganesh N. Ramaswamy
Jan Kleindienst
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nuance Communications Inc
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US09/374,374 priority Critical patent/US7069220B2/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: COFFMAN, DANIEL L., GOPALAKRISHNAN, POPANI, KLEINDIENST, JAN, RAMASWAMY, GANESH N.
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION CORRECTIVE ASSIGNMENT TO CORRECT ASSIGNOR ON A PREVIOUS RECORDING AT REEL 010276, FRAME 0943 Assignors: COFFMAN, DANIEL M., GOPALAKRISHNAN, POPANI, KLEINDIENST, JAN, RAMASAWMY, GANESH
Priority to GB0019658A priority patent/GB2357364B/en
Publication of US20030014260A1 publication Critical patent/US20030014260A1/en
Application granted granted Critical
Publication of US7069220B2 publication Critical patent/US7069220B2/en
Assigned to NUANCE COMMUNICATIONS, INC. reassignment NUANCE COMMUNICATIONS, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: INTERNATIONAL BUSINESS MACHINES CORPORATION
Adjusted expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/226Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
    • G10L2015/228Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of application context

Definitions

  • the present invention relates to dialog systems, and more particularly to management of a dialog within a conversational computer system with multiple input modalities.
  • Conversational systems typically focus on the interaction with a single application at a time.
  • a speaker for a conversational system is only permitted to interact with the active application.
  • This type of interaction is generally referred to as modal interaction or a modal system. That is, the user must specify which application he intends to use, and must finish working with that application before using another. This is disadvantageous in many situations where several applications may be needed or desired to be accessed simultaneously. Further, the conventional modal systems may result in loss of efficiency and time. In many instances, this leads to reduced profitability.
  • a first task must be performed and closed prior to opening a second task and performing the second task.
  • Conventional conversational modal systems are not capable of distinguishing tasks between applications. However, this is not how every day tasks are generally performed. In an office setting, for example, a worker might begin writing a letter, stop for a moment and place a telephone call, then finish the letter. The conventional modal systems do not provide this flexibility.
  • a method of the present invention which may be implemented with a program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for determining and maintaining dialog focus in a conversational speech system, includes presenting a command associated with an application to a dialog manager.
  • the application associated with the command is unknown to the dialog manager at the time it is made.
  • the dialog manager determines a current context of the command by reviewing a multi-modal history of events.
  • At least one method is determined responsive to the command based on the current context.
  • the at least one method is executed responsive to the command associated with the application.
  • the step of presenting a command may include the step of employing at least one multi-modal device for presenting the command.
  • the at least one multi-modal device for presenting the command may include a telephone, a computer, and/or a personal digital assistant (other devices may also be employed).
  • the step of determining a current context of the command by reviewing a multi-modal history of events may include the step of providing a linked list of all events in the multi-modal history.
  • the events in the multi-modal history may include at least one of events linked by time, by type, by transaction, by class and by dialog focus.
  • the step of determining at least one method may include the step referencing all active applications using a component control to determine the at least one method which is appropriate based on the current context of the command.
  • the command may be presented in a formal language such that a plurality of human utterances represent an action to be taken.
  • the step of determining a current context of the command by reviewing a multi-modal history of events may include the step of maintaining a current dialog focus and a list of expected responses in the dialog manager to provide a reference for determining the current context.
  • the step of querying a user for information needed to resolve the current context and/or information needed to take an appropriate action may also be included.
  • a system for determining and maintaining dialog focus in a conversational speech system includes a dialog manager adapted to receive commands from a user.
  • the dialog manager maintains a current dialog focus and a list of expected responses for determining a current context of the commands received.
  • a multi-modal history is coupled to the dialog manager for maintaining an event list of all events which affected a state of the system.
  • the multi-modal history is adapted to provide input to the dialog manager for determining the current context of the commands received.
  • a control component is adapted to select at least one method responsive to the commands received such that the system applies methods responsive to the commands for an appropriate application.
  • the appropriate application may include an active application, an inactive application, an application with a graphical component and/or an application with other than a graphical component.
  • the commands may be input to the dialog manager by a telephone, a computer, and/or a personal digital assistant.
  • the multi-modal history may include a linked list of all events to associate a given command to the appropriate application.
  • the events in the multi-modal history may include at least one of events linked by time, by type, by transaction, by class and by dialog focus.
  • the control component preferably references all active applications to determine the at least one method which is appropriate based on the current context of the commands.
  • the command is preferably presented in a formal language such that a plurality of human utterances represent an action to be taken.
  • FIG. 1 is a schematic diagram of a conversational system in accordance with the present invention
  • FIG. 2 illustratively depicts a multi-modal history in accordance with the present invention
  • FIG. 3 illustratively depicts a dialog manager in accordance with the invention.
  • FIG. 4 is a block/flow diagram of a system/method for determining and maintaining dialog focus in a conversational speech system in accordance with the present invention.
  • the present invention relates to the management of multiple applications and input modalities through a conversational system.
  • the conversational system manipulates information from applications, presents this to a user, and converses with the user when some aspects of this manipulation are ambiguous.
  • the present invention provides for many applications to be active at any time and for the system itself to deduce the intended object of a user's action.
  • the invention provides a method for determining dialog focus in a conversational speech system with multiple modes of user input and multiple backend applications.
  • the invention permits interaction with desktop applications which are not the subject of current graphical focus, or which do not even have a visual component.
  • the methods provided by the invention achieve this focus resolution through an examination of the context of the user's command.
  • the command may be entered through any one of the several input modalities, examples of which include a spoken input, a keyboard input, a mouse input, etc.
  • a detailed history is maintained of the commands the user has previously performed. The final resolution proceeds through knowledge of any application specific aspects of the command, where the command is made from (i.e., from a telephone, computer, etc.) and an investigation of this history.
  • FIGS. 1–4 may be implemented in various forms of hardware, software or combinations thereof. Preferably, these elements are implemented in software on one or more appropriately programmed general purpose digital computers having a processor and memory and input/output interfaces.
  • FIG. 1 a block/flow diagram is shown for a system/method for the implementation of dialog management for a multiple client conversational system 8 in accordance with the present invention.
  • client devices such as a personal computer (PC), telephone, or personal digital assistant (PDA) (or other devices) may all be used as clients.
  • PC personal computer
  • PDA personal digital assistant
  • FIG. 10 09/374,026 entitled “METHOD AND SYSTEM FOR MULTI-CLIENT ACCESS TO A DIALOG SYSTEM,” filed concurrently herewith and incorporated herein by reference.
  • the PC may have a keyboard, mouse, and microphone; the telephone may have a microphone and numeric keypad; the PDA may have a stylus.
  • any of these devices may be used to initiate a new command to the system 8 or to respond to a query from the system 8 .
  • the conversational system 8 further supports the use of any application the user desires.
  • an electronic mail (e-mail) application might be active simultaneously with a calendar application and a spreadsheet application. The application the user interacts with need not be explicitly selected.
  • this application need not be in the foreground or graphical focus, or indeed even visible.
  • the intended action is clear. If the user pushes a button on the PC with his mouse, for example, the user's intention is obvious because of the constraints placed on the user by the application's design. The button can only perform one action. Similar constraints apply for the PDA's stylus and the numeric keypad of the telephone. However, a spoken interface presents no such constraints.
  • a user communicates with a spoken interface in much the same way the user would with a human.
  • the user describes actions much more complex than those possible with an input device such as a mouse.
  • the user also is able to speak in a natural manner with the system deciding what the user intends, carrying out this action, if possible, and prompting the user if more information is needed.
  • An intended target of a spoken command may not be at all obvious.
  • each application may be capable of responding to the same spoken command.
  • the target is determined dynamically, on an utterance-by-utterance basis.
  • the target may be one of the active applications if the utterance represents a command, but if it represents the response to a query from the system itself for more information, the target will be the pending action which generated the query.
  • a concept related to, but distinct from, the target is that of dialog focus. This is the application with which the user is currently interacting. As such it represents the best hypothesis of the target of a command.
  • the application with dialog focus is usually examined first to determine whether it can accept the command.
  • This dialog focus may be implicitly or deliberately changed. If the user launches a new application, it will be granted dialog focus in the assumption that the user wishes to interact with the new application. The user may also request to bring a different application into the foreground and it will then be granted dialog focus.
  • a multi-modal system permits user input through a variety of modalities. In many cases, a spoken command will be superior, but there are certainly cases where, for example, a single mouse click may be more efficient or more to the user's liking. These non-speech inputs often change the context of the system, and the conversational system should be made aware of this. If, for example, the user starts a new application by using his mouse, the conversational system should know this to direct spoken commands to the new application. To this end, this invention presents a mechanism for capturing and maintaining a complete history of all events concerning the system 8 , i.e., speech or non-speech events, the result of user input or of system output.
  • a multi-modal history 16 is created in accordance with the invention. This multi-modal history 16 plays a role in deducing a target 18 of spoken commands.
  • FIG. 1 shows those components of the conversational system 8 used to determine the target 18 of a spoken command or response (block 12 ).
  • This command or response 12 is presented to a dialog manager 14 for processing.
  • what is given to the dialog manager 14 is not the actual spoken command, but rather an element of a formal language representing the meaning of the command 12 .
  • the dialog manager 14 may be capable of handling direct human utterances, for example, by including a speech recognition system.
  • the dialog manager 14 examines the formal language, extracts the command, and locates a corresponding method. In one embodiment of the present invention, these methods are implemented using independent decision networks, as described in commonly assigned U.S. application Ser. No. 09/374,744 entitled “METHOD AND SYSTEM FOR MODELESS OPERATION OF A MULTI-MODAL USER INTERFACE THROUGH IMPLEMENTATION OF INDEPENDENT DECISION NETWORKS,” filed concurrently herewith and incorporated herein by reference.
  • the determination of the correct target 18 proceeds through examination of the nature of the command and the current context of the system 8 . This context may be obtained from the multi-modal history 16 .
  • a component control 20 acts as a “switch yard”. Component control 20 maintains a reference to all currently active applications. Component control 20 is described in greater detail in “METHOD AND SYSTEM FOR MULTI-CLIENT ACCESS TO A DIALOG SYSTEM,” previously incorporated by reference.
  • the target 18 determined by the dialog manager 14 is of an abstract nature. That is, the target 18 refers to a type of application, not its implementation.
  • the dialog manager 14 may, for example, determine that the target 18 is a calendar component, but it has no knowledge of which particular application implements a calendar. This degree of abstraction permits a suite of applications currently active to be modified dynamically, at the user's discretion, with no modification to the dialog manager 14 needed.
  • the multi-modal history 16 is illustratively presented in greater detail.
  • the multi-modal history 16 is a list of all events which have influenced the state of the system 8 as a whole, and the system's response to those events.
  • the entries in the history 16 may be of several types. These may include user input of all types including both speech and non-speech inputs, responses from the system including results of queries, and prompts for more information, all changes of dialog focus and a descriptor of all successfully completed actions.
  • the multi-modal history 16 relies upon a linked list 22 . All events 24 concerning the system 8 as a whole are maintained in the order received, but the history makes use of additional forward and backward links 26 .
  • the events 24 are linked by time, event type, transaction identifier, and event class.
  • the event types included for this invention are “SET_DIALOG_FOCUS”, “GUI_ACTION”, and “COMPLETED_ACTION”.
  • the event type “SET_DIALOG_FOCUS” is an indication that dialog focus has been changed, either automatically by the system 8 or deliberately by the user.
  • the event type “GUI_ACTION” indicates that the user has performed some action upon the graphical interface, and the nature of the action is maintained as part of the event.
  • the event list 22 includes a complete history of all steps taken to complete the action, including any elements resolved in the course of the execution. Several steps may be taken during the completion of one action. All of the events generated as a result, share one unique transaction identifier. In the current embodiment, this transaction identifier is derived from the system clock time and date. As events within the history are linked also by this transaction identifier, all events pertaining to a particular action may be removed easily when they are no longer needed or relevant.
  • All events within the history 16 belong to one of several classes. Some examples are “OPEN”, “DELETE”, and “CHECK”. An event belongs to the “OPEN” class when it describes the action of opening an object, such as, for example, a mail message, a calendar entry or an address book entry. All events 22 in the history 16 are also linked by an event type 28 .
  • the numerous links within the history 16 permit efficient searches. If, for example, a request is made for an event of class “OPEN”, a link manager 15 ( FIG. 1 ) in the history 16 will return the most recent event of this type. If this is not the correct event, the previous link by class 30 of the event will provide a reference to the previous event of class “OPEN”. These two events may have been widely separated in time. This process may be repeated until the correct event is located.
  • the dialog manager 14 maintains a reference to a current dialog focus 32 . This is updated each time the dialog focus changes.
  • the dialog manager 14 also maintains a list of expected responses 34 . Each time the system 8 poses a question to the user, the method implementing the action being performed is permitted its expected response or responses with dialog manager 14 . In the present implementation, this registration is performed by a decision network.
  • the list of expected responses 34 is implemented as a linked list 35 , much like the multi-modal history 16 .
  • the elements are linked by time 36 , command 38 and requester 40 .
  • the function of this list 35 is easily illustrated through an example. If a method executing a command poses the question “Do you mean Steve Krantz, or Steve Bradner?” to the user, the method expects a response of the form “I mean Steve Bradner” or simply “Steve Bradner”.
  • the method will register the two possible responses with the dialog manager 14 with the commands being “select_object” and “set_object”.
  • each entry will include a field indicating the acceptable argument type is name.
  • the dialog manager 14 decides which of these is correct. In the present implementation, the dialog manager 14 merely uses the most recent requester, however, in a different implementation, the dialog manager 14 could pose a query to the user for clarification.
  • the several components are used in various ways to resolve the intended target 18 depending on the nature of the command.
  • the target is clear from the command itself. If the user were to ask “Do I have anything scheduled for next Monday?” the intended target is clearly a calendar component and no further resolution is necessary.
  • the current dialog focus maintained within the dialog manager is the intended target. If the user says “Change the subject to ‘proposal,’” the user is clearly referring to the application with dialog focus. In such cases, the target 18 is taken to be the current dialog focus 32 , and the formal language statement is dispatched accordingly.
  • Certain commands are extremely ambiguous and are permitted in a conversational system to substantially enhance the quality of the interaction.
  • the user can say, for example, “Close that” and the system must react correctly. However, such an utterance includes no information at all about the intended target. This target is resolved by examining the multi-modal history 16 . In this particular example, the most recent event of type “COMPLETED_ACTION” and class “OPEN” would be fetched from the history 16 . Such an event includes the target 18 of the original command. The target of the new command is taken to be the same as that of the original command and is forwarded to the original target. Hence, if the user says “Close that” the object most recently opened will be closed, be it a calendar entry, spreadsheet cell or other type of object.
  • a further use of the history 16 is made when utterances such as “Undo that” or “Do that again” are received.
  • the most recent event of type “COMPLETED_ACTION” is retrieved from the multi-modal history. Additional fields of such events indicate whether the action can be undone or repeated.
  • the original command is extracted from the “COMPLETED_ACTION” event, and if possible as indicated by these fields, and undone or repeated as appropriate.
  • a special case is that of canceling an already proceeding action.
  • the target of the formal language is the method performing this action itself.
  • the most recent event of type “DIALOG_FOCUS,” with the owner of the focus being a method, is fetched from the multi-modal history.
  • the formal language is delivered to the method which will then cease executing its action. Subsequently, all events in the multi-modal history 16 bearing the transaction identifier of this now canceled method are purged from the history 16 .
  • a command associated with an application to is presented to a dialog manager.
  • the command may be in a formal language or be a direct utterance.
  • the command or response may be input to the dialog manager from a user from any of a plurality of multi-mode devices. For example, a computer, a personal digital assistant, a telephone, etc.
  • the application associated with the command is unknown to the dialog manager at the time the command is made, and therefore, the application which the command is intended for should first be deduced.
  • the dialog manager determines a current context of the command by reviewing a multi-modal history of events.
  • the current context of the command is ascertained by reviewing a multi-modal history of events which preferably includes a linked list of all events in the multi-modal history.
  • the events in the multi-modal history may include at least one of events linked by time, by type, by transaction, by class and by dialog focus.
  • a current context of the command is determined by reviewing the multi-modal history of events, a current dialog focus maintained in the dialog manager and a list of expected responses also maintained in the dialog manager to provide a reference for determining the current context.
  • At least one method is determined responsive to the command based on the current context.
  • the method is determined based on the all active applications referenced using a component control to determine the method(s) which are appropriate based on the current context of the command. If a method cannot be determined or more information is needed, a query is sent to the user for information needed to resolve the current context or information needed to take an appropriate action.
  • the method(s) are executed responsive to the command or response to the query associated with the application. This means the present invention automatically associates the command given to an application which is active or inactive depending on the context of the command or response.
  • a record is maintained in the dialog manager and in the multi-modal history of any changes to states which the system has undergone. Records which are no longer relevant may be removed.
  • This invention illustratively presents a method and system for determining and maintaining dialog focus in a conversational speech system with multiple modes of user input and multiple backend applications.
  • the focus resolution is achieved through an examination of the context of the user's command.
  • the command may be entered through any one of the several input modalities.
  • a detailed history is maintained of the commands the user has previously performed.
  • the final resolution proceeds through knowledge of any application specific aspects of the command and an investigation of this history.
  • This invention thus allows interaction with desktop or other applications which are not the subject of current graphical focus, or which do not even have a visual component.

Abstract

A system and method of the present invention for determining and maintaining dialog focus in a conversational speech system includes presenting a command associated with an application to a dialog manager. The application associated with the command is unknown to the dialog manager at the time it is made. The dialog manager determines a current context of the command by reviewing a multi-modal history of events. At least one method is determined responsive to the command based on the current context. The at least one method is executed responsive to the command associated with the application.

Description

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to dialog systems, and more particularly to management of a dialog within a conversational computer system with multiple input modalities.
2. Description of the Related Art
Conversational systems typically focus on the interaction with a single application at a time. A speaker for a conversational system is only permitted to interact with the active application. This type of interaction is generally referred to as modal interaction or a modal system. That is, the user must specify which application he intends to use, and must finish working with that application before using another. This is disadvantageous in many situations where several applications may be needed or desired to be accessed simultaneously. Further, the conventional modal systems may result in loss of efficiency and time. In many instances, this leads to reduced profitability.
To illustrate a conventional modal system, a first task must be performed and closed prior to opening a second task and performing the second task. Conventional conversational modal systems are not capable of distinguishing tasks between applications. However, this is not how every day tasks are generally performed. In an office setting, for example, a worker might begin writing a letter, stop for a moment and place a telephone call, then finish the letter. The conventional modal systems do not provide this flexibility.
Therefore, a need exists for a system and method for determining dialog focus in a conversational speech system. A further need exists for a system which deduces the intent of a user to open a particular application.
SUMMARY OF THE INVENTION
A method of the present invention, which may be implemented with a program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for determining and maintaining dialog focus in a conversational speech system, includes presenting a command associated with an application to a dialog manager. The application associated with the command is unknown to the dialog manager at the time it is made. The dialog manager determines a current context of the command by reviewing a multi-modal history of events. At least one method is determined responsive to the command based on the current context. The at least one method is executed responsive to the command associated with the application.
In other methods, which may be implemented using a program storage device, the step of presenting a command may include the step of employing at least one multi-modal device for presenting the command. The at least one multi-modal device for presenting the command may include a telephone, a computer, and/or a personal digital assistant (other devices may also be employed). The step of determining a current context of the command by reviewing a multi-modal history of events may include the step of providing a linked list of all events in the multi-modal history. The events in the multi-modal history may include at least one of events linked by time, by type, by transaction, by class and by dialog focus. The step of determining at least one method may include the step referencing all active applications using a component control to determine the at least one method which is appropriate based on the current context of the command. The command may be presented in a formal language such that a plurality of human utterances represent an action to be taken. The step of determining a current context of the command by reviewing a multi-modal history of events may include the step of maintaining a current dialog focus and a list of expected responses in the dialog manager to provide a reference for determining the current context. The step of querying a user for information needed to resolve the current context and/or information needed to take an appropriate action may also be included.
A system, in accordance with the invention, for determining and maintaining dialog focus in a conversational speech system includes a dialog manager adapted to receive commands from a user. The dialog manager maintains a current dialog focus and a list of expected responses for determining a current context of the commands received. A multi-modal history is coupled to the dialog manager for maintaining an event list of all events which affected a state of the system. The multi-modal history is adapted to provide input to the dialog manager for determining the current context of the commands received. A control component is adapted to select at least one method responsive to the commands received such that the system applies methods responsive to the commands for an appropriate application.
In alternate embodiments, the appropriate application may include an active application, an inactive application, an application with a graphical component and/or an application with other than a graphical component. The commands may be input to the dialog manager by a telephone, a computer, and/or a personal digital assistant. The multi-modal history may include a linked list of all events to associate a given command to the appropriate application. The events in the multi-modal history may include at least one of events linked by time, by type, by transaction, by class and by dialog focus. The control component preferably references all active applications to determine the at least one method which is appropriate based on the current context of the commands. The command is preferably presented in a formal language such that a plurality of human utterances represent an action to be taken.
These and other objects, features and advantages of the present invention will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings.
BRIEF DESCRIPTION OF DRAWINGS
The invention will be described in detail in the following description of preferred embodiments with reference to the following figures wherein:
FIG. 1 is a schematic diagram of a conversational system in accordance with the present invention;
FIG. 2 illustratively depicts a multi-modal history in accordance with the present invention;
FIG. 3 illustratively depicts a dialog manager in accordance with the invention; and
FIG. 4 is a block/flow diagram of a system/method for determining and maintaining dialog focus in a conversational speech system in accordance with the present invention.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
The present invention relates to the management of multiple applications and input modalities through a conversational system. The conversational system manipulates information from applications, presents this to a user, and converses with the user when some aspects of this manipulation are ambiguous. The present invention provides for many applications to be active at any time and for the system itself to deduce the intended object of a user's action. The invention provides a method for determining dialog focus in a conversational speech system with multiple modes of user input and multiple backend applications. The invention permits interaction with desktop applications which are not the subject of current graphical focus, or which do not even have a visual component. The methods provided by the invention achieve this focus resolution through an examination of the context of the user's command. The command may be entered through any one of the several input modalities, examples of which include a spoken input, a keyboard input, a mouse input, etc. A detailed history is maintained of the commands the user has previously performed. The final resolution proceeds through knowledge of any application specific aspects of the command, where the command is made from (i.e., from a telephone, computer, etc.) and an investigation of this history.
It should be understood that the elements shown in FIGS. 1–4 may be implemented in various forms of hardware, software or combinations thereof. Preferably, these elements are implemented in software on one or more appropriately programmed general purpose digital computers having a processor and memory and input/output interfaces. Referring now to the drawings in which like numerals represent the same or similar elements and initially to FIG. 1, a block/flow diagram is shown for a system/method for the implementation of dialog management for a multiple client conversational system 8 in accordance with the present invention. In block 10, various client devices such as a personal computer (PC), telephone, or personal digital assistant (PDA) (or other devices) may all be used as clients. The architecture by which this is accomplished is described in greater detail in commonly assigned U.S. application Ser. No. 09/374,026 entitled “METHOD AND SYSTEM FOR MULTI-CLIENT ACCESS TO A DIALOG SYSTEM,” filed concurrently herewith and incorporated herein by reference. Each of these devices of block 10 has different input modalities. For example, the PC may have a keyboard, mouse, and microphone; the telephone may have a microphone and numeric keypad; the PDA may have a stylus. In block 12, any of these devices may be used to initiate a new command to the system 8 or to respond to a query from the system 8. The conversational system 8 further supports the use of any application the user desires. For example, an electronic mail (e-mail) application might be active simultaneously with a calendar application and a spreadsheet application. The application the user interacts with need not be explicitly selected. In the case of the PC, this application need not be in the foreground or graphical focus, or indeed even visible. In the case of most of the input modalities described above, the intended action is clear. If the user pushes a button on the PC with his mouse, for example, the user's intention is obvious because of the constraints placed on the user by the application's design. The button can only perform one action. Similar constraints apply for the PDA's stylus and the numeric keypad of the telephone. However, a spoken interface presents no such constraints.
In accordance with the invention, a user communicates with a spoken interface in much the same way the user would with a human. The user describes actions much more complex than those possible with an input device such as a mouse. The user also is able to speak in a natural manner with the system deciding what the user intends, carrying out this action, if possible, and prompting the user if more information is needed.
An intended target of a spoken command may not be at all obvious. In a system with several applications active simultaneously, each application may be capable of responding to the same spoken command. Thus, the target is determined dynamically, on an utterance-by-utterance basis. In a conversational system, the situation is even more complicated. The target may be one of the active applications if the utterance represents a command, but if it represents the response to a query from the system itself for more information, the target will be the pending action which generated the query. A concept related to, but distinct from, the target is that of dialog focus. This is the application with which the user is currently interacting. As such it represents the best hypothesis of the target of a command. When resolving the target of a command, the application with dialog focus is usually examined first to determine whether it can accept the command. This dialog focus may be implicitly or deliberately changed. If the user launches a new application, it will be granted dialog focus in the assumption that the user wishes to interact with the new application. The user may also request to bring a different application into the foreground and it will then be granted dialog focus.
A multi-modal system permits user input through a variety of modalities. In many cases, a spoken command will be superior, but there are certainly cases where, for example, a single mouse click may be more efficient or more to the user's liking. These non-speech inputs often change the context of the system, and the conversational system should be made aware of this. If, for example, the user starts a new application by using his mouse, the conversational system should know this to direct spoken commands to the new application. To this end, this invention presents a mechanism for capturing and maintaining a complete history of all events concerning the system 8, i.e., speech or non-speech events, the result of user input or of system output. A multi-modal history 16 is created in accordance with the invention. This multi-modal history 16 plays a role in deducing a target 18 of spoken commands.
FIG. 1 shows those components of the conversational system 8 used to determine the target 18 of a spoken command or response (block 12). This command or response 12 is presented to a dialog manager 14 for processing. In one embodiment, what is given to the dialog manager 14 is not the actual spoken command, but rather an element of a formal language representing the meaning of the command 12. In this manner, there may be many human utterances which convey the same meaning to the dialog manager 14. The actual form of this formal language may be “command(argument1=value1, . . . , argumentj=valuej)” where “command” represents the nature of the action to be taken or response, and “arguemt1=value1” represents a qualifier to this command. In this manner the utterance “Do I have anything scheduled for tomorrow?” would be transformed into the formal language “query_calendar(day=tommorow)”. Alternately, the dialog manager 14 may be capable of handling direct human utterances, for example, by including a speech recognition system.
One purpose of the dialog manager 14 is to identify the intended target 18 of the command and a method for completing the command. The dialog manager 14 examines the formal language, extracts the command, and locates a corresponding method. In one embodiment of the present invention, these methods are implemented using independent decision networks, as described in commonly assigned U.S. application Ser. No. 09/374,744 entitled “METHOD AND SYSTEM FOR MODELESS OPERATION OF A MULTI-MODAL USER INTERFACE THROUGH IMPLEMENTATION OF INDEPENDENT DECISION NETWORKS,” filed concurrently herewith and incorporated herein by reference. The determination of the correct target 18 proceeds through examination of the nature of the command and the current context of the system 8. This context may be obtained from the multi-modal history 16.
A component control 20 acts as a “switch yard”. Component control 20 maintains a reference to all currently active applications. Component control 20 is described in greater detail in “METHOD AND SYSTEM FOR MULTI-CLIENT ACCESS TO A DIALOG SYSTEM,” previously incorporated by reference. The target 18 determined by the dialog manager 14 is of an abstract nature. That is, the target 18 refers to a type of application, not its implementation. The dialog manager 14 may, for example, determine that the target 18 is a calendar component, but it has no knowledge of which particular application implements a calendar. This degree of abstraction permits a suite of applications currently active to be modified dynamically, at the user's discretion, with no modification to the dialog manager 14 needed.
Referring to FIG. 2, the multi-modal history 16 is illustratively presented in greater detail. The multi-modal history 16 is a list of all events which have influenced the state of the system 8 as a whole, and the system's response to those events. The entries in the history 16 may be of several types. These may include user input of all types including both speech and non-speech inputs, responses from the system including results of queries, and prompts for more information, all changes of dialog focus and a descriptor of all successfully completed actions.
In the embodiment shown in FIG. 2, the multi-modal history 16 relies upon a linked list 22. All events 24 concerning the system 8 as a whole are maintained in the order received, but the history makes use of additional forward and backward links 26. In particular, the events 24 are linked by time, event type, transaction identifier, and event class. Among the event types included for this invention are “SET_DIALOG_FOCUS”, “GUI_ACTION”, and “COMPLETED_ACTION”. The event type “SET_DIALOG_FOCUS” is an indication that dialog focus has been changed, either automatically by the system 8 or deliberately by the user. The event type “GUI_ACTION” indicates that the user has performed some action upon the graphical interface, and the nature of the action is maintained as part of the event. When an action is completed successfully, a “COMPLETED_ACTION” event is placed in the history. The event list 22 includes a complete history of all steps taken to complete the action, including any elements resolved in the course of the execution. Several steps may be taken during the completion of one action. All of the events generated as a result, share one unique transaction identifier. In the current embodiment, this transaction identifier is derived from the system clock time and date. As events within the history are linked also by this transaction identifier, all events pertaining to a particular action may be removed easily when they are no longer needed or relevant.
All events within the history 16 belong to one of several classes. Some examples are “OPEN”, “DELETE”, and “CHECK”. An event belongs to the “OPEN” class when it describes the action of opening an object, such as, for example, a mail message, a calendar entry or an address book entry. All events 22 in the history 16 are also linked by an event type 28.
The numerous links within the history 16 permit efficient searches. If, for example, a request is made for an event of class “OPEN”, a link manager 15 (FIG. 1) in the history 16 will return the most recent event of this type. If this is not the correct event, the previous link by class 30 of the event will provide a reference to the previous event of class “OPEN”. These two events may have been widely separated in time. This process may be repeated until the correct event is located.
Referring to FIG. 3, the dialog manager 14 is shown in greater detail in accordance with the present invention. The dialog manager 14 maintains a reference to a current dialog focus 32. This is updated each time the dialog focus changes. The dialog manager 14 also maintains a list of expected responses 34. Each time the system 8 poses a question to the user, the method implementing the action being performed is permitted its expected response or responses with dialog manager 14. In the present implementation, this registration is performed by a decision network.
The list of expected responses 34 is implemented as a linked list 35, much like the multi-modal history 16. In this case, the elements are linked by time 36, command 38 and requester 40. The function of this list 35 is easily illustrated through an example. If a method executing a command poses the question “Do you mean Steve Krantz, or Steve Bradner?” to the user, the method expects a response of the form “I mean Steve Bradner” or simply “Steve Bradner”. The formal language translation of the first response is “select_object (name=Steve Bradner)” and of the latter response “set_object (name=Steve Bradner)”. The method will register the two possible responses with the dialog manager 14 with the commands being “select_object” and “set_object”. In addition, each entry will include a field indicating the acceptable argument type is name. The process of resolution of the target 18 of a command makes use of these various components in several ways. First each time a formal language statement is presented to the dialog manager 14, the dialog manager 14 extracts the command portion and examines the list of expected responses 34 to discover if any pending action can make use of the command. If so, the dialog manager 14 also examines the acceptable arguments. In the previous example, the formal language statement “select_object (name=Steve Bradner)” would be found to match one of the expected responses whereas “select_object (object=next)” would not. If a matching expected response is found, the target 18 is taken to be the requester and the formal language statement forwarded to the requester. Subsequently, all expected responses from this requester are purged from the list of expected responses 34. If more than one requester has registered the same expected response, the dialog manager 14 decides which of these is correct. In the present implementation, the dialog manager 14 merely uses the most recent requester, however, in a different implementation, the dialog manager 14 could pose a query to the user for clarification.
If no expected responses match the formal language statement, the several components are used in various ways to resolve the intended target 18 depending on the nature of the command. In certain cases, the target is clear from the command itself. If the user were to ask “Do I have anything scheduled for next Monday?” the intended target is clearly a calendar component and no further resolution is necessary. Often the current dialog focus maintained within the dialog manager is the intended target. If the user says “Change the subject to ‘proposal,’” the user is clearly referring to the application with dialog focus. In such cases, the target 18 is taken to be the current dialog focus 32, and the formal language statement is dispatched accordingly.
Certain commands are extremely ambiguous and are permitted in a conversational system to substantially enhance the quality of the interaction. The user can say, for example, “Close that” and the system must react correctly. However, such an utterance includes no information at all about the intended target. This target is resolved by examining the multi-modal history 16. In this particular example, the most recent event of type “COMPLETED_ACTION” and class “OPEN” would be fetched from the history 16. Such an event includes the target 18 of the original command. The target of the new command is taken to be the same as that of the original command and is forwarded to the original target. Hence, if the user says “Close that” the object most recently opened will be closed, be it a calendar entry, spreadsheet cell or other type of object. A further use of the history 16 is made when utterances such as “Undo that” or “Do that again” are received. The most recent event of type “COMPLETED_ACTION” is retrieved from the multi-modal history. Additional fields of such events indicate whether the action can be undone or repeated. The original command is extracted from the “COMPLETED_ACTION” event, and if possible as indicated by these fields, and undone or repeated as appropriate.
A special case is that of canceling an already proceeding action. In this case, the target of the formal language is the method performing this action itself. The most recent event of type “DIALOG_FOCUS,” with the owner of the focus being a method, is fetched from the multi-modal history. The formal language is delivered to the method which will then cease executing its action. Subsequently, all events in the multi-modal history 16 bearing the transaction identifier of this now canceled method are purged from the history 16.
Referring to FIG. 4, a block/flow diagram is shown, which may be implemented with a program storage device, for determining and maintaining dialog focus in a conversational speech system. In block 102, a command associated with an application to is presented to a dialog manager. The command may be in a formal language or be a direct utterance. The command or response may be input to the dialog manager from a user from any of a plurality of multi-mode devices. For example, a computer, a personal digital assistant, a telephone, etc. The application associated with the command is unknown to the dialog manager at the time the command is made, and therefore, the application which the command is intended for should first be deduced. In block 104, the dialog manager determines a current context of the command by reviewing a multi-modal history of events. The current context of the command is ascertained by reviewing a multi-modal history of events which preferably includes a linked list of all events in the multi-modal history. The events in the multi-modal history may include at least one of events linked by time, by type, by transaction, by class and by dialog focus. A current context of the command is determined by reviewing the multi-modal history of events, a current dialog focus maintained in the dialog manager and a list of expected responses also maintained in the dialog manager to provide a reference for determining the current context.
In block 106, at least one method is determined responsive to the command based on the current context. The method is determined based on the all active applications referenced using a component control to determine the method(s) which are appropriate based on the current context of the command. If a method cannot be determined or more information is needed, a query is sent to the user for information needed to resolve the current context or information needed to take an appropriate action. In block 108, the method(s) are executed responsive to the command or response to the query associated with the application. This means the present invention automatically associates the command given to an application which is active or inactive depending on the context of the command or response. In block 110, a record is maintained in the dialog manager and in the multi-modal history of any changes to states which the system has undergone. Records which are no longer relevant may be removed.
This invention illustratively presents a method and system for determining and maintaining dialog focus in a conversational speech system with multiple modes of user input and multiple backend applications. The focus resolution is achieved through an examination of the context of the user's command. The command may be entered through any one of the several input modalities. A detailed history is maintained of the commands the user has previously performed. The final resolution proceeds through knowledge of any application specific aspects of the command and an investigation of this history. This invention thus allows interaction with desktop or other applications which are not the subject of current graphical focus, or which do not even have a visual component.
Having described preferred embodiments of a system and method for determining and maintaining dialog focus in a conversational speech system (which are intended to be illustrative and not limiting), it is noted that modifications and variations can be made by persons skilled in the art in light of the above teachings. It is therefore to be understood that changes may be made in the particular embodiments of the invention disclosed which are within the scope and spirit of the invention as outlined by the appended claims. Having thus described the invention with the details and particularity required by the patent laws, what is claimed and desired protected by Letters Patent is set forth in the appended claims.

Claims (25)

1. A method for determining and maintaining dialog focus in a conversational speech system comprising the steps of:
maintaining a multi-modal history of events that result from user interaction with one or more user applications of a multi-modal computing system, wherein the events are maintained in chronological order, and wherein the events are linked by event type, wherein an event type includes a change of dialog focus;
receiving and processing a user speech command directed to one of a plurality of the user applications of the multi-modal computing system;
a dialog manager determining a target of the speech command by determining a current context of the speech command by reviewing the multi-modal history of events;
determining at least one method responsive to the command based on the determined current context; and
executing the at least one method responsive to the command.
2. The method as recited in claim 1, wherein the dialog manager determines a target application type from the current context which is responsive to the command, wherein the target application type is abstracted from the user applications supported by the multi-modal system.
3. The method as recited in claim 1, comprising receiving input commands from a user interacting with the multi-modal system using a plurality of modalities including a telephone, a computer, and a personal digital assistant.
4. The method as recited in claim 1, wherein the step of determining a current context of the command by reviewing a multi-modal history of events includes searching the multi-modal history of events by traversing linked events.
5. The method as recited in claim 1, wherein the events in the multi-modal history of events are linked by time, transaction and class.
6. The method as recited in claim 1, wherein the step of determining at least one method includes the step of referencing the user applications of the multi-modal computing system using a component control to determine the at least one method of a user application which is appropriate based on the current context of the speech command.
7. The method as recited in claim 1, wherein the speech command is presented in a formal language such that a plurality of human utterances represent an action to be taken.
8. The method as recited in claim 1, wherein the step of determining a current context of the command by reviewing a multi-modal history of events includes the dialog manager maintaining a current dialog focus and a list of expected responses, which is used by the dialog manager to provide a reference for determining the current context.
9. The method as recited in claim 1, further comprising the step of querying a user for one of information needed to resolve the current context and information needed to take an appropriate action.
10. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for determining and maintaining dialog focus in a conversational speech system, the method steps comprising:
maintaining a multi-modal history of events that result from user interaction with one or more user applications of a multi-modal computing system, wherein the events are maintained in chronological order, and wherein the events are linked by event type, wherein an event type includes a change of dialog focus;
receiving and processing a user speech command directed to one of a plurality of the user applications of the multi-modal computing system;
a dialog manager determining a target of the speech command by determining a current context of the speech command by reviewing the multi-modal history of events;
determining at least one method responsive to the command based on the determined current context; and
executing the at least one method responsive to the command.
11. The program storage device as recited in claim 10, wherein the dialog manager determines a target application type from the current context which is responsive to the command, wherein the target application type is abstracted from the user applications supported by the multi-modal system.
12. The program storage device as recited in claim 10, comprising instructions for receiving input commands from a user interacting with the multi-modal system using a plurality of modalities including a telephone, a computer, and a personal digital assistant.
13. The program storage device as recited in claim 10, wherein the instructions for determining a current context of the command by reviewing a multi-modal history of events includes instructions for searching the multi-modal history of events by traversing linked events.
14. The program storage device as recited in claim 10, wherein the events in the multi-modal history of events are linked by time, transaction and class.
15. The program storage device as recited in claim 10, wherein the instructions for determining at least one method includes the step of referencing the user applications of the multi-modal computing system using a component control to determine the at least one method of a user application which is appropriate based on the current context of the speech command.
16. The program storage device as recited in claim 10, wherein the speech command is presented in a formal language such that a plurality of human utterances represent an action to be taken.
17. The program storage device as recited in claim 10, wherein the instructions for determining a current context of the command by reviewing a multi-modal history of events includes instructions for the dialog manager maintaining a current dialog focus and a list of expected responses, which is used by the dialog manager to provide a reference for determining the current context.
18. The program storage device as recited in claim 10, further comprising instructions for querying a user for one of information needed to resolve the current context and information needed to take an appropriate action.
19. A multi-modal computing system comprising:
a multi-modal history of events to maintain meta information of events that result from user interaction with one or more user applications of the multi-modal computing system, wherein the events are maintained in chronological order, and wherein the events are linked by event type, wherein an event type includes a change of dialog focus;
a dialog manager that maintains a current dialog focus for one of a plurality of user applications and a list of expected responses, wherein the dialog manager determines a target of a speech input event by determining a current context of the speech input event using the multi-modal history of events or the list of expected responses; and
a control component adapted to select at least one method responsive to the speech input event based on the target of the speech input event as determined by the dialog manager.
20. The system as recited in claim 19, wherein the control component selects a method associated with an active user application, an inactive application, an application with a graphical component and an application with other than a graphical component.
21. The system as recited in claim 19, wherein the multi-modal history of events maintains meta information of input events including speech and non-speech input events and output events generated by the multi-modal system.
22. The system as recited in claim 19, wherein the dialog manager determines a current context of the input speech event by traversing linked events in the multi-modal history of events.
23. The system as recited in claim 19, wherein events in the multi-modal history of events are further linked by time, transaction and class.
24. The system as recited in claim 19, wherein the control component references the user applications to determine the at least one method which is appropriate based on the current context of an input speech command.
25. The system as recited in claim 19, wherein an input speech event is presented to the dialog manager in a formal language such that a plurality of human utterances represent an action to be taken.
US09/374,374 1999-08-13 1999-08-13 Method for determining and maintaining dialog focus in a conversational speech system Expired - Lifetime US7069220B2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US09/374,374 US7069220B2 (en) 1999-08-13 1999-08-13 Method for determining and maintaining dialog focus in a conversational speech system
GB0019658A GB2357364B (en) 1999-08-13 2000-08-11 Method and system of dialog management in a conversational computer system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US09/374,374 US7069220B2 (en) 1999-08-13 1999-08-13 Method for determining and maintaining dialog focus in a conversational speech system

Publications (2)

Publication Number Publication Date
US20030014260A1 US20030014260A1 (en) 2003-01-16
US7069220B2 true US7069220B2 (en) 2006-06-27

Family

ID=23476534

Family Applications (1)

Application Number Title Priority Date Filing Date
US09/374,374 Expired - Lifetime US7069220B2 (en) 1999-08-13 1999-08-13 Method for determining and maintaining dialog focus in a conversational speech system

Country Status (2)

Country Link
US (1) US7069220B2 (en)
GB (1) GB2357364B (en)

Cited By (157)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030144845A1 (en) * 2002-01-29 2003-07-31 Samsung Electronics Co., Ltd. Voice command interpreter with dialog focus tracking function and voice command interpreting method
US20040001100A1 (en) * 2002-06-27 2004-01-01 Alcatel Method and multimode user interface for processing user inputs
US20060106614A1 (en) * 2004-11-16 2006-05-18 Microsoft Corporation Centralized method and system for clarifying voice commands
US20070043574A1 (en) * 1998-10-02 2007-02-22 Daniel Coffman Conversational computing via conversational virtual machine
US20070174057A1 (en) * 2000-01-31 2007-07-26 Genly Christopher H Providing programming information in response to spoken requests
US20080244442A1 (en) * 2007-03-30 2008-10-02 Microsoft Corporation Techniques to share information between application programs
US20090150156A1 (en) * 2007-12-11 2009-06-11 Kennewick Michael R System and method for providing a natural language voice user interface in an integrated voice navigation services environment
US20090172573A1 (en) * 2007-12-31 2009-07-02 International Business Machines Corporation Activity centric resource recommendations in a computing environment
US20100145700A1 (en) * 2002-07-15 2010-06-10 Voicebox Technologies, Inc. Mobile systems and methods for responding to natural language speech utterance
US20100204994A1 (en) * 2002-06-03 2010-08-12 Voicebox Technologies, Inc. Systems and methods for responding to natural language speech utterance
US20110231182A1 (en) * 2005-08-29 2011-09-22 Voicebox Technologies, Inc. Mobile systems and methods of supporting natural language human-machine interactions
US8145489B2 (en) 2007-02-06 2012-03-27 Voicebox Technologies, Inc. System and method for selecting and presenting advertisements based on natural language processing of voice-based input
US8150694B2 (en) 2005-08-31 2012-04-03 Voicebox Technologies, Inc. System and method for providing an acoustic grammar to dynamically sharpen speech interpretation
US8326634B2 (en) 2005-08-05 2012-12-04 Voicebox Technologies, Inc. Systems and methods for responding to natural language speech utterance
US8326637B2 (en) 2009-02-20 2012-12-04 Voicebox Technologies, Inc. System and method for processing multi-modal device interactions in a natural language voice services environment
US8332224B2 (en) 2005-08-10 2012-12-11 Voicebox Technologies, Inc. System and method of supporting adaptive misrecognition conversational speech
US8515765B2 (en) 2006-10-16 2013-08-20 Voicebox Technologies, Inc. System and method for a cooperative conversational voice user interface
US8589161B2 (en) 2008-05-27 2013-11-19 Voicebox Technologies, Inc. System and method for an integrated, multi-modal, multi-device natural language voice services environment
US8892446B2 (en) 2010-01-18 2014-11-18 Apple Inc. Service orchestration for intelligent automated assistant
US9171541B2 (en) 2009-11-10 2015-10-27 Voicebox Technologies Corporation System and method for hybrid processing in a natural language voice services environment
US9190062B2 (en) 2010-02-25 2015-11-17 Apple Inc. User profiling for voice input processing
US9262612B2 (en) 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
US9300784B2 (en) 2013-06-13 2016-03-29 Apple Inc. System and method for emergency calls initiated by voice command
US9305548B2 (en) 2008-05-27 2016-04-05 Voicebox Technologies Corporation System and method for an integrated, multi-modal, multi-device natural language voice services environment
US9330720B2 (en) 2008-01-03 2016-05-03 Apple Inc. Methods and apparatus for altering audio output signals
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US9368114B2 (en) 2013-03-14 2016-06-14 Apple Inc. Context-sensitive handling of interruptions
US9430463B2 (en) 2014-05-30 2016-08-30 Apple Inc. Exemplar-based natural language processing
US9483461B2 (en) 2012-03-06 2016-11-01 Apple Inc. Handling speech synthesis of content for multiple languages
US9495129B2 (en) 2012-06-29 2016-11-15 Apple Inc. Device, method, and user interface for voice-activated navigation and browsing of a document
US9502025B2 (en) 2009-11-10 2016-11-22 Voicebox Technologies Corporation System and method for providing a natural language content dedication service
US9502031B2 (en) 2014-05-27 2016-11-22 Apple Inc. Method for supporting dynamic grammars in WFST-based ASR
US20160358603A1 (en) * 2014-01-31 2016-12-08 Hewlett-Packard Development Company, L.P. Voice input command
US9535906B2 (en) 2008-07-31 2017-01-03 Apple Inc. Mobile device having human language translation capability with positional feedback
US9576574B2 (en) 2012-09-10 2017-02-21 Apple Inc. Context-sensitive handling of interruptions by intelligent digital assistant
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
US9620105B2 (en) 2014-05-15 2017-04-11 Apple Inc. Analyzing audio input for efficient speech and music recognition
US9620104B2 (en) 2013-06-07 2017-04-11 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9626703B2 (en) 2014-09-16 2017-04-18 Voicebox Technologies Corporation Voice commerce
US9626955B2 (en) 2008-04-05 2017-04-18 Apple Inc. Intelligent text-to-speech conversion
US9632650B2 (en) 2006-03-10 2017-04-25 Microsoft Technology Licensing, Llc Command searching enhancements
US9633674B2 (en) 2013-06-07 2017-04-25 Apple Inc. System and method for detecting errors in interactions with a voice-based digital assistant
US9633004B2 (en) 2014-05-30 2017-04-25 Apple Inc. Better resolution when referencing to concepts
US9646609B2 (en) 2014-09-30 2017-05-09 Apple Inc. Caching apparatus for serving phonetic pronunciations
US9646614B2 (en) 2000-03-16 2017-05-09 Apple Inc. Fast, language-independent method for user authentication by voice
US9668121B2 (en) 2014-09-30 2017-05-30 Apple Inc. Social reminders
US9697820B2 (en) 2015-09-24 2017-07-04 Apple Inc. Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks
US9697822B1 (en) 2013-03-15 2017-07-04 Apple Inc. System and method for updating an adaptive speech recognition model
US9711141B2 (en) 2014-12-09 2017-07-18 Apple Inc. Disambiguating heteronyms in speech synthesis
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US9721566B2 (en) 2015-03-08 2017-08-01 Apple Inc. Competing devices responding to voice triggers
US9734193B2 (en) 2014-05-30 2017-08-15 Apple Inc. Determining domain salience ranking from ambiguous words in natural speech
US9747896B2 (en) 2014-10-15 2017-08-29 Voicebox Technologies Corporation System and method for providing follow-up responses to prior natural language inputs of a user
US9760559B2 (en) 2014-05-30 2017-09-12 Apple Inc. Predictive text input
US9785630B2 (en) 2014-05-30 2017-10-10 Apple Inc. Text prediction using combined word N-gram and unigram language models
US20170301348A1 (en) * 2016-04-19 2017-10-19 International Business Machines Corporation Smart launching mobile applications with preferred user interface (ui) languages
US9798393B2 (en) 2011-08-29 2017-10-24 Apple Inc. Text correction processing
US9818400B2 (en) 2014-09-11 2017-11-14 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US9842105B2 (en) 2015-04-16 2017-12-12 Apple Inc. Parsimonious continuous-space phrase representations for natural language processing
US9842101B2 (en) 2014-05-30 2017-12-12 Apple Inc. Predictive conversion of language input
US9858925B2 (en) 2009-06-05 2018-01-02 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US9865280B2 (en) 2015-03-06 2018-01-09 Apple Inc. Structured dictation using intelligent automated assistants
US20180012601A1 (en) * 2013-11-18 2018-01-11 Amazon Technologies, Inc. Dialog management with multiple applications
US9886432B2 (en) 2014-09-30 2018-02-06 Apple Inc. Parsimonious handling of word inflection via categorical stem + suffix N-gram language models
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
US9899019B2 (en) 2015-03-18 2018-02-20 Apple Inc. Systems and methods for structured stem and suffix language models
US9898459B2 (en) 2014-09-16 2018-02-20 Voicebox Technologies Corporation Integration of domain information into state transitions of a finite state transducer for natural language processing
US9922642B2 (en) 2013-03-15 2018-03-20 Apple Inc. Training an at least partial voice command system
US9934775B2 (en) 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US9953088B2 (en) 2012-05-14 2018-04-24 Apple Inc. Crowd sourcing information to fulfill user requests
US9959870B2 (en) 2008-12-11 2018-05-01 Apple Inc. Speech recognition involving a mobile device
US9966068B2 (en) 2013-06-08 2018-05-08 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US9966065B2 (en) 2014-05-30 2018-05-08 Apple Inc. Multi-command single utterance input method
US9972304B2 (en) 2016-06-03 2018-05-15 Apple Inc. Privacy preserving distributed evaluation framework for embedded personalized systems
US9971774B2 (en) 2012-09-19 2018-05-15 Apple Inc. Voice-based media searching
US9990433B2 (en) 2014-05-23 2018-06-05 Samsung Electronics Co., Ltd. Method for searching and device thereof
US10043516B2 (en) 2016-09-23 2018-08-07 Apple Inc. Intelligent automated assistant
US10049663B2 (en) 2016-06-08 2018-08-14 Apple, Inc. Intelligent automated assistant for media exploration
US10049668B2 (en) 2015-12-02 2018-08-14 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10057736B2 (en) 2011-06-03 2018-08-21 Apple Inc. Active transport based notifications
US10067938B2 (en) 2016-06-10 2018-09-04 Apple Inc. Multilingual word prediction
US10074360B2 (en) 2014-09-30 2018-09-11 Apple Inc. Providing an indication of the suitability of speech recognition
US10079014B2 (en) 2012-06-08 2018-09-18 Apple Inc. Name recognition system
US10078631B2 (en) 2014-05-30 2018-09-18 Apple Inc. Entropy-guided text prediction using combined word and character n-gram language models
US10083688B2 (en) 2015-05-27 2018-09-25 Apple Inc. Device voice control for selecting a displayed affordance
US10089072B2 (en) 2016-06-11 2018-10-02 Apple Inc. Intelligent device arbitration and control
US10101822B2 (en) 2015-06-05 2018-10-16 Apple Inc. Language input correction
US10127220B2 (en) 2015-06-04 2018-11-13 Apple Inc. Language identification from short strings
US10127911B2 (en) 2014-09-30 2018-11-13 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US10134385B2 (en) 2012-03-02 2018-11-20 Apple Inc. Systems and methods for name pronunciation
US10140985B2 (en) 2013-07-02 2018-11-27 Samsung Electronics Co., Ltd. Server for processing speech, control method thereof, image processing apparatus, and control method thereof
US10170123B2 (en) 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US10176167B2 (en) 2013-06-09 2019-01-08 Apple Inc. System and method for inferring user intent from speech inputs
US10186254B2 (en) 2015-06-07 2019-01-22 Apple Inc. Context-based endpoint detection
US10185542B2 (en) 2013-06-09 2019-01-22 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US10192552B2 (en) 2016-06-10 2019-01-29 Apple Inc. Digital assistant providing whispered speech
US10199051B2 (en) 2013-02-07 2019-02-05 Apple Inc. Voice trigger for a digital assistant
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
US10235990B2 (en) 2017-01-04 2019-03-19 International Business Machines Corporation System and method for cognitive intervention on human interactions
US10241644B2 (en) 2011-06-03 2019-03-26 Apple Inc. Actionable reminder entries
US10241752B2 (en) 2011-09-30 2019-03-26 Apple Inc. Interface for a virtual digital assistant
US10249300B2 (en) 2016-06-06 2019-04-02 Apple Inc. Intelligent list reading
US10255907B2 (en) 2015-06-07 2019-04-09 Apple Inc. Automatic accent detection using acoustic models
US10269345B2 (en) 2016-06-11 2019-04-23 Apple Inc. Intelligent task discovery
US10276170B2 (en) 2010-01-18 2019-04-30 Apple Inc. Intelligent automated assistant
US10283110B2 (en) 2009-07-02 2019-05-07 Apple Inc. Methods and apparatuses for automatic speech recognition
US10289433B2 (en) 2014-05-30 2019-05-14 Apple Inc. Domain specific language for encoding assistant dialog
US10297253B2 (en) 2016-06-11 2019-05-21 Apple Inc. Application integration with a digital assistant
US10318639B2 (en) 2017-02-03 2019-06-11 International Business Machines Corporation Intelligent action recommendation
US10318871B2 (en) 2005-09-08 2019-06-11 Apple Inc. Method and apparatus for building an intelligent automated assistant
US10331784B2 (en) 2016-07-29 2019-06-25 Voicebox Technologies Corporation System and method of disambiguating natural language processing requests
US10356243B2 (en) 2015-06-05 2019-07-16 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US10354011B2 (en) 2016-06-09 2019-07-16 Apple Inc. Intelligent automated assistant in a home environment
US10366158B2 (en) 2015-09-29 2019-07-30 Apple Inc. Efficient word encoding for recurrent neural network language models
US10373515B2 (en) 2017-01-04 2019-08-06 International Business Machines Corporation System and method for cognitive intervention on human interactions
DE112009001779B4 (en) 2008-07-30 2019-08-08 Mitsubishi Electric Corp. Voice recognition device
US10410637B2 (en) 2017-05-12 2019-09-10 Apple Inc. User-specific acoustic models
US10431214B2 (en) 2014-11-26 2019-10-01 Voicebox Technologies Corporation System and method of determining a domain and/or an action related to a natural language input
US10446141B2 (en) 2014-08-28 2019-10-15 Apple Inc. Automatic speech recognition based on user feedback
US10446143B2 (en) 2016-03-14 2019-10-15 Apple Inc. Identification of voice inputs providing credentials
US10474439B2 (en) 2016-06-16 2019-11-12 Microsoft Technology Licensing, Llc Systems and methods for building conversational understanding systems
US10482874B2 (en) 2017-05-15 2019-11-19 Apple Inc. Hierarchical belief states for digital assistants
US10490187B2 (en) 2016-06-10 2019-11-26 Apple Inc. Digital assistant providing automated status report
US10496753B2 (en) 2010-01-18 2019-12-03 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US10509862B2 (en) 2016-06-10 2019-12-17 Apple Inc. Dynamic phrase expansion of language input
US10521466B2 (en) 2016-06-11 2019-12-31 Apple Inc. Data driven natural language event detection and classification
US10552013B2 (en) 2014-12-02 2020-02-04 Apple Inc. Data detection
US10553209B2 (en) 2010-01-18 2020-02-04 Apple Inc. Systems and methods for hands-free notification summaries
US10568032B2 (en) 2007-04-03 2020-02-18 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
US10567477B2 (en) 2015-03-08 2020-02-18 Apple Inc. Virtual assistant continuity
US10593346B2 (en) 2016-12-22 2020-03-17 Apple Inc. Rank-reduced token representation for automatic speech recognition
US10592095B2 (en) 2014-05-23 2020-03-17 Apple Inc. Instantaneous speaking of content on touch devices
US10607140B2 (en) 2010-01-25 2020-03-31 Newvaluexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US10614799B2 (en) 2014-11-26 2020-04-07 Voicebox Technologies Corporation System and method of providing intent predictions for an utterance prior to a system detection of an end of the utterance
US10621166B2 (en) 2017-03-23 2020-04-14 International Business Machines Corporation Interactive dialog in natural language using an ontology
US10652394B2 (en) 2013-03-14 2020-05-12 Apple Inc. System and method for processing voicemail
US10659851B2 (en) 2014-06-30 2020-05-19 Apple Inc. Real-time digital assistant knowledge updates
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US10679605B2 (en) 2010-01-18 2020-06-09 Apple Inc. Hands-free list-reading by intelligent automated assistant
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
US10705794B2 (en) 2010-01-18 2020-07-07 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US10706373B2 (en) 2011-06-03 2020-07-07 Apple Inc. Performing actions associated with task items that represent tasks to perform
US10733993B2 (en) 2016-06-10 2020-08-04 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
US10755703B2 (en) 2017-05-11 2020-08-25 Apple Inc. Offline personal assistant
US10762293B2 (en) 2010-12-22 2020-09-01 Apple Inc. Using parts-of-speech tagging and named entity recognition for spelling correction
US10789041B2 (en) 2014-09-12 2020-09-29 Apple Inc. Dynamic thresholds for always listening speech trigger
US10791176B2 (en) 2017-05-12 2020-09-29 Apple Inc. Synchronization and task delegation of a digital assistant
US10791216B2 (en) 2013-08-06 2020-09-29 Apple Inc. Auto-activating smart responses based on activities from remote devices
US10810274B2 (en) 2017-05-15 2020-10-20 Apple Inc. Optimizing dialogue policy decisions for digital assistants using implicit feedback
US11010550B2 (en) 2015-09-29 2021-05-18 Apple Inc. Unified language modeling framework for word prediction, auto-completion and auto-correction
US11025565B2 (en) 2015-06-07 2021-06-01 Apple Inc. Personalized prediction of responses for instant messaging
US11217255B2 (en) 2017-05-16 2022-01-04 Apple Inc. Far-field extension for digital assistant services
US11314826B2 (en) 2014-05-23 2022-04-26 Samsung Electronics Co., Ltd. Method for searching and device thereof
US11366864B2 (en) 2017-02-09 2022-06-21 Microsoft Technology Licensing, Llc Bot integration in a web-based search engine
US11587559B2 (en) 2015-09-30 2023-02-21 Apple Inc. Intelligent device identification
US11599332B1 (en) * 2007-10-04 2023-03-07 Great Northern Research, LLC Multiple shell multi faceted graphical user interface

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6964023B2 (en) * 2001-02-05 2005-11-08 International Business Machines Corporation System and method for multi-modal focus detection, referential ambiguity resolution and mood classification using multi-modal input
EP1293963A1 (en) * 2001-09-07 2003-03-19 Sony International (Europe) GmbH Dialogue management server architecture for dialogue systems
US7869998B1 (en) 2002-04-23 2011-01-11 At&T Intellectual Property Ii, L.P. Voice-enabled dialog system
US8645122B1 (en) 2002-12-19 2014-02-04 At&T Intellectual Property Ii, L.P. Method of handling frequently asked questions in a natural language dialog service
US20060036438A1 (en) * 2004-07-13 2006-02-16 Microsoft Corporation Efficient multimodal method to provide input to a computing device
US7778821B2 (en) * 2004-11-24 2010-08-17 Microsoft Corporation Controlled manipulation of characters
US8185399B2 (en) 2005-01-05 2012-05-22 At&T Intellectual Property Ii, L.P. System and method of providing an automated data-collection in spoken dialog systems
US20060149553A1 (en) * 2005-01-05 2006-07-06 At&T Corp. System and method for using a library to interactively design natural language spoken dialog systems
US8478589B2 (en) 2005-01-05 2013-07-02 At&T Intellectual Property Ii, L.P. Library of existing spoken dialog data for use in generating new natural language spoken dialog systems
JP5887253B2 (en) * 2012-11-16 2016-03-16 本田技研工業株式会社 Message processing device
US9472196B1 (en) * 2015-04-22 2016-10-18 Google Inc. Developer voice actions system
US20200005118A1 (en) * 2018-06-28 2020-01-02 Microsoft Technology Licensing, Llc Offtrack virtual agent interaction session detection
US10580176B2 (en) 2018-06-28 2020-03-03 Microsoft Technology Licensing, Llc Visualization of user intent in virtual agent interaction
US11005786B2 (en) 2018-06-28 2021-05-11 Microsoft Technology Licensing, Llc Knowledge-driven dialog support conversation system
US10831442B2 (en) * 2018-10-19 2020-11-10 International Business Machines Corporation Digital assistant user interface amalgamation
US11741951B2 (en) * 2019-02-22 2023-08-29 Lenovo (Singapore) Pte. Ltd. Context enabled voice commands
DE102020102982A1 (en) 2020-02-05 2021-08-05 Böllhoff Verbindungstechnik GmbH Joining element, connection structure with the joining element, manufacturing method of the joining element and corresponding connection method
US20220374109A1 (en) * 2021-05-14 2022-11-24 Apple Inc. User input interpretation using display representations

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5265014A (en) * 1990-04-10 1993-11-23 Hewlett-Packard Company Multi-modal user interface
US5748974A (en) * 1994-12-13 1998-05-05 International Business Machines Corporation Multimodal natural language interface for cross-application tasks
US5748841A (en) * 1994-02-25 1998-05-05 Morin; Philippe Supervised contextual language acquisition system
US5892813A (en) * 1996-09-30 1999-04-06 Matsushita Electric Industrial Co., Ltd. Multimodal voice dialing digital key telephone with dialog manager
US5897618A (en) 1997-03-10 1999-04-27 International Business Machines Corporation Data processing system and method for switching between programs having a same title using a voice command
US6125347A (en) * 1993-09-29 2000-09-26 L&H Applications Usa, Inc. System for controlling multiple user application programs by spoken input
US6246981B1 (en) * 1998-11-25 2001-06-12 International Business Machines Corporation Natural language task-oriented dialog manager and method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5265014A (en) * 1990-04-10 1993-11-23 Hewlett-Packard Company Multi-modal user interface
US6125347A (en) * 1993-09-29 2000-09-26 L&H Applications Usa, Inc. System for controlling multiple user application programs by spoken input
US5748841A (en) * 1994-02-25 1998-05-05 Morin; Philippe Supervised contextual language acquisition system
US5748974A (en) * 1994-12-13 1998-05-05 International Business Machines Corporation Multimodal natural language interface for cross-application tasks
US5892813A (en) * 1996-09-30 1999-04-06 Matsushita Electric Industrial Co., Ltd. Multimodal voice dialing digital key telephone with dialog manager
US5897618A (en) 1997-03-10 1999-04-27 International Business Machines Corporation Data processing system and method for switching between programs having a same title using a voice command
US6246981B1 (en) * 1998-11-25 2001-06-12 International Business Machines Corporation Natural language task-oriented dialog manager and method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Lamel et al., "The LIMSI ARISE System for Train Travel Information," International Conference on Acoustics, Speech and Signal Processing, Phoenix, Arizona, Mar. 1999.
Papineni et al., "Free-Flow Dialog Management Using Forms," Eurospeech, Budapest, Hungary, Sep. 1999.
Ward et al., "Towards Speech Understanding Across Multiple Languages," International Conference on Spoken Language Processing, Sydney, Australia, Dec. 1998.

Cited By (268)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090313026A1 (en) * 1998-10-02 2009-12-17 Daniel Coffman Conversational computing via conversational virtual machine
US20070043574A1 (en) * 1998-10-02 2007-02-22 Daniel Coffman Conversational computing via conversational virtual machine
US8082153B2 (en) * 1998-10-02 2011-12-20 International Business Machines Corporation Conversational computing via conversational virtual machine
US7729916B2 (en) * 1998-10-02 2010-06-01 International Business Machines Corporation Conversational computing via conversational virtual machine
US8374875B2 (en) * 2000-01-31 2013-02-12 Intel Corporation Providing programming information in response to spoken requests
US20070174057A1 (en) * 2000-01-31 2007-07-26 Genly Christopher H Providing programming information in response to spoken requests
US8805691B2 (en) 2000-01-31 2014-08-12 Intel Corporation Providing programming information in response to spoken requests
US9646614B2 (en) 2000-03-16 2017-05-09 Apple Inc. Fast, language-independent method for user authentication by voice
US20030144845A1 (en) * 2002-01-29 2003-07-31 Samsung Electronics Co., Ltd. Voice command interpreter with dialog focus tracking function and voice command interpreting method
US8155962B2 (en) 2002-06-03 2012-04-10 Voicebox Technologies, Inc. Method and system for asynchronously processing natural language utterances
US8731929B2 (en) 2002-06-03 2014-05-20 Voicebox Technologies Corporation Agent architecture for determining meanings of natural language utterances
US20100204994A1 (en) * 2002-06-03 2010-08-12 Voicebox Technologies, Inc. Systems and methods for responding to natural language speech utterance
US20100286985A1 (en) * 2002-06-03 2010-11-11 Voicebox Technologies, Inc. Systems and methods for responding to natural language speech utterance
US8112275B2 (en) 2002-06-03 2012-02-07 Voicebox Technologies, Inc. System and method for user-specific speech recognition
US8140327B2 (en) 2002-06-03 2012-03-20 Voicebox Technologies, Inc. System and method for filtering and eliminating noise from natural language utterances to improve speech recognition and parsing
US7490173B2 (en) * 2002-06-27 2009-02-10 Alcatel Method and multimode user interface for processing user inputs
US20040001100A1 (en) * 2002-06-27 2004-01-01 Alcatel Method and multimode user interface for processing user inputs
US20100145700A1 (en) * 2002-07-15 2010-06-10 Voicebox Technologies, Inc. Mobile systems and methods for responding to natural language speech utterance
US9031845B2 (en) * 2002-07-15 2015-05-12 Nuance Communications, Inc. Mobile systems and methods for responding to natural language speech utterance
US10748530B2 (en) 2004-11-16 2020-08-18 Microsoft Technology Licensing, Llc Centralized method and system for determining voice commands
US8942985B2 (en) * 2004-11-16 2015-01-27 Microsoft Corporation Centralized method and system for clarifying voice commands
US9972317B2 (en) 2004-11-16 2018-05-15 Microsoft Technology Licensing, Llc Centralized method and system for clarifying voice commands
US20060106614A1 (en) * 2004-11-16 2006-05-18 Microsoft Corporation Centralized method and system for clarifying voice commands
US9263039B2 (en) 2005-08-05 2016-02-16 Nuance Communications, Inc. Systems and methods for responding to natural language speech utterance
US8326634B2 (en) 2005-08-05 2012-12-04 Voicebox Technologies, Inc. Systems and methods for responding to natural language speech utterance
US8849670B2 (en) 2005-08-05 2014-09-30 Voicebox Technologies Corporation Systems and methods for responding to natural language speech utterance
US9626959B2 (en) 2005-08-10 2017-04-18 Nuance Communications, Inc. System and method of supporting adaptive misrecognition in conversational speech
US8620659B2 (en) 2005-08-10 2013-12-31 Voicebox Technologies, Inc. System and method of supporting adaptive misrecognition in conversational speech
US8332224B2 (en) 2005-08-10 2012-12-11 Voicebox Technologies, Inc. System and method of supporting adaptive misrecognition conversational speech
US8447607B2 (en) 2005-08-29 2013-05-21 Voicebox Technologies, Inc. Mobile systems and methods of supporting natural language human-machine interactions
US8849652B2 (en) 2005-08-29 2014-09-30 Voicebox Technologies Corporation Mobile systems and methods of supporting natural language human-machine interactions
US9495957B2 (en) 2005-08-29 2016-11-15 Nuance Communications, Inc. Mobile systems and methods of supporting natural language human-machine interactions
US8195468B2 (en) * 2005-08-29 2012-06-05 Voicebox Technologies, Inc. Mobile systems and methods of supporting natural language human-machine interactions
US20110231182A1 (en) * 2005-08-29 2011-09-22 Voicebox Technologies, Inc. Mobile systems and methods of supporting natural language human-machine interactions
US8150694B2 (en) 2005-08-31 2012-04-03 Voicebox Technologies, Inc. System and method for providing an acoustic grammar to dynamically sharpen speech interpretation
US10318871B2 (en) 2005-09-08 2019-06-11 Apple Inc. Method and apparatus for building an intelligent automated assistant
US9632650B2 (en) 2006-03-10 2017-04-25 Microsoft Technology Licensing, Llc Command searching enhancements
US9117447B2 (en) 2006-09-08 2015-08-25 Apple Inc. Using event alert text as input to an automated assistant
US8942986B2 (en) 2006-09-08 2015-01-27 Apple Inc. Determining user intent based on ontologies of domains
US8930191B2 (en) 2006-09-08 2015-01-06 Apple Inc. Paraphrasing of user requests and results by automated digital assistant
US10755699B2 (en) 2006-10-16 2020-08-25 Vb Assets, Llc System and method for a cooperative conversational voice user interface
US10297249B2 (en) 2006-10-16 2019-05-21 Vb Assets, Llc System and method for a cooperative conversational voice user interface
US8515765B2 (en) 2006-10-16 2013-08-20 Voicebox Technologies, Inc. System and method for a cooperative conversational voice user interface
US11222626B2 (en) 2006-10-16 2022-01-11 Vb Assets, Llc System and method for a cooperative conversational voice user interface
US9015049B2 (en) 2006-10-16 2015-04-21 Voicebox Technologies Corporation System and method for a cooperative conversational voice user interface
US10515628B2 (en) 2006-10-16 2019-12-24 Vb Assets, Llc System and method for a cooperative conversational voice user interface
US10510341B1 (en) 2006-10-16 2019-12-17 Vb Assets, Llc System and method for a cooperative conversational voice user interface
US11080758B2 (en) 2007-02-06 2021-08-03 Vb Assets, Llc System and method for delivering targeted advertisements and/or providing natural language processing based on advertisements
US8145489B2 (en) 2007-02-06 2012-03-27 Voicebox Technologies, Inc. System and method for selecting and presenting advertisements based on natural language processing of voice-based input
US10134060B2 (en) 2007-02-06 2018-11-20 Vb Assets, Llc System and method for delivering targeted advertisements and/or providing natural language processing based on advertisements
US8527274B2 (en) 2007-02-06 2013-09-03 Voicebox Technologies, Inc. System and method for delivering targeted advertisements and tracking advertisement interactions in voice recognition contexts
US8886536B2 (en) 2007-02-06 2014-11-11 Voicebox Technologies Corporation System and method for delivering targeted advertisements and tracking advertisement interactions in voice recognition contexts
US9406078B2 (en) 2007-02-06 2016-08-02 Voicebox Technologies Corporation System and method for delivering targeted advertisements and/or providing natural language processing based on advertisements
US9269097B2 (en) 2007-02-06 2016-02-23 Voicebox Technologies Corporation System and method for delivering targeted advertisements and/or providing natural language processing based on advertisements
US20080244442A1 (en) * 2007-03-30 2008-10-02 Microsoft Corporation Techniques to share information between application programs
US10568032B2 (en) 2007-04-03 2020-02-18 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
US11599332B1 (en) * 2007-10-04 2023-03-07 Great Northern Research, LLC Multiple shell multi faceted graphical user interface
US8140335B2 (en) 2007-12-11 2012-03-20 Voicebox Technologies, Inc. System and method for providing a natural language voice user interface in an integrated voice navigation services environment
US20090150156A1 (en) * 2007-12-11 2009-06-11 Kennewick Michael R System and method for providing a natural language voice user interface in an integrated voice navigation services environment
US10347248B2 (en) 2007-12-11 2019-07-09 Voicebox Technologies Corporation System and method for providing in-vehicle services via a natural language voice user interface
US9620113B2 (en) 2007-12-11 2017-04-11 Voicebox Technologies Corporation System and method for providing a natural language voice user interface
US8326627B2 (en) 2007-12-11 2012-12-04 Voicebox Technologies, Inc. System and method for dynamically generating a recognition grammar in an integrated voice navigation services environment
US8370147B2 (en) 2007-12-11 2013-02-05 Voicebox Technologies, Inc. System and method for providing a natural language voice user interface in an integrated voice navigation services environment
US8452598B2 (en) 2007-12-11 2013-05-28 Voicebox Technologies, Inc. System and method for providing advertisements in an integrated voice navigation services environment
US8719026B2 (en) 2007-12-11 2014-05-06 Voicebox Technologies Corporation System and method for providing a natural language voice user interface in an integrated voice navigation services environment
US8983839B2 (en) 2007-12-11 2015-03-17 Voicebox Technologies Corporation System and method for dynamically generating a recognition grammar in an integrated voice navigation services environment
US20090172573A1 (en) * 2007-12-31 2009-07-02 International Business Machines Corporation Activity centric resource recommendations in a computing environment
US10650062B2 (en) * 2007-12-31 2020-05-12 International Business Machines Corporation Activity centric resource recommendations in a computing environment
US9330720B2 (en) 2008-01-03 2016-05-03 Apple Inc. Methods and apparatus for altering audio output signals
US10381016B2 (en) 2008-01-03 2019-08-13 Apple Inc. Methods and apparatus for altering audio output signals
US9626955B2 (en) 2008-04-05 2017-04-18 Apple Inc. Intelligent text-to-speech conversion
US9865248B2 (en) 2008-04-05 2018-01-09 Apple Inc. Intelligent text-to-speech conversion
US8589161B2 (en) 2008-05-27 2013-11-19 Voicebox Technologies, Inc. System and method for an integrated, multi-modal, multi-device natural language voice services environment
US10089984B2 (en) 2008-05-27 2018-10-02 Vb Assets, Llc System and method for an integrated, multi-modal, multi-device natural language voice services environment
US10553216B2 (en) 2008-05-27 2020-02-04 Oracle International Corporation System and method for an integrated, multi-modal, multi-device natural language voice services environment
US9305548B2 (en) 2008-05-27 2016-04-05 Voicebox Technologies Corporation System and method for an integrated, multi-modal, multi-device natural language voice services environment
US9711143B2 (en) 2008-05-27 2017-07-18 Voicebox Technologies Corporation System and method for an integrated, multi-modal, multi-device natural language voice services environment
DE112009001779B4 (en) 2008-07-30 2019-08-08 Mitsubishi Electric Corp. Voice recognition device
US9535906B2 (en) 2008-07-31 2017-01-03 Apple Inc. Mobile device having human language translation capability with positional feedback
US10108612B2 (en) 2008-07-31 2018-10-23 Apple Inc. Mobile device having human language translation capability with positional feedback
US9959870B2 (en) 2008-12-11 2018-05-01 Apple Inc. Speech recognition involving a mobile device
US8326637B2 (en) 2009-02-20 2012-12-04 Voicebox Technologies, Inc. System and method for processing multi-modal device interactions in a natural language voice services environment
US10553213B2 (en) 2009-02-20 2020-02-04 Oracle International Corporation System and method for processing multi-modal device interactions in a natural language voice services environment
US9570070B2 (en) 2009-02-20 2017-02-14 Voicebox Technologies Corporation System and method for processing multi-modal device interactions in a natural language voice services environment
US9105266B2 (en) 2009-02-20 2015-08-11 Voicebox Technologies Corporation System and method for processing multi-modal device interactions in a natural language voice services environment
US9953649B2 (en) 2009-02-20 2018-04-24 Voicebox Technologies Corporation System and method for processing multi-modal device interactions in a natural language voice services environment
US8719009B2 (en) 2009-02-20 2014-05-06 Voicebox Technologies Corporation System and method for processing multi-modal device interactions in a natural language voice services environment
US8738380B2 (en) 2009-02-20 2014-05-27 Voicebox Technologies Corporation System and method for processing multi-modal device interactions in a natural language voice services environment
US9858925B2 (en) 2009-06-05 2018-01-02 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US10475446B2 (en) 2009-06-05 2019-11-12 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US10795541B2 (en) 2009-06-05 2020-10-06 Apple Inc. Intelligent organization of tasks items
US11080012B2 (en) 2009-06-05 2021-08-03 Apple Inc. Interface for a virtual digital assistant
US10283110B2 (en) 2009-07-02 2019-05-07 Apple Inc. Methods and apparatuses for automatic speech recognition
US9502025B2 (en) 2009-11-10 2016-11-22 Voicebox Technologies Corporation System and method for providing a natural language content dedication service
US9171541B2 (en) 2009-11-10 2015-10-27 Voicebox Technologies Corporation System and method for hybrid processing in a natural language voice services environment
US10705794B2 (en) 2010-01-18 2020-07-07 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US8903716B2 (en) 2010-01-18 2014-12-02 Apple Inc. Personalized vocabulary for digital assistant
US10679605B2 (en) 2010-01-18 2020-06-09 Apple Inc. Hands-free list-reading by intelligent automated assistant
US9318108B2 (en) 2010-01-18 2016-04-19 Apple Inc. Intelligent automated assistant
US10553209B2 (en) 2010-01-18 2020-02-04 Apple Inc. Systems and methods for hands-free notification summaries
US10706841B2 (en) 2010-01-18 2020-07-07 Apple Inc. Task flow identification based on user intent
US9548050B2 (en) 2010-01-18 2017-01-17 Apple Inc. Intelligent automated assistant
US11423886B2 (en) 2010-01-18 2022-08-23 Apple Inc. Task flow identification based on user intent
US8892446B2 (en) 2010-01-18 2014-11-18 Apple Inc. Service orchestration for intelligent automated assistant
US10496753B2 (en) 2010-01-18 2019-12-03 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US10276170B2 (en) 2010-01-18 2019-04-30 Apple Inc. Intelligent automated assistant
US10984326B2 (en) 2010-01-25 2021-04-20 Newvaluexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US10607141B2 (en) 2010-01-25 2020-03-31 Newvaluexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US11410053B2 (en) 2010-01-25 2022-08-09 Newvaluexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US10984327B2 (en) 2010-01-25 2021-04-20 New Valuexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US10607140B2 (en) 2010-01-25 2020-03-31 Newvaluexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US10049675B2 (en) 2010-02-25 2018-08-14 Apple Inc. User profiling for voice input processing
US9633660B2 (en) 2010-02-25 2017-04-25 Apple Inc. User profiling for voice input processing
US9190062B2 (en) 2010-02-25 2015-11-17 Apple Inc. User profiling for voice input processing
US10762293B2 (en) 2010-12-22 2020-09-01 Apple Inc. Using parts-of-speech tagging and named entity recognition for spelling correction
US10102359B2 (en) 2011-03-21 2018-10-16 Apple Inc. Device access using voice authentication
US9262612B2 (en) 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
US10241644B2 (en) 2011-06-03 2019-03-26 Apple Inc. Actionable reminder entries
US10706373B2 (en) 2011-06-03 2020-07-07 Apple Inc. Performing actions associated with task items that represent tasks to perform
US10057736B2 (en) 2011-06-03 2018-08-21 Apple Inc. Active transport based notifications
US11120372B2 (en) 2011-06-03 2021-09-14 Apple Inc. Performing actions associated with task items that represent tasks to perform
US9798393B2 (en) 2011-08-29 2017-10-24 Apple Inc. Text correction processing
US10241752B2 (en) 2011-09-30 2019-03-26 Apple Inc. Interface for a virtual digital assistant
US10134385B2 (en) 2012-03-02 2018-11-20 Apple Inc. Systems and methods for name pronunciation
US9483461B2 (en) 2012-03-06 2016-11-01 Apple Inc. Handling speech synthesis of content for multiple languages
US9953088B2 (en) 2012-05-14 2018-04-24 Apple Inc. Crowd sourcing information to fulfill user requests
US10079014B2 (en) 2012-06-08 2018-09-18 Apple Inc. Name recognition system
US9495129B2 (en) 2012-06-29 2016-11-15 Apple Inc. Device, method, and user interface for voice-activated navigation and browsing of a document
US9576574B2 (en) 2012-09-10 2017-02-21 Apple Inc. Context-sensitive handling of interruptions by intelligent digital assistant
US9971774B2 (en) 2012-09-19 2018-05-15 Apple Inc. Voice-based media searching
US10199051B2 (en) 2013-02-07 2019-02-05 Apple Inc. Voice trigger for a digital assistant
US10978090B2 (en) 2013-02-07 2021-04-13 Apple Inc. Voice trigger for a digital assistant
US10652394B2 (en) 2013-03-14 2020-05-12 Apple Inc. System and method for processing voicemail
US11388291B2 (en) 2013-03-14 2022-07-12 Apple Inc. System and method for processing voicemail
US9368114B2 (en) 2013-03-14 2016-06-14 Apple Inc. Context-sensitive handling of interruptions
US9697822B1 (en) 2013-03-15 2017-07-04 Apple Inc. System and method for updating an adaptive speech recognition model
US9922642B2 (en) 2013-03-15 2018-03-20 Apple Inc. Training an at least partial voice command system
US9633674B2 (en) 2013-06-07 2017-04-25 Apple Inc. System and method for detecting errors in interactions with a voice-based digital assistant
US9966060B2 (en) 2013-06-07 2018-05-08 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
US9620104B2 (en) 2013-06-07 2017-04-11 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US10657961B2 (en) 2013-06-08 2020-05-19 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US9966068B2 (en) 2013-06-08 2018-05-08 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US10176167B2 (en) 2013-06-09 2019-01-08 Apple Inc. System and method for inferring user intent from speech inputs
US10185542B2 (en) 2013-06-09 2019-01-22 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US9300784B2 (en) 2013-06-13 2016-03-29 Apple Inc. System and method for emergency calls initiated by voice command
US10140985B2 (en) 2013-07-02 2018-11-27 Samsung Electronics Co., Ltd. Server for processing speech, control method thereof, image processing apparatus, and control method thereof
US10791216B2 (en) 2013-08-06 2020-09-29 Apple Inc. Auto-activating smart responses based on activities from remote devices
US11688402B2 (en) 2013-11-18 2023-06-27 Amazon Technologies, Inc. Dialog management with multiple modalities
US10706854B2 (en) * 2013-11-18 2020-07-07 Amazon Technologies, Inc. Dialog management with multiple applications
US20180012601A1 (en) * 2013-11-18 2018-01-11 Amazon Technologies, Inc. Dialog management with multiple applications
US10978060B2 (en) * 2014-01-31 2021-04-13 Hewlett-Packard Development Company, L.P. Voice input command
US20160358603A1 (en) * 2014-01-31 2016-12-08 Hewlett-Packard Development Company, L.P. Voice input command
US9620105B2 (en) 2014-05-15 2017-04-11 Apple Inc. Analyzing audio input for efficient speech and music recognition
US11314826B2 (en) 2014-05-23 2022-04-26 Samsung Electronics Co., Ltd. Method for searching and device thereof
US11157577B2 (en) 2014-05-23 2021-10-26 Samsung Electronics Co., Ltd. Method for searching and device thereof
US10223466B2 (en) 2014-05-23 2019-03-05 Samsung Electronics Co., Ltd. Method for searching and device thereof
US9990433B2 (en) 2014-05-23 2018-06-05 Samsung Electronics Co., Ltd. Method for searching and device thereof
US11080350B2 (en) 2014-05-23 2021-08-03 Samsung Electronics Co., Ltd. Method for searching and device thereof
US10592095B2 (en) 2014-05-23 2020-03-17 Apple Inc. Instantaneous speaking of content on touch devices
US11734370B2 (en) 2014-05-23 2023-08-22 Samsung Electronics Co., Ltd. Method for searching and device thereof
US9502031B2 (en) 2014-05-27 2016-11-22 Apple Inc. Method for supporting dynamic grammars in WFST-based ASR
US10078631B2 (en) 2014-05-30 2018-09-18 Apple Inc. Entropy-guided text prediction using combined word and character n-gram language models
US11133008B2 (en) 2014-05-30 2021-09-28 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US10497365B2 (en) 2014-05-30 2019-12-03 Apple Inc. Multi-command single utterance input method
US10170123B2 (en) 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US9430463B2 (en) 2014-05-30 2016-08-30 Apple Inc. Exemplar-based natural language processing
US10169329B2 (en) 2014-05-30 2019-01-01 Apple Inc. Exemplar-based natural language processing
US10289433B2 (en) 2014-05-30 2019-05-14 Apple Inc. Domain specific language for encoding assistant dialog
US9734193B2 (en) 2014-05-30 2017-08-15 Apple Inc. Determining domain salience ranking from ambiguous words in natural speech
US9785630B2 (en) 2014-05-30 2017-10-10 Apple Inc. Text prediction using combined word N-gram and unigram language models
US9760559B2 (en) 2014-05-30 2017-09-12 Apple Inc. Predictive text input
US9842101B2 (en) 2014-05-30 2017-12-12 Apple Inc. Predictive conversion of language input
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US10083690B2 (en) 2014-05-30 2018-09-25 Apple Inc. Better resolution when referencing to concepts
US9633004B2 (en) 2014-05-30 2017-04-25 Apple Inc. Better resolution when referencing to concepts
US9966065B2 (en) 2014-05-30 2018-05-08 Apple Inc. Multi-command single utterance input method
US11257504B2 (en) 2014-05-30 2022-02-22 Apple Inc. Intelligent assistant for home automation
US10659851B2 (en) 2014-06-30 2020-05-19 Apple Inc. Real-time digital assistant knowledge updates
US10904611B2 (en) 2014-06-30 2021-01-26 Apple Inc. Intelligent automated assistant for TV user interactions
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US9668024B2 (en) 2014-06-30 2017-05-30 Apple Inc. Intelligent automated assistant for TV user interactions
US10446141B2 (en) 2014-08-28 2019-10-15 Apple Inc. Automatic speech recognition based on user feedback
US10431204B2 (en) 2014-09-11 2019-10-01 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US9818400B2 (en) 2014-09-11 2017-11-14 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US10789041B2 (en) 2014-09-12 2020-09-29 Apple Inc. Dynamic thresholds for always listening speech trigger
US11087385B2 (en) 2014-09-16 2021-08-10 Vb Assets, Llc Voice commerce
US10216725B2 (en) 2014-09-16 2019-02-26 Voicebox Technologies Corporation Integration of domain information into state transitions of a finite state transducer for natural language processing
US9898459B2 (en) 2014-09-16 2018-02-20 Voicebox Technologies Corporation Integration of domain information into state transitions of a finite state transducer for natural language processing
US10430863B2 (en) 2014-09-16 2019-10-01 Vb Assets, Llc Voice commerce
US9626703B2 (en) 2014-09-16 2017-04-18 Voicebox Technologies Corporation Voice commerce
US9646609B2 (en) 2014-09-30 2017-05-09 Apple Inc. Caching apparatus for serving phonetic pronunciations
US9886432B2 (en) 2014-09-30 2018-02-06 Apple Inc. Parsimonious handling of word inflection via categorical stem + suffix N-gram language models
US9668121B2 (en) 2014-09-30 2017-05-30 Apple Inc. Social reminders
US10127911B2 (en) 2014-09-30 2018-11-13 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US9986419B2 (en) 2014-09-30 2018-05-29 Apple Inc. Social reminders
US10074360B2 (en) 2014-09-30 2018-09-11 Apple Inc. Providing an indication of the suitability of speech recognition
US10229673B2 (en) 2014-10-15 2019-03-12 Voicebox Technologies Corporation System and method for providing follow-up responses to prior natural language inputs of a user
US9747896B2 (en) 2014-10-15 2017-08-29 Voicebox Technologies Corporation System and method for providing follow-up responses to prior natural language inputs of a user
US10431214B2 (en) 2014-11-26 2019-10-01 Voicebox Technologies Corporation System and method of determining a domain and/or an action related to a natural language input
US10614799B2 (en) 2014-11-26 2020-04-07 Voicebox Technologies Corporation System and method of providing intent predictions for an utterance prior to a system detection of an end of the utterance
US10552013B2 (en) 2014-12-02 2020-02-04 Apple Inc. Data detection
US11556230B2 (en) 2014-12-02 2023-01-17 Apple Inc. Data detection
US9711141B2 (en) 2014-12-09 2017-07-18 Apple Inc. Disambiguating heteronyms in speech synthesis
US9865280B2 (en) 2015-03-06 2018-01-09 Apple Inc. Structured dictation using intelligent automated assistants
US11087759B2 (en) 2015-03-08 2021-08-10 Apple Inc. Virtual assistant activation
US10567477B2 (en) 2015-03-08 2020-02-18 Apple Inc. Virtual assistant continuity
US10311871B2 (en) 2015-03-08 2019-06-04 Apple Inc. Competing devices responding to voice triggers
US9721566B2 (en) 2015-03-08 2017-08-01 Apple Inc. Competing devices responding to voice triggers
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
US9899019B2 (en) 2015-03-18 2018-02-20 Apple Inc. Systems and methods for structured stem and suffix language models
US9842105B2 (en) 2015-04-16 2017-12-12 Apple Inc. Parsimonious continuous-space phrase representations for natural language processing
US10083688B2 (en) 2015-05-27 2018-09-25 Apple Inc. Device voice control for selecting a displayed affordance
US10127220B2 (en) 2015-06-04 2018-11-13 Apple Inc. Language identification from short strings
US10101822B2 (en) 2015-06-05 2018-10-16 Apple Inc. Language input correction
US10356243B2 (en) 2015-06-05 2019-07-16 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US11025565B2 (en) 2015-06-07 2021-06-01 Apple Inc. Personalized prediction of responses for instant messaging
US10255907B2 (en) 2015-06-07 2019-04-09 Apple Inc. Automatic accent detection using acoustic models
US10186254B2 (en) 2015-06-07 2019-01-22 Apple Inc. Context-based endpoint detection
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US11500672B2 (en) 2015-09-08 2022-11-15 Apple Inc. Distributed personal assistant
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
US9697820B2 (en) 2015-09-24 2017-07-04 Apple Inc. Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks
US11010550B2 (en) 2015-09-29 2021-05-18 Apple Inc. Unified language modeling framework for word prediction, auto-completion and auto-correction
US10366158B2 (en) 2015-09-29 2019-07-30 Apple Inc. Efficient word encoding for recurrent neural network language models
US11587559B2 (en) 2015-09-30 2023-02-21 Apple Inc. Intelligent device identification
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
US11526368B2 (en) 2015-11-06 2022-12-13 Apple Inc. Intelligent automated assistant in a messaging environment
US10049668B2 (en) 2015-12-02 2018-08-14 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
US10446143B2 (en) 2016-03-14 2019-10-15 Apple Inc. Identification of voice inputs providing credentials
US10229677B2 (en) * 2016-04-19 2019-03-12 International Business Machines Corporation Smart launching mobile applications with preferred user interface (UI) languages
US20170301348A1 (en) * 2016-04-19 2017-10-19 International Business Machines Corporation Smart launching mobile applications with preferred user interface (ui) languages
US9934775B2 (en) 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US9972304B2 (en) 2016-06-03 2018-05-15 Apple Inc. Privacy preserving distributed evaluation framework for embedded personalized systems
US10249300B2 (en) 2016-06-06 2019-04-02 Apple Inc. Intelligent list reading
US11069347B2 (en) 2016-06-08 2021-07-20 Apple Inc. Intelligent automated assistant for media exploration
US10049663B2 (en) 2016-06-08 2018-08-14 Apple, Inc. Intelligent automated assistant for media exploration
US10354011B2 (en) 2016-06-09 2019-07-16 Apple Inc. Intelligent automated assistant in a home environment
US10067938B2 (en) 2016-06-10 2018-09-04 Apple Inc. Multilingual word prediction
US10192552B2 (en) 2016-06-10 2019-01-29 Apple Inc. Digital assistant providing whispered speech
US10490187B2 (en) 2016-06-10 2019-11-26 Apple Inc. Digital assistant providing automated status report
US11037565B2 (en) 2016-06-10 2021-06-15 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10733993B2 (en) 2016-06-10 2020-08-04 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10509862B2 (en) 2016-06-10 2019-12-17 Apple Inc. Dynamic phrase expansion of language input
US10521466B2 (en) 2016-06-11 2019-12-31 Apple Inc. Data driven natural language event detection and classification
US10089072B2 (en) 2016-06-11 2018-10-02 Apple Inc. Intelligent device arbitration and control
US11152002B2 (en) 2016-06-11 2021-10-19 Apple Inc. Application integration with a digital assistant
US10269345B2 (en) 2016-06-11 2019-04-23 Apple Inc. Intelligent task discovery
US10297253B2 (en) 2016-06-11 2019-05-21 Apple Inc. Application integration with a digital assistant
US10474439B2 (en) 2016-06-16 2019-11-12 Microsoft Technology Licensing, Llc Systems and methods for building conversational understanding systems
US10331784B2 (en) 2016-07-29 2019-06-25 Voicebox Technologies Corporation System and method of disambiguating natural language processing requests
US10553215B2 (en) 2016-09-23 2020-02-04 Apple Inc. Intelligent automated assistant
US10043516B2 (en) 2016-09-23 2018-08-07 Apple Inc. Intelligent automated assistant
US10593346B2 (en) 2016-12-22 2020-03-17 Apple Inc. Rank-reduced token representation for automatic speech recognition
US10373515B2 (en) 2017-01-04 2019-08-06 International Business Machines Corporation System and method for cognitive intervention on human interactions
US10235990B2 (en) 2017-01-04 2019-03-19 International Business Machines Corporation System and method for cognitive intervention on human interactions
US10902842B2 (en) 2017-01-04 2021-01-26 International Business Machines Corporation System and method for cognitive intervention on human interactions
US10318639B2 (en) 2017-02-03 2019-06-11 International Business Machines Corporation Intelligent action recommendation
US11366864B2 (en) 2017-02-09 2022-06-21 Microsoft Technology Licensing, Llc Bot integration in a web-based search engine
US10621166B2 (en) 2017-03-23 2020-04-14 International Business Machines Corporation Interactive dialog in natural language using an ontology
US10755703B2 (en) 2017-05-11 2020-08-25 Apple Inc. Offline personal assistant
US11405466B2 (en) 2017-05-12 2022-08-02 Apple Inc. Synchronization and task delegation of a digital assistant
US10410637B2 (en) 2017-05-12 2019-09-10 Apple Inc. User-specific acoustic models
US10791176B2 (en) 2017-05-12 2020-09-29 Apple Inc. Synchronization and task delegation of a digital assistant
US10482874B2 (en) 2017-05-15 2019-11-19 Apple Inc. Hierarchical belief states for digital assistants
US10810274B2 (en) 2017-05-15 2020-10-20 Apple Inc. Optimizing dialogue policy decisions for digital assistants using implicit feedback
US11217255B2 (en) 2017-05-16 2022-01-04 Apple Inc. Far-field extension for digital assistant services

Also Published As

Publication number Publication date
US20030014260A1 (en) 2003-01-16
GB0019658D0 (en) 2000-09-27
GB2357364A (en) 2001-06-20
GB2357364B (en) 2003-09-17

Similar Documents

Publication Publication Date Title
US7069220B2 (en) Method for determining and maintaining dialog focus in a conversational speech system
EP1076288B1 (en) Method and system for multi-client access to a dialog system
US9081590B2 (en) Multimodal input using scratchpad graphical user interface to edit speech text input with keyboard input
JP3212618B2 (en) Dialogue processing device
US6208972B1 (en) Method for integrating computer processes with an interface controlled by voice actuated grammars
US7917843B2 (en) Method, system and computer readable medium for addressing handling from a computer program
US7188067B2 (en) Method for integrating processes with a multi-faceted human centered interface
US5748974A (en) Multimodal natural language interface for cross-application tasks
US6615176B2 (en) Speech enabling labeless controls in an existing graphical user interface
US7447638B1 (en) Speech input disambiguation computing method
US5893063A (en) Data processing system and method for dynamically accessing an application using a voice command
US20130103391A1 (en) Natural language processing for software commands
US7496854B2 (en) Method, system and computer readable medium for addressing handling from a computer program
EP1650744A1 (en) Invalid command detection in speech recognition
US5801696A (en) Message queue for graphical user interface
JP3265131B2 (en) Event generation distribution method
US5897618A (en) Data processing system and method for switching between programs having a same title using a voice command
US20080147403A1 (en) Multiple sound fragments processing and load balancing
US7082391B1 (en) Automatic speech recognition
US7430511B1 (en) Speech enabled computing system
US20090019273A1 (en) Exception-based error handling in an array-based language
US7814092B2 (en) Distributed named entity recognition architecture
US20240070193A1 (en) Reducing metadata transmitted with automated assistant requests
EP1171836B1 (en) Function key for computer data handling
US20030229491A1 (en) Single sound fragment processing

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:COFFMAN, DANIEL L.;GOPALAKRISHNAN, POPANI;RAMASWAMY, GANESH N.;AND OTHERS;REEL/FRAME:010276/0943

Effective date: 19990914

AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: CORRECTIVE ASSIGNMENT TO CORRECT ASSIGNOR ON A PREVIOUS RECORDING AT REEL 010276, FRAME 0943;ASSIGNORS:COFFMAN, DANIEL M.;GOPALAKRISHNAN, POPANI;RAMASAWMY, GANESH;AND OTHERS;REEL/FRAME:010709/0097

Effective date: 19990914

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCF Information on status: patent grant

Free format text: PATENTED CASE

AS Assignment

Owner name: NUANCE COMMUNICATIONS, INC., MASSACHUSETTS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:INTERNATIONAL BUSINESS MACHINES CORPORATION;REEL/FRAME:022354/0566

Effective date: 20081231

FPAY Fee payment

Year of fee payment: 4

FPAY Fee payment

Year of fee payment: 8

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 12TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1553)

Year of fee payment: 12