G10L15/12

Leveraging Natural Language Processing

A system, computer program product, and method are provided to automate a natural language processing system to facilitate an artificial intelligence platform defining a relationship between dialogue and post dialogue activity. Dialogue is detected and analyzed, including identification of key words and phrases within the dialogue. Post dialogue actions, including physical actuation of a hardware device and an associated temporal proximity of the action and the dialogue, are monitored. The hardware device receives an instruction from a processing unit that relates to the analyzed dialogue and the hardware device changes states and/or actuates another hardware device. The system constructs a hypothesis, i.e., a relationship from the identified key phrase drawn from the analyzed dialogue and the monitored post action dialogue. A dialogue tree containing identified terms and associated post dialogue actions is dynamically modified with one or more new identified terms and the associated post dialogue actions.

Leveraging Natural Language Processing

A system, computer program product, and method are provided to automate a natural language processing system to facilitate an artificial intelligence platform defining a relationship between dialogue and post dialogue activity. Dialogue is detected and analyzed, including identification of key words and phrases within the dialogue. Post dialogue actions, including physical actuation of a hardware device and an associated temporal proximity of the action and the dialogue, are monitored. The hardware device receives an instruction from a processing unit that relates to the analyzed dialogue and the hardware device changes states and/or actuates another hardware device. The system constructs a hypothesis, i.e., a relationship from the identified key phrase drawn from the analyzed dialogue and the monitored post action dialogue. A dialogue tree containing identified terms and associated post dialogue actions is dynamically modified with one or more new identified terms and the associated post dialogue actions.

Method and apparatus for encoding and decoding audio signal

Provided are an apparatus and a method for encoding and decoding audio signals, in which when determining a masking threshold according to a psychoacoustic model, accurate results may be obtained for a short window-based audio signal as well as for a long window-based audio signal. The apparatus for encoding audio signals according to the present invention comprises a masking threshold determining unit configured to determine, on the basis of a frame length of a first window having a divided audio signal, a masking threshold for a second window that has a different frame length from that of the first window.

Method and apparatus for encoding and decoding audio signal

Provided are an apparatus and a method for encoding and decoding audio signals, in which when determining a masking threshold according to a psychoacoustic model, accurate results may be obtained for a short window-based audio signal as well as for a long window-based audio signal. The apparatus for encoding audio signals according to the present invention comprises a masking threshold determining unit configured to determine, on the basis of a frame length of a first window having a divided audio signal, a masking threshold for a second window that has a different frame length from that of the first window.

Methods and Systems For Automated Interactive Quran Education

Methods and systems for giving feedback to users about the correct pronunciation of phrases in the Quranic recitation are disclosed. The method comprises playing an exemplary pronunciation on a sound device and optionally displaying it on a display device and starting sound recording for a response automatically. The response gets analyzed after sound recording with an automated speech recognition system and a comparison mechanism. Then, the method advances to the next target phrase if the response is correct or repeats the same phrase if it is incorrect with predefined correctness criteria without any extra user interaction.

Methods and Systems For Automated Interactive Quran Education

Methods and systems for giving feedback to users about the correct pronunciation of phrases in the Quranic recitation are disclosed. The method comprises playing an exemplary pronunciation on a sound device and optionally displaying it on a display device and starting sound recording for a response automatically. The response gets analyzed after sound recording with an automated speech recognition system and a comparison mechanism. Then, the method advances to the next target phrase if the response is correct or repeats the same phrase if it is incorrect with predefined correctness criteria without any extra user interaction.

System for grasping keyword extraction based speech content on recorded voice data, indexing method using the system, and method for grasping speech content
10304441 · 2019-05-28 · ·

Disclosed are a system for grasping keyword extraction based speech content on recorded voice data, an indexing method using the system, and a method for grasping speech content. An indexing unit receives voice data, performs per-frame voice recognition with reference to a phoneme to form a phoneme lattice, generates divided indexing information for a frame of a limited time configured with a plurality of frames, and stores the same in an indexing database, the divided indexing information including a phoneme lattice formed for each frame of the limited time. A searcher uses a keyword input by a user as a search word, performs a comparison on the divided indexing information stored in the indexing database with reference to a phoneme, searches a phoneme string matching the search word, and finds a voice portion corresponding to a search word through a precise acoustic analysis regarding the matching phoneme string, and the grasper grasps a representative word through a search result searched by the searcher and outputs it to the user so as to grasp speech content of the voice data.

System for grasping keyword extraction based speech content on recorded voice data, indexing method using the system, and method for grasping speech content
10304441 · 2019-05-28 · ·

Disclosed are a system for grasping keyword extraction based speech content on recorded voice data, an indexing method using the system, and a method for grasping speech content. An indexing unit receives voice data, performs per-frame voice recognition with reference to a phoneme to form a phoneme lattice, generates divided indexing information for a frame of a limited time configured with a plurality of frames, and stores the same in an indexing database, the divided indexing information including a phoneme lattice formed for each frame of the limited time. A searcher uses a keyword input by a user as a search word, performs a comparison on the divided indexing information stored in the indexing database with reference to a phoneme, searches a phoneme string matching the search word, and finds a voice portion corresponding to a search word through a precise acoustic analysis regarding the matching phoneme string, and the grasper grasps a representative word through a search result searched by the searcher and outputs it to the user so as to grasp speech content of the voice data.

METHOD AND APPARATUS FOR SEARCHING FOR GEOGRAPHIC INFORMATION USING INTERACTIVE VOICE RECOGNITION

An apparatus for searching for geographic information using interactive voice recognition includes: a receiver configured to receive a voice signal; a voice recognition unit configured to recognize the voice signal; a result analysis processing unit configured to search for geographic information on the basis of the recognized voice signal, and analyze a search result of the geographic information; and a question generating unit configured to generate a question in response to the result of determination. A method for searching for geographic information using interactive voice recognition includes: receiving a voice signal, and recognizing the voice signal; searching for geographic information on the basis of the recognized voice signal; analyzing a search result of the geographic information; and generating a question in response to the result of determination.

System and method of automated evaluation of transcription quality
10147418 · 2018-12-04 · ·

Systems and methods automatedly evaluate a transcription quality. Audio data is obtained. The audio data is segmented into a plurality of utterances with a voice activity detector operating on a computer processor. The plurality of utterances are transcribed into at least one word lattice with a large vocabulary continuous speech recognition system operating on the processor. A minimum Bayes risk decoder is applied to the at least one word lattice to create at least one confusion network. At least conformity ratio is calculated from the at least one confusion network.