CALLER IDENTIFICATION IN A SECURE ENVIRONMENT USING VOICE BIOMETRICS
20200389554 ยท 2020-12-10
Inventors
Cpc classification
H04M3/2281
ELECTRICITY
H04M3/537
ELECTRICITY
International classification
H04M3/42
ELECTRICITY
Abstract
A method for passive enrollment and identification of a telephone caller to a called telephone number, comprising the steps of audio recording a telephone call; identifying and separating any multiple speakers on the telephone call and specifying a one of the multiple speakers; creating a net speech portion of the telephone call by trimming portions of audio recording from the beginning and end of the audio recording; processing the net speech portion against an existing Biometric Voice Print (BVP) database; creating a new BVP for the at least one of the multiple speakers if no match of the net speech portion against the BVP database is found in the processing step; comparing subsequent calls against the BVP, whether existing or created, to identify the at least one of the multiple speakers; and associating in a cluster all subsequent calls having voice prints matching the BVP.
Claims
1. A method for passive enrollment and identification of a telephone caller in a telephone call to a called telephone number, comprising the steps of: a) selecting one of said speakers; b) designating a net speech portion of speech by said selected speaker by trimming portions of said speech from the beginning and end of said telephone call; c) creating a biometric voice print of said net speech portion; d) processing said biometric voice print against an existing biometric voice print database; e) creating a new biometric voice print for said selected speaker if no match of said biometric voice print against said database is found in said processing step; and f) iterating said processing step a plurality of times during the remainder of said telephone call to detect changes in identity of said selected speaker.
2. A method in accordance with claim 1 comprising the further step of comparing voice prints of callers on subsequent telephone calls against said BVP, whether existing or created, to identify again said selected speaker.
3. A method in accordance with claim 1 comprising the further step of audio recording said telephone call including the voice of said speaker prior to said designating step.
4. A method in accordance with claim 1 comprising the further step of associating in a cluster all subsequent calls having voice prints matching said BVP.
5. A method in accordance with claim 2 comprising the further steps of: a) identifying telephone numbers associated with said subsequent calls; and b) clustering all such telephone numbers in association with said called number.
6. A method in accordance with claim 1 wherein said creating step comprises the steps of: a) computing the AIP/AIS of said caller; and b) triggering creation of a BVP only if said AIP/AIS exceeds a predetermined threshold.
7. A method in accordance with claim 1 wherein said processing step is carried out if said net speech portion is at least seven seconds long.
8. A method in accordance with claim 1 wherein said creating step is carried out if said net speech portion is at least 30 seconds long.
9. A method in accordance with claim 1 comprising the further steps of: a) downloading a plurality of recordings of calls from said caller; b) processing said plurality of recordings to yield a plurality of net speech portions; c) generating a plurality of voice prints from said plurality of net speech portions; and d) creating an enhanced BVP from said new BVP and said plurality of voice prints.
10. A method in accordance with claim 1 wherein said caller is selected from the group consisting of an inmate within a secure facility and a civilian outside said secure facility.
11. A method in accordance with claim 1 wherein both the caller and the person being called can be identified from a single recording comprising speech of both persons.
12. A method in accordance with claim 1 comprising the further steps of: a) determining whether the caller is a person of interest; b) alerting the facility that a person of interest is leaving a voice mail; c) identifying instances of said caller using multiple telephone numbers to leave voice mail messages for the same or multiple inmates.
13. A method in accordance with claim 1, comprising the further steps of: a) associating a Billing Telephone Number with said telephone caller; b) preprocessing said call to normalize volume, suppress silence, and reduce ambient noise; c) isolating said caller's voice from any other voices on said telephone call; d) trimming the beginning and end of said call; e) extracting at least seven seconds of net speech of said caller; f) creating a voice print of said net speech; g) comparing said voice print of said caller against the BVP database.
14. A method in accordance with claim 13, comprising the further steps of: a) establishing a test window length within said telephone call; b) continuously analyzing speaker voice input during said call; c) generating a first test voice identification score at the conclusion of said first test window; d) starting a second test window before the end of said first test window such that said second test window overlaps a portion of said first test window; e) starting one or more additional test windows wherein each of said additional test windows overlaps a portion of the immediately preceding test window.
15. A method in accordance with claim 14 wherein each of said test windows is equal in length.
16. A method in accordance with claim 13 wherein said first test window is between 7 and 30 seconds in length, and wherein said second test window begins between 3.5 and 15 seconds into said call.
17. A method in accordance with claim 13 comprising the further step of continuing to monitor the identity of said caller throughout said call.
18. A method in accordance with claim 13 comprising the further step of identifying all existing voice prints matching said incoming caller voice prints.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0036] The disclosure, both as to organization and method of practice, together with further objects and advantages thereof, may best be understood by reference to the following description taken in connection with the accompanying drawings in which:
[0037]
[0038]
[0039]
[0040]
DETAILED DESCRIPTION OF THE INVENTION
[0041] A novel system and method are disclosed to passively enroll and authenticate individuals inside secure facilities and persons-of-interest (POI) outside such facilities during both live telephone calls and pre-recorded calls (phone message), and to monitor continuously the identity of one or more speakers in a telephone call. The enrollment and authentication process, as described in greater detail hereinbelow, is completely invisible to the speakers.
[0042] The system cuts down on costs significantly since there is no need to formally enroll persons inside the facility, which typically requires substantial supervisor's time during the enrollment process. Further, the process is text and language independent. The only required element to create an initial BVP for a speaker is at least one call of more than 30 net seconds of speech for that particular speaker, although additional calls are necessary to creation of a high-quality BVP as described below.
[0043] A unique aspect of the system not found in the prior art is the ability to automatically create BVPs of POIs outside the facility. A system limited to authenticating POIs inside the secure facility alone would offer limited incremental value over other forms of authentication. The secure facility knows the general location of any inmate at all times and a caller from within the facility can only be one among a very limited set of possible candidates, depending on the housing breakdown. On the other hand, a real need for the facility is the ability to identify the individuals receiving outgoing calls from within the facility who could possibly be involved in a criminal activity in collusion with the inmates. The current disclosure describes an automated process to create BVPs for the caller as well as the called party in calls originating either inside or outside a facility.
[0044] In a world where a telephone number is largely meaningless for identification purposes, being able to authenticate callers by voice alone is a critical feature. Previous methods of the prior art, requiring a formal enrollment process of all callers, make this impossible. The process of the current disclosure makes authenticating by voice alone a reality by being able to use a recording to create a BVP of a new caller in as little as 30 second of net speech, and then to monitor caller identities during the remainder of a call.
[0045] 1. Passive Enrollment of Speakers
[0046] In a currently preferred embodiment, a high quality BVP can be generated from processing recordings of multiple calls. The number of calls is not critical, although it has been found that three is a satisfactory number of calls with which to work, provided that at least 30 seconds of net speech is available for processing.
[0047] In a currently preferred embodiment, the system can process calls recorded at different times and from different numbers. This method captures different characteristics of the caller's voice, training the system to recognize the caller in different circumstances, and produces a high-quality BVP.
[0048] 2. Creation of a BVP without Formal Enrollment
[0049] In a currently preferred embodiment, the system of the disclosure relies on a plurality of pre-recorded calls to create a BVP of a target speaker without formal enrollment. The BVP can also be produced during a live call.
[0050] Referring to
[0056] 3. Authentication of Incoming or Outgoing Speaker with Existing BVP
[0057] If a caller, e.g., a civilian caller 29 as shown in
[0066] The system is able to identify one or both speakers based upon as little as seven seconds of net speech in a single call, which permits the system to operate in real time and to continue to confirm speaker identification throughout the duration of the call.
[0067] In another embodiment, the process may also be run in real time via Continuous Window Processing to determine throughout the length of a call whether any of the speakers have changed. An occasion where this is useful is if the inmate is on a watch list and the agency wants to know who is leaving him a voice mail in real time.
[0068] Referring to
[0069] 4. Passive Enrollment of Outside Parties
[0070] One aspect of the current disclosure is the ability to create a BVP of a called party, whether inside or outside a facility, without the need for formal enrollment. In a preferred embodiment, the voice print of every speaker is processed against the voice print database. If no match is found, a trigger may automatically generate the creation of a new BVP. That person will be assigned by default the name of the called party, if known.
[0071] In another embodiment, a trigger for producing a BVP is based on the output from a data mining algorithm whose output is an Actionable Intelligence Potential (AIP) or Actionable Intelligence Score (AIS). The AIP/AIS is generated by mining the connections between the individual that has been called and other inmates/calls, emails, communications, financial transactions, etc. The trigger is based on one or more thresholds that can be adjusted as a function of the probability that the target speaker is a known Person of Interest (POI).
[0072] In another preferred embodiment, a trigger to create a BVP is based on certain keywords identified either automatically or manually by an investigator listening to a phone call. The investigator can then request that the system create a BVP for the individual, if a BVP does not already exist, and begin the process of searching for those calls based on the individual's voice.
[0073] The identification process is as follows: [0074] a) uploading the call of the called party; [0075] b) preprocessing the call to normalize volume, suppress silences, and reduce ambient noise; [0076] c) separating the speakers in each recording through a speaker separation process; [0077] d) trimming the beginning and end of the call to remove telephone system prompts; [0078] e) extracting at least seven seconds of net speech of the called party; [0079] f) processing the net speech of the called party against the BVP database; [0080] g) matching the called party to a BVP; [0081] h) if no match is found, triggering the creation of new BVP from the call; [0082] i) if less than 30 seconds of net speech is available from the call, searching the call database for additional calls involving the called party; [0083] j) if no other calls involving the called party can be found, then setting an alarm against the called party to use future calls to trigger creation of a new BVP; and running the call database against newly created BVP.
[0084] This procedure may be followed in real time, defined herein as being on a live telephone call rather than a recorded call. The call is processed through an SIP server which analyzes the call. After 7 seconds of speech, the system can identify the caller.
[0085] In another preferred embodiment, a cross-reference is kept of every incoming and outgoing telephone number against the BVPs of all inmates. Biometrics on voice mail recordings can be used, e.g., as follows: [0086] 1) identify the caller; [0087] 2) determine whether the caller is an ex-inmate/parolee (a POI); [0088] 3) alert the facility of a person of interest/under investigation who is leaving a voice mail; [0089] 4) identify instances of one caller using multiple numbers to leave VMs for the same inmate, or for multiple inmates.
[0090] 5. Enhancement of BVPs
[0091] The ability to enhance an existing BVP over time is an important factor for ensuring that the identification accuracy of the BVP is as high as possible. A poor quality BVP will result in more false positive and false negative results. A BVP can be of poor quality for a number of reasons: one of the calls used to generate the BVP may erroneously include another speaker's voice, or the calls used were not of high quality to begin with, or more audio is needed to ensure that the BVP is of the highest quality. A proprietary algorithm automatically identifies BVPs that could use enhancement if they are consistently receiving poor identification scores. The system expects that if the correct person is being run against the BVP a certain threshold score should be attained; if it is not, the system flags the BVP for enhancement.
[0092] The enhancement can occur using multiple techniques. One way that the algorithm enhances the BVP is by using a call that has been through the identification process and has been assigned an exceptionally high score. This indicates that this call is an excellent representative sample of the person that is being identified and, as such, should be used to improve the BVP. This enhancement will make future identifications better as well as the BVP is what drives the accuracy of the system. This method of improvement can be used over and over to improve the BVP.
[0093] An additional method is a scheduled process whereby the system, after a designated period of time (e.g., week, month, three months, six months, etc.) selects the highest scoring identified calls and uses them to improve the appropriate BVPs (a call identified with a very high score for a particular individual is used to improve that individual's BVP). This process automatically occurs during the designated periods, ensuring that the BVP continues to improve and be of high quality.
[0094] 6. Cluster Matching of BVP
[0095] There are circumstances where calls have been assigned by default to a particular target. This would be, for instance, when an outside party calls in and is assigned to a particular calling number. The system may not be able to identify the caller from the database. At a later date, the same person may be calling in and be personally identified. The original call assigned to the calling number is then reassigned to the newly identified caller, forming a call cluster. Additional calls from this or other numbers as identified are added to the cluster.
[0096] The introduction of a biometric identification for the person leaving the voice mail is invaluable for intelligence personnel and agencies. Often, the facility has a general idea of who should be tied to that number since the friends and family members often deposit funds for their loved ones in the prison and, in order to do so, must confirm their identity. This gives the process a starting point to match the person's voice to a telephone number. However, oftentimes there is no place to start in terms of identifying the caller. This is where proprietary software comes into use. Having the ability to voice print both sides of a call, the software can already have a voice associated with that number from previous calls to or from that number. The system then checks to see if the new voice print is a positive match. (Additionally, a voice mail is an excellent basis from which establish a voice print, as such a call is simply the person talking without interruption.) The system can create a BVP from the caller based on a VM and use it to identify the caller in the future. The biometric process can occur after the VM has been completed in an offline process. This can be the most efficient means of identifying the callers as each voice mail recording is run against the database of BVPs, and the identity with the accompanying score is returned.