Developers, Get Ready to Have Fun!

Voice biometrics is cool! It's a thrill to programmatically analyze someone's voice and get back information from that voice. And fortunately, voice biometrics is easy to embed within your existing application. The VBG API works the same whether you use the VBG Cloud or if we install our software on your premises. And it works the same whether you use it as part of a mobile application, an IVR telephony application, or through a web page.

You can use your programming language of choice to access the services provided through our RESTful APIs. If you want to use VoiceXML from wthin your IVR, you can do that too, although the RESTful API is the preferred approach. If you don't have an IVR but need one, we provide a third-party solution available for your convenience either in our Cloud or installed on your premises. But IVR is not necessary. Any audio source, including embedded native mobile applications, will work provided you deliver us audio in a standard .WAV format.

We encourage you to contact us for a free consultation and educational session on voice biometrics. But if you want to get better acquainted with voice biometrics on your own, this page will give you a high level understanding of the programming flow so you can begin to assess the scope of work required on your side. If you are new to voice biometrics, we strongly recommend going through our tutorial on the overall considerations of authentication and fraud detection. If you then determine the programming task and voice biometrics technology match your needs, request and execute an electronic Non-disclosure Agreement. Once we have an NDA, we’ll send you detailed documentation and get you on a path for a fully capable Free Trial of our service.

Use Your Language of Choice

As long as you can make HTTPS requests -- and what language doesn't allow that? -- then you can access voice biometric capabilities. Once we launch your trial, you will receive an Account ID and an Access Key. Those two items are the credentials you need to make requests to VBG's services through our designated URL.

The Basics

The core of what you need to do with your HTTPS requests is send a sample of someone's voice, a speech sample, to the voice biometrics platform for analysis and comparison, getting a result back, and acting on the result. The rest of what happens in a voice biometrics solution is the necessary detail, such as setting up users in the voice biometrics platform database, providing an identifier of whose speech you are sending, setting whether the speech sample is to be used as part of the voiceprint template (i.e., "enrolling") or used to compare against an existing voiceprint or voiceprints, (i.e., verifying or identifying).

Think Transactions

To deal with the different reasons you are sending speech samples and also to impose security restrictions, we organize the back and forth between your application and the VBG Voice Biometrics platform around the concept of a transaction. For example, you request to create a transaction to enroll a user and get back a transaction id. Using that transaction id, you then submit multiple speech samples. Once you have finished submitting samples, using that same transaction id, you request to make a voiceprint from those samples and then close the transaction. For security purposes, we impose a time limit on how long the transaction id remains valid. If your application abandons the transaction, we automatically close it.

The full process for a typical enrollment assuming you are capturing speech samples from a telephone call using the RandomPIN™ use case is as follows:

  1. If not already created, request VBG to create a user id that matches how you want to track this particular caller. You may use an account ID from your CRM or a phone number or anything else you choose.
  2. Request a transaction ID for enrollment for a particular user and store that in a variable. VBG will also send you the digits to prompt the caller to speak.
  3. Prompt the user to speak the prescribed string of digits and record the voice sample.
  4. Send the sample to VBG using the transaction ID along with what the caller was supposed to have spoken.
  5. Get a response back that tells you if the sample is accepted, or if the sample has too much noise, or if it does not have enough speech, and so forth, along with the next prompt to speak, which may be the same one as before if the sample was rejected.
  6. If not ok, then retry up to a reasonable number of times, typically up to three attempts.
  7. If ok, then have the caller speak the next set of digits.
  8. Repeat the prompting and recording until enough voice samples are collected for your use case.
  9. Close the transaction and confirm that the enrollment was either ok or not ok.

That is the basic logic for enrollment for RandomPIN™. Details come into play with parameters you specify to check or not to check the spoken content, for example. Verification for RandomPIN™ is even simpler, typically just create a transaction, collect and submit a single sample, get a response, retry if necessary, and then close the transaction.

Interpreting Results

When you submit a voice sample, either for enrollment or verification, our platform first tests the audio acceptable quality. For example, the audio must not have too much background noise or be too quiet. Further, in most applications, the content of the voice sample must match what the person was asked to say. If any of these conditions are not satisfied, your program will receive a corresponding error message. The logical next step is for you to collect another sample and submit again until you either collect a sample of sufficient quality or determine that the person is in a place that is too noisy or the person is not complying. This leads to a branch in your application code.

If the audio quality is not sufficient even after retries, then we recommend instructing the speaker to start over after moving to a location that is less noisy and the signal is stronger. Or, if they continue to fail to match what they are supposed to say, they will need assistance from a human.

In most cases, the audio quality will be sufficient. Assuming the audio quality is sufficient, the VBG platform will then process the voice sample. In the case of enrollment, you will get an OK response that the sample was accepted. The VBG platform then combines the new voice sample with previously submitted voice samples. In the case of verification, your program will receive a response from the VBG platform indicating whether the voice sample passes or fails. You also receive a score. Your program can further query the VBG platform to receive historical statistics on the score levels for the claimed identity. This enables you to implement additional logic depending on the absolute and relative value of the score. We have a few other settings and methods that we will share with you under non-disclosure.

Design Based on Application

For verification requests, by far the bulk of what most applications do, VBG's system responds in sub-second time. The biggest delay is usually the transport across the internet between our Cloud and your site. Enrollments are also fast, usually within a one second processing window after all samples are submitted. Therefore, Authentication applications are virtually all real-time.

For Identification applications, on the other hand, the response time varies based on how many voiceprints you want to search. Are you scanning a blacklist with 100 entries or 5,000? The larger the scan, the more computations necessary. The more computations, the more hardware necessary to achieve the computations in a given amount of time. Achieving real-time for Identification may be possible, provided you are willing to spend more on the hardware and system design. Consult with us on your business process. We will help you optimize your design for real-time or offline computation as a trade-off against other parameters in your business case.

Get Started Quickly

We make it easy to try voice biometrics using our Free Trial. Assuming you have your own audio capture, you can get started quickly with our web-based API tool to become familiar with the API without having to write any code.

Another super fast way to get started is to leverage our Cloud IVR. Instead of submitting audio samples yourself, you simply ask us to make a phone call, give use the phone number and userid, and tell us whether this is an enrollment or verification. Our ready-built application will do the rest. Check out this short video to see how you can quickly create a Proof-Of-Concept with this approach.