Are You A Developer?

Adding voice biometrics to your applications is fun and easy!

Image Description

Get Ready to Have Fun!

Voice biometric software is leading-edge technology that is fortunately very easy to integrate within your existing applications. The VBG API works the same whether you use the VBG Cloud or if we install our software on your premises. And, it works the same whether you use it as part of a mobile application, an enterprise applicaiton, an IVR system, a call center environment, a web page, an IoT device, or a wearable device.

You can use your programming language of choice to access the services provided through our RESTful API. If you want to use VoiceXML from within your IVR, you can do that too, although the RESTful API is the preferred approach and it provides significantly more development power and flexibility for your applications. If you don't have an IVR, but need one, VBG can provide a third-party solution in our Cloud or installed on your premises. But IVR is not necessary. Any audio source, including embedded native mobile applications, will work provided you deliver us audio in a standard .WAV format.

We encourage you to get in touch with us for a free consultation and educational session on voice biometrics. However, if you want to get better acquainted with voice biometrics on your own, this page will give you a high-level understanding of the programming flow so you can begin to assess the scope of work required on your side. If you are new to voice biometrics, we strongly recommend going through our Considering Voice Biometrics guide.

Develop in Your Language of Choice

The minimum requirements to use the VBG Platform are very simple:

  • Must be able to capture speech samples from within your application (if not using VBG's IVR)
  • Must be able to send your speech samples to VBG in an acceptable audio format
  • Must be able to communicate via HTTPS to a third-party service (like the VBG Platform).

As long as your development environment supports the use of HTTPS requests to third-party services, you can access the power of the VBG Platform. Once we setup a trial account for you, you will receive an Account ID, an API Access Key, and a secure URL for the VBG Platform. Then, you'll be ready to start developing ...

And GOOD NEWS, you can use whatever development language you like, just as long as it satisfies the minimum requirements outlined above.

The Basics

VBG's RESTful API has only 6 main request types and only about 20 total methods. Most developers only need 10-12 API calls to create well-behaved and fully-functioning applications.

As a developer, the very core of what you need to do with your HTTPS requests is send a sample of someone's speech to the voice biometrics platform for distillation into a voiceprint, or analysis and comparison to one or more existing voiceprints. You will get a result back, and then act on the result as appropriate. The rest of what happens in a typical voice biometrics solution is the necessary detail, such as setting up unique user ids in the database, tagging sample submission requests with identifiers for a specific user, telling the system whether the speech sample is to be used as part of the voiceprint template (i.e., "enrolling") or used to compare against an existing voiceprint or voiceprints, (i.e., verifying or identifying), supplying content-checking information, etc.

The VBG Platform is extremely powerful, yet simple for most developers having experience with speech-related applications or web services. It is not unusual for developers to have fully-functioning systems up and running in hours or days -- not weeks or months!

Our Transaction Model

To deal with the different reasons you are sending speech samples, and to also impose security restrictions, we organize the back and forth between your application and the VBG Platform around the concept of a "transaction". You can think of this as a single session or authentication event with your end-user.

For example, you request to create a transaction to enroll a user (create their voiceprint) and the VBG Platform responds back a transaction id. Using that transaction id, you then submit multiple speech samples. Once you have finished submitting samples, using that same transaction id, you request to make a voiceprint from those samples and then close the transaction. For security purposes, we impose a time limit on how long the transaction id remains valid. If your application abandons the transaction, we automatically close it and categorize it as "failed".

The full process for a typical enrollment, assuming you are capturing speech samples during an IVR session using the RandomPIN™ use case, is as follows:

  • If not already created, request VBG to create a user id that matches how you want to track the unique caller. You may use an account ID from your CRM or any alphanumeric string of your choosing. However, you cannot use anything categorized as Personally Identifiable Information (PII). For this reason, we recommend "hashed ids" that equate to a foreign key in your database.
  • Request an enrollment transaction ID for the unique caller (user id) and store that in a variable. VBG will also send you the digits to prompt the caller to speak.
  • Prompt the user to speak the prescribed string of digits and record the voice sample.
  • Send the sample to VBG using the transaction ID along with what the caller was supposed to have spoken.
  • Get a response back that tells you if the sample is accepted, or if the sample has too much noise, or if it does not have enough speech, and so forth, along with the next prompt to speak, which may be the same one as before if the sample was rejected.
  • If not ok, then allow the use to retry submitting a new sample (three total attempts is recommended).
  • If ok, then have the caller speak the next set of digits.
  • Repeat the prompting and recording until enough voice samples are collected for your use case.
  • Request that the system build the voiceprint from the collected samples.
  • Close the transaction and confirm that the enrollment was either ok or not ok.

That is the basic logic for enrollment using our RandomPIN™ process. Other details come into play with parameters you specify to check or not to check the spoken content, for example. Verification for RandomPIN™ is even simpler, typically you create a verification transaction for the specified user id, collect and submit a single sample, get a response, retry if necessary, and then close the transaction with the score and corresponding pass or fail result.

Interpreting Results

When you submit a speech sample, whether for enrollment or verification, the VBG Platform first tests to make sure the sample is of acceptable quality. For example, the sample must not have too much background noise or be too quiet. And for most active use cases, the content of the voice sample must match what the person was asked to say. If any of these conditions are not satisfied, your program will receive a corresponding error response code. The logical next step is for you to collect another speech sample and submit it again until you either collect a speech sample of sufficient quality or determine that the person is in a place that is too noisy or the person is not complying. This leads to a branch in your application code.

If the speech sample quality is not sufficient even after retries, then we recommend instructing the speaker to start over after moving to a location that is less noisy and/or the signal is stronger. Or, if the user continues to fail to match what they are supposed to say, they will need assistance from your support team.

In most cases, the speech sample quality will be sufficient. Assuming this is the case, the VBG Platform will then process the speech sample. In the case of enrollment, you will get an OK response that the sample was accepted (response code = 0). The VBG Platform then combines the new speech sample with previously submitted speech samples (if available) and the processes repeats/loops until all desired samples are collected. In the case of verification, your program will receive a response from the VBG platform indicating whether the voice sample passes or fails. You also receive a score. Your program can further query the VBG Platform to receive historical statistics on the score levels for the claimed identity. This enables you to implement your own rules engine and logic depending on the absolute and relative value of the score.

Design Based on Application

For verification requests, the VBG Platform responds in sub-second time. The biggest delay is usually due to Internet transport time between the VBG Cloud and your site. Enrollments are also fast, especially for active use cases, and usually complete within a one second after all samples are submitted. Because of the relative speed of active enrollment and verifications in the VBG Platform, the majority of authentication applications operate in near real-time.

On the other hand, identification requests take longer to respond, and response times can vary widely based on the length and content of the sample and how many voiceprints you want to search. Are you scanning a blacklist with 100 entries or 5,000? Does your speech sample contain 10 seconds of speech, or 3 minutes? The larger the sample and more voiceprints to compare against, the more computations necessary. And, the more computations that are needed, the more hardware necessary to achieve the computations in a given amount of time.

Achieving a real-time response for identification requests may be possible, provided you are willing to spend more on the hardware and system design. Please discuss your specific needs with VBG. We will help you optimize your design for real-time or offline computation as a trade-off against other parameters in your business case.

Get Started Quickly

We make it easy to try voice biometrics with a free trial of VBG Enterprise™ or by using our self-serve product, VBG Pro™. Assuming you have your own audio captured already, you can also get started quickly with our web-based API tool to become familiar with the API without having to write any code. This is part of the Visual Demo Center™ module, which also includes an online API Reference, Developer Guide, and other useful documents.

Another fast and easy way to get started is to leverage our Cloud IVR. Instead of submitting audio samples yourself, you simply ask us to make a phone call, give use the phone number and userid, and tell us whether this is an enrollment or verification. Our ready-built application will do the rest.

Want to Read More?

All Documents in this Series

Click on the title of the document you want to read next

Contact Us

Do You Have Any Questions?

Please let us know how we can help you and we'll respond promptly!

Image Description