Vapi Voice AI Review: A Comprehensive Analysis and Key Takeaways

Kaloyan Yankulov Portrait
Kalo Y.
Vapi AI Review Thumbnail
Link Icon
arrow up

AI conversations are everywhere these days. The technology has a lot of potential, but it’s not perfect. Voicebots can lag, struggle with interruptions, or even go completely off track in the middle of a conversation.

Vapi (also called Vapi AI) is trying to change that. It gives users more control over how voice assistants function to help smooth out some of the biggest pain points. You don’t even need to be a developer to use it. Its intuitive features make it easy for anyone to create and manage voice assistants without much technical expertise.

From my experience, Vapi improves conversations in many ways. But it also comes with its own set of challenges. In this Vapi review, I’ll walk you through my hands-on testing - covering the setup, key features, and overall performance - so you can decide if Vapi is the right fit for you.

Table of Contents
arrow

What Is Vapi?

Vapi is a developer-centric voice AI platform designed to help you create, test, and deploy sophisticated voice assistants. It boasts a comprehensive suite of tools, including a robust voice API (it's where the name Vapi comes from), an intuitive dashboard, and a ton of customizations. With Vapi, you can rapidly launch voice apps that simulate the flow of natural human conversations.

Despite being marketed primarily as a programming tool, Vapi has a well-packed, no-code user interface (UI). This allows marketers and other non-technical people to create fully functional AI voice assistants without bothering the developers on the team.

Some of the features that make Vapi stand out in the field of voice bots include:

  • Faster Response Times: Thanks to its latency optimizations, like improved processing, smart data storage (caching), and smooth audio streaming, Vapi’s assistants have the potential to be snappier than many other voicebots on the market.

  • Natural Cadence of Conversations: Your assistants can now pause when you interrupt them while also being polite enough not to cut you off when you're speaking. This is a huge improvement over competitors like Synthflow.

  • Unmatched Scalability: Vapi boasts the capability to handle more than 1 million concurrent calls, which makes it ideal for businesses of any size.

  • Support for 100+ Languages: Vapi lets you create voice agents that speak your users' languages, like English, Spanish, French, German, Hindi, Portuguese, and many others.

  • Advanced Functions for Developers: You can supercharge your voice assistant with custom tools that handle tasks like booking appointments, looking up information, and filling out forms.

  • Native Integration With Make: You can connect your Make scenarios and trigger them during Vapi voice conversations.

How Does Vapi Work?

Before we create our first voice assistant in Vapi, let's see how the platform works and operates behind the scenes.

Vapi’s unique selling point lies in the way it connects and orchestrates three different types of AI models to enable efficient, human-like conversations:

  1. Listen – Transcriber Module or Speech to Text (STT). When you speak to your device, the audio gets recorded and transcribed by the model.

  2. Intelligence – AI Model or Large Language Model (LLM). The transcript text will then get fed into a prompt and run through an LLM. The LLM is the core intelligence that simulates a real person.

  3. Speak – Text To Audio – The LLM outputs text (prompt answers) that is played back to your device.

This setup is not unique to Vapi. It’s a common infrastructure for all AI voice platforms. However, two things distinguish Vapi: its extensive support for AI models and its orchestration models.

Unlimited Support for AI Models

Unlike other platforms that only support a handful of models, in Vapi, you can switch and combine AI models and tools with others like ChatGPT, Claude, ElevenLabs, and more. Vapi supports everything under the sun, including the option to bring your custom models. This ensures you can use what best suits your needs or project.

The best part is that you don’t need to run and connect your external accounts, as Vapi natively supports most models. For instance, if you want to switch from OpenAI (the default LLM model) to Claude, you're free to connect your Claude API keys, but you're not forced to. By changing the model inside the Vapi interface, the platform will switch to Claude internally.

Orchestration Models

Orchestration models are AI add-ons that run on top of the core functionality that make conversations more lifelike and engaging. The Vapi platform is unique in its ability to improve and fine-tune the standard AI models with these add-ons.

The models are:

  • Endpointing: Endpointing is when you stop speaking to your voice assistant. Instead of using a timeout (the standard method), Vapi uses a custom fusion audio-text model to detect when you’ve stopped speaking. This helps reduce latency and make conversations more natural.
  • Interruptions: The ability to recognize when you're trying to interject, allowing the assistant to pause and listen. Many AI voice assistants lack this feature, which can make the conversation feel unnatural.
  • Background Noise and Voice Filtering: Vapi enhances call clarity by filtering out background noise and focusing on the speaker’s voice. This helps ensure accurate transcription and a smoother conversation, even in noisy environments.
  • Backchanneling: A more sophisticated way for the assistant to understand when verbal fillers (or backchannel responses) like “Ah,” “Yeah,” “Uh-oh,” etc., are intended to prompt an action from the assistant or are just there as fillers. You can even write a prompt so your bot can inject (use) such fillers to make its speech sound more natural.
  • Emotion Detection: Vapi can analyze the speaker’s tone to detect emotions like happiness, frustration, or urgency. This allows the assistant to respond in a more empathetic and context-aware manner.
vapi orchestration models include endpointing, interruptions, background noise and voice filtering, backchanneling, emotion detection. Image shows how these models are connected, what they do, and the models/providers associated with them

The primary purpose of these models, apart from making your assistants more believable, is to improve the latency of the voice-to-voice. "Voice-to-voice” is a term used to indicate the time between a user finishing their speech and the AI assistant’s first speech chunk being played back on the user’s device. Vapi’s goal is for the voice-to-voice flow to clock in at 500-700 ms or shorter.

I haven’t tracked my assistants’ response times, but the difference is striking compared to other apps like Synthflow. Vapi’s assistants responded noticeably faster and felt snappier than similar solutions. However, there's a catch, which we’ll explore in the assistant testing section.

With that out of the way, let’s create our first Vapi assistant and see how it performs.

Getting Started With Vapi

Vapi’s core product is its assistants, which are automated voice assistants (or bots) that can make or take calls. For this review, we'll focus on the user interface instead of using software development kits (SDKs).

1. Creating Your First Vapi AI Assistant

In Vapi, you only pay for the minutes you use. So, similar to Synthflow, you can create an unlimited number of assistants. However, if you want to use the AI assistant through a phone, you have to purchase a separate phone number for each assistant.

Your first step is to select whether you want to create an assistant from scratch or choose one of the existing templates. There are only four template options available:

  • Appointment Setter: An inbound assistant for dental practices to handle scheduling, answer questions, and provide service details.
  • Customer Support: A balanced template combining empathy and technical expertise for efficient support.
  • Inbound Q/A: Designed for an interior design agency to offer detailed product support and troubleshooting.
  • Game NPC (non-playable character): An in-game assistant, Elenya, provides guidance, lore, and insights into the game world.
Creating Vapi Assistant Template

If you're new to chat prompting, creating your first voice assistant in Vapi can be quite challenging. To make things worse, all templates are for inbound assistants, so there’s nothing to grab onto if you want to build an outbound (cold outreach) assistant.

2. Configuring Your Assistant

I started with Mary, the appointment setter assistant. The setup screen is vaguely organized into three main areas starting from the top: a breakdown of cost and latency, tabs to switch between the assistant's main configurations (Model, Transcriber, Voice, etc.), and the actual setup area with forms to enter prompts, change providers, etc.

Configuring Vapi Assistant Template

Costs and Latency Breakdown

Vapi strongly emphasizes delivering fast, responsive assistants while maintaining clear and transparent pricing. This approach is obvious in the first area of the assistant setup - we can see a breakdown of cost and latency that changes as we experiment with different models. Keep in mind these are only estimates.

The models you select for your assistant can make a huge difference in costs, as well as in latency. In the first example below, I selected OpenAI’s GPT-4o real-time preview model, which resulted in a cost of $0.22 per minute with a latency of 700 ms - Vapi's recommended level.

However, switching to the o1 preview model caused the latency to spike dramatically to 8000 ms (8 seconds), making it far too slow for a natural conversation. Apart from the models, the “mode” - web or over the phone (Twilio or Vonage) - can also impact the latency and costs. Keeping track of these factors is crucial to maintaining an optimal balance between price and performance when designing your assistants.

Vapi cost breakdown for GPT 4o shows a cost of $0.22/min and a latency of 700 ms
Vapi cost breakdown for GPT o1 preview model shows a cost of $0.19/minute and 8000 ms latency

Model, Transcriber, Voice, and Others

Next, you get to select which aspect of your assistant you want to configure. The first three options - Model, Transcriber, and Voice - are essential for setting up your assistant. The remaining three - Functions, Advanced, and Analysis - offer more advanced capabilities you may not need for your first assistant.

Configuring Vapi voice assistant Model, Transcriber, and Voice

Leaving our assistant, Mary, I’ll start from scratch and build Jade, my very own inbound assistant to handle order-taking for our Chinese restaurant, The Golden Wok.

The steps we need to take:

  1. Create the Assistant: We'll build an assistant and provide it with instructions on how to handle calls for our restaurant.
  2. Get a Phone Number: We can either use an existing number or purchase one directly through Vapi.
  3. Link the Assistant: We'll assign the assistant to the phone number so it can start answering calls.
  4. Make a Call and Test the Assistant: Finally, we'll dial the number and interact with our assistant.

3. Choosing Your Assistant’s Model (LLM)

In the first chapter of our review, we talked about the three core modules of the infrastructure - Listen, Intelligence, and Speak. These are the key components you’ll configure in your assistant’s first three tabs:

  • Model (Intelligence) - The AI model (LLM) that processes and generates responses.
  • Transcriber (Listen) - Converts spoken language into text.
  • Voice (Speak) - Transforms text responses into natural-sounding speech.

We’ll start with the first one, the Model, where you get to select the LLM and write your prompt.

The default model is OpenAI’s GPT 3.5 turbo, which is a great starting point because it provides quick speed and a relatively good interaction experience. However, you can natively select between 35+ models from 16 different providers.

While this plethora of options is fantastic for people looking for next-level customization and flexibility, it can also be intimidating and create decision paralysis for those not familiar with AI models.

In that regard, the platform is better suited for developers than novices. I wish Vapi had highlighted recommended models or even provided an in-app wizard that suggests the best model based on your needs. That said, Vapi highlights the fastest and cheapest model by each provider, which, as of this review, is the GPT 4o Mini Cluster for OpenAI.

List of LLM models and providers available on Vapi. Breakdown includes latency and pricing for each model.

*OpenAI o1 models are still in beta and not recommended for production use. System Prompts and Tool Calls are currently not supported by o1 models, and latency is significantly higher than in traditional models.

Advanced Options for Configuring Your AI Assistant

Vapi has so many levels of customization, so let's get in the weeds a bit and look at some of the more advanced configuration options and how they work.

Vapi Advanced Configuration includes configuring Knowledge Base, Temperature, Max Tokens, and Detect Emotion

Knowledge Base

The model configuration allows you to plug in custom documents with information on specific topics in order to provide more accurate and informative responses to user queries. For example, we can import our Chinese restaurant menu, hours, and other relevant information.

Temperature

The temperature is used to control the randomness of the assistant’s answers. When you set it higher, you'll get more random outputs. When you set it lower, towards 0, the values are more predictable.

While testing this setting, I initially expected that setting the value to zero would keep the assistant strictly on-topic, preventing it from going off-track. However, that wasn’t the case. When I asked Jade to tell me a joke, she always responded politely and humorously, but the content of the jokes varied based on the temperature setting.

  • At a value of zero, the jokes were highly relevant to the restaurant theme. For example: "Why did the dumpling go to school? Because it wanted to be a wanton!"
  • At a value of 1.5, the jokes became more generalized, covering broader topics related to cooks and food rather than just Chinese cuisine.

However, when I pushed the setting to 2 (the highest value), the assistant completely broke down and responded with nonsensical gibberish.

Fortunately, with some prompting, I was able to get Jade to stay on track. This does show the value in thoroughly testing your assistant, though!

Max Tokens

The maximum number of API tokens the assistant can generate per turn in a conversation. This directly affects API costs, so setting a limit helps keep responses concise and cost-efficient. To manage expenses, it’s best to keep this value at 250 or lower.

Detect Emotion

Enable this feature to detect users' emotions - such as anger, joy, and frustration - and use them as additional context for the model. In my experience, I didn’t notice a major difference with this setting turned on or off. The choice of AI model had a much bigger impact on how well Jade handled conversations. For example, more advanced models like GPT-4o responded to my complaints about being starving with greater empathy, while GPT-3.5 felt noticeably less attentive.

4. Writing Your AI Prompt

Your assistant's prompt serves as a guide, outlining the rules and instructions it will follow during conversations. If you've used ChatGPT before, the process will feel familiar. You set the prompt for your Vapi assistant under the “Model” tab.

Creating a custom assistant in Vapi. Image shows AI prompt for the Golden Wok, a Chinese restaurant

A tiny but important thing I appreciate about Vapi is the field where you can enter the “First Message." This is especially critical with outbound calls and something I struggled to get right with the Synthflow bots.

"Hello, this is Jade from The Golden Wok. Can I take your order?"

Things you should include in your prompt:

  • Assistant basics and introduction
  • Business information
  • Client information (if relevant)
  • Role and primary goals of the assistant
  • Conversation instructions and script
  • Voice and tone
  • Additional instructions and limitations

Let’s break down each one in my prompts:

Introduction and Business Information

You are a voice assistant for The Golden Wok, a Chinese restaurant located at 456 Dragon Street, San Francisco, California. The restaurant operates Monday through Saturday from 11 a.m. to 10 p.m. and is closed on Sundays. The Golden Wok offers a variety of delicious Chinese dishes to the local community, including popular items like dumplings, fried rice, kung pao chicken, and chow mein.

Goal and Primary Instructions

The main purpose of the assistant:

Your primary role is to take customer orders, answer basic questions about the menu, and provide information about restaurant hours and services. If a caller wants to place an order, your goal is to gather all necessary details in a friendly, efficient, and engaging way.”

Followed by the call script:

Here’s how you should handle it:

  1. Take their order: Ask what they'd like to order and confirm any specific preferences (e.g., spicy level, add-ons, etc.).
  2. Gather delivery or pickup details: Ask if they’d like delivery or pickup, and if it’s a delivery, collect their address.
  3. Confirm contact details: Politely ask for their name and phone number to ensure the order is correct.
  4. Review and confirm: Read back the order, delivery/pickup details, and provide the estimated wait time.”

Tone and Style

Make your assistant sound on-brand with specific conversational style requests:

  • “Be casual, fun, and a little witty - think friendly diner vibes, not a formal call center.
  • Keep responses short and conversational, using phrases like 'Umm…', 'Gotcha!', 'Sounds delicious!', and 'Alright, let's make it happen!'
  • Don't talk too much - make it feel like a natural chat, not a monologue.
  • If they ask about menu items, highlight popular dishes or specials enthusiastically, e.g., 'Ooh, the kung pao chicken is a fan favorite!'
  • If they're unsure what to order, suggest popular combos or ask about their food preferences.
  • If you don't know an answer, keep it light - 'Hmm, good question! Let me check on that for you.'"

Additional Considerations

  • If they ask about allergens, let them know that dishes may contain soy, gluten, and nuts and that they should check with the restaurant for specifics.
  • If they request something that’s not on the menu, gently guide them to similar available options.
  • End each call with a cheerful closing - 'Thanks for calling The Golden Wok! Your order will be ready soon. Enjoy your meal!'
  • With your friendly and engaging personality, you’ll make ordering from The Golden Wok a fun and seamless experience!"

Additionally, I added the following restrictions:

Stay on track and avoid any off-topic conversations by all means.”

This small adjustment made a huge difference in keeping my assistant focused and avoiding off-topic conversations.

For example, when I asked Jade to tell me a joke, she politely steered the conversation back to its main purpose: ordering delicious Chinese food. I was thrilled with this result, especially since I hadn’t been able to achieve the same level of focus with the assistants I built using Synthflow AI. I highly recommend keeping a similar restriction in your prompt to minimize inefficient calls and unnecessary costs.

Pro tip: To create your own prompt, you can take my example (or any template from Vapi), input it into ChatGPT, and ask it to generate a tailored system prompt based on your industry and specific use case.

I know - a prompt to create a prompt, inspired by another prompt. So meta!

5. Setting up the Transcriber

In Vapi, the transcription module is responsible for converting spoken language into text, enabling the voice assistant to process and understand user inputs effectively, as well as transcribe your calls.

Setting up the transcriber in Vapi AI, which can support 100+ languages

The provided models support transcription in more than 100+ different languages.

6. Giving Your Assistant a Voice

The Voice module is the third crucial component of the Vapi infrastructure, and it’s responsible for converting the AI assistant's text-based responses (which come from the LLM) into spoken audio. It acts as the text-to-speech (TTS) engine, allowing the assistant to communicate naturally with users through voice.

Vapi offers a wide range of voices with different accents and tones to make conversations feel more natural. It works with top TTS providers like ElevenLabs and Deepgram, giving you plenty of options to find the right voice for your brand.

Latency and pricing vary by model, so testing a few will help you find the best balance between cost and quality. Want to hear the voices? You can try one on Vapi’s homepage.

Vapi AI assistant voice configuration screen shows options for selecting the provider and voice

I was genuinely impressed by the variety of voices available. Whether you need a laid-back New Yorker or an aristocratic noble princess, there’s a voice for every need. Vapi also provides an amazing Voice Library section that you can open in a separate tab to preview the voices and even search for a specific gender and accent.

Vapi Voice Library  shows a range of voices, languages, and accents

One thing that isn't immediately clear is whether a voice supports the same language as the prompt and transcriber. I experimented by changing my prompt to a couple of different languages, and the voices I tested handled them well. That said, the only reliable way to confirm compatibility is through trial and error.

7. Functions

“Functions” or “Tools” (Vapi seems to use both interchangeably) enable your assistants to perform custom actions and tasks during the call. You can add these Tools from the Tools Library (a separate page on the platform).

Setting up custom predefined functions for a Vapi AI assistant

There are a few types of Tools:

Predefined Tools

Currently, there are three available:

  • Enable End Call Function: Allows the assistant to end the call on its own. (Best for GPT-4 and larger models.)
  • Dial Keypad: The assistant can enter digits on the keypad.
  • Forwarding Phone Number: This number is used to transfer calls from the assistant. (Only applicable to phone calls, not web calls.) The forwarding number can be any number - it doesn’t have to be a Vapi-registered number. It’s also recommended to include a line in your prompt, such as: If needed, forward any calls to [your phone number].

Custom Tools

This is a developer’s feature that allows you to build your custom actions through an API. For instance, you can collect user information through the call and send it to a server.

Integrations

You can connect your Make or GoHighLevel accounts through a webhook URL.

Keep in mind the integrations currently rely on webhooks rather than being fully native. While this isn’t necessarily a drawback, it’s worth noting the setup might require slightly more technical expertise.

8. Advanced Settings

As the name suggests, this tab allows you to set up various advanced configurations like privacy settings, conversation fine-tuning, and messages that the assistant can send.

Vapi AI advanced settings show options to set privacy, fine tune conversation and messages assistant can send

Overall, I was pleasantly surprised by the range of available features. It's clear that Vapi truly excels in offering robust customization options for your assistant compared to other alternatives.

Privacy

This panel allows you to disable the recording of calls and videos. This is especially important for EU-based customers.

Pro tip: EU users, remember that if you plan to record your calls, you must include a notice in your opening message to inform the customer.

Vapi privacy settings panel shows option to enable HIPAA compliance, enable or disable audio and video recording

Start and Stop Speaking Instructions

These panels allow you to finetune your assistant's waiting times and interruptions during interactions.

Based on my testing, the Smart Endpointing feature improved the natural flow of the conversation. It reduced the awkward interruptions from the assistant, so I recommend that you keep it on. Of course, you ultimately have to test your voicebot before going live. We go into more detail on that in the last section.

Vapi voice speaking instructions screen shows settings for how and when the assistant should start and stop speaking

Call Timeout Settings

Here, you can set parameters for when the assistant should end a call, whether due to client silence or reaching a maximum call duration. This is crucial for keeping costs under control.

Vapi call timeout settings options show settings for silence timeout and maximum call duration

Messages

Lastly, we have settings for messages your assistant can send, including Voicemail, End Call Messages, and Idle Messages (e.g. “Are you still there?”). You also have settings for sending messages programmatically to your server (for developers only).

Vapi messages settings screen shows settings for sending voicemail, end call messages and what to say if the call is idel

9. Analysis

The last section of the assistant configurator allows you to specify prompts and settings for the analysis of the call, including call summary prompt, success criteria, and structured data extraction.

Vapi AI call analysis configuration allows you to set up a prompt for the AI to evaluate the client's behavior during the call

The Success Evaluation and Structured Data Extraction will be especially important for sales calls and lead qualification. Together, they can be used to score leads.

An example Success Evaluation system prompt could look like this:

"Evaluate the client’s behavior during the call based on:

  1. Engagement: Did they participate actively and show interest?
  2. Clarity: Did they clearly communicate their needs or goals
  3. Receptiveness: Were they open to suggestions and solutions?
  4. Objection Handling: Were they cooperative when addressing concerns?
  5. Decision-Making: Did they show readiness to take the desired action?
  6. Provide a brief breakdown of strengths, weaknesses, and suggestions for improving client engagement if needed."

You can even specify the evaluation rubric for the prompt - the framework that sets out criteria for evaluation:

Vapi success evaluation rubric allows you to set out the criterial for evaluating a call and scoring leads

10. Choosing a Phone Number

Phone numbers are required in Vapi to make or receive calls over the phone.

You can purchase U.S. and Canadian phone numbers directly from Vapi for $2 per month per phone number or import your numbers from Twilio or Vonage by entering your Twilio/Vonage String Identifier (SID).

The feature to purchase numbers natively is quite limited at the moment. You can only buy U.S. and Canadian numbers, and you must manually enter the local area code to find a number. Also, you can’t make outbound calls to any other country with a native phone number. In other words, you must use the import feature if you operate outside these two countries or make calls to international numbers. This is quite restrictive and a significant step down compared to the Synthflow interface for purchasing numbers.

When you buy the number, you have two options:

  • Inbound Settings: You can attach your number to an inbound assistant. When people call that number, your AI assistant will answer the calls.
  • Outbound Settings: You can have your assistant call a specific outbound number. Unfortunately, the platform doesn’t offer a batch campaign feature (like Synthflow does), which makes outbound calls via the interface rather impractical. That said, you can still automate this process using the API.
Vapi phone numbers screen shows inbound and outbound settings and numbers

11. Testing and Publishing Your Assistant

Once you set up your assistant, you're finally ready to do some final tests and publish it live. Vapi gives you $10 of free credits for testing. You can track their usage on your billing page.

You can either call the assistant through the web browser or use the phone number option to make inbound or outbound calls on the phone.

With my custom prompt and GPT 4o Mini as the underlying model, Jade performed exceptionally well, maintaining a coherent and smooth conversation. The welcome message functioned perfectly (something I had trouble achieving with Synthflow), and the restrictions on off-topic conversations worked better than I hoped.

That said, I noticed that reducing latency below 750 ms made the assistant feel unnatural. As the saying goes, "Too much of a good thing can be bad," and this also applies to your assistant's speed. The assistant responded too quickly, frequently interrupting and overlapping my speech. This could be especially problematic when serving slower-paced audiences, such as the elderly or non-native speakers. The sweet spot for me was anywhere between 750 ms and 900 ms, which, fortunately, was very easy to finetune with the number of options and models available.

Phone number testing worked flawlessly, but I was disappointed to find out there’s no web embed option similar to Synthflow. If you want to roll out your assistant over the web, you can currently achieve this only programmatically.

Another downside of testing is the absence of a chat texting feature. In Synthflow, you can interact with your assistants via text, simulating a real call without using the phone/web calls, which helps save credits during testing.

Advanced Features of Vapi

Vapi has some more advanced features we haven’t covered yet. Let’s take a look.

Create Multistep Processes with Blocks

The Blocks feature in Vapi is an advanced visual workflow builder that offers powerful customization and automation capabilities for your voice assistants. With Blocks, you can design and connect a series of steps, combining conversational steps and external tools to create a seamless customer experience. This can be used for multi-step conversations, forwarding, error handling, visual logic, and programmatic interactions with your server and database.

For our Chinese restaurant, a potential workflow might look like this:

  1. Greet the customer and ask for their order ID.
  2. Use an API block to query your database for the order details.
  3. Provide the customer with their order status.
  4. Offer them the option to speak to a representative if further help is needed.
Vapi AI multistep blocks feature allows you to configure step by step process

Run a Well-Oiled Machine of Assistants with Squads

Vapi's Squads feature allows seamless collaboration between multiple assistants to create a more dynamic and efficient call-handling system. This functionality enables call forwarding between assistants when one is unavailable and also helps simulate a comprehensive multi-step process, such as lead research, qualification, and closing deals. The best part is you can call the Squad and test the whole assistant team.

Vapi Squads shine in scenarios where multiple assistants are required to handle different stages of a process. For instance:

  1. Lead Research: The first assistant gathers key information about a prospect, such as their business, needs, and contact details.
  2. Lead Qualification: A second assistant evaluates the lead’s suitability by asking targeted questions and determining if they fit your product or service criteria.
  3. Lead Data Recording: A Tool records the prospect data to your server and CRM.
  4. Closing the Deal: The third assistant handles the final stage, addressing specific objections, explaining pricing, or even processing an order through a Tool.
Vapi AI Blocks Feature

Enrich Assistant Knowledge with Files

One of Vapi’s standout features is its ability to import files as a "Knowledge Base." This significantly enhances the assistant’s ability to provide accurate and detailed responses. Just upload relevant documents directly and they become instantly referenceable.

For our Chinese restaurant, The Golden Wok, I scraped a menu from my favorite local Asian restaurant’s website in a text file. Then, I imported it into Vapi as the assistant's Knowledge Base. The process was quick and straightforward:

First, I uploaded the menu file in the Files section:

Vapi Files screen lets you upload data to the knowledge base, such as a menu for our Chinese restaurant

Then, I selected it as a Knowledge Base under the assistant’s settings:

Files uploaded to Vapi can be selected under an assistant's knowledge base. Here, a menu for the Golden Wok is being added to our assistant, Jade's, Knowledge Base

Within moments, the assistant processed the information and made it accessible.

When I tested it, I asked, "What’s on the menu?" Jade responded accurately and effortlessly, listing all the items as they appeared in the file. She even recommended specific dishes based on my taste and listed the ingredients of particular items.

Vapi Pricing

Vapi does its best to help you understand the costs per minute of voice calls. It provides a comprehensive breakdown of your assistants' cost structure, including the Vapi margin. Most importantly, Vapi provides a cheaper price compared to alternatives.

Vapi Pricing Example

The cost per minute depends on four moving components:

  • AI Models: More advanced models like GPT-4 are more expensive than lighter options. Costs vary from $0.32 to less than $0.01.
  • Voice Providers: Costs vary between text-to-speech providers, such as ElevenLabs, ranging from $0.65 to $0.001.
  • Listen Module: Costs for a speech-to-text provider like Deepgram range from $0.017 to $0.008.
  • Vapi’s fixed pricing of $0.05 per minute.

As you can see, overall costs vary a lot. You can expect a total cost per call minute between $0.07 and $1.03 on the highest end. Also, keep in mind these are estimates, not the exact price you’ll pay. The good news is that you still get high-quality calls even with the cheap models.

On your dashboard, you can keep track of your actual spending and average cost per call. Please notice it’s the cost per call, not per minute, but you can calculate that if you divide the total call minutes by the total spent. You can also test your assistant to get a sense of what your cost per call will be.

Vapi AI Dashboard shows call minutes, number of calls, pricing, and other statistics

Final Verdict

Vapi is a fantastic tool for those seeking deep customization, offering smooth, low-latency conversations and impressive flexibility. While it might be intimidating for beginners unfamiliar with AI models, its powerful features make it a standout.

However, it does lack some UI options, like chat texting, and the phone number purchasing interface could be more intuitive. Additionally, embeddable options are missing. That said, its affordable pricing and scalability make it a great choice for businesses looking to scale their call operations efficiently, especially if you're not afraid to experiment with AI models.

    Pros

  • Customizable models

    -

  • Maximum flexibility when customizing and finetuning conversation

    -

  • Transparent price breakdown

    -

  • Overall lower price compared to alternatives

    -

  • Low-latency voice bots

    -

  • Smooth, natural conversations

    -

  • Prompt instructions work perfectly

    -

  • Robust voice library

    -

  • Robust API and overall the most complete platform from a developer standpoint

    -

    Cons

  • No batch campaigns for mass outbound campaigns

    -

  • No chat texting

    -

  • Lackluster predefined templates

    -

  • No embeddable widget for web deployment of assistant

    -

  • Chatbots with extremely low latency (<700ms) can be difficult to interact with

    -

  • You can only purchase U.S. and Canadian phone numbers natively, although Twilio and Vonage numbers can be imported

    -

  • Model and finetuning options might be overwhelming for beginners

    -

Build Your AI Voice Assistant with Vapi

Seamless Integration for Phone Calls & Apps

Vapi Alternatives

Synthflow AI

Synthflow AI is a strong alternative to Vapi, especially for those seeking an intuitive platform to build AI-driven workflows without deep technical expertise. It offers a no-code interface, making it accessible to users with minimal coding experience while still providing powerful customization options. The biggest difference is the set of features available in the interface (and therefore available to non-developers). These include batch campaigns (for mass outbound campaigns), embeddable widgets, and data extraction.

Bland AI

Bland.ai is an enterprise-focused advanced alternative to Vapi. Unlike Vapi, which is more accessible with its no-code option, Bland.ai focuses on providing an even higher level of flexibility for developers. The platform is packed with enterprise features like SOC2 Type Il security, payments over the phone with PCI DSS, and more.

Retell AI

Retell AI is focused on helping you effortlessly deploy AI voice agents. Similarly to Synthflow, it's primarily focused on the UI of the platform. It offers native features for booking a meeting (through Cal.com), automated syncing of your knowledge base, call forwarding, and others.

FAQ

Can you use Vapi if you are not a developer?

Yes, Vapi offers a fully functional UI, but it lacks some features compared to the developer-friendly API, such as an interface for data extraction and an embeddable widget for the assistant.

Who is the founder of Vapi?

Vapi was founded by Jordan Dearsley and Nikhil Gupta in 2023. It’s a San Francisco-based company.

Is Vapi open source?

No, Vapi is not open source. It is a commercial platform. However, it does provide extensive customization and integration options through its API, including open-source options.

What is the open-source alternative to Vapi?

Currently, there is no fully open-source solution available on the market. However, if you have the development resources and bandwidth, you can build your own stack using open-source models.

Link Icon
arrow up

I'm a co-founder of a marketing automation platform and obsessed with all things related to marketing and SaaS growth. In my free time I love to go to the gym and play video games.