Unlock the knowledge in audio via a simple search API

December 21, 2022
The Objective Team

Audio as a medium contains a wealth of knowledge. Podcasts, audiobooks, interviews, speeches, radio, and more all create and contribute to our collective knowledge. However, that information has been historically hard to access, and finding something stored in audio has boiled down to manual scrubbing. We built podsearch.page to give podcasters a search page that goes deep into their episodes, understanding the underlying audio natively. The engine powering this site is available as an API, able to be used by any developer to build audio-native search into their app.

👋

Have audio or other content you’d like to make searchable? We’d love to help you, please get in touch!

Loading up data is as simple as making a POST request to the ingest endpoint with an RSS feed (in the podcast example):

$ curl -X 'POST' \\<https://api.kailualabs.com/v1/catalogs/podcasts-inside-round/ingestions?sources=https://anchor.fm/s/66b9cac4/podcast/rss> \\ -H 'Apikey: $your_api_key'{  "msg": "created",  "status": 201}

Our platform is modality-agnostic, and will pull in all of the elements of the feed and make them searchable. We start by pulling in the text in the titles, descriptions and other metadata. But our multimodal engine will also ingest the audio and index its contents. Once users start searching, it will grab the most relevant results for your query based on all the available content in each moment of each episode, not just the information highlighted in the metadata.

Once the ingestion has finished, you can query your catalog with natural language, i.e. “How do I find product market fit?”:

curl 'https://api.kailualabs.com/v1/demo/catalog/search?query=How%20do%20I%20find%20product%20market%20fit%3F&limit=10&extra_fields=document_matches,images,title,description,attributes,url' \-H 'Apikey: your-api-key'{    "results": [        {            "document_matches": [                {                    "reference_type": "text",                    "doc_id": "8919316a90644db2b3aa5e77cec85e57",                    "media_identifier": "transcript",                    "position": {                        "start_char": 20062,                        "end_char": 20272,                        "start_timestamp": 4259,                        "end_timestamp": 6105,                    },                    "highlight": {                        "text": "What advice would you give from your own experience to founders for that stage of your company and are thinking through <b>how to find product market fit</b>? Yeah. I think there are probably two separate pieces here."                    }                }, ...}

The engine will take your query and retrieve relevant pieces of content based on their meaning, not just string matching. It will try to find segments that are the most relevant to what you are searching for, even if the keywords don’t match exactly.

Not long ago, building something like this would have required a PhD and writing a ton of machine learning code. We think that building search that understands opaque content like audio should be as simple as making a couple API calls. We’re building a search platform that enables these user experiences, and it’s what's powering the site you see here. If you have audio or other content you’d like to make searchable we’d love to help you, please get in touch!

We recommend you to read