November 11, 2004

Searching Metadata on Podcasts Isn't Enough

Harold Gilchrist writes:

I'm not sure why Napster is apart of this discussion. That I know of, podcasting right now tends to be not-always-well-known people recording small talk radio-like information and sending it out via RSS to subscribers. The difference with the podcasts I know of, and say, the last 50 years of music, is that you already were exposed to some of the music because of radio, friends, commercials, etc. Podcasts are new, we haven't been exposed to the content yet, and they do need some way to be searchable. If we only have metadata such as the name of the person talking, the subject categories, title, interviewees if they exist, the searches will all focus on that metadata and people will be driven to content created by those they know instead of also looking at content that covers subjects they are interested in because they see that the words are interesting before they listen to the podcast.

So, we need to search the transcript, to find information not covered in the metadata, to find creators that may be unknown but doing something interesting.

So why not try to make podcasting different and better than P2P networks that rely just on metadata, early on in the development of this medium, while we still can? Search on P2P networks sucks, so why not try for something better here? Searching the transcripts would be much like searching key words across blogs with any of the blog search services.

Okay, so let this new system be more democratic, and searchable, by creating a standard transcript system, to make the full content more searchable. What's wrong with that? Why not try to actively disrupt the power laws that will form around a metadata-only search that will push people to listen mostly to know podcasters? This is new, and different, as Howard says. So lets figure out a good way to discover new voices in podcasting. Searching the content is one way, but I sure there are others. So what do people recommend to accomplish this power law disruption for podcasts?

Posted by Mary Hodder at November 11, 2004 08:41 AM | TrackBack
Comments

We're in a very familiar loop here, not at all unlike the beginning of blogs, which grew even though it was often said that we needed directories (and we had them but eventually the supply of blogs completely stripped the ability to maintain the directories).

We actually have a podcast directory, on ipodder.org and it's working pretty well, it's distributed and decentralized. Unfortunately, and ironically, even the existence of this directory is hard to discover, making it ever harder for the nodes in that directory to be discovered, and even harder for the limited metadata with the programs, and then the individual podcasts.

In other words, if you ask me, news of new content is going to flow the usual way, human to human, with certain nodes being recommenders to many. This time I'm not likely to be one of them, Adam is the best at this, it's just like what he did at MTV.

Posted by: Dave Winer at November 11, 2004 09:14 AM

Both Dave and Curry seem happy with mp3 opacity. This just won't wash in the medium run. Google et al need to deconstruct a blog down to the post level, and permalinks must point to specific moments in an audio or video file.

Most of the current podcasters are groovin on the freedom to just speak. No editing. No typing. No html or crafting links in a post. Just you and the microphone.

Not good enough.

At a minimum, we need closed captioning for accessibility and to find parts of a file that we want to hear. (Dave and Adam are momentarily under the impression that each moment of each podcast is so compelling you just can't wait for the next moment. This will pass.) WE NEED RECODRING, ENCODING, PLAYER, AND SEARCH/NAVIGATION TOOLS TO TREAT AUDIO AND VIDEO FILES AS COLLECTIONS OF MICROCONTENT, NOT OPAQUE BLOBS. Then we can cite a great 10 second quip ("We're going to Wyoming! And Minnesota!"), click a blogged link to go straight to it, and listen to the cited sound bite or section.

Conversely, when we listen to a transparent show, new behaviors are available to us. Just think of what TiVo does. It lets us rate invidivual programs Thumbs Up or Thumbs Down, so it can learn our preferences. It lets us subscribe to (watch later) items mentioned in commercials, like another show, or to see info and stats on the current program, including reviews and summaries from other sources. With the Internet at our disposal, microcontent creates a whole platform for developers.

(It's not as if Dave and Adam are recording in 30 second snips; they're doing 20 - 60 minutes at a time. Even NPR has shorter segments. Dave argues that it's a continuous sequence of ideas that build on each other, so you must start at the beginning. Perhaps, but few podcasts are rhetorical ladders and most listeners benefit from some idea of how Dave and Adam intend for them to spend the next hour. Before they download, to make the listening decision. And during the audition, to navigate to sections of higher relevance, as determined by the listener, not the podcaster.)

Microcontent, like time-coded transcripts (even bad speech-to-text will be a boon to search), location names (the Melrose Starbucks) and other geocoding, the languages spoken, credits, links to people/sites mentioned, tip jar info, MPAA ratings and the like (PG-13 for language?), maybe even Card Catalog category info.

Microcontent is part-and-parcel of putting power in the hands of the information consumer from the producer.

See you at the cabal.

Posted by: Phil Wolff at November 12, 2004 04:27 AM

The most promising tool I have seen in that area is HP's Speechbot (http://speechbot.research.compaq.com/), which does speech recognition on spoken word recordings. Unfortunately, it is still in research - they only use input from a limited number of radio transmissions. But this technology in full-fledged grab-all-you-can-get-on-the-web search mode? Wow!
(see http://www.forret.com/blog/2004/10/google-is-listening-searching-audio.html)

Posted by: Peter Forret at December 4, 2004 12:25 PM

if voice recognition accuracy can ever be made a function of computing power, then many podcasts can be batch processed by an entity like Google (Auggle?).

Otherwise, NGH.

Anybody who thinks the majority of podcasters are going to add tons of metadata themselves is smoking crack. semantic network-flavored crack (the worst kind).

not gonna happen.

People will find interesting podcasts (and -casters) the same way they find interesting TV shows and web sites. Word-of-mouth, directories, links by web-mavens and connesours, google-wandering, etc.


I question the usecase Phil mentioned. I really don't believe people consume podcasts sentence-by-sentance or even topic-by-topic. Most of the ones I listen to are not really data driven, they are personality / rant / story driven.

Interacting personalities don't break up into sound bites meaningfully plus these guys' conversations jump all over the place. You just have to want to swim in that flow if you are going to be a regular listener.

Is a rant on Whole Wheat Radio worth tagging at a tiny atomic level? JimBob hardly thinks they are worth posting at all (even though I like to listen to them)

I also question Phil's assertion that most podcasts are shoot from the hip with no accompanying web site. The ones I've see usually have a text summary on their parent site.

Posted by: willc2 at December 17, 2004 05:38 PM