July 30, 2005
An Answer! Chris Anderson Writes About Sellers at the End of the Long Tail
Chris wrote a post a couple of days ago, in answer to a series of questions I wrote in the fall. I asked about sellers further down the curve. Not Ebay and iTunes and Amazon, who are themselves at the top of the curve, but selling stuff down the curve, but actually the sellers themselves that exist down the curve, as well as selling content down the curve. I could only think of a couple of examples.
Chris answers.. here, where he says that most of the sellers are either at the top of the power law as sellers of tail content, or they are slicing bits of their businesses into small subdivisions of themselves to sell content. He suggests that the following is happening, where there are:
- 1. Long Tail aggregators (that include both the head and tail of content and products)
2. Niche suppliers/producers (who get aggregated by someone else)
3. Filters (which help people find what they want)
Update: Chris notes here that there are some sellers operating down the niche including companies like CDBaby, CountryTracks.com, TheLoveOfMetal.com, DownloadPop.com and AudioFader.com. Chris makes the case for those kinds of businesses here.
Nice!
Blogher: Women who want to fund, build & sell things
I'm leading a session at Blogher called Women who want to fund, build & sell things. I will be there with Denise Howell and Patricia Nakache. We will discuss with the audience questions about support and information they need to fund their own businesses. We want to answer questions about the venture market (what's getting funded, what's not), the venture process (both lifecycle of funding as well as in-the-meeting pointers), funding strategies (e.g., angel vs institutional), and typical investment decision criteria, and see where the discussion leads us. Please join us if you are thinking about starting a business, have started a company and are navigating the funding process or would like to contribute what you've learned in your work. We are looking forward to a rich discussion.
Tag: blogher
July 29, 2005
Blogher Speaker's Session
I'm here in the Blogher Speaker's session, held the day before the conference. Notes are posted on Donna Mill's blog: SoCalMom.
I must say, this is impressive. I know of no other conference that holds speaker sessions (there may be some but I've never heard of it, so please leave comments if there are some). Almost all the speakers came, and the presentations by Elisa Camahort and Lisa Stone were great. I'll update this post with notes from Donna's post, but frankly, every conference I attend could use something like this, were the organizers give the results of the survey from attendees as they registered (context about who's coming and what they care about.. talk about knowing your audience), common sense tips about speaking and the guidelines for running sessions.
The handout, which is not yet online, but will be, should be given to speakers everywhere.
One other tip during the session on how to live blog: don't wear organic deodorant on days like tomorrow, use the one's with aluminum .. just for tomorrow.
And.. they quoted Doc Searls post about Jonathan Schwartz at Always On who was too even, and needed to be more animated, have more fun. The suggestion to us was, have energy, have fun.!
Tag: blogher
July 28, 2005
Campfires and Story telling
Britt Blaser wrote an interesting piece on corporate blogging, extending the metaphor of campfire story telling, and the properties he sees in it as why we trust its "talk to accept its steady, unadorned, agenda-free tone as trustworthy."
It's something I've been thinking about, as I started writing about it (offline) a couple of months ago to describe why people need to tell stories and what the relationships, expectations and processes are that cause us to speak in a certain way. But I was thinking about it in terms of individula users and their desire to share their experiences and connect to each other in deeper and more real ways. But Britt makes the analogy about corporate blogging, where he suggests that our ability to share information at low transaction costs online means that we won't accept a stilted style or broadcast mode. We'll just change the channel to someone more authentic and conversational. He suggests that the style of speaking around the campfire is deeply embedded in our primal brains and that when people, corporate bloggers or otherwise, take that tone, we are much more prone to listen.
I think it's more than tone, but a process we engage in with talk that is conversational, where everyone has an opportunity to speak, there is not so much a competitive flavor but rather a contemplative one in the interaction. Also, we take turns and listen to others, which also reinforces the egalitarian nature of the experience. The ideas and stories are more about shared experiences, regardless of whether the teller was with the others when the experience occurred. I'm not sure the actual details of the story matter in the long term, except in the moment of the telling, as listeners half listen for truth or in the case of a narrative willingly suspend disbelief and half listen to the development of the story, emotional buildup, and shared connection to the speaker. The details serve to support the moments of entertainment, emotion and connection, but in memory, it is the shared connection and warm emotional experience that we remember, more than most of the details.
If corporate blogging can succeed in revealing who people are inside the company, really, with all of our foibles and quirky eccentricities that make us authentically human, then we might want to sit with people in companies to share our stories together. But part of the success of the campfire is equality in the sharing, as much as the tone during the telling of the story. And I'm not so sure that corporations on the whole would be willing to show that unkept, unmanaged side of themselves inside as they share information campfire style with outsiders who themselves are revealing this sort of thing.
July 27, 2005
Feedster and AOL
Well, the blog search news is hopping this week, as Feedster has now announced their Deal with AOL:
which will have RSS search and a news reader, along with some other stuff.
Congrats, guys. Nice work!
July 26, 2005
For the Vox Populi, Part II: A Comparison of How Some Blog Aggregation and RSS Search Tools Work for Keyword Search
This is part II continuing from a post on URL search I did yesterday. It follows a post on how URL search and link counts are done by Bloglines (as an information search tool, not a news reader tool), Blogpulse, Feedster, Pubsub and Technorati. The third will cover subscription search (watchlist) performance, the fourth will look at special services and the fifth will look at spam and controls for it. The sixth will summarize and make recommendations about how to best use the services.
Keyword search
Keyword search is very different than URL search, from the user perspective.
URL lookups are more straightforward in terms of user motivation: a user looks up a URL to see everything that links to it. No matter the search motivation, the user still wants to see all links, and all five blog aggregation tools that I reviewed in the last post, give results in reverse chronological order, with some kind of link count. So whether users are doing an ego surf (looking up their own URLs) or checking out who links to a client, their own company, the New York Times or the post of a blogger they are interested in checking influence or conversationalness, the results in that chronological form satisfy these different needs, together with the number of links, at least at first. It may be that a user then wants to sort the results to see more 'authoritative' posts first, from the results list, but that is more often a secondary need, and not currently offered by any of these five services. Since blogging and RSS search are very much about the 'aliveness' of the activity, serving results in reverse chron order does satisfy most user needs (this observation is based on user testing I did that looked at search results on URLs, keywords and topic browsing -- the last of which doesn't exist yet, though I built a front end system that did it, and tested it repeatedly with users).
Conversely, keyword search is harder to get right in terms of user expectations. First, user motivations for searching are different and spread more evenly across a more complicated set of goals than with URL search. Does the user want a quick taste of what is out there around a particular topic? Or do they want every instance of a keyword match, with an accurate count of those instances? Do they want to see only the most relevant posts that use the keywords, or the most recent? How each results is laid out is also an issue in satisfying user's search needs, because we have all been trained to map our information expectations, consciously or not, to Google keyword search results. They have us viewing keyword search results in very particular ways. And yet, the part of the web that is alive is very different than that which is static. So keyword search results for live data are naturally different, and yet we find ourselves first looking through our Google eyes that see relevance first, before we refocus on the live, and chronologically ordered results, that may be formated a little differently.
Below is an image with screenshots of Google, in order to compare it to Technorati, Feedster, Blogpulse and Bloglines (again, Pubsub does not have historical search, so they have no results to compare for this test). Each of the four blog search tools were captured with the first result for the term: napsterization. Google results give context about the number of results, but there is no indication of time. All four blog search results note the number of matches they see, but also include the time of a post. Bloglines has a "subscribe" button to pull the feed into its newsreader and Blogpulse and Feedster offer XML buttons for putting the feed into the user's newsreader. Blogpulse offers to track the conversation and provides a profile of the blogger for additional context.
But note the format of all of them. Everyone maps to a format that places a link at the top of the entry that corresponds to the except below with the match. Feedster and Technorati bold the search term as does Google. Bloglines and Blogpulse don't. Below the matching excerpt, they all place a link to the blog, and follow with context they find relevant. Google shows the size of the page, the others follow with dates and times of posting, and Technorati gives the number of links to the blog as an indicator of authority.
But the point is, blog search results are similar to web search results, with some additional information presented. The order of the results is different, though, in an attempt to meet most users' expectations and goals with the information, and more closely match the results with what is interesting about blog information. Google serves what they believe is the most relevant information, based on page rank. Blog search companies give what they see as most relevant, which are results are in reverse chronological order.
The picture below is simply a way to visually compare results:

Additionally, I've included another PDF chart comparing how the five blog search companies handle keyword search. Note that Pubsub does prospective search, so their value is in feeds of keyword searches an not historical information. I'll also update this post and chart if additional information comes out, as I've been doing with my first post in this series. Future posts will summarize and give more sense of the value of these search tools, but I'm focusing on how these things are done, and more long terms results of search quality here.
KeyWord Search Comparison Chart
Comparing search completeness, timeliness and cleanness
Right now, Doc Searls is running a short term look at blog search, including URLs and Keywords: The Sam Test and Your Findage May Vary. He is looking for the speed and inclusiveness of delivered results across several services. But comparing just the services I can review with historical search, I can see that Blogpulse, Feedster and Technorati seem to have very high, accurate results, with little duplication. Feedster has more results in the same period (48 hours) because they mix del.icio.us results into their search results instead of separating them as Technorati does. I'll be curious to see what the counts are, based on Doc's criteria. But my own counts show about 45 results talk about this in the past couple of days, and it appears that Blogpulse, Feedster and Technorati each have almost all the results, and Doc reports that Technorati was able to grab them the fastest.
Compare this to a very specific survey I did last fall that encompassed a week of results (not dependent on the speed posts were pulls into each search service but rather looking at duplication and coverage of the activity across posts). In that survey, Blogpulse won hands down. But since then, Technorati has redone keyword search, removed the 7 day result limit, and while results only go back to last October (the start of the redone database) their new keyword search is doing very well and shows dramatic improvement. In Doc's recent test, Technorati wins in the speed/coverage test. In my tests, which were based on completeness and cleanness, Technorati and Blogpulse both did well, but if it's true that Technorati was faster, I didn't notice because enough time had passed that Blogpulse had parity with similarly complete results and as clean of results. Also, Technorati has reduced duplication over the past 10 months, bringing it more in line with Blogpulse's cleaner result set from last year. But it appears from Doc's test that Technorati is now faster than Blogpulse, a few hours after an event, to pull in posts. Overall, Technorati's keyword service shows marked improvement and does very well.
The best interface, with most interesting options again has to go to Blogpulse, which has links to tracking the conversation (links to a post instead of just links to a blogger) and profiles for bloggers, as well as a clean easy-to-read style.
Additionally, in various test searches, Feedster still appears to mix blog posts, top down news and del.icio.us data together, though it did not appear to be as much of a problem as it was 10 months ago. Depending on your goals, this practics may or may not be helpful. And as the keyword survey below shows, it was not helpful for that particular use case, because repeated Associated Press articles cluttered the results so badly. However, Feedster did get the originating post in last year's survey, which was difficult, and no other service got.
In contrast to the current tests, below is the 10 month old survey of two searches for: kryptonite lock, and kryptonite bike lock, across 4 blog search services. Searches compared results to see which service provided the best user experience and results. The user task was to grok where this meme had started and to get a sense of what people were saying.
Search Comparison Summary dated 9/19/04
Kryptonite locks became a major story for bloggers this past week, when a bike rider (whose bike was stolen because his kryptonite lock was compromised with a bic pen), made a video of how to pick the lock. Bloggers picked up the story with speed by Tuesday 9/14/04, though things appear to have originated from a biking forum post from 9/12/04 that then was blogged on 9/13/04 (appears to be this one). That video appears on many blogs starting around Tuesday of this week. So keyword search for "kryptonite lock" and "kryptonite bike lock" (searches were NOT done with quotes) works across all services to find out what's going on with this story in the blogosphere. Also, by 9/17/04 (Thursday), the story appeared in newspapers across the country, both original stories by those news outlets, as well as by AP and Reuters stories.
All four searches returned results in reverse chronological order. Only Feedster got the originating blog post from Anarchocyclist. Blogpulse had the cleanest result set, with the fullest set of returns, though not complete. Many of the major blogger's posts were missing from the Blogpulse set, but they were also missing from Technorati's returns.
Blogpulse had by far the cleanest results sets (only three duplicate posts out of 160 and 241), cleanest and easiest to read presentation, no top down news posts (Feedster had around 70% of results from AP, Reuters and other news services and it was hard to distinguish them), so groking the blogosphere's take on this topic was easiest with them. Blogpulse had all results listed by blog post title, hyperlinked to the posts, that were spotchecked. Blogpulse gave the best overall experience and returned results data, despite missing some posts.
When blogs posts across the four services were isolated from the news stories that were listed in say, Feedster, as just another post, all four services did give a picture of what was happening across blogs regarding people's thoughts about the locks. However, the easiest and fastest way to do that was on Blogpulse as stated. Feedster's extreme mixing of top down news stories with blog posts may satisfy some searchers, because the order is most recent first and the news stories are more recent (followed by blog posts). But the actual experience was that the Feedster returns for this kind of search where results might produce both types of entries was not great, because the results pages resembled Yahoo news, and therefore getting the blogosphere take was much more difficult.
Technorati is giving a suboptimal experience, both because of the limited results returned due to forced phrase search, the extreme amount of duplication of posts, the lack of title hyperlinks that returned the correct post every single time, and the presentation.
Results in detail:
1. Technorati returns 41 results for -- kryptonite lock -- and 18 results for -- kryptonite bike lock --
Of the 41 results, 40 are actually shown (41 was somehow missing), 2 were exact repeats of nanopublishing.weblogs.inc and 12 were from badassgeek.com from "1 day 13 hours ago" and 9 were from "20 hours and 8 minutes ago." Those 21 "posts" were identical and obvious by skimming through them. The net result was that of the 40 shown posts, 19 were unique and 21 were repeats of 2 of the 19 unique posts. Because the search default is for phrases, these searches, not put into quotes, still return results in quotes and this severely limits the value of the search. As an example, the originating blog post that appears to have started this meme in the blogosphere uses the term: kryptonite U-locks, and Technorati's search would not therefore pick up this post for the phrase: kryptonite lock, though other search services did. Just to test, kryptonite U-locks did not return the Anarchocyclist post from 9/13, which it should have, but it did bring up 3 results that did not appear in the other two searches.
Earliest posts are from 2 or 3 days ago for the 'kryptonite bike lock' search and 4 to 5 days ago for 'kryptonite lock' search because of the phase search limitations. Posts do not show where the meme originated, and start well into when the blogosphere discovered this issue. Posts are shown 20 per page so paging through was not onerous.
2. Feedster returns 410 results for -- kryptonite bike lock - and 663 results for -- kryptonite lock
300 of the results for kryptonite bike lock were from Tuesday 9/14 forward and all posts before this were not about the bike locking picking issue. Feedster is also showing major news stories mixed with blog posts, and the only way to tell is either from the icon they use: a large blue square with a white "i" in the middle, or by reading each source. However, they use this for many informational blog posts as well, and so reading the source became a time consuming task to figure out what is coming from a blog and what comes from top down media. In fact, more than 70% of the "posts" were actually mainstream media articles, and if the goal was to find out what the blogosphere was saying, on an issue that appears in both places with lots of "press," Feedster was making this goal difficult. In fact the returned results looked very similar to a search on news.yahoo.com for the same terms. Percentages were similar for the 663 returned results on kryptonite lock. Feedster allows any instances of the searched words appearing in any order to be included in the returned results. Spotchecking post title links produced some posts that were not the ones Feedster returned on results pages.
Earliest posts go back months because they do not limit search to 7 days, but the older posts are not about the lock picking story. Posts were shown 10 per page, so paging through 30-50 pages was onerous to get to the beginning of the blog meme.
3. Bloglines returned 212 results for -- kryptonite lock and 129 for kryptonite bike lock
Numerous duplicates of posts were found. 4 of 1 post, 12 of another were duplicates -- because there were so many and so many pages of posts, I didn't do an exact count, but every page of results had numerous repeat posts, however, one post from Slashdot had over 40 entries as individual posts, and I would guess 40-50% of the entries were duplicates. Also, Bloglines has some AP and Reuters stories mixed into results, though not the 70% returned results that Feedster did. Bloglines had about 25% news stories, though they did not distinguish between these "posts" and blog posts, and it was even more difficult to tell them apart than the Feedster results. Bloglines allows any instances of the words appearing in any order to be counted as a result. Spot checking the title hyperlinks produced posts that were not the ones Bloglines listed.
Bloglines first story is from Eyebeam on Tuesday 9/14/05 for a blogpost on breaking the locks with a pen. This was not the first post, but was very early in the meme. Posts are shown 20 per page so paging through was not onerous.
4. Blogpulse returns 241 for kryptonite lock and 160 kryptonite bike lock Blogpulse had very few duplicate posts (maybe 3 which was amazing compared to the other three services), and no top down news stories. 143 of the 160 returned results for kryptonite bike lock, and 224 of the 241 posts for kryptonite lock were all blog posts, all on the bic pen topic, all matching their respective searches. Presentation was clean and easiest to read. Blogpulse had all results listed by blog post title, hyperlinked to the posts. All spotchecked post titles were correctly linked to the actual post returned in their results.
Blogpulse did not get the originating post for this meme, nor did the search for Kryptonite Bike Locks get the Engadget posts that kicked off the top down news and many of the hundred or so blog posts that followed this week. However the kryptonite lock search did pull the Engadget posts. Lots of Live Journal results; maybe 50% of entries are from Live Journal bloggers.
Earliest posts go back months before the bic pen results. Blogpulse returns 10 posts per page, so paging through 16 and 24 pages was a bit onerous.
July 25, 2005
Check out Newsweek: Getting down with the masses
Just saw this on Newsweek:

Scroll down.. on their main page to see it on the right side. Pulling in bloggers to contrast their thoughts with Newsweek's reporting, and showing what bloggers write about Newsweek articles and writers is a very cool idea. Nice!
Also, congratulations to Technorati on providing data to Newsweek.
July 24, 2005
For the Vox Populi: A Comparison of How Some Blog Aggregation and RSS Search Tools Work
UPDATED: Recently, there has been some blogosphere discussion about different blog search services. People have been asking me for a year and a half to compare them, and I've been reluctant. However, after last weeks confusion, I decided that if folks like Robert Scoble are having difficulty comparing the search results of different services that we've been using for some time, we really needed to get a few things clear for users. Also, Doc Searls suggested that it was about time. And the other day, he said it again in person.
I'm going to do this as a six part series, the first of which is below, on how services track links to blogs. The second will be on key word search, the third will cover subscription search (watchlist) performance, the fourth will look at special services and the fifth will look at spam and controls for it. The sixth will summarize and make recommendations about how to best use the services. I picked the five services I look at every day: Technorati, Feedster, Bloglines, Blogpulse and Pubsub, and so I'm familiar with them over time. I see watchlists or alerts via RSS feeds from all but Bloglines, of both URL and keyword searches, many of which are duplicate searches that allow me to also track how the services do with their searches. Note that I'm not reviewing Bloglines as a newsreader, partly because I use Netnewswire for the most part, with Bloglines as one of my backup readers, and partly because there is no comparison to the other services because they are not news readers at all.
Additionally, Blogpulse had this write up in Marketing Vox, suggesting it might be a Technorati Killer in the estimation of a blogger they were quoting. However, because what Blogpulse covers is fundamentally different, and their philosophies about how to age information is different, they are not so similar when comparing results of URL searches for inbound links. Depending on the user's needs, one or the other service may suit those needs better. However, due to some of the additional features Blogpulse is now offering, it is doing some of the things that bloggers and others really want from other blog aggregation companies, yet aren't being offered, like rank, citations and recent posts. So in this sense, they are different and more interesting, if Blogpulse information is what you are looking for about you or others you want to analyze.
Finally, Adam Pennenberg notes that these kinds of services are like public utilities, so it seems like a good time to compare and contrast the services.
This exercise is an attempt to give readers and users of the services a comparison of how the services work so that they can take best advantage of the strengths and avoid the weaknesses in order to track URLs, keywords, other special services, and alerts or subscriptions or watchlists (the services each use different terminology in order to differentiate themselves but users tell me the terminology is just terribly confusing and they wish that as an industry we would settle on one term and use it across all the services and then get on to figuring out how to provide the service better).
Matt Hurst of Intelliseek (parent to Blogpulse) has a post on evaluating blog search services which is very informative. It includes information on search generally.. which I think applies very much to evaluating key word search, which will be covered in my next post. URL search is a little more straightforward in that people want to see everything that is linking to the URL they are looking up. But he makes some excellent points.
In disclosure, I should say that to one degree or another, I'm friends with people at all of these companies, as well as having worked for Technorati in the past, and currently a member of its advisory board.
URL Searches for Inbound Links
Two weeks ago, Scoble compared the inbound link counts for Dave Sifry's blog on Technorati (735 links at the time of Scoble's post) and Bloglines (2,644 links at the time of Scoble's post). However, the way they are contrasted, isn't actually comparable. First, Technorati's count is actually for inbound sites or sources. In other words, you can have 10 links from a blog, but that blog counts once as a 'site'. So Dave has 735 blogs that have linked to him at least once, at this moment in time. Technorati also only counts links and sites from blogs that have a link on the front page. Therefore, if a bloggers blogs, which bloggers tend to do, their old posts scroll off the front pages and therefore the links in those old posts go off the Technorati count at the same time. Blogroll links stay in the counts because they are permanently on the front pages of blogs, but if a blogger's post links to another blog, that link only gets counted so long as it's on the linking blog's top home page.
Bloglines on the other hand, gives a total link count, for all Blogline's history. If a blogger is linked to 10 times, in the history of Bloglines aggregation of links, those links count as ten, towards Dave's Bloglines total. Bloglines doesn't give a base count of sources doing the linking. Also, Bloglines shows you everything since they started tracking blogs, so Dave's first link goes back to a post on August 22 2002. Technorati would age that post off their link counts, since that blog no longer shows the post on the front page (it long ago scrolled off the page). However, I wasn't able to look at Dave's first link on Technorati, because the service kept returning error messages about high search volumes, so I can't compare their first result to Blogline's first result.
Note also that Blogline's total for Dave's blog is now, two weeks after Scoble's post, 2730 links verses Technorati's total sources (each blog counts once) is 712. Bloglines is higher that two weeks ago, because it has an aggregate count of all links. Technorati is lower, because some blog posts have scrolled of those blog's front pages, and until new links are made, Dave's source count might continue to fall. And based on each company's information philosophy, this is actually as it should be, and is correctly counted using each methodology. In fact, the difference is very useful, because one can compare Dave's current activities, blogroll and post links at 712 from Technorati to his historical link count at Bloglines of 2730, maybe discounted a little for duplication of posts. My assessment might be that Dave is currently a heavily linked to blogger, but three years ago, didn't have so many links, and has grown over time, in an upswing to say, around 2000 links total over the history of his blog. Probably this has occurred because of the growth of Technorati, and as its CEO and the place Dave blogs about Technorati, his blog has had it's link counts grow as more attention is paid to Technorati.
On the other hand, my blog has 1012 links from Bloglines over the past couple of years (discount 20% for dups) but 205 site links in Technorati. My assessment might then be that Napsterization is more of a steady blog.. with 800 links over the past two or so years, and since I already know that the blog had similar link counts a year ago.. that it's more conversational, linking out and in at similar rates over the past year or so. Not much upswing but a steady conversation ongoing.
Below, in chart form, is a comparison of Technorati, Bloglines (as an information search tool, not a news reader tool), Feedster, Blogpulse and Pubsub. The chart is a PDF (blog software doesn't render html charts so well... but if you have a suggestion about getting this data into my post, please email me at mary@hodder.org) but as feedback for this post comes in, I will update both the post and chart and note the updated time and date. I'm going to treat this survey as somewhat of a wiki, so that I can incorporate feedback to make this the most accurate survey possible.
Please note the footnotes, as they explain additional information about specific categories of information and how specific services work in those categories of activities. Also, note that some services perform poorly in the URL lookup category, but their usefulness will become apparent in the keyword category, or for subscription search or for other special services. Please don't write anyone off due to a poor showing here in the URL section. All five of these services are very valuable, as they each show us different things, and frankly, for my information needs, I want and use all of them each day to track myself, my projects, companies I consult for, and all of my areas of interest, which are numerous. Often, the combination is the only way to get an accurate picture of what is happening online across blogs and RSS feeds.
NOTE: I've updated the file just now to take into account revised and clarified information about Blogpulse and Technorati. Blogpulse has a bug in their URL search, wherein, if the http:// is not at the front of the URL, very little information is returned. And so rather than 9 links for napsterization, there are 477. And Technorati, I wanted to point out, does not count links in its link counts that have scrolled off the front pages of blogs, but they do still show search results that match keywords that have scrolled off. So users may see older results, but not see them in link counts.
And additional update regarding Bloglines. They noted that they only serve results for searches from blogs that at least one subscriber has in the list of subscriptions. This has been added to the chart under information philosophy.
PDF file of comparison of how Blog search work.
Also, please use the comments below to tell me about areas that need more information, or suggestions. I'd like this to be as accurate as possible and will correct or update with information as I find it, or it's sent to me. Thanks very much for suggestions.
Oh.. and you have to answer a question to comment.. so please remember to do that, or the comment system gives you that obtuse answer that your comment is 'of questionable content' which isn't really true.. just that haven't answered the question. Thanks!
July 21, 2005
Blogher is sold out!
I'm leading a discussion on investing and entrepreneurship with Denise Howell and Patricia Nakache. I was hoping the investor panel at Always On yesterday would prompt Denise and me to figure out a discussion post to get the conversation going before our discussion on the 30th at Blogher. Denise and I conferred afterwards and decided the panel was so unpalatable (trying to be generous here) that we wanted to make sure what happened at AO didn't happen at our discussion. We'll keep working on a post to put up to get the discussion going online previous to the conference.
Lisa Stone writes about the list of speakers at Always On, color coding it to show that the number of women speakers is really low, and pointing out why Blogher is one solution to the problem of conference organizers being unable to think about speakers other than men.
I think another solution is to come up with a list of interesting women doing amazing things. I've been working on that list, but would love to get comments below with suggestions. I'll try to post the list this weekend.
The Digital Media Exposure Scale
I'm at Always On, and there are interesting hallway conversations going on. Dave Sifry and I were talking about exposure, or, how much you expose online the people you come into contact with in person. The other night in the EFF panel discussion, I said that if I know someone is online in a medium, I have no problem putting them online on my blog or using Flickr or whatever the appropriate thing is. In other words, if someone is online in text, I will talk about them by name on my blog in text. If someone puts themselves online in pictures, I will too, by name. Same with rich text. If they aren't online, I might put them up, but not attach their full names or information that would make it possible to find them.
Additionally, I noted that people think of media reuse differently depending on the type of media. Text is least likely to be a problem if cut and pasted, photo reuse is a little more of an issue, but sound and video is most concerning for those putting their media online. And so using some judgment around the ways we reuse each other's media. However, I also think this will shift as we see more examples of remixing, and get comfortable with having our stuff remixed, even in ways we don't like, and realize the remix is a reflection of those remixing, and not those who made the original media, and cease to care so much. In other words, the richer the media, the more we are concerned about our own images or how other's reuse our media.
This came up because Dave walked up and we chatted about some of the AO sessions, and he shot a little video of me describing a point from a session yesterday. And we talked about how we each assume that we can do this with the other, because we are already online and put ourselves out there.
Dave made an interesting point that those of us with companies doing social media need to think about what we will do, what happens when we have our first big scare. Some stalker does something bad with the information we put up online, using some service put up by these companies, to do something uncool that is scary for people. As more people beyond the early adopter crowd take to blogging, social photo sharing, vlogging, podcasting, etc., we are more exposed. The good part is, people in these companies are all are pretty connected to each other, so we can quickly talk about it, and hopefully adjust for the bad actor behavior to solve the problem. But we haven't had our first big scare yet, and that will happen, and cause us to rethink our online behaviors and the services that are out there helping us filter information. It will even out, but we are still early and naive in this business, and we need to be sensative to these issues.
July 20, 2005
July Event: Planetwork FOCUS on DIGITAL IDENTITY TOOLS
Planetwork (I'm dying to say planetwerk...) is doing this event next week on identity systems:
Thursday, July 28th doors at 6, program at 7
CIIS, Namaste Hall,3rd Floor
1453 Mission St. San Francisco (2 blocks from Civic Center BART)
Kaliya Hamlin of Identity Woman has curated this line up that provides a great opportunity to learn more about some of the latest tools for next generation digital identity.
Light Weight Identity - LID
Johannes Ernst http://netmesh.info/jernst
NetMesh Inc. http://lid.netmesh.org/ .
Light-Weight Identity(tm)-- LID(tm)-- a new and very simple digital identity protocol that puts users in control of their own digital identities, without reliance on a centralized party and without approval from an "identity provider".
OpenID
Brad Fitzpatrick http://bradfitz.com/
Six Apart, Ltd. http://www.sixapart.com/
OpenID, a decentralized identity system, but one that's actually decentralized and doesn't entirely crumble if one company turns evil or goes out of business. An OpenID identity is just a URL.
Sun Single Sign On
Pat Patterson http://blogs.sun.com/superpat/
Sun Microsystems http://opensso.dev.java.net/
Sun is announcing the intention to open source web single sign-on. This project, called Open Web Single Sign-On, or OpenSSO, gives developers access to the source code to these basic identity services allows them to focus on innovations that solve more urgent problems, such as securely connecting partner networks, ensuring user privacy, and proving compliance.
Opinity, Inc
Ted Cho http://www.opinity.com
Opinity provides what might be called open reputation for end users. It is a young start up offering free online reputation management related services so that individuals can authenticate, aggregate, and mobilize their website (eBay, Amazon, etc.) reputations. Opinity also offers reputation management tools so that individuals can monitor, build, and work to enhance their own reputation going forward. Individuals can also review other individuals at the Opinity website.
_______
Planetwork has been hosting monthly networking forums in the Bay Area for the last 3 years. We are a unique network sitting at the nexus of technology use for social and environmental good. To support the monthly forums we invite voluntary donations (in a basket on the food table).
If you would like to join our mailing list to get more information about upcoming events please go to this page and get a planetwork i-name http://www.planetwork.net/community/index.html
July 19, 2005
EFF Announced Blog-A-Thon !!
Tell your stories about the first time you knew your rights online were important, and win a T-Shirt!
The Electronic Frontier Foundation, to celebrate its 15th Anniversary, is having a party tonight at 111 Minna in SF (where I'll be speaking with Kurt Opsahl, Violet Blue, danah boyd, Dan Gillmor and Jackson West, and I think ms. boyd has my most favorite quote on the event:
- ::gasp::bounce:: They're letting me out in public again! Mooo ha ha ha!).
They are also having a Blog-A-Thon!

EFF says, "we want to hear about your "click moment" -- the very first step you to took to stand up for your digital rights -- whether it was blogging about an issue you care about, participating in a demonstration, writing your representatives, or getting involved with EFF. As a thank you, we've enlisted an independent panel of judges to choose from among your posts for "Most Inspirational," "Most Humorous," and "Best Overall." At the end of the Blog-a-thon, we'll announce the names of the three bloggers with the best posts on our website and in our weekly newsletter, EFFector. We'll also publish the three best posts on our site and send the authors a blogging "kit" as an extra thank you: an EFF bloggers' rights T-shirt, special EFF-branded blogger pajama pants, a pound of coffee, and a pair of fuzzy slippers."
Well.. get to it!
And look here or here for entries using the Blog-a-thon tag: EFF15 keyword or tag.
(Thanks Joe, I corrected the address of the party!
Vlogpod
Doc Searls talks about his first podcast (it's my video and Doc's podcast mixed):
(3.29 minutes, 320x240, quicktime format, iMovie, shot on a Canon SD300)
July 18, 2005
Did EVERYBODY Get Up on the Wrong Side of the Bed This Past Week?
Okay, so I go to Chicago for a few days last week and don't have much internet access, get a little backed up on my aggregator, and return to a lot of work in CA, only to find out that a bunch of people are really saying some very, very strange things.
It started with Silicon Valley Watcher who reported Peter Hirsberg's remarks at an event on 7/8/05. Peter apparently mentioned that Technorati would sell filtered blog data to companies interested in tracking themselves and their competitors. Of course, anyone can make free watchlists now, but these would be more sophisticated versions of those filters.
And then Jeremy Zawodny asked whether Technorati was going to share any dough with bloggers. Huh?
Um, my understanding is that Yahoo 'sells' search and filtering. And it makes money doing this. On its website, Yahoo search results showing freely available webpages are valuable because of the filter/search service that Yahoo provides, and those search filters allow them, in exchange for this service, to place ads next to their search results, thereby make them money. They don't share the profits with the makers of the matched sites. Or at least, Yahoo has never sent me a check for serving my content. Cause they ARE NOT selling content. They are selling a service for searching and filtering, and that IS salable.
You can't sell content online. If you do as a publisher, your content becomes unlinkable (see the Wall Street Journal for their *tremendous* online reach and participation in the conversation). You can't sell data like blog information because it's free already. You can sell services that help manage data and content, including filtering, search and aggregation. Service providers like Yahoo, Google, Technorati, Feedster, Pubsub, Bloglines, and many more, who offer free online services with less sophisticated search and filtering, ARE NOT selling data. They are selling convenience and management of data. And they don't owe us creators of free blog posts and websites for selling convenience by filtering our data. More sophisticated filtering and search, that is often highly customized, is a charge service. And providers of those services don't owe us a cut of that either. Because they ARE NOT selling data. It's the filtering and search services that matter.
Please read Information Rules again, where they explain why and how YOU CAN'T SELL CONTENT in the digital world, except in highly unusual circumstances but YOU CAN SELL THE MANAGEMENT of information.
Oh, and let me disclose here: I used to work for Technorati, and I am on the advisory board. I've also been critical of Technorati at times, but this time, I think they were the first to push this conversation out into the view of people, and were unfairly singled out for the 'selling of the blogosphere' which is a complete misnomer and totally inaccurate. But the reality is, lots of other companies before them have given away free services online with lower level customization, put ads next to them, and also sold highly customized services for special purposes to individuals. Technorati is no different, and is actually doing the right thing for the company and their ability to provide the free service in the long term. It is often the case in the digital world that free services are supported by the selling of premium services, and in this case paid filtering can support free filtering offered on the Technorati website.
Other folks that wrote about this include: Those linking to SVW and Doc links to a bunch here. Additionally, lots of other posting about the quality of Technorati's service appeared around the same time, which is a totally different issue, but somehow was conflated with the selling of filtering. I understand that folks think Technorati should make a good service before they start selling it, but it seems to me they are working on that, the service online is free, and no one is forced to use it. There are lots of other choices for finding similar data, and everyone is free to go to whatever site they wish. My own view is that I hold multiple search feeds from all the services, because they return different data on the same searches, based on their data models and databases. But that's a different issue as well, and maybe I should do a post on that, explaining who covers what, and how the results compare, as I've monitored all the services for the past 18 months.
July 17, 2005
Brad Templeton Interviews Esther Dyson at the Hillside Club
Last night in Berkeley. Great talk. Audio should be available soon (will update here with a link).
I took about 3 minutes of video, and edited it into this simple piece. But the complete talk was great. About 70 people came, including Esther's mother, who recently moved to Kensington/Berkeley.
(3.38 minutes, 17.5 mgs, 320x240, quicktime format, edited in iMovie)
July 16, 2005
danah makes me ::giggle::
Which evil nation state are you? (similes for Microsoft, Yahoo and Google)

She sez "...it's not so very nice... but"
Check it out. ::giggle::
July 15, 2005
FAB: A missed opportunity to tell a great story, but what incredible assumptions and presumtions people have about this new kind of fabrication of personalized manufacturing
I'm in this book group, called 'netheads' where we read books about networks, network effects, digital culture and new technologies. We've been reading Fab by Neil Gershenfeld, who is a prof at MIT.
He started a class on fabricating your own stuff.. and instead of the 10 students he thought would sign up, 100 signed up, making things using computerized fabrication and CAD tools, and some other things, in a fab-lab. He later got copies of the lab exported around the world. The book relates what he observed, taught and learned, with people basically thinking along the lines of personalized, remix culture (which we usually see in digital situations) applied to what is normally considered to be mass-market manufacturing, in the one-size fit's all model.
The book is repetitive, and not very well written at times. And the way he tells different students' stories is frustrating, because sometimes people come up with amazing things, like the computer interface for parrots, or the scream machine, and you really want more, in-depth information about how they did it, what the interaction is in the thing they made, how they figured it out exactly, and lots of detail about how they worked with the lab etc. instead of a glossing over in a page or so description. It feels at times that he is jumping from one 'cool' think to the next, of is focusing on personalities instead of the beginning of a really interesting social phenomenon.
I really wanted deep thinking, connection and comparison between earlier self-manufacturing and this current version, and some analysis about how people make the leap from consumer to self-manufacturer. Instead, I got these quick hit stories where I had to figure it out myself to some degree from the skimpy information, but I know there is much more behind it and it felt missing from the stories and analysis.
But the assumptions the people in the book take on are radical. Just like we (online digital people) wake up every day presuming we can both make code that remixes other code, or remix digital information, or that we can write and rewrite whatever we want online, remixing others' words to create something new, assuming this is normal and possible, and just another everyday thing, so do the folks who have access to fab-labs presume they can remix the material world, and assume it's normal. I was imagining, in reading the stories of people's inventions, finding those things in Walmart. Would never happen. This stuff is too customized, too personal, too handmade, and too useful to find its way into some marketed, glossy mass-production type store.
Also, what is interesting is how people assume they are capable of this. 100 years ago, people manufactured their own things, but at some point, we gave that up. The vast majority would never think of self-manufacturing. But with computerized fabrication labs it's possible to think we can do this, just by watching others in a fab-lab or similar workshop, having them teach us, sharing information one-by-one and thinking in a remix mindset that takes you beyond the consumer, manufactured frames we are born into now.
It's extremely politically subversive stuff, amazing and exciting. It's really too bad that Gershenfeld isn't a great writer, who could transcend his own fascination with being cool, to find what is so changing about these labs, the communities that build up and fall out and build again around projects and creators personal assumptions and shared information and learning. Because what is in the subtext that he never really brings to the surface is potentially transcendental for our culture.
It also reminded me of what I learned from my father when I was a kid. He was the CEO of a company for 25 or so years, and yet, he would take the opposite of those skills and experiences, in spare time, where he and I would do things like replace a sewage line, or rewire the bathroom, or replace a garbage disposal. Or make a rabbit hutch for my rabbits and ginnea pigs, or build a playhouse. I don't know that I could fabricate things yet that they did in the fab-lab, but because of my experiences with my father, I believe I can do a lot of things that I hear others say they can't do. The possiblity for me is there, and I think the fab-labs extend that possiblity to lots of people for manufactured kinds of things. The fab-labs offer an incredible change in personal assumptions for people.
July 13, 2005
Setting the Scene
The past four days, I've been in Chicago, north of the city actually, in a nice leafy green suburb visiting friends. The original plan was to be there for 2.5 days, but they had a family emergency (the grandmother was diagnosed with pancreatic cancer three weeks ago, and after operating, the doctor said she might live a few months. But yesterday they had an appointment with a world renowned oncologist, and at the same time, the babysitter got sick Monday.. so they asked me to stay for Tuesday while they went with grandma to the research center. The good news is, new doc has some treatment that gives her a fighting chance!!)
So.. the girls and I went out for the afternoon to the forest preserve (a beautiful long narrow preserve, that stretches north / south for miles where we biked for hours), shooting video of them, to edit into a present for their grandmother. They directed the editing process, chose titles and colors, arranged photos and chose the music (we'd already done this Sunday with some beach footage so they were ready to go and very excited about this new craft could do.)
Here are the results of their work.. they loved the process, where they could direct the story, creating a little movie with whatever they wanted. I also showed them Rain and Octopus, which they insisted on screening over and over. I showed them Flickr, and all my photos, as well as the photos and video I took at the 'parents only' engagement party their parents had for a friend Saturday night at their house. They were taken with the idea that they could see what went on, even though they stayed with their aunt and uncle for the night. They were captivated by all of it.
While I was in Chicago, I missed a couple of things until today when I could really read through my aggregator:
This will be the lore we tell the younguns.. when we're old, about how obtuse some people's reaction to digital media was.. back in the day: Bloggers need not apply.
This is the lore right now: Zits on the tragedy of a single telephone line for the whole family.
And, William Gibson on Remix Culture: God's Little Toys
Our culture no longer bothers to use words like appropriation or borrowing to describe those very activities. Today's audience isn't listening at all - it's participating. Indeed, audience is as antique a term as record, the one archaically passive, the other archaically physical. The record, not the remix, is the anomaly today. The remix is the very nature of the digital.
July 11, 2005
The Train of the Blogosphere
Jessika Hjarta left a comment on my blog here.. which, I have to say, felt a little bit like comment spam because it was so general and resembled the comment spam (which only shows up in the backend as i have good filter, but there is a ton of it) that spammers try to leave. But I looked at her blog (personal journal and opinion style) and decided she was just saying she liked my blog, which I very much appreciate. But then Doc IM'd me and asked if this comment on his blog was spam. Same type of comment from Jessika, saying 'nice blog,' with a link to hers. We decided to write her a note, to see what was up, and it turns out that she sincerely liked both our blogs, is interested in topics around technology and the web, and she's relatively new to blogging. Next thing we get email asking about what an RSS feed is, as she listened to a Gillmor Gang podcast, and then asked Doc, what's a podcast?
I remember when I first discovered of all this stuff online... it was really exciting, and finding people who could help me understand more of it was great too. People online who were blogging were just so nice and interesting, and I'm really happy to pass that help along to someone else who's new and trying to figure out what this all means. And check out her blog. It rocks, has a cool pinup-y style that's really beautiful, and is another great voice for the blogosphere.
July 07, 2005
For the love of wifi and 110
So, I'm in Palo Alto all day for meetings and in between, trying to get on wifi to work.
Started out in Palo Alto at a cafe that has disconnected all its power outlets so people can't plug in (Torrefazzione on University -- bah to you guys). After a while, due to power needs, I had to move. Eventually ended up at a Starbucks, where I helped some people make a shared network so that they could all get on with one account (they were all buying food.. so why not, plus I really hate Starbucks, and they self-identified as 'rookies' for wifi.. so I felt like they could really use the help....) You know, I bought things at all of these places.. so I think the deal should be that if you buy, you get on and can get power.
I'd rather have wifi than food, but I'll buy food to get the wifi.
July 06, 2005
Honor Tags
Dan Gillmor and Bayosphere have worked up an interesting tagging system, to differentiate the types of blog posts people are making, if they choose to self-tag their posts. They plan to pull those tagged items into their site to reflect back activities in certain categories. This has some advantages but also presents some problems, though I think there are community solutions that can moderate the problems.
Benefits of Honor Tags include:
- more fine grained searching, based on the ability to pivot on a couple of tags... where you can find the intersection of a type of post and a topic to see just those that have both kinds of tags
- community affiliation, closeness and participation due to special tag understanding and use
- people will declare their intentions
- there is potential for advertising based on these tags or aggregated groups of posts with tags
- we might find there is better tracking of reputations than we did before
- it might allow users of the tags to have a better shot at legal protection for self proclaimed journalism makers, as they make a kind of journalism
- users of these Honor tags can self-tag one post at a time, so one tag can be about one thing, and another can be another.. it's on a post-by-post basis that we understand these designations instead of by blog or by blogger (in other words, it's not self-tagging as a journalist, but rather as self-tagging a journalism-categorized post).
Problems that might creep up with this tag system:
- people are often either not honest about themselves, or simply don't classify themselves well because they are not very self-aware, or understand the definitions of the classifications differently, so they may state something different that what the community perceives them to be
- some outside the Bayosphere community may feel it's elitist
- as the benefits/results of self-tagging one kind of post become more visible either in search results or other aggregation pages, some people will game the system, especially if there are advertising dollars at stake
- people will game reputations of themselves and others for variety of reasons
- people will game the links to be included in search or display systems in malicious ways just because they can
I think the community should moderate the use of these tags, to solve some of the problems that may arise from Honor Tag use. Some things that might help the community do this include:
- Users could make tags that are about self describing an action, at one point in time, in one post (so it's 'journalism' for a post, not 'journalist' as one's status)
- Users of the tags should make it clear that this Honor Tag system is for a particular community, and specifically for certain acts, not defining people, but that anyone can use the tags and is welcome to participate in this community through their blogs and use of these tags
- Users and the community as a whole could help make it clear that people can use any and all tags on a post by post basis.. meaning.. one post is journalism, and another is advocacy, depending on what's in the post
- Users and Bayosphere together could create some community moderation for the tag use so that if the community sees a bad actor, they can report it, and if there is a dispute, allow the community to decide what to do about it, and even how to handle it.
I'm very interested in seeing how people use these tags, and what the results are on Bayosphere aggregation pages (which I hear are coming soon) and through services like Technorati. One thing I'm already noticing is that I'm having trouble deciding which tag or tags to use for this post. I want to use one of the tags, but is this advocacy? or reporting? or personal? It's kind of all three, including the personal since right at this point in the post, I'm discussing my own tagging and classification issue. Humans are messy and we have trouble saying what we are, and sticking to one thing at a time.
I'm also really enjoying the creative ways people are coming up with to tag things, and I hope that Bayosphere and Honor Tags will keep tinkering with the classifications, tag structure and the UI and information meaning of the aggregation pages that collect the tags. These systems and tagging generally are very early stage and need a lot of work, but I'd definitely encourage people to try out Honor Tags and see what happens as their posts get pulled into other sites. I'm sure that the community around Bayosphere will have lots of feedback as they play with tagging. I think it's fantastic that Dan & crew are taking the plunge on this to try to figure out something interesting.
Good Luck!





