Comments: Link Love Lost or How Social Gestures within Topic Groups are More Interesting Than Link Counts

Great post Mary!

I replied here:

Posted by Kevin Burton at August 6, 2005 08:20 PM

My full reply URL is here I think..

Posted by Kevin Burton at August 6, 2005 08:23 PM

Mary, PageRank is not a secret. Its actually described in Larry and Sergey's paper The Anatomy of a Large-Scale Hypertextual Web Search Engine.

Of course, alot of other stuff that google does is secret including other aspects of their seach results ranking. PageRank, though, is not a secret.

-ryan king

Posted by ryan king at August 6, 2005 10:57 PM

The url for the PageRank paper is - it didn't make it through on the last comment.

Posted by ryan king at August 6, 2005 11:07 PM

"Part of what we want is a rich user generated ontology resulting in topic groups that is constantly adjusting to find what's delightful, useful, interesting across blogs."

That would be http:/

With the volume of blogs these days, you can find topic-focused blogs with keywords and tag search, which helps get out of the 'one true list' trap.

There's certainly lots of room for experiment here.

Posted by Kevin Marks at August 6, 2005 11:49 PM

This is well done, Mary, but may I add an element that most people seem to miss when tackling this issue. We all agree that the blogosphere is a bottom-up phenomenon, yet rankings are inherently top-down. This, it seems to me, is the most important factor in determining influence, etc.

Every blog is a part of or member of a tribe of individuals with like interests or other social, work or playtime factors that bring people together. No man is an island. We are all connected, and it seems to me that these connections are what need to be determined before any external value factor can be measured. I mean, it's fine to weigh overall rankings among the population, but that's mass marketing stuff and really irrelevant in a world of bottom-up connectivity.

In my work with LOCAL blogospheres, I find remarkable communities of people, so this is one element -- how does a blog or blogger fit with their geographical tribe?

A second factor is, of course, content. Rankings determined by blogger category, for example, will give an entirely different view of where that blog fits within tribes that are determined by interest.

If this truly is a bottom-up phenomenon -- and I believe it is -- then we must start looking at the expanding circles of influence that surround an individual before we can do any sorts of measuring.

Finally, if any of this produces a lust to "get to the top" in any way, then we've shifted focus from bottom-up to top-down. Frankly, I'm very comfortable on the bottom, because that's where the people are.

Keep up the great work. Terry

Posted by Terry Heaton at August 7, 2005 07:34 AM

Hi Ryan,
Thanks for letting me know about Google's page rank algorithm and search results.. I've made a note up in the post to the effect that it is the search result order, not the algorithm, that is secret. But I also wonder. Isn't keeping the order of the search result ranking a secret just as problematic? I mean, if one part and not another is secret, there is still a kind of security-by-obscurity that the community may have trouble policing, and comparing that to a way to understand bloggers, I still feel that we need an open algorithm for showing bloggers influence, especially if the metrics are based on so many factors.

Hi Terry,
Thanks! I do describe factoring in topic communities, which are included in the chart and the description, but had not thought about geographic communities. It's a great idea. But how would it work? Would the same algorithm around small communities that talk with each other be applied to show bloggers influence locally? That could be really interesting, but we should probably test it against a large data set, because I can also imagine that some bloggers have no connection at all to bloggers in relatively close physical proximity, and therefore, the results might be really off. But I definitely want to try it!


Posted by mary hodder at August 7, 2005 08:23 AM

The primary criteria for any metric is utility. Ask the simple question: What's this metric *for*? *Who* will do *what* with it?

Who are the prospective customers for this metric? Who is actually going to *buy* it (so to speak) and actually *use* it (not to mention misuse it)?

I don't want to sound cynical, but is blog ranking just for ego gratification? Or maybe simply a way for SEO consultants to justify that they've accomplished something?

If a business has a blog, don't they have an intended audience in mind? Aren't they most interested in their "reach" and effectiveness *within* that target audience or market?

Aren't most bloggers more interested in their *value* to their "tribe"?

Do we want to encourage niche blogging or encourage broader, "diverse" blogging?

Let's not forget one thing... blogs are about conversations. Comments should count a lot. Cross-blog conversation (not mere linking) should count a lot. Raw links don't seem to capture the quality of a conversation.

Finally, haven't you ever been involved in some conversation that go's back and forth endlessly until some "genius" makes an astute observation and short-circuits the entire conversation? How do you rank the elements of that conversation? The person or person who resolves the issue deserves a lot of credit (ranking), but their insight may simply have come from carefully listening from a distance, so the "trench warriors" may deserve a lot of credit as well.

Question: Shouldn't comments raise the ranking of the commenter's blog as well?

And how would you rank a blog which had some early success, but then lost its luster. It almost seems like ranking should depend on timeframe as a parameter. Call this "historical ranking."

But even a "current" ranking is by definition somewhat historical. Will someone who spikes up sharply based on the timeliness of an event be ranked above someone who delivers consistent value without spikes over time? Do you want to value spikes over consistency or vice versa, or maybe the poor dumb *user* selects between or weights the balance between the two?

Ultimately you've got a big problem: objectivity versus subjectivity. If you want to come up with global metrics that are inherently "objective", then they'll have little subjective value to each user. And if you focus on tailoring to the needs of each user or class of user, that subjectivity diminishes the global objectivity of the metrics.

Do we really have a handle on what problem we're trying to solve? Show me a robust *problem statement*, and only when there is some consensus about *what* the problem is (and how people with actually *use* the metric(s)) does it make sense to consider solutions.

If the problem is "understanding blogs", I'd suggest that we're talking about some global measure of how *effective* blogs are at reaching out to and engaging in conversations with their target audiences. The keyword there is effectiveness, not quantity or popularity.

If blogs are popular reading, but the conversations are minimal (or non-existent), shouldn't the "reading reach" be discounted by the weakness of the conversations?

How do you measure effectiveness of a conversation?

And what should be the measure or rank of a blog that stimuates angry controversy in the blogosphere without stimulating "useful" conversations within the blog itself? Will "firestarters" be permited to retain the "fruits" of their ill-gotten gains? Will "frenzy level" tend to be ranked higher than simply offering calm reflections?

One reflection: The complexity of the ranking algorithm will determine the extent of gaming of the ranking system.

There was a little controversy over character blogs some months ago... might your algorithm and ranking system encourage "character communities" and "character tribes". That reminds me of the need for a more robust "identity" validation system for blog posts and comments.

Note: With advances in software agent technology and multi-agent systems and artificial societies and virtual environments, aren't we just a short time away from a cyberworld in which "artificial bloggers" will be able to dynamically conjure up entire "artificial communities" that may attract "real" users, to form hybrid communities? How are these virtual bloggers to be ranked, especially if you are unable to identify them as "artificial"?

If you want to evaluate the effectiveness of a blog to its audience, I'm not sure you can really do any better than to poll that audience and let them do the ranking (e.g., -10 to +10 in terms of value received.) Even then, the effectiveness of the poll will depend on the willingness of the audience members to actually "vote", and vote responsibly.

Of course, the big remaining problem with such metrics is than so many blogs have an open-ended audience.

Maybe it's like stocks, where you have "growth" stocks, "value" stocks, small-cap stocks, and large-cap stocks, and ranking is relative to your "peer" group. The problem with blogs is that all of the parameters are open-ended, so it can be hard to define boxes that would hold even two blogs.

-- Jack Krupansky

Posted by Jack Krupansky at August 7, 2005 10:21 AM

I wonder whether rank is the wrong presentation, and clouds are right.

Clouds would primarily show the communities that a blogger is in. It may show secondarily the influence strength within that community, but that should be secondary in the presentation.

A cloud presentation might enable navigation along topic axis. For my blog, you'd be able to traverse to social software and austin clouds.

Influence would be calculated within the cloud. So, Jon Lebkowsky would have separately-calculated influence level within Austin and environmental blog communities.

Perhaps the presentation would allow the browser to traverse communities. One could find "blogher", and traverse to the "sepia mutiny" south asia community.

A cloud presentation would avoid the rankism, because it would focus on the community more than the individual, and allow a browser to travese communities.

Posted by Adina Levin at August 7, 2005 10:28 AM

Actually Mary I just started wondering today if it is the very opacity of Google (and other search engine) results that gives it more credibility.

I think the fact the one can see who links to who as part of the blog search results and how it drives up ranking leads to at the worst dishonest, and at the best inevitably distorted linking behaviors.

Posted by Elisa Camahort at August 8, 2005 11:35 AM

Hm, maybe the whole idea of making a 'just' system is wrong. To some a blog will be interesting because it has many readers, to others because it has many early-adopter-RSS-readers, to others because it covers a niche, to others because of its writing style.

A newspaper with a high circulation might be boring to me because it covers the wrong region of the world.

Different methods of measurement measure different things. Maybe it makes more sense to build separate tools which measure what you suggest and let the READERS decide what "toplist" to pick.

The marketers will optimize their strategy on totally different data anyway (views, PIs, clickhtrough and such).

Posted by OliverG at August 8, 2005 11:36 AM

My trackback is failing for some reason. One of the many possible reasons that inhibit some conversations when the technology gets in the way.

What I think we are after is using technology to draw up a better measure of an audience. How engaged are we? I do more in my posting at

In the revision being drafted, I will be formulating a problem statement as Jack mentions. It is good to begin to agree on the problem were solving before we can begin to solve the problem.

Posted by Steve Sherlock at August 8, 2005 05:29 PM

Very helpful post, Mary, thanks.

I may have missed it, but in the table above, should "frequency of posting" and "average length of post" be measurable criteria as well? They're certainly available by the time and date data on most blogging systems, and via word counting software.

Not sure what inferences you'd gather though from frequency of posting, since a lot of great blogs don't post very frequently.

At the same time, I'd venture to say that a majority of the "Top 100" blogs tend to do short, bursty posts several times a day.

How do you adjust for this bias in measuring/ranking systems is another important question, methinks.

thanks again.

Posted by michael parekh at August 9, 2005 08:43 AM

I find this urge to find "another" ranking system somewhat puzzling. What is the purpose of such a ranking system? I share your distate for it, yet you somehow think it is something essential.

I think a close analogy can be made with books. Do you choose what you read by the sales rank of a particular title? Certainly some do, and some stores make shelf space available on this basis, but most of what I read is not by driven by any particular rank. It is driven by interest and referrals. Reviews and bibliographies and libraries are useful in this regard. I would think that in that regard publishing reasonably good reviews of blogs would be more beneficial than finding another way to assign rank.

Algorithms are great for some things, but not particularly for finding something that is interesting to me.

Posted by Jack Dahlgren at August 9, 2005 02:13 PM

you have really hit the nail on the head. i've been playing around with the idea of a better blog search concept for a few months, and you have done a great job articulating some of the key issues. you may be interested in a couple posts of mine on the topic:

cheers, mark

Posted by Mark Evans at August 10, 2005 10:29 AM

I like the idea of coming up with a more varied set of metrics for judging blogs than just incoming links. But why should we restrict things to making only One True List of bloggers? If we have a varied set of metrics, let each reader create their own list based on the metrics that matter to them. Maybe incoming links works for some people. For others, only personal referrals matter. For others, it may be writing style or frequency of posts. The power of the Internet is that we are no longer restricted to mass media trying to fit everybody - we can create our own media that is completely customized for us, our own personal blogosphere.

I developed these ideas further in a post on my blog, but for some reason, the trackback didn't work, so I'm commenting manually.

Posted by Eric Nehrlich at August 10, 2005 10:46 AM

In many ways measuring blog impact is only the tip of the iceberg in measuring the centrality of a person or an organization and their ideas.

So many good parts of the discussion happen through back channels, side conversations, in person, ad hoc little email lists, and the like that the blog-only measurements are always skewed no matter the algorithm. You simply don't have the data.

Posted by Edward Vielmetti at August 11, 2005 07:36 AM