Rating Blog Posts

Every picture on Picapp has a small control at the bottom of its screen that lets the view assign a star rating – 1 to 5 stars.

Picapp Rating

Picapp Rating

It’s a good idea and I want to have something similar on this blog. Actually – there’s a problem but I’ll talk about that later.

I clicked on the question mark and found out that the rating service is provided by a third party – Outbrain.com.

Picapp Uses Outbrain

Picapp Uses Outbrain

Oh no, it’s a platform!

It looked like I could sign up with them and add a similar control to all my blog posts. Then readers can tell me what they think of the post with minimal effort.

BUT I found that …

NOTE: The widget is only available for the installed version, and not for blogs hosted on WordPress.com (yet). If your blog is on WordPress.com, you can help us get approved there by giving your voice.

I went to the link indicated and found out that people have been asking Outbrain to support WordPress hosted blogs for over a year and nothing’s happened.

I added my request to the long list and soon got a reply from an Outbrain employee:

Yaron Galai, (Official Rep), commented 8 hours ago
Hey bangkok photographer – thanks for your comment! I agree with you that this situation is borderline ridiculous. The guys at WP are indeed very nice and we’re very friendly with them, but for some reason it is nearly impossible to get approved by them as a widget for WordPress.com hosted blogs.Our performance is outstanding and we have rarely had any complaints from our tens of thousands of bloggers, so any technical reason would be nothing more than an excuse. I am as mystified as you are by the reason for Outbrain not being approved for WP hosted blogs.

A direct nudge with them would be greatly appreciated. I will try to find out what’s the best way to do that.

Thanks again for your comment here!

My guess is that WordPress are reluctant to support them because they have a plan to do it themselves. But it has been over a year and there’s been no progress.

The Problem With Star Ratings

The problem is that people don’t agree what they mean. In my use of Lightroom I have a policy that pictures get three stars by default.

Five stars are for the best photos I have ever taken.

Four stars are for better than average pictures that I would want to show to others or post on Flickr. I have made Lightroom Smart Collections of family photos, for example, that are four or five star rated.

Two stars are below average but contain something interesting or useful. A one star image is a candidate for deletion unless it contains something unique like the Loch Ness Monster.

I rate every picture and strive for a normal distribution of star ratings with three stars as the mean.

But there are no standards. In Chris Orwig’s Lightroom training he implies that he does not rate all his pictures and only gives them a single star if they are better than average.

So a Chris Orwig one-star might be equivalent to a Bangkok Photographer’s four-star.

Neither of us are right or wrong – it’s our private convention.

So Chris might decline to rate the majority of Picapp pictures he uses and give a star to the ones he likes. Whereas I probably would not use a picture that I think is lower than my three-star rating.

Michael Willems describes his Lightroom rating process here. It has some similarities to mine but he starts at 2-stars.


Tags: , , ,

7 Responses to “Rating Blog Posts”

  1. Raanan Bar-Cohen Says:

    Have you looked at enabling the ratings feature: http://en.support.wordpress.com/ratings/ ?


    • BKKPhotographer Says:

      Aargh! I am so stupid! “Ratings” was on my dashboard all along and I did not notice it.

      I turned them on just now.

      I guess that is why WordPress isn’t interested in supporting a third party.

      Thank you. Another Mr Fredrickson moment from me.

    • BKKPhotographer Says:

      On problem though is that the ratings widget only appears when you “open” the post. I bet most readers skim through the page and do not click on the title to open it.

  2. Raanan Bar-Cohen Says:

    @BKKPhotographer — we want to enable ratings for the home page / index page as well, and not just on permalink/post pages — just a matter of making it work on all themes. It will happen soon 🙂

    • BKKPhotographer Says:

      Ah yes, themes. The testing issue must be huge. I hadn’t thought of that.

      I wonder if most bloggers use [more …]. I only used it once for a very long post. Otherwise I let people read the whole post on the page.

      Thanks for the info. I also posted to the Outbrain blog.

  3. Jordan Elpern-Waxman Says:

    Ratings calibration is certainly an issue in all ratings systems and schemes, and I’m not sure that there is any clear all-purpose “best” solution. By way of background, I will share one of my own experiences with ratings calibration. I had the pleasure of working for a management consulting company that, like all good management consulting companies, was very into measuring and rating as much as possible, including employee performance. The employee review system was incredibly detailed, with dozens of skills and capabilities reviewed on a 1-5 system, including half points (i.e. actually a nine-point system). Each whole point score was meticulously defined for each skill for each professional level (i.e. analyst/associate/manager), so that in theory there was an objective definition of what it meant for an analyst to rank a 3 in Excel data manipulation and for a manager to rank a 5 in subordinate time management (I’m making up these examples). Despite this comprehensive documentation and the fact that there were only about 20 reviewers reviewing about less than 100 people total, it was considered necessary to provide detailed instructions for reviewers on how to avoid biases that might make the 3 given be manager A actually imply a better performance than a 3 given by manager B. Despite these instructions and the relatively controlled environment I describe (20 reviwers reviewing a total of less than 100 employees; all reviewers and employees working together under close conditions and sharing a single corporate culture), it was *still* recognized that biases were inevitable and even more steps were taken to try and make all ratings directly comparable. First, statistical means were used to calibrate ratings, such that if one manager gave everyone a 3 or above while another gave everyone a 3 or below, the numerical ratings were automatically adjusted based on each raters history. Second, a full-day meeting between all of the twenty reviewers was convened once per rating period, at which all reviews were brought up for discussion (with extra attention paid to score that would imply promotions) and everyone given a chance to comment or object to another reviewers rating of the same employee, in order to provide a final manual check on ratings biases. Still, with all of these quite extraordinary measures taken, no one was under the illusion that the ratings were completely objective and that full calibration had been achieved.

    Now imagine trying to provide a universal callibration system that spans millions of blogs/picture sites, billions or trillions of entries, billions of reviewers, and thousands or millions of implicit criteria if not more (what does it even mean to say that I “like” a post or a picture? That it made me laugh? That I learned something? That I would read/view it again? That I would remember it? That I would recommend it to someone else? That I found the prose to be pleasing but the arguments lacking? Or the arguments convincing but the prose to be grating? etc, etc.).

    At the end of the day, the simple rating star-based system provided be Outbrain remains quite useful despite all of these limitations. For the question that you raised, that of calibration, I would guess that as n=the number of reviewers approaches infinity, that each rater’s calibration bias cancels out and you are left with a fairly well calibrated score (assuming that the set of reviewers is the same for each blog, which again becomes more true as n grows) where a 3 on one blog is equivalent to a 3 on the other. You should note that the aggregate rating compiled by Outbrain or Picapp is very different from the type of single-rater system that you mention when you compare your and Chris Orwig’s ratings. In this case you are dealing with a data set (be it a blog or picture album) whose set of reviewers has size n=1, so clearly the calibration will be poor or possibly even meaningless across data sets.

    This was probably way more than you were expecting but I do think it is a neat topic so I got a little carried away 🙂

    • BKKPhotographer Says:

      Excellent points! Don’t worry about the length. I get carried away too. 😉

      I used to work for a major computer vendor and we had a similar culture of measurement and evaluation. There were guidelines from HR that told managers to approximate a normal distribution for each “performance band” – rank. So in a population of 100 software engineers we managers were told we could only have 5 ranked “exceptional” (5 stars) and had to find 5 to rank “unacceptable” (1 star). It was stressful and caused conflict amongst the first level managers like me. (Who were at the same time being ranked by the second level managers present at the ranking meeting.)

      You are right that a ranking applied by a large number of people will tend to a broad consensus as n increases. I’ve seen it talked about as “the wisdom of the crowd”. There’s a site called hotornot.com that lets people submit their photos to be ranked by others on a 10 point scale. It was addictive. They had many users and you could see the aggregate score and n after you ranked a photo. I recall rarely being surprised by photos that had a large n.

      I like the way the WordPress ranking widget setup allows the blog owner to assign labels to each rank. That means I can tell potential rankers what I think the stars mean. This is a small nudge to consistency.

      For an individual ranking her own work consistency over time is important. I did a little exercise last night where I looked at the Lightroom Smart Collection of all the pictures I’ve ranked 5-stars over the past year. I asked myself if I still would rank then thus.

      In 99% of cases I agreed when evaluating the subject of the photo. (That means for example that my taste in beautiful women has not changed in a year). But my photographic standard (skill) has (I hope) risen so I am now more critical of the quality of a photo than I was. So I downgraded some for that reason.

      Maybe tonight I’ll go through the photos I ranked 4-star in Lightroom and see if I want to promote any.

      I got carried away too. I say I am “analytical”. Thai people say I “think too much”.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: