Sunday, August 4, 2013

What's an Idea Worth?

Today's post is about the value of information!

The January 29 NYT ran an article with the captioned title (
It presents the story of a blasphemous accountant who doesn't charge by the hour, but by ... something different: "he endeavored to help his clients make (and save) enough money that they would gladly pay a significant fee without asking about the hours it took him to figure out what to do". The NYT thinks this means he is charging based on what he thinks his "ideas are worth".

Well, wait a second.

There is lots of economics literature that discusses the basis upon which people are paid for their work. The classical distinction was "by the hour" or "by the piece". In the case of services, instead of "by the piece", we have the possibility of performance-based 'merit pay', or commissions.

Separate from that, there is a literature on how services are priced. The prevailing idea in the Marketing literature is that prices are based on one of these three: (1) cost (2) the customer's willingness to pay (3) what the market will bear, or the 'going price'. (I've always thought this Marketing literature was confusing, or confused: The market price is not some kind of third choice, but is orthogonal to the other two, i.e. each single firm might set a price as a function of either its costs and/or the client's value, but the absolute level of the price will be affected by competition). There is also the possibility of contingency fees (think Erin Brokovich's attorney), which sets the price as a percentage of whatever the service gains for the client.

In the current case, where the accountant works alone, there is a convergence of the two questions, i.e. how the person gets paid, and how the service gets priced. This maybe allows us to forgive the NYT for mixing two issues that are separate in the economics literature, by comparing the possibility of an hourly wage (a question of how people are paid) and the possibility of price reflecting value to the client (a question of how price is set). But still, there's actually quite a variety of ways in which the price for a service may be set, and it's not as if hourly rates have been a dominant approach, let alone the only approach, to pricing services.

With this background, we see that the idea of setting a price according to "what it's worth to the client", is not entirely new. But it comes in a few guises. A contingency fee, by definition, sets the price as a percentage of what it proves to be worth to the client. In addition, the Marketing idea of basing the price on a customer's willingness to pay, presumably reflects to a great extent the value of the item or service to the client. Is there anything new here?

To me, one interesting new question is the extent to which the service provider and/or the client can estimate in advance what the value to the customer is going to be. The NYT article writes: " He [the accountant] can only figure out what to charge his clients after spending a lot of (unbilled) time talking to them about their needs". Contingency fees reflect the value of the service to the client, so that aspect is not new per se. But contingency fees also always had a "contingent" component, on the grounds that it was impossible to know in advance what the value of a service would turn out to be. What is interesting in the NYT article, is the possibility of estimating the value to the client, in advance. For lawyers in litigation, it's never possible to know this in advance. But an accountant, who is not facing an adversary and a jury, but a Tax Code, can figure out the value of the service to the client in advance. As it turns out, though, estimating this is not easy -- due again to the tax Code, I'm sure. 

So, for me, I assume that the value to the client enters into the price one way or another. But the interesting question for information products or services, is the degree to which the service provider and/or client can estimate in advance what the information's value to the client will turn out to be. 

My colleague Daphne Raban and I have research in progress that explores how question-answering services are priced. In the particular setting we studied, the questioners would set ("offer", in finance terms) the price in advance. We found that we could characterize the price that was going to be offered, based on the type of question that they were asking, under a coarse question typology.

There's many types of information and information services. They all have value to the client, and this value will presumably be reflected in the price somehow. But for some information services, like that of a lawyer, there may never be a way to figure out in advance what the realized value will be (though one can set a price based on an expected value). For other kinds of information and information services, like the accountant's, this value may be estimable in advance, but only after many hours of work, i.e. beginning to actually do the work whose price one is setting. Finally, there will be some kinds of information services -- like the Question Answering services we studied -- where the client is able to specify the value in advance.

One more thought. The idea that the accountant can only figure out its value to the client after doing the work. This is fascinating. It's like an "experience good" on the production side! heh heh

Monday, July 29, 2013

User-Generated Content... Travel Blogs

Today's musings are about user-generated content, which arises in two NYT articles in the last few days:

The first article is about travel bloggers who get their trips paid for by local tourism boards or other interested parties. In other words, the idea of the eccentric lone traveler with a backpack and an uplink is naive, with travel blogs increasingly resembling advertisements. As reported, "In March, the Federal Trade Commission had seen enough digital content that blurred the line between editorial and advertising that it issued a clarification document stating that disclosures of free trips need to be clear, concise and toward the top of posts". It's reminiscent of the issue of keyword ads versus organic search results.

The question facing readers is how to discern reliable content from content whose objectivity may be compromised by the writer's sources of funding. This is also an interesting research question, within the broader category of how people judge information quality. The article suggests a variety of clues that readers may use, such as the out-links from the blog. One imagines that readers may look for cues such as the number of followers that a blogger has, leading to a rich-get-richer effect of popularity. I personally am researching rich-get-richer effects, which I think are a big story in our Web-based information world.

The other article is about the seemingly unrelated topic of Wintel-and-PCs versus Android-Ipad-and-Tablets. It quotes  Daniel Huttenlocher of Cornell University’s new New York City technology campus, who associates PC's with the function of generating content, and tablets with the function of consuming content. He notes that "There are way more consumers than producers, period, even in a world with lots of user-generated content,” much to the chagrin of Wintel.

The IS literature, including the literature on end-user computing, has instruments that distinguish and separately measure information quality from functional effectiveness, but I think the distinction between generating and consuming content may be useful in models of adoption, usage, satisfaction, etc.

So, putting the two together: User-generated content is a big story for Web 2.0 and on. For IS research, the associated business models raise questions about how users figure out what information is objective. And there may be some importance in refining our models to distinguish explicitly between usage for content-generation and usage for content-consumption.

Thursday, July 25, 2013

e-Reader Wars and Teaching Network Effects

Today's post is about David Carr's NYT piece titled Why Barnes & Noble is Good for Amazon ( I am looking at it just from the perspective of teaching about network effects.

In my MBA classes I still use the Harvard case on AOL's Instant Messenger as the basis for discussing network effects. That case describes a bare-bones setting -- 3 big companies fighting about a chat standard -- that routinely stupefies even the brightest students and highlights the key concepts of "cooperate to compete" (a phrase I took from Kenneth Preiss) in a network-externality setting.

In an effort to use a more up-to-date case, I sometimes consider adopting the e-reader wars as a basis for teaching this topic. But to be honest, I find these wars too complicated to neatly teach (or understand) the core concepts. In these e-reader wars, each firm is operating within an ecosystem of device manufacturers, publishers, and (e-)book retailers. For example, suppose a publisher decides to make its books available on Nooks. Then, while it continues to compete directly against other publishers, it will also be wanting to cooperate with those other publishers in trying to establish the Nook as a dominant standard. And so on.  I call this "three-dimensionsal cooperate to compete under externalities" -- a mouthful that I can barely grasp. How -- what -- are we supposed to teach our Executive MBA's about this?

The e-wallet wars only up the ante. Imagine teaching a case with the following actors in direct competition:
Wal-Mart, 7-Eleven, Google, Visa, Mastercard, AT&T, DeutscheTelecom, Starbucks, PayPal, and more. This is the landscape in the e-wallet standards wars. How can anyone possibly wrap their mind around the core issues of such a case? Well, as I said, in my short courses I am satisfied if we can unravel the nature of the strategic decisions in the AOL IM chat case. I believe that that case captures many key concepts about cooperate to compete, which are then applicable to today's more complex ecosystem wars.

One other question about the article: First, assuming that Amazon also believes that Barnes & Noble is good for Amazon, does anyone expect a situation in which Amazon consciously eases the pressure on B&N to help it survive? Presumably, Amazon would like to see a weakened but existant B&N. In particular, Amazon would love to see B&N figure out a way to make money off the experience it sells, without selling too many actual books. Can we imagine any scenario in which that occurs? Any scenario in which Amazon passively or even actively "helps" B&N to move there? At the same time, Amazon may be moving away from reliance on e-books and be orienting itself more towards software and apps? So, assuming that both B&N and Amazon don't want to keep competing directly on the price of an e-book, then either B&N moves towards turf that exploits its physicality, or Amazon moves towards turf that exploits its virtuality.

Or not.

Tuesday, July 23, 2013

Automatic Diagnosis Machine -- FDA debate

Today’s post is about an article ( describing a new medical device to detect melanomas, and the factors that affect the FDA’s decision whether to approve it, and individual doctors’ decisions whether to adopt it.

One thing that struck me about the quotations from various FDA officials and doctors, is that academics – especially in fields such as information systems, and industrial engineering -- may have a lot to offer, and that as a community, we may want to think about how to make our knowledge more visible and available to policy makers.

One example that struck me, was a thread of quotes about the machine’s rate of false positives. A member of the FDA panel expressed concern that the false-positive rate was too high. But anyone with an understanding of the technology will realize that this rate is trivial to alter, and that the key metric is not either false-positive or false-negatives in isolation – since either of these can be trivially set to zero – but some combined measure of them both (e.g. ROC, average-precision, etc.). It is hard to imagine – but seems to be the case -- that the FDA panel did not know this. It also appears that the FDA was not provided with information that properly compares the machine against a human on such a combined measure. This is scary to me.

A slightly subtler thread that runs through the article, is about how the device is likely to be used. The argument is raised that doctors may get lazy and rely on the machine, in which case one has merely replaced the person with a machine. But let’s consider this argument in more detail. First, even if this is true, the machine may be better than the human, though I am still surprised that the FDA is asked to approve or reject a device without a clear answer to which (person or machine) works better when acting alone. But there appears to be lurking a second, stronger version of this argument, according to which the result of human+machine is worse than human alone (or machine alone). That is to say, the increased laziness of the human more than offsets the benefit of his/her having a better predictor. Can this be? Is there research that shows a phenomenon such as this?

I have posed this question to a friend who is more expert in this area, and I will post a follow-up....stay tuned

Sunday, July 21, 2013

Web-based information and beliefs

Today's post is about how Web-based information influences beliefs.

The NYT article is about Mormon believers who encounter challenging -- heretical, to them -- information on the Web. We all encounter -- seek -- information on the Web. And there is always the general question of how people update their beliefs in response to information (e.g. under- (or over-) weighting prior beliefs, as studied by Kahneman and Tversky). But there are also more specific questions. First, is there anything peculiar about the extent to which Web-based information influences our beliefs, as compared with other information sources? Conceptualizing this, this question really becomes, what are the characteristics of information that make it more influential on our beliefs? Prior research gives some ideas. For example, communications research has studied Source Credibility as one such factor. In Information Systems, theories such as Elaboration Likelihood Model have been used to characterize the process whereby information may influence beliefs.

But what is special about the case being described in this article, is that it represents the reader's first encounter with alternative views, i.e. to beliefs that that person had previously taken as axioms. In this case, the information that is encountered does not represent one more piece of information to a perennial stream. It doesn't just carry its informational message. It also carries an implied meta-informational message, namely, that there exist multiple competing points of view on the given issue. It is interesting to consider how this meta-message is processed, and the characteristics of meta-messages such as this that might make it more influential. In other words, are the characteristics that are important for a message to convince us to adopt a particular side or position (e.g. in ads), also important for a meta-message to convince us that an issue *has* debatable sides?

Leaving aside this academic question, the article hits home to many of us who have developed ways of 'coping' with the tensions between information and religious tradition. Here's a personal anecdote. My wife was brought up without any of the belief systems we associate with religion. Soon after we'd moved to Israel and she'd begun to study about Judaism for her later conversion, we visited the Bible Lands museum (, a wonderful private museum that is scientific in its "methodology" but whose content is the study of the lands (e.g. Mesopotamia, etc.) and cultures from Biblical times. Anyhow, my wife came across a timeline display. It was a long horizontal affair with color-coded events (ancient city so-and-so) depicted on glass. Somewhere about halfway through the timeline, was the event -- one among many -- "creation of the universe". As someone brought up in a Jewish-observant but modern family, this barely registered. But my wife had never had to deal with this kind of dissonance, and I will never forget the unfolding look on her face. Having just begun to open her mind to studying about the Bible, and without much practice in -- how to call it? -- constructive ambiguity, she was seriously shaken up; nothing made sense anymore. Until it did again.

Information is not only the stuff of economic decisions. Religion is not science, but it is surely a real phenomenon. And how people with religious beliefs process information -- especially dissonant information, and meta-information -- is amenable to scientific study. More-information, as provided by the Internet, is not going to lead to the demise of religious belief. But it may very well influence it to be more questioning, as this NYT article describes.

Thursday, July 18, 2013

Big Data means...

Last week, the NYT ran a piece about technology that tracks users as they walk around a store ( The idea is not new, of course. I think that casinos were some of the early adopters of this approach. They'd give you a kind of loyalty card that you'd walk around with to present for discounts, free drinks, etc., but it was actually an RFID card that allowed them to track you and learn about traffic patterns. The technology has progressed so that now a store can do that just based on your cellphone. That doesn't give them demographic data like the loyalty card does, but they couple the cellphone location data with some face-recognition stuff that reliably determines gender, age, race, etc., and presto!

What I want to talk about is, what makes big data "big"?

Gartner has defined Big Data as “big” in the three V's of Volume, Variety, and Velocity. Boyd and Crawford (2012) raise the notion (attributed to others) that Big Data means having the amount and kind of data that obviates the need for a priori modeling; one may simply let the data talk. This is an interesting, if debatable, perspective.

I'm not satisfied with these definitions. Well, let's take a step back. Why bother with definitions? Because Big Data seems to mean many things, and definitions can help to cogently define and separate the different challenges and opportunities for research and practice. For example, Gartner's 3 V's has been used to frame the technical computer-science challenges that Big Data entails. The Boyd and Crawford definition highlights some technical implications for data mining. But I'm not satisfied with these definitions, partly because they especially highlight the new technical challenges and solutions that Big Data entails, whereas I'm more interested in characterizing the new opportunities that it brings. 

With this in mind, I characterize Big Data as data that is passive in its collection and in its referent.  When I say it is passive in its referent, I mean that Big Data doesn’t only capture distinguished episodes that happen to objects, but also their background state. And when I say that Big Data is passive in its collection, I mean that the data is recorded not only in response to an active trigger, but regularly, i.e. due to the mere passage of a pre-defined amount of time. Often, the data capture is done by sensors. I think that the more important element is the first, that the referent is an object's state. 

In the context of an online consumer, Big Data represents not only a purchase, but each user click, and all the moments of inactivity in between. In a physical-store consumer setting, it represents the full history of the person’s movement through the store over time; this is the technology described in the NYT article. In a logistics context, it represents not only a transition of cargo from one modality to another, but its location, temperature, etc. at all times. In a medical setting, it represents not only episodic measures, but a continuous reading of the patient’s state on certain variables (if it is his/her whole state of DNA, then due to practical limitations it is unlikely to be a continuously updated reading, but a single or occasional snapshot). 

When considering the opportunity presented by Big Data, the question is, what additional opportunity is afforded when moving from data whose referents are pre-defined events to data whose referents describe a state?

I think about this question in light of older Information Systems literature on how to really get the best benefit from data. A number of research areas within the field of information systems have established that the greatest gains are unleashed when newly available information facilitates entirely new processes, rather than “merely” to improved decisions or execution of existing processes. For example, this has been found in the context of inter-organizational information sharing, as when a manufacturer may consider using retail-level POS data not only to make better production decisions, but to possibly bypass the distribution network altogether and make direct deliveries (Clemons and Row 1993). A similar idea is the use of supply chain information to allow make-to-order to partly replace make-to-inventory.

The early literature on the opportunity presented by Big Data (I leave completely aside the technical aspects) appears to be more oriented towards informing decisions than re-designing processes (e.g. McAfee and Brynolfsson 2012).  This perspective views big data as part of managing and competing by analytics (Davenport 2006).

My inclination is to consider that because it represents states and not events, Big Data offers something different, beyond improved decision-making, but different than altered business processes. 

I just haven't yet figured out what that is.

Wednesday, July 17, 2013

Clinical Medical Trials and Design Science

A recent opinion piece on a seemingly unrelated topic has great methodological importance for research in the "design science" tradition. Design science is essentially engineering research, in which the researcher builds a system -- e.g. a recommender system, a data mining system, etc. -- and tests whether it works better than existing systems. It also includes research such as interface design, in which the researcher isolates and tests the efficacy of a single design element, as opposed to a whole system. We'll get back to both kinds of design science below, and in a later post, I will discuss the important differences between these two versions of design science. But first, the NYT article.

The New York Times article is titled "Do Clinical Trials Work?" (

It writes about clinical trials of medicines. The purpose of a clinical trial (a so-called Phase 3 trial) is to test the efficacy of a proposed drug as part of the process of gaining approval from the FDA.

The chief concern that is expressed in this NYT opinion is that even after hearing the results of a study -- or indeed, of the totality of studies -- one still doesn't know which drug works best. The article implies that the reason for this is that each clinical trial tests the efficacy of a single drug, often against a placebo. This is not quite right. The reason one doesn't know which drug works best, is that each drug is treated in isolation, and it's impossible to compare the magnitude of effect on one study against the magnitude of effect in another study, due to the myriad confounding factors (e.g. different population, etc.). Distinct from this, the reason one doesn't know which COMBINATION of drugs works best, is because each drug is tested against a placebo. I will elaborate, but just to summarize till here:

PROBLEM 1: Don't know which drug works better; REASON: Each drug tested in different setting.
PROBLEM 2: Don't know which combination of drugs works best; REASON; each drug tested in isolation, against a placebo.

Finally, the article raises a third problem, which is that one doesn't know the circumstances under which one drug may work better than another; this is attributed to the fact that drug efficacy depends crucially on the presence of ("is moderated by", in academic parlance) genetic factors that most large clinical trials don't include.

In the information systems setting the question is, what do we learn from a study that pits system-with-feature-X against a "placebo" system-without-feature-X? Well, may we learn that feature X is helpful. But we don't know if this feature is better than feature Y, which was studied separately (PROBLEM 1). And more to the point, similar work is being done by dozens or hundreds of other researchers, each studying one or two system features, and demonstrating that they work better than ...nothing, i.e. the placebo. This leaves us with the question of which combination of features works best (PROBLEM 2).

This issue was recently raised in Norbert Fuhr's acceptance speech for his Salton award (, a kind of lifetime achievement award for work done in the field of information retrieval, aka search engines. As he noted, a study by Armstrong et al. (2009) reported that there has not been any upward trend in the overall performance of (laboratory-based) information retrieval systems over the past decade or so, in spite of an endless stream of papers reporting system features that improve performance. Now, the information retrieval field does not suffer badly from PROBLEM 1. The reason is that they often use standardized data sets, meaning that even when two researchers each study a different feature, they study them on the same set of documents and queries. This would be akin to two separate medical clinical trials, each testing a different drug for a particular condition, ON THE SAME SET OF PATIENTS. Obviously, this is not practicable in the medical setting, where hundreds of studies are being carried out, each in a different hospital etc.

But like many fields in engineering, including much work in IS's design science, information retrieval does suffer from PROBLEM 2. What happens is that each year, new design features are suggested, but always in comparison with the same - call it "placebo" -- baseline, not with respect to a system that includes all previously known good-features. The result is that we are left with a sort of inventory of design features, each of which is provenly better than nothing, but with no guidance about which combination of features works best. Armstrong et al. further imply that this essentially means that the studies were "cheating", for if feature Z only works better than a placebo with "no features", but not better than a decent system that includes previously-known-to-work features, then Z cannot be said to "work" in any meaningful sense. At least, that's their view. And, it leaves us with the problem of not knowing which combination of features works best. To remedy this, they suggest that each researcher should compare his/her newly proposed design against the best-performing system that is known to date. In other words, if I propose a new design feature Z, I should test a system that has all the features that lead to the very best performance overall but that does not include Z, against a system that has all those features and ALSO feature Z. Then, if Z adds marginal benefit, we will have learned something.

I recently wrote a commentary in ACM SIGIR Forum, which diametrically opposes the suggestion that proper design science should test proposed new features as additions to previous-best-performing systems. I argue that the correct remedy to this situation is not to require comparisons against previous-best-performers, but to engage in more conceptual research. Conceptual research is about using theory to guide the invention and definition of variables, and how to measure them, and their relationships with other similarly conceived variables -- NOT ULTIMATE OUTCOMES -- and testing those definitions and relationships in empirical work. This is the a-b-c of scientific work in all fields, except in engineering fields where there is a tendency to test any proposal on ultimate performance measures.

Take an example from maritime engineering. Suppose a researcher proposes a new design element for a ship, e.g. a new material that results in a stronger hull. In the engineering-oriented approach of Armstrong et al. -- which is also implied in the NYT article -- the researcher must create (simulate) a full, best-performing total ship that includes all good design elements that yield the world's single top-performing ship. Then, based on that previous top-performer, he/she should see if the stronger hull adds any improvement to the ultimate outcome e.g. top speed or time between refueling or whatever is the ultimate outcome measure.

By contrast, in conceptual research, a researcher would propose specific, local variables that the increased hull strength is expected to affect. Indeed, the whole notion of "hull strength" would have to first be conceived as a meaningful variable to think about; it would have to be defined, on its own terms and in terms of its expected relationships with other variables. The researcher would propose the other variables that it directly affects, and would not (only) predict how that might affect ultimate outcomes. In academic parlance, a variable's direct relationship with other variables that are not ultimate outcomes, is called the "mechanism" through which the variable affects the ultimate outcome. For example, the researcher might propose that the stronger hull will reduce the ship's wake. A proper test of that hypothesis is a (simulation of) whether such a stronger hull indeed reduces the ship's wake. The importance of such research for shipbuilding is the hope that, under some conditions, the reduced wake might improve an ultimate performance measure; but that would be outside the scope of the described research.

In this conceptual world, it is not only unnecessary, but counter-productive, to test instead whether the stronger hull led to improvement in some ultimate performance measure such as top speed. Unnecessary, because we are trying to learn how things work. And counter-productive, because it might very well be that the so-called "previous best" ship would be better if we had REMOVED one of its supposedly great features, and INSTEAD used the stronger hull. It is the nature of scientific work to study direct connections between local variables, and in this effort, it is perfectly correct to use a placebo as the baseline. This is not "cheating" because the aim is not to show that my system is the winner, or to say that ships with stronger hulls will be better on some ultimate performance criterion. Rather, the aim (in this example) is to see whether a stronger hull actually reduces the ship's wake. Other researchers will do something similar, studying different sets of local variables, such as how wake interacts with wind, or what have you. Armed with these separate understandings of how things work, we may be able to predict which combinations of elements work well, under which circumstances. It's not trivial, but we're in a much better position than if we had conducted experiments that only measure ultimate performance. Each piece of conceptual research contributes insights into how things work. Then we might be able to theorize and hypothesize which combinations make sense together. This is called science, and not all engineering fields are steeped in the tradition of conceptual research.

To summarize, in conceptual research, we learn how things work, and this will ultimately guide us about what combination to expect to work. By contrast, in the horse-racing approach that dominates some engineering fields, we are indeed left with an inventory of features, but no guidance about how they work, and so no guidance about which combinations may be expected to work best.

In the medical field, actually, I think there is a strong tradition of conceptual research to complement the measurement of ultimate outcomes. Clinical trials are like those engineering studies that try to measure an ultimate performance measure. But in medicine, those same clinical trials often also measure the many layers of causal mechanisms (what led to longer life? reduced tumor size during x months; what led to that? increased susceptibility of cancer cells to destruction by X; what led to that? increased Y, supplied by the drug being tested). Thus, in the medical field like in engineering fields, the ultimate performance measure of a single study has limited meaning. X worked better than nothing, but is it better than alternative Y that was studied elsewhere? Hard to know. Is it best to use X in combination with A or in combination with B or neither? Will a regimen of X added to A, offer any benefit compared with A alone? Don't know, based solely on the clinical trial's ultimate performance measures. But the answer is not to require clinical tests to compare the addition of X to the so-called previous best performer. Rather, the answer is to focus -- as the medical field does -- also on the mechanisms, the less-than-ultimate performance measures, which explain how things are working. This yields guidance about what combinations of drugs might work best. I am no expert, but I believe that medical research, including clinical trials, does not limit itself to ultimate performance measures. Therefore, I think the situation in the medical world is not as bleak as the article portrays it. I think they accumulate knowledge of mechanisms, and this serves as the basis for contemplating what combinations might work well. In engineering design science, I am less sanguine that researchers appreciate the benefit of conceptual research.

To summarize, in design science as in all science, the most important research is conceptual research that studies mechanisms, i.e. directly related variables. This is the best way to make sustained progress on the level of whole-systems, because it guides us about which combinations of features make sense together.