Comments on Use of Survey Evidence in Trademark Litigation Submitted to the Intellectual Property Court of the Russian Federation

The following responds to a request by the Intellectual Property Court of the Russian Federation (IPC) for public comments on the Court’s development of methodological recommendations for the judicial assessment of consumer survey evidence in American trademark infringement cases. Survey results are typically used as evidence that a trademark is recognized as famous, distinctive, or a designation of origin or source, or that the mark is descriptive or generic for particular goods and services. The purpose of developing these recommendations is to consider how such polls should be conducted and what factors to consider when evaluating them. The Court is particularly interested in:

how respondents are selected;
who are the target consumers;
what questions are appropriate;
how questions should be formulated so as not to lead consumers to a particular answer;
how retrospective surveys are conducted to confirm the existence of circumstances on a certain date (for example, on the priority date of a trademark);
how correspondence to and from the survey expert should be evaluated in considering the merits of the survey; and
any other relevant factors to consider in the assessment of consumer surveys

Introduction. In U.S. courts, a properly conducted survey is “direct” evidence of consumer perception, as opposed to “circumstantial” evidence such as the nature and extent of sales and advertising, testimony from dealers and experts, and unsolicited publicity. Consumer surveys in trademark infringement and false advertising cases are common, methodological best practices are published by the Federal Judicial Center,¹ and the leading authority – from whom the following comments are derived in whole or part without further attribution – is McCarthy on Trademarks and Unfair Competition.

General Rule. The basic rule is that a professionally conducted survey which is relevant to the facts will be admitted into evidence, and that any deficiencies in the survey will be taken into consideration in balancing its weight and credibility. A proper foundation is established by a showing of (1) relevance and (2) the employment of accepted principles in the conduct of the survey.

Once a survey is admitted, issues of methodology, survey design, reliability, and the merits of its conclusions go to the weight of the survey rather than its admissibility. Survey evidence is subject to review in the appellate courts for its probative value based also on the questions asked and the experience of the surveyor.

The principle purposes for which surveys are introduced in trademark and unfair advertising cases in the U.S. courts are to prove or disprove that –

A mark is generic;
A mark has acquired distinctiveness or “secondary meaning”;
The other party’s use of a similar mark is likely to cause consumer confusion
A mark is famous for purposes of anti-dilution and is likely to be blurred, tarnished or otherwise diluted by the other party’s mark; or
An advertisement has a particular meaning to the general public or relevant population.

The main duty of the court when considering survey evidence, as with any scientific evidence, is to ensure that the evidence is based on scientifically valid principles and is relevant to the facts of the case. The Court must function as a “gatekeeper” to ensure the reliability and relevancy of expert testimony and to confirm that the expert, whether basing testimony upon professional studies or personal experience, employs in the courtroom the same level of intellectual rigor that characterizes the practice of an expert in the relevant field.

These principles appear in Rule 702 of the Federal Rules of Evidence, which provides that “[i]f scientific, technical, or other specialized knowledge will assist the trier of fact to understand the evidence or to determine a fact in issue, a witness qualified as an expert by knowledge, skill, experience, training, or education, may testify thereto in the form of an opinion or otherwise, if (1) the testimony is based upon sufficient facts or data, (2) the testimony is the product of reliable principles and methods, and (3) the witness has applied the principles and methods reliably to the facts of the case.”

Notwithstanding the above, the court also has a duty, in exercising its “gatekeeping” function, to exclude a defective survey report from being received into evidence. A survey may be excluded if the person who conducted it was not an expert, or if the survey was so carelessly designed and conducted that it fails the basic test of professionalism and reliability. The exclusion of consumer surveys from evidence is seen most often in jury cases. An appellate court can find that the district court abused its discretion by placing reliance on a deficient survey, and a trial court may order a new jury trial because of a defective survey even after denying a pre-trial motion to exclude the survey from evidence.

The Relevant Universe. The first step to evaluating a survey is to determine whether the relevant “universe” has been studied. Even if the correct questions are asked correctly, the results are irrelevant if the wrong persons are asked. Then a sample is selected from that universe. A census may be conducted, i.e., questions to each member of the universe, but a survey involves the taking of a “sample” by selecting and questioning a subset of the universe.

In most trademark infringement cases, the relevant universe are potential buyers of the junior user’s goods or services, but in a case of “reverse confusion,” the relevant universe is the senior user’s customer base. The universe may be narrowed by geography, area of commerce, market segment, purchaser buying habits, or any other factor based on legal precedent and/or common sense depending on whose state of mind is at issue.

A survey can be a “probability” or a “non-probability” survey, depending on the mathematical ability to statistically project the results to the universe as a whole with a known degree of error. Almost all consumer surveys in trademark and false advertising cases are nonprobability surveys, and most are conducted over the internet. The burden is on the proponent of the survey to show that the universe has been sampled in conformance with accepted statistical standards. Relevant factors include whether:

the population was properly chosen and defined;
the sample was representative of the population;
the data gathered were accurately reported; and
the data was analyzed in accordance with accepted statistical principles.

Laying the foundation for survey evidence requires expert testimony, and the Court must give consideration well in advance of trial to whether the admission of such evidence should include the underlying data and documentation.

Methodology. While confusion of non-purchasers may be relevant, a survey is typically designed to prove the state of mind of a prospective purchaser. Most surveys do not measure actual confusion. Surveys only give information about a controlled and artificial world from which the jury or Court is asked to draw inferences about the real world.

The court may be able to draw helpful inferences from a survey of randomly selected pedestrians in a shopping mall who are queried about pictures of two products, but that does not mean that their responses are direct proof of the responses of actual consumers. Courts have rejected surveys finding that the interviewee was “not in a buying mood” and that many people “do not take the same trouble to avoid confusion when they are responding to sociological investigators as when they spend their cash.”

Notwithstanding, the vast majority of professionally conducted surveys are admitted into evidence in trademark infringement and false advertising cases, and, the more closely the survey methods approximate the conditions in which an ordinary purchaser would encounter the trademark, the more persuasive the survey results.

The classic location for an admissible “intercept” survey is the shopping mall because persons encountered there are supposedly in a “in a buying mood,” even though that is not a necessary element of an admissible survey. Today, consumer surveys are typically conducted on the Internet. A consumer who clicks on a link to purchase a good is the equivalent of a consumer at a store who chooses a product and takes it to the cashier to buy it.

Telephone surveys, which are less expensive, are also admissible where the inability to visually compare images is not relevant. A telephone survey conducted by an expert must identify (1) the procedures that were used to identify potential respondents; (2) the number of telephone numbers where no contact was made; and (3) the number of contacted potential respondents who refused to participate.

Pilot Results. Before incurring the substantial expense of a typical consumer survey, the expert typically conducts a pre-test or “pilot test” to evaluate the viability of the survey method, structure and questions. In the pilot test, a proposed survey is administered to a small sample (typically 25 to 75 respondents). The eventual survey is typically changed and re-worded as a result of feedback and insights from the pilot test.

Ordinarily the expert does not disclose or discuss the existence or scope of pilot testing in his or her expert report, but the proponent of the survey must prepare to disclose such changes if requested. However, there is no per se negative inference from changes in methodology or questions after the pilot test. As stated by the Federal Judicial Center, “A more appropriate reaction is to recognize that pilot work is a standard and valuable way to improve the quality of a survey and to anticipate that it often results in word changes that increase clarity and correct misunderstandings. Thus, changes may indicate informed survey construction rather than flawed survey design.” On the other hand, if a survey is significantly altered after the pilot test to improve the survey numbers, then the expert must have a credible explanation.

Probability vs. Non-Probability Sample. A sample from a universe may be a probability sample or a nonprobability sample. Although each may be projected to the universe, only a probability sample can achieve that by mathematical and statistical probability models. When speaking of “sampling error” or “margin of error,” the reference is to a probability sample. A probability survey requires the mathematically random selection of subjects from the universe such that each person has a known probability of being selected, resulting in a statistical projection to the universe with a known degree of error.

The classic example of a probability sampling is the telephone survey, in which telephone numbers in a given territory can be randomly selected. However, telephone surveys have been eclipsed by the advent of smartphones. Companies that sell lists of available phone numbers for random dialing exclude cell phone numbers, caller ID and call blocking make it difficult to reach numbers, and even when a cell phone is answered, fewer and fewer people are willing to spend time responding to survey questions. Person-to-person probability surveys, in turn, are difficult, costly, and rarely encountered in trademark infringement and false advertising cases.

The typical survey in a trademark or false advertising case is a nonprobability sampling survey, sometimes called a “mall intercept” survey, which does not require the random selection of respondents or result in a mathematical projection to the universe. Selection of respondents results from their availability at a designated location. The characteristics of respondents are recorded, e.g., gender, age, income level, and education. The target quota of persons with a specified characteristic is reached by questioning persons until the quota is reached. Simply handing out survey forms to anybody who stops by a booth at a shopping mall or a trade show falls short of a reasonable sample.

Nonprobability sampling is the most common type of in-person survey seen in trademark and unfair competition litigation. In most cases they are admitted into evidence under Federal Rule of Evidence 703, which allows an expert opinion to be based on evidence which is not admissible if that evidence is “of a type reasonably relied upon by experts in the particular field in forming opinions or inferences upon the subject.”

Internet Surveys. Surveys conducted over the internet are increasingly common and are admitted into evidence when conducted according to generally accepted principles as any other survey results. Internet surveys are customarily administered with large panels of volunteers who supply demographic information and data about their use of particular goods and services. The participant responds to periodic surveys of different kinds while he or she is a member of the panel.

As with any survey, an Internet survey must replicate the conditions under which potential purchasers are likely to encounter the trademark, but this can be done with video clips and images. Internet surveys make it possible to rapidly and inexpensively screen large populations and draw upon a geographically diverse population, and to eliminate the potential for interviewer mistakes and intentional or inadvertent biasing of responses. Their principal drawback is that they reduce the ability to probe the respondent’s responses by means of follow-up questions.

Attorney Participation. While it is improper for an attorney to single handedly design and conduct a survey without the assistance of a professional, attorney cooperation with the expert is essential to efficacy of the survey. Essentially, the attorney may and must participate in the design of the survey, but is prohibited from conducting the survey.

Examples of Methodological Defects. Courts in the U.S. have rejected or discounted the weight of survey evidence for a wide variety of reasons, for example, questioning consumers in a supermarket where the interviewee could see, or had just seen, the goods and labels in question, and questioning interviewees who had just purchased the party’s product.

The most common reason stated by U.S. courts for excluding or criticizing survey evidence on methodological grounds is an ambiguity or lack of relevance in the questions, for example, whether the respondent thinks that marks are “too similar” or that the trademark owner will be “successful in its legal fight”; asking the respondent to rank “similarities” between marks on a scale of 1 to 7; showing photographs that are not true representations of a product; questioning respondents about assumptions about the source of goods without showing them the point-of-purchase display cards that are present in stores; asking about a logo without showing it as it appeared on the product; asking about marks as they appear in print without reference to the surrounding packaging and house marks; asking about an unpackaged product which is typically encountered in distinctive packaging; giving respondents less information than they would actually receive in a real purchasing situation; trying to prove secondary meaning for trade dress in a product feature by showing the whole product, which included non-protectable, functional features; showing respondents the marks side-by-side where that is not how the products appeared in the marketplace; and asking respondents if they found a term to be “offensive” instead of “disparaging.”

Personnel and Tabulation Problems. Surveys have been rejected by American courts on the grounds of treating inconclusive responses as definite, unqualified responses in final survey tabulations; using low paid, part-time, nonprofessional investigators who may have exercised poor judgment in interpreting ambiguous interviewee responses; failing to verify the validity of responses; changing recorded responses and failing to record unfavorable responses.

Bias. The fact that a survey is conducted specifically for the purpose of creating evidence for particular litigation does not lessen its weight in any way. However, American courts have found problems with inducing survey responses by flattering letters and by offers of prizes in return for prompt replies, and by discounts on products.

Copyright Problems. In some survey formats, to replicate actual materials as used in the marketplace, a party will reproduce and use an advertisement or promotional piece of its adversary. It has been held that such a use constitutes a “fair use” that is not an infringement of the adversary’s copyright in the advertisement or promotional materials.

Memory Tests. Surveys have been rejected by American courts for doing little more than testing respondents’ memory of a controlled exposure to a trademark stimulus. In one case, Phase One of the survey involved showing respondents advertisements of four companies, including the plaintiff. Then in Phase Two, respondents were shown advertising promotions of three companies, including defendant and asked, “Was there a product or service in the booklet I showed you [in Phase One] that is from the same source or company as [shown in this exhibit in Phase Two]?” Positive answers linking plaintiff and defendant were counted by the survey taker as evidence of likely confusion. A “noise” or “error” rate was developed which was then deducted. The court found that the results were only of “slightly probative value.”

Insufficient Number of Respondents. Conducting a survey with a number of respondents too small to justify a reasonable extrapolation to the target group at large will lessen the weight of a survey. If the sample is too small, the results may not be able to be projected to the whole universe at issue in the case.

A problem with a small sample size is a “sampling error” — namely, the imprecision inherent in a sample due to random variations. This imprecision is sometimes calculated as a “confidence interval.”

Viewing the Stimulus While Questioning. In some surveys the interviewer will remove the stimulus (a “memory test”) and in others will keep it in the respondent’s view while questioning (a “reading test”). Keeping the stimulus in view while questioning is usually only appropriate to assess point-of-sale confusion under circumstances where consumers make involved, deliberate, thoughtful decisions. A “memory test” comes closer to real-life buying conditions for all types of products. Consumers in the real world obtain information on their own prior to purchase without being led by interviewer questions. This can be replicated by the interviewer, prior to removing the stimulus, by instructing the respondent to look at the products, advertisements or marks at issue as they would normally when considering a purchase.

Leading questions. “Demand Effects” in a survey are produced when respondents use cues from the survey procedures and questions to infer the purpose of the survey and identity the “correct” answers. Respondents may then provide what they perceive as the correct or expected answers, to make sure that the results “come out right.” Demand effects can result from leading questions.

Secondary Meaning Questions. Survey questions should not be slanted or leading so as to lead the respondent to a desired response. For example, a survey designed to prove secondary meaning in a designation should not use questions that assume that the designation identifies a single source, since that is the issue that the survey should be designed to test. The question “What brand do you think of when you hear this slogan?” is a leading question where previous questions had already mentioned the critical brand name.

Infringement Questions Which Lead to a Desired Result. Care must be exercised in framing questions designed to see if a respondent links two entities identified by similar marks. In one case the issue was whether defendant’s NOVA proposed magazine was likely to be mistakenly connected to plaintiff’s NOVA science television program. The court held that defendant’s survey asking what the “word” Nova means rather than what the “name” Nova means “seriously undercut whatever validity the survey might otherwise have.” “Name” would have focused attention on plaintiff’s program, whereas the term “word” was said to have diffused that focus.

It is also improperly leading to expose the surveyed person to a desired response before asking the critical question about a connection between the owners of the competing marks. Responses were given little weight by a court in a survey where questions twice exposed respondent to both parties’ marks before asking the question: “Do you feel that there is some connection between Penta Hotels and Penta Tours or do you feel that there is no connection between these companies?” The court in that case held said that the questions were “designed and placed in such a sequence so as to elicit the intended response.”

A critical survey question is improperly slanted if it has been preceded by questions which lead the interviewee “along the garden path” to the desired response. Thus, it was improper to precede the critical question by questions which, if correctly answered, elicited the trademark which was a favorable response to the critical question. Another form of slanting is presenting respondents with particular similarities between the marks rather than letting persons respond to the marks as a whole.

An obviously leading question is one directed to sympathetic retailers, telling them about the pending litigation, and asking them to respond to a question like: “Would it be your opinion that the use of the name SLEEX upon girdles for women would import a derogatory effect to slacks sold under the identical trademark for men so that men would not buy the slacks merchandised under that name?”

Questions Implying a Connection Between the Parties. Is it improperly leading and biased to ask in some fashion if the survey respondent thinks that goods or services bearing marks A and B are from the same source or different sources? Such a question may be improperly leading because it implies that there could be a business relationship where the respondent may previously have not thought of any such connection. For example, the question: “Do you think that there may or may not be a business connection between Beneficial Capital Corp. and the Beneficial Finance System Companies?” was rejected as a leading question, “not well suited to eliciting an uninfluenced reaction.” An even more leading question requiring a “yes-no” response was similarly excluded as evidence: “Do you believe that [the accused mark] and [the plaintiff’s mark] are likely to produce confusion in the marketplace for [these products]?”

How to Avoid Leading the Survey Respondent. The challenge in a survey that asks if there is some connection between two marks is to avoid the kind of leading question “demand effects” noted above. Showing conflicting marks side-by-side to survey respondents can skew the results by suggesting that the question would not be asked if there were not in fact some connection between the marks. However, a “closed-ended” question is not inherently leading and can be accompanied by appropriate controls; all questions lead respondents to think; this does not make them leading questions. It is only when a question leads you to answer one way rather than another that it is considered leading.

Open-Ended Questions About Brands. Caution must be exercised in evaluating the results of some open-ended survey questions about brands because respondents who merely guess will likely just play back the names of the best-known and dominant brands. A survey question directed at showing secondary meaning in FENDER guitar body shapes was not sufficiently probative, where the surveyor asked if the person knew “the company or companies that makes” the displayed guitar shape. The survey in that case was rejected because it was unknown if the surveyed person in fact associated that shape with FENDER or if the person just said FENDER because it was such a dominant brand in the market.

The Importance of Phrasing of Questions. The phrasing of the question may determine the content of the answer by the witness. So also with survey questions. To test that proposition, one surveyor asked half his sample the question: “Do you think the United States should allow public speeches against democracy?” and the other half, “Do you think the United States should forbid public speeches against democracy?” In the first sample, 56% said no, the U.S. should not “allow” public speeches against democracy, but in the second sample, only 39% said yes, the U.S. should “forbid” public speeches against democracy. While to not “allow” and to “forbid” literally carry the same meaning, the difference in connotation was significant enough to result in a 17% difference in the study. The drastic, legalistic connotations of the word “forbid,” compared with the softer import of “not allow,” resulted in a significant difference in responses.

“Aided Awareness” Survey Questions. As with direct examination of a witness at trial, in framing survey questions the surveyor must resist the temptation to lead responses to a desired end or to “help” the witness or survey respondent to reach the “correct” answer. American courts are quick to spot this defect. For example, after asking a neutral question about the interviewee’s opinion as to products sold under the defendant’s mark, a further “aided awareness” question asked the interviewee to choose from among various named products, with the desired response featured prominently in the choices. The percentage naming products made by the plaintiff jumped from only 7% on the neutral question to over 50% on the “aided awareness” question. The courts hold that while the responses to the neutral question were supportive of the plaintiff’s position, the responses to the “aided awareness” question were “of little probative value,” as they led the interviewee to the desired response.

The “Eveready” Format and the “Squirt” Format. Two survey formats commonly used to test for confusion of source or connection in American courts are the “Eveready” format and the “Squirt” format. Unlike the “Squirt” format, the “Eveready” survey format does not inform respondents what the senior mark is, but assumes that they are aware of the mark from their prior experience. A likelihood of confusion survey that makes use of neither the Eveready nor Squirt formats is likely to be rejected or to require an explanation for its basis.

The Eveready Survey Format. The “Eveready” survey format has become a widely accepted survey format for testing whether confusion is likely. Unlike the “Squirt” format, the “Eveready” survey format does not inform survey respondents what the senior mark is, but assumes that they are aware of the mark from their prior experience. The “Eveready” format is useful when the senior mark is readily recognized by buyers in the relevant universe. In cases involving strong marks, the Eveready test is the best method for fundamental cognitive and marketing reasons.

While commentators have suggested that an Eveready survey is appropriate only if the senior mark is strong and “top-of-mind,” the courts have not adopted such a restriction. Nevertheless, it is risky for a plaintiff to criticize a low percentage revealed by a defendant’s Eveready survey on the ground that plaintiff’s mark is not a “top of the mind,” well-known trademark. This requires the court to accept that the plaintiff’s mark is not familiar enough to the target group of buyers to be the subject of an Eveready survey while at the same time is so well-known to those buyers that the defendant’s similar mark will be likely to cause confusion.

Origin of the Eveready Format. To prove that consumers were likely to confuse the source of the defendant’ s EVER-READY lamps with plaintiff Union Carbide’s EVEREADY branded batteries, flashlights and bulbs, Union Carbide introduced the results of a survey with the following questions:

[Screening question to eliminate persons in the bulb or lamp industries.]
Who do you think puts out the lamp shown here? (A picture of defendant’s EVER-READY lamp with its mark is shown).
What makes you think so?
Please name any other products put out by the same company which puts out the lamp shown here.

The results were that the number who associated the products displayed with Union Carbide were: By answering “Union Carbide”: 6 (0.6%)

By indicating Union Carbide products such as batteries as being put out by the same concern: 551 (54.6%)

TOTAL 557 (55.2%)

While the trial court stated that the survey was entitled to “little, if any, weight,” the appellate court held that the trial court was clearly erroneous in not crediting the survey, stating, “Those who indicated that they believed other Carbide products were manufactured by the same company that produced the bulbs or lamps shown must be considered cases of confusion.” The questions were held not to be leading, for “this is not a case where the interviewer stated the similar parts of the plaintiff’s name several times in questions and then asked about the defendant company.”

Secondary Meaning. Although the original Eveready survey was not designed to prove secondary meaning, the appellate court held that it did help to prove secondary meaning because it supported the conclusion that “an extremely significant portion of the population associates Carbide’s products with a single anonymous source.” If the survey results are so strong, convincing and conclusive as to establish actual confusion, then some courts will view the results as also being evidence of secondary meaning.

Reverse Confusion Cases. In a reverse confusion case, an Eveready-type question should be asked of potential customers of the plaintiff’s products, not of the defendant’s products.

Affiliation or Connection Questions with an Eveready Survey. An Eveready survey format can be combined with additional questions probing whether there is a likelihood of confusion as to sponsorship, affiliation or approval. While these types of survey questions were not expressly addressed in Eveready, American courts have held that affiliation and connection queries are appropriate.

The Mark to Be Shown in an Eveready Survey. In infringement litigation, the mark to be shown in an Eveready survey should be the accused mark as it appears to shoppers in the marketplace, either online or in brick-and-mortar stores. But if the case involves only the issue of registration, then the mark to be shown in an Eveready survey should be the accused mark as it appears in the challenged application or registration.

The Squirt Format. The “Squirt” format presents a survey respondent with both of the conflicting marks. It does not assume that the respondent is familiar with the senior mark. The method of telling the respondent what the senior mark is can be either direct or subtle. A direct method is to ask in some fashion if the respondent thinks that goods or services bearing the parties’ marks A and B are from the same source or different sources.

Origin of the Squirt Method. In the case that gave rise to the “Squirt” label for this type of survey, the plaintiff alleged that defendant Seven-Up’s QUIRST for a noncarbonated lemonade drink infringed plaintiff’s SQUIRT for a carbonated grapefruit drink. The appellate court affirmed the finding that QUIRST infringed the senior SQUIRT mark. The trial court relied on the plaintiff’s two surveys, one in Chicago (where the defendant’s product was not available), the other in Phoenix (where the defendant’s product had been available for six weeks). In the Phoenix survey, in a grocery store, women 25 and older who had bought a soft drink that day were asked: “Do you think SQUIRT and QUIRST are put out by the same company or by different companies?” A follow-up question was then asked: “What makes you think that?” Of 476 face-to-face interviews, 23% said they though the two soft drinks were put out by the same company, 34% thought it was two different companies and 43% said they didn’t know. There was no control group.

The “Squirt” survey method will often produce different results from the “Eveready” format for the same competing marks. For a senior user’s mark that is not readily identified by survey respondents, a Squirt survey is more likely to produce a higher level of perception that the marks identify the same or related sources. The Squirt format is the alternative for testing the likelihood of confusion between marks that are weak, but are simultaneously or sequentially accessible in the marketplace for comparison. Some argue that a Squirt survey cannot properly attempt to replicate the real world unless the two marks and products are available or advertised in reasonable proximity in the marketplace. That is, a survey asking respondents to compare two trademarks does not bear a reasonable similarity to the marketplace unless it reflects a significant number of real world situations in which both marks are likely to be seen in the marketplace sequentially or side-by-side.

Is a Squirt Method Survey Inherently Leading? A continuing challenge raised against the Squirt method is that is improperly leading because it tells the survey respondents about a senior user’s trademark with which they are unfamiliar. A number of American courts have held that presenting survey respondents with the conflicting trademarks and asking if they identify the same or different sources is improperly leading.

Some American courts have held that, compared with the Eveready method, the Squirt approach does not properly replicate marketplace conditions because it artificially tells respondents about a senior user’s mark that they did not know about and then asks about connections with that mark. Other U.S. courts find nothing inherently wrong with a Squirt survey and hold that it should be admitted and weighed the same as any other survey data.

Product Line-Up Squirt Survey Methods. A more subtle form of the Squirt survey is a product line-up in which the survey respondent is shown (either in person or online via a computer display) an array of branded products, including the disputed brands. The respondent is asked questions about the perceived relation between the companies that sell the products with the contesting marks.

Two-Part Format. One often-used two-part format for eliciting responses as to both source confusion and confusion as to sponsorship, affiliation and connection is the following. The respondent is shown the accused product or advertisement and is asked: “What company do you think makes this product?” Responses naming the senior user evidence actual confusion as to source. Respondents who did not name the senior user are then asked a second question: “Do you think this product was approved, licensed or sponsored by another company or not?” If the answer is yes, respondent is asked: “What company do you think this product is approved, licensed or sponsored by?” Responses naming the senior user evidence actual confusion as to sponsorship, affiliation or connection. Both questions should be followed up by the question: “Why do you say that?” Often, an examination of the respondents’ verbatim responses to the “why” question are the most illuminating and probative part of a survey, for they provide a window into consumer thought processes in a way that mere statistical data cannot, although some commentators disagree with the view that verbatim responses to the “why” question are reliable indicators of consumer perception. In one case, asking respondents if they thought that the accused product was made or sold by a company that makes another line of product yielded a majority of “no” responses, but the court said that did not prove a lack of confusion of affiliation when the respondents encountered both marks.

Leading or Non-Leading Questions? It is not leading to suggest the possibility to respondents that there may be some form of affiliation or licensing behind a junior user’s operation. In one survey, to determine likely confusion, consumers were asked, “Though you may or may not have seen or heard of this restaurant, who do you believe sponsors or promotes MCBAGELS?” This was held not to be a leading question; the court held that it was not unfair to hypothesize a larger entity’s sponsorship of a defendant’s restaurant to determine whether an association with McDonald’s was triggered by the name “McBagel’s.”

“Need to Get Permission?” Questions. In the NFL case, a defendant sold unauthorized football jersey replicas imitating the design, names and colors of those licensed by the National Football League. The NFL’s survey asked several questions probing for consumer beliefs as to the sponsorship or affiliation by the NFL of these jerseys. The survey then asked the critical question: “Did the company that made this jersey have to get authorization or sponsorship, that is permission, to make it?” The court viewed the high percentage of affirmative responses as persuasive evidence of confusion as to sponsorship, affiliation or connection. The surveyor explained that it was worded in this way because it was believed that the younger and lesser educated members of the relevant universe would not understand the legal connotation of the terms “sponsored and authorized.”

Another appellate in a case involving the imitation of a golf course held that the survey question “Did defendant get permission?” was a proper inquiry and was probative of confusion. The court cited another judicial opinion which approved the question “Did defendant need to get permission?” The court held that the “need to get” question was “problematic” because it “allows for the consumer’s misunderstanding of the law.”

A criticism of the “need to get permission” survey question is that some consumers may think that there is a need to get permission in situations where there is no likelihood of confusion. Thus, this question only indirectly measures a level of confusion. Some courts have criticized the “need to get permission” question as asking for “the law,” arguing that what is asked should be the “factual” question: “was permission obtained

The “Mystery Shopper” Method. A survey dubbed the “mystery shopper” method was judicially approved as a method of proving confusion over similar trade dress. In a case in which plaintiff alleged that defendant’s table lamp design was an infringement of plaintiff’s expensive high-style halogen lamp, interviewers masqueraded as shoppers and asked in lamp stores for the identification of a lamp shown in a photograph, under the pretext of wanting to buy such a lamp for a friend. The interviewer showed the sales clerk a photograph of plaintiff’s table lamp and asked the clerk to identify it.

Word Association Questions. American courts sometimes use the term “word association survey” in different senses, leading to confusion as to what is intended. The most obvious meaning is a survey question which merely asks, “What is the first thing that comes to mind when looking at this word?” Without further probing, such a question is meaningless and irrelevant. It is irrelevant because “calling to mind” is not the same as “likelihood of confusion.” However, a “call to mind” survey may be relevant to prove the “association” which is a key ingredient of proving a claim alleging the likelihood of dilution.

Another meaning of “word association survey” is a survey question which shows respondents a stimulus that does not accurately represent the mark in issue in the case. Responses may be so ambiguous and off-the-point as to be of little aid in resolving the issue in the case. However, using a card with a mark shown in block letters as the survey stimulus is appropriate in cases involving the right to register a mark where only the registered or applied-for marks in block-letter format are in issue, without regard to special type or background.

Another possible meaning of a “word association” survey is one in which the respondent is presented with the marks alone, out of context, and asked if there is a connection. This type of survey question can be criticized as deviating too far from actual market conditions to be helpful.

Product Line-ups. A survey method sometimes used to test for likelihood of confusion, especially in trade dress cases, involves some variation of a method in which respondents are shown a “line-up” of products or containers and asked which, if any, of them are made by the same company.

Look, Pause, Look. In one variation, respondents are shown the plaintiff’s trade dress, and then, after a short delay, shown a line-up of other brands, including the accused product. Respondents are asked if any of them are made by the same company that makes the product initially seen. This is an attempt to replicate the marketplace process of advertising exposure to a brand or trade dress, followed by being confronted in the market with both similar and differing brands or trade dresses.

For example, in one case, respondents were first shown plaintiff’s candy wrapper claimed as trade dress, then in a second room were shown a line-up of several other brands, including the accused package. Respondents were then asked if any were the one they saw previously: “Do you think candy number ——— is made by the same company as the candy I showed you a minute or two ago in the other room, or do you think it is made by a different company?” If a brand was named, a follow-up question probed for reason: “What is it that makes you think candy number ——— is made by the same company as the candy I showed you in the other room?”

The court found that results of 48% and 34% linkage of plaintiff’s and defendant’s trade dress were probative of likely confusion and supportive of the finding that a preliminary injunction should issue against the infringing trade dress. Some courts will admit such a survey only if it reflects a significant number of real world situations in which both marks are likely to be seen in the marketplace sequentially or side-by-side.

Look, Then Recall. In another variation, respondents are shown a line-up of products or brands which includes the accused mark or trade dress, but not the plaintiff’s mark. After seeing the line-up, respondents are asked to recall the brand names they had seen. Those who name plaintiff’s mark as one of the brands are counted as evidence of confusion.

An adaptation of this approach is to show respondents in Phase One advertisements of four companies, including the plaintiff. Then in Phase Two, respondents are shown advertising promotions of three companies, including defendant and asked, “Was there a product or service in the booklet I showed you [in Phase One] that is from the same source or company as [shown in this exhibit in Phase Two]?” Positive answers linking plaintiff and defendant will be counted by the survey taker as evidence of likely confusion and a “noise” or “error” rate deducted.

Look at A, Then Look at B. This type of survey has been used with just two stimuli. In Phase One, respondents are shown the senior user’s mark and it is removed from view. In Phase Two, respondents are shown the junior user’s mark and it is removed from view. For half the group, the order of showing is reversed. Respondents are then questioned: “Do you think that the brand name you saw first and the brand name you saw second come from the same company, different companies, or are you not sure?” Those who answered “different companies” or “not sure” are probed with further questions. All respondents are asked “Why do you feel that way?” A control group is shown a stimulus different from that of the junior user.

Look at All the Products Together. After a jury verdict for plaintiff awarding over $53 million in damages, a trial court ordered a new trial because of the “fundamental unfairness” of a likelihood of confusion survey that showed a product lineup. The survey showed respondents a photo of a display of one of defendant’s accused products placed among 10 of plaintiff’s products. Respondents were asked if they thought that all of the products were from the same company. The court said this survey “did not employ a reliable methodology for measuring likelihood of confusion.”

Summary. Survey evidence is circumstantial and not direct evidence of confusion. The issue with surveys is how strongly they support a finding of likely confusion or other state of consumer perception under the facts. According to McCarthy, the most illuminating and probative parts of a survey are not the numbers and percentages generated by the responses, but the verbatim accounts of the responses. The respondents’ verbatim responses to “why” question may provide a window into consumer thought processes in a way that mere statistical data cannot.

[1] Federal Judicial Center, Reference Manual on Scientific Evidence (3d Ed.), file:///P:/Public/WDC/Russian%20IP%20Court/SciMan3D01.pdf.