Validity and reliability

This is not ready yet

There are several ways to increase and measure the validity of your survey. Careful preparation of your survey improves its validity. You can also add extra bits to your survey, like a control question or adding the "Other" option. Finally, statistical analysis of the survey results allows you to measure its validity.

The difference between validity and reliability

Survey results can be reliable but invalid.

For example, if a set of weighing scales consistently measured the weight of an object as 500 grams over the true weight, then the scale would be very reliable, but it would not be valid (as the returned weight is not the true weight). For the scale to be valid, it should return the true weight of an object. This example demonstrates that a perfectly reliable measure is not necessarily valid, but that a valid measure necessarily must be reliable. (source: Wikipedia)

Reliable results mean that the results are consistent. They are reproducible and have a low rate of errors.

Validity is "the degree to which a test measures what it claims, or purports, to be measuring". Even if the results are reliable, your survey can still be invalid if its results are not an answer to the real questions you want to see answered.

Improving validity before you administer your survey

Survey preparation

Your first concern must be: will the survey result tell you what you want to know? To make sure, you must formulate your questions well and ensure the possible answers make sense to the participant. You can only know for certain if you test your survey beforehand.

Selecting the right audience

Think about what you want to know and who you want to know it from. If you want to know how experienced users feel about certain features, recruit experienced users. If you want to know from accounting experts how they feel about your new accounting app's features, recruit accounting experts.

It sounds obvious, but I've seen Kano surveys that were open to everyone and his dog. I don't want to be the manager who makes decisions based on the outcome of those surveys.

Introducing a control question

One way of assuring yourself of the validity of the results is by adding a control question to your survey. If there's a feature you're absolutely certain is a Must-Be feature, add it to the survey. Be absolute sure it is a Must-Be feature however. Don't use it as a control question if you can think of any reason why someone may give a pair of answers that categorize the features differently.

If after you've done the survey, your control question turns out to be a Must-Be feature indeed, you'll have more confidence about the validity of the other features' categories.

Using "Other" as a measure of confidence

If you have added "Other" as a sixth choice for your questions, you can use the number of "Other" responses as a measure of confidence. Kano 2001 states that "if the number of "Other" responses does not exceed 1% for every survey item, it can be certified that the survey results are extremely confident".

Judging the validity of the results

https://www.mathsisfun.com/data/chi-square-test.html

Help, the answers are all over the place!

Are they really? Show examples of statistical significance where numbers seem all over the place

Three major reasons:

Segments in audience (use k-means to find out?

Determining validity

Results will be invalid if they are useless to you to begin with.

Category reliability

Category statistical significance

TODO

z-test and t-test opzoeken.

Category strength

Lee and Newcomb (1997) state that there needs to be a minimum difference of 6% between the top two categories for the survey results to be statistically significant. Statistical significance means that there the result is not the same as a random distribution, but a usable indication of customer attitudes towards the feature.

Looking back at the our example survey, this means that for feature 1, the category strength is 4%. 10 out of 25 participants attributed the One-Dimensional category to the feature (40%), while 9 out of 25 (36%) attributed it the Must-Be category. The difference is 4%, and that’s too little to confidently determine that the feature is a Must-Be feature.

Answer reliability

Reliability refers to the scoring consistency among groups of participants. Many studies use Cronbach's alpha to measure item (question) consistency.

There is a fundamental problem with this approach. One of the parameters in the calculation of Cronbach's alpha is the average standard deviation between answers. But the answers on a Kano survey are not part of a scale. You cannot assign a numerical value to an answer. One answer is not higher or lower than another. Using Cronbach's alpha to determine consistency and reliability is therefore wrong.

You can however apply the test to customer satisfaction coefficients.

Last updated