Skip Nav

Validity (statistics)

Assessing Behavioral Changes: The Importance of Having a Baseline For Comparison

❶Standards for Educational and Psychological Testing. But each of these, the cause and the effect, has to be translated into real things, into a program or treatment and a measure or observational method.

Main navigation

Navigation menu
TQR Publications
This article is a part of the guide:

At the other extreme, any experiment that uses human judgment is always going to come under question. Human judgment can vary wildly between observers , and the same individual may rate things differently depending upon time of day and current mood.

This means that such experiments are more difficult to repeat and are inherently less reliable. Reliability is a necessary ingredient for determining the overall validity of a scientific experiment and enhancing the strength of the results. Debate between social and pure scientists, concerning reliability, is robust and ongoing. Validity encompasses the entire experimental concept and establishes whether the results obtained meet all of the requirements of the scientific research method.

For example, there must have been randomization of the sample groups and appropriate care and diligence shown in the allocation of controls.

Internal validity dictates how an experimental design is structured and encompasses all of the steps of the scientific research method.

Even if your results are great, sloppy and inconsistent design will compromise your integrity in the eyes of the scientific community. Internal validity and reliability are at the core of any experimental design.

External validity is the process of examining the results and questioning whether there are any other possible causal relationships. Control groups and randomization will lessen external validity problems but no method can be completely successful. This is why the statistical proofs of a hypothesis called significant , not absolute truth.

Any scientific research design only puts forward a possible cause for the studied effect. There is always the chance that another unknown factor contributed to the results and findings. This extraneous causal relationship may become more apparent, as techniques are refined and honed. If you have constructed your experiment to contain validity and reliability then the scientific community is more likely to accept your findings.

Researchers often rely on subject-matter experts to help determine this. In our case, the researchers could turn to experts in depression to consider their questions against the known symptoms of depression e.

In this case, the researchers could have given a questionnaire on a similar construct, such as anxiety, to see if the results were related, as one would expect. Or they could have given a questionnaire on a different construct, such as happiness, to see if the results were the opposite.

The researchers could see how their questionnaire results relate to actual clinical diagnoses of depression among the workers surveyed.

Researchers also need to consider the reliability of a questionnaire. Will they get similar results if they repeat their questionnaire soon after and conditions have not changed? In our case, if the questionnaire was administered to the same workers soon after the first one, the researchers would expect to find similar levels of depression.

However, surveying or carrying out a study with an entire population is almost impossible and very costly. Thus, a sample representative of the population is used, and the data is analyzed and then conclusions are drawn and extrapolated to the population under study.

It is important to have an appropriately sized sample to achieve reliable results and high statistical power — the ability to discern a difference between study groups when a difference truly exists. An insufficient sample size is more likely to produce false negatives and inconsistent results. On the other hand, too large of a sample is not recommended because it can be unwieldy to manage, and it is a waste of time and money if an answer can be accurately found from a smaller sample.

The causes of bias can be related to the manner in which study subjects are chosen, the method in which study variables are collected or measured, the attitudes or preferences of an investigator, and the lack of control of confounding variables… in epidemiologic terms bias can lead to incorrect estimates of association, or, more simply, the observed study results will tend to be in error and different from the true results.

Although bias in research can never be completely eliminated, it can be drastically reduced by carefully considering factors that have the potential to influence results during both the design and analysis phases of a study. The most common types of bias in research studies are selection biases, measurement biases and intervention biases. Selection bias also may occur if a study compares a treatment and control group, but they are inherently different.

If selection bias is present in a study, it is likely to influence the outcome and conclusions of the study. This can occur due to leading questions, which in some way unduly favor one response over another, or measurement bias may be due to social desirability and the fact that most people like to present themselves in a favorable light, and therefore, will not respond honestly.

When assessing behavioral changes, it is essential to have a baseline or control group for comparison. It is important to evaluate the impact of a program and determine whether the program actually had an impact, or if what happened would have occurred regardless of the implementation of the program.

Such an evaluation could yield useful information to program implementers. But it could not be considered a rigorous evaluation of the effects of the program if there are good reasons to believe that scores might have changed even without the program.

Just another site

Main Topics

Privacy Policy

Internal validity and reliability are at the core of any experimental design. External validity is the process of examining the results and questioning whether there are any other possible causal relationships.

Privacy FAQs

Internal validity - the instruments or procedures used in the research measured what they were supposed to measure. Example: As part of a stress experiment, people are shown photos of war atrocities. Example: As part of a stress experiment, people are shown photos of war atrocities.

About Our Ads

Validity: the best available approximation to the truth of a given proposition, inference, or conclusion. The first thing we have to ask is: "validity of what?" When we think about validity in research, most of us think about research components. "Any research can be affected by different kinds of factors which, while extraneous to the concerns of the research, can invalidate the findings" (Seliger & Shohamy , 95). Controlling all possible factors that threaten the research's validity is a primary responsibility of every good researcher.

Cookie Info

Internal consistency reliability is a measure of reliability used to evaluate the degree to which different test items that probe the same construct produce similar results. Average inter-item correlation is a subtype of internal consistency reliability. Research validity in surveys relates to the extent at which the survey measures right elements that need to be measured. In simple terms, validity refers to how well an instrument as measures what it is intended to measure.