Home > Uncategorized > Insignificant ‘statistical significance’

Insignificant ‘statistical significance’

from Lars Syll

ad11

worship-p-300x214We recommend dropping the NHST [null hypothesis significance testing] paradigm — and the p-value thresholds associated with it — as the default statistical paradigm for research, publication, and discovery in the biomedical and social sciences. Specifically, rather than allowing statistical signicance as determined by p < 0.05 (or some other statistical threshold) to serve as a lexicographic decision rule in scientic publication and statistical decision making more broadly as per the status quo, we propose that the p-value be demoted from its threshold screening role and instead, treated continuously, be considered along with the neglected factors [such factors as prior and related evidence, plausibility of mechanism, study design and data quality, real world costs and benefits, novelty of finding, and other factors that vary by research domain] as just one among many pieces of evidence.

We make this recommendation for three broad reasons. First, in the biomedical and social sciences, the sharp point null hypothesis of zero effect and zero systematic error used in the overwhelming majority of applications is generally not of interest because it is generally implausible. Second, the standard use of NHST — to take the rejection of this straw man sharp point null hypothesis as positive or even definitive evidence in favor of some preferredalternative hypothesis — is a logical fallacy that routinely results in erroneous scientic reasoning even by experienced scientists and statisticians. Third, p-value and other statistical thresholds encourage researchers to study and report single comparisons rather than focusing on the totality of their data and results.

Andrew Gelman et al.

As shown over and over again when significance tests are applied, people have a tendency to read ‘not disconfirmed’ as ‘probably confirmed.’ Standard scientific methodology tells us that when there is only say a 10 % probability that pure sampling error could account for the observed difference between the data and the null hypothesis, it would be more ‘reasonable’ to conclude that we have a case of disconfirmation. Especially if we perform many independent tests of our hypothesis and they all give about the same 10 % result as our reported one, I guess most researchers would count the hypothesis as even more disconfirmed.

We should never forget that the underlying parameters we use when performing significance tests are model constructions. Our p-values mean nothing if the model is wrong. And most importantly — statistical significance tests DO NOT validate models!

statistical-models-sdl609573791-1-42fd0In journal articles a typical regression equation will have an intercept and several explanatory variables. The regression output will usually include an F-test, with p – 1 degrees of freedom in the numerator and n – p in the denominator. The null hypothesis will not be stated. The missing null hypothesis is that all the coefficients vanish, except the intercept.

If F is significant, that is often thought to validate the model. Mistake. The F-test takes the model as given. Significance only means this: if the model is right and the coefficients are 0, it is very unlikely to get such a big F-statistic. Logically, there are three possibilities on the table:
i) An unlikely event occurred.
ii) Or the model is right and some of the coefficients differ from 0.
iii) Or the model is wrong.
So?

  1. Rhonda Kovac
    January 11, 2019 at 9:30 pm

    That extensive and pervasive errors and misrepresentations, deliberate or otherwise, in the application of these kinds of statistical tests are present — and that such errors are not precluded by our method and its rules — indicates more than just that scholars are not being conscientious, professional or honorable enough.

    It points to a serious deficiency in the method itself. Such things should not be permitted to occur, certainly not in as widespread a fashion as is happening now.

    A further extension/elaboration of scientific method is sorely needed for economics, and for the other social sciences, if they are to be truly valid and productive.

  2. Helen Sakho
    January 13, 2019 at 2:41 am

    Inaddition to the comment just posted, I throughly agree. Economics needs to reform itself throughly or it will implode from within. And it seems this is already happening as it is increasingly discredited as a science. Cross posting may occur, for which I apologise.

  3. January 15, 2019 at 4:51 am

    Ladies – Rite you are. However, a more surgical analysis seems in order. Unless I am mistaken, anything less than realistic analysis is a bogus science masking a parasitic scam.The cause of the problem is clearly the pandemic spiritual illness we can call the culture of corruption and pandemic ecocidal delusion. “Further scientific method” lacking qualitative ecometrics to account for the vast majority of human interests, aspirations, intentions & motivations will not cure the kleptocracy of plutonomy and faux-scientific economics (actually plutonomics). Why? The proper scope of a real science of cultural holontology (ethical economics) must be focused on the realities and causal factors of personal activity (the actual micro-level of economics), and on communal activities and interaction among communities. Other than ecological factors, what else determines the course of cultural economies?

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.