The Ganzfeld Methodology: A Joint Effort in Understanding

Iyace

Member
In this thread, we'll be discussing the methodology of the Ganzfeld experiment, its susceptibility to various experimental bias, protocols in place to control for those biases, and what we can gain in terms of experimental evidence in light of those these factors.

I'm going to ask Andy to keep this thread heavily moderated to stay on topic. We'll begin the thread by introducing various biases both in the experimental approach, and in the GRADE approach. And this point, it is senseless to debate on these biases, as they will not necessarily apply to the Ganzfeld. This will be as close to a total ' pool ' of bias we can draw from.

The second portion of this thread will deal with what potential biases actually apply to the Ganzfeld. Bare in mind that this section will not mean they necessarily are present in the experimental execution, but by the very nature of the Ganzfeld methodology, these biases could account for false positives. This portion should be open to debate somewhat, as there may be some disagreement between what should apply and should not, but too much time should not be spent on this section, as most biases are relatively obvious.

The third portion will be dealing with how the protocol controls for these biases. What safeguards are in effect to attempt to give us accurate results?

The last portion will be discussing whether or not the Ganzfeld gives us a high, or low, quality research. This does not mean whether the Ganzfeld was done sloppily or poorly. This has to do with whether there is a high risk of bias in the Ganzfeld. If there is a high risk, this does not necessarily mean that the Ganzfeld can't provide us accurate data; Just that it must be controlled for a lot more rigorously than a methodology that has a low risk of bias. This will be the most heavily debated of the sections.

Please keep responses tasteful and tactful. Keep under the assumption in drafting your replies in this thread that everyone is attempting, in their best efforts, to control their cognitive biases. Likewise, attempt to recognize those biases in yourself when you post a reply. If a term or bias is not clear, attempt hammer that out via private message, or post a brief question on the thread for someone to PM you and explain. We're going to try to prevent this thread from devolving into a deluge of worldview war.

If there's any questions or concerns about the way this communique will be structured, or any suggestions, feel free to post them now before we get started.
 
I'm going to ask Andy to keep this thread heavily moderated to stay on topic.

I've seen the "Mod+" marker listed as an available prefix when creating a new thread; wouldn't it have been appropriate to have used that? The whole point of the marker is to indicate threads that are meant to be eagle eyed by the administrators...
 
Iyace, why don't you put forward a paper to discuss - we can summarize it and identify areas to focus on vis-a-vis the methodology.
 
I've considered resolving the "which statistical method do we use?" sticking point in the past by creating a simulation that would create test sets with different biases, which would allow testing a particular method against "at chance" and "barely above chance" to "very much above chance." Then, base a genetic algorithm around the different operators used in statistical analysis and let the computer try out different attempts to identify the most accurate method that is able to compare the chance-based results to the "positive" (synthesized) results and see what effect size would be needed for a high confidence rating.

I suspect that the correct methods are already being used, since looking up Stouffer Z shows some examples (and one discussion on Cross Validated that Stouffer Z is more apt to tracking real hits better than Chi-squared) and the statistical methods appear to check out with standard reference material suggesting the validity of those algorithms. I also suspect the best means to address the Ganzfeld is to take the highest performers and replicate their particular quirks (highly creative people, computer controlled random, as examples) in a single pooled study instead of a meta-analysis. But, a computer model has the potential to give a lot of mathematical details and logs so it might be worth considering.
 
Lets stick to just experimental protocol. Not statistical analysis. The paper I linked gives the general ganzfeld procedure.
 
Lets stick to just experimental protocol. Not statistical analysis. The paper I linked gives the general ganzfeld procedure.

Considering the single largest problem with adoption of the Ganzfeld database is that academics seem to think Wiseman-Milton is the correct analysis, and arguments still ensue regarding pooling tests with a meta-analysis (and deciding on what counts for inclusion or not) instead of running a single large study with identical design, it seems a little odd to remove the topmost problem from the deck. :eek:

The designs haven't really been much of a problem for a few years or more, its a matter of removing the variance and running a single design as a pooled study with identical implementations and different researchers per site. As far as I am aware on the top of my head, the largest Ganzfeld has about one hundred participants while a statistical power calculator indicates at least 384 would be required to have a 95% confidence and a 5% margin for error. Having the experimenters do a subset of these allows cooperation from skeptic and proponent circles without having to directly interfere with each others labs, and agreeing to the testing protocol ahead of time prevents any goal-post moving silliness. Radin et all have access to true-random generators, so the randomization of display order (randomization has been argued as flawed by skeptics in the past) is done and over with, which just requires specifying in the protocol that all decisions are made using the same model true-random generator. There's no more 'tape degradation' sensory leakage (digital copies don't age), 'hearing the whirrs' (use a solid state drive, pre-cache a random number of red herrings, they don't make sound) which means those suggestions by skeptics are even less viable in modern studies. Doing a subset where each experimenter essentially gets a/the lab to themselves for that number of samples rules out possible arguments for "skeptic doubt" causing losses. Then you come back with the results, take them to the pre-agreed upon success/analysis criteria and pool the whole thing under a single report (zero heterogeneity arguments whatsoever, because they're all identical protocol.)

That you can peel a success or fail out of picking which tests to meta-analyze is their most prevalent and powerful argument right now; if you had something like the Wiseman-Schlitz paper done for the Ganzfeld like I just described, and if that returned positive results (why wouldn't it?), what choice do they have other than "Game over, anomalous cognition has been modeled and we're out of arguments"? A successful version of what I just described would also be additive, since its validity would add to the total database's validity in meta-analysis and basically state that there is no "sum of small rounding errors" situation going on.

I'll look in to the paper in the morning, but I suspect I'll see what I already said: Very solid protocol with near-nitpick levels of flaws, plagued with smaller sample sizes. I need to sleep now.
 
Youre missing the point of the thread. Were going to start by all having an understanding of Ganzfeld protocol, and then were going to discuss various biases that can exist during experimentation.
 
Iyace,

I think you may have hit the nail on the head here! The Ganzfeld procedure seems so watertight that none of the skeptics want to argue which specific aspect of the procedure supposedly biases the outcome from 25% to over 30%. I'd like to read a convincing explanation as to how this experiment can produce the results that it does because of some flaw!

David
 
I'm happy to discuss the ganzfeld methodology, but my understanding was that Iyace wanted first to discuss experimental biases in general, especially with reference to the GRADE approach (and then, only after that, see how that applies to the Ganzfeld work). Since I know very little about that, I didn't contribute.
 
Same kind of thing for me (although I am very familiar with the GRADE approach). I already outlined the experimental biases in general, looked in detail at which biases the ganzfeld was susceptible to, addressed what protocols were in place and whether they were sufficient (according to the GRADE recommendations), and outlined which additional protocols would be recommended, on the other forum. There didn't seem to be any proponent interest in that discussion (and Andy/Alex deleted all those posts), so I'm a bit puzzled/suspicious about what's going on here. I'm going to wait and see if this goes anywhere, but I don't think anyone should try to draw any conclusions from a lack of non-proponent participation.

Linda
 
Same kind of thing for me (although I am very familiar with the GRADE approach). I already outlined the experimental biases in general, looked in detail at which biases the ganzfeld was susceptible to, addressed what protocols were in place and whether they were sufficient (according to the GRADE recommendations), and outlined which additional protocols would be recommended, on the other forum. There didn't seem to be any proponent interest in that discussion (and Andy/Alex deleted all those posts), so I'm a bit puzzled/suspicious about what's going on here. I'm going to wait and see if this goes anywhere, but I don't think anyone should try to draw any conclusions from a lack of non-proponent participation.

Linda
If you still have that post saved, or fresh in mind, can you repost it here?
 
I pmd paqart last week asking him to restore fls' Ganzfeld posts - he hasn't responded but perhaps if others asked as well.
 
If you still have that post saved, or fresh in mind, can you repost it here?

I'd rather wait and see how this thread proceeds in the absence of skeptical nincompoopery.

(If you had any sense, you'd ask me not to post, so Alex doesn't pull the plug on your thread. :eek:)

Linda
 
Last edited:
This study is a metanalysis with a LOT of issues to discuss. I suggest to keep things manageable we break down the issues and try and deal with them one by one. I'll open the discussion with one issue that I'd like to get a better understanding of.This is the issue of the power of the underlying studies in this meta-analysis and the impact that has on how we should see the results.

I don't think its controversial that this meta-analysis is made up of underpowered studies. It may be that not every member of this forum understands what that means or why it is important.

Then we should look at how we should look at underpowered studies in the context of a meta-analysis. Does combining them address the issues of the under-powering or do the concerns carry over? Are meta-anslysis techniques sufficient to address those problems or for the meta-analysis to be considered reliable does it need to be based on sufficiently powered studies in the first place.

I think we should try and discuss these out of the context of the ganzfeld. Let's build this discussion from the ground up. Let's try and identify the issues and concerns. This isn't about validating or invalidating the results of this paper. It's about trying to figure out how we should view the results of this paper.

Sound reasonable?
 
Alright, we can start with power.

A couple issues that could come about with power are things like cutting off research too early at the first signs of success, or extending research after an unsuccessful amount of trials to achieve a null mean. This would imply that big projects with could be combined with smaller studied that cut off the mean effect size before it has the chance to reach a true mean ( in this case the null ). Naturally, it also gives greater incentive to drop unsuccessful studies leading to intended file-drawer effects.

Power analysis, which is what arouet is referring to, is a way to calculate how many trials will give you the needed sample size to reject the null confidently. With a bigger mean effect size, you need smaller n. Likewise, with a smaller effect size, your n will need to be larger.

Arouets question is whether the power problem can be correctly addressed via a metaanalysis. Since most if not all ganzfeld studies are underpowered, MAS are used to draw conclusions from a multitude of those studies. So his question is this: Does a meta analysis serve to accurately address the power problem in most GZ studies?

Is this correct?
 
This would imply that big projects with could be combined with smaller studied that cut off the mean effect size before it has the chance to reach a true mean ( in this case the null ). Naturally, it also gives greater incentive to drop unsuccessful studies leading to intended file-drawer effects.

As I understand it, the volatile nature of means with regards to very high and very low input is why medians are sometimes preferred. Medians are more resistant to outliers unless those outliers are sufficiently common enough to pull the set down, so wouldn't it be more appropriate to use these instead? (I've been reading in to statistics only recently, so I haven't yet seen the suitable explanation why they aren't used more often in these scenarios.)

Does a meta analysis serve to accurately address the power problem in most GZ studies?

I don't believe it does, because pooling requires checking for similarity of studies and any differences within the protocol would be equivalent to flipping variables around mid-study. If you consider that up and changing the experimenter or the lab for no apparent reason would be madness, and you consider that treating the studies as unified in this way implies you have done just that, I don't believe it can do more than show you that commissioning a proper sized study is worth doing.

It is my vague understanding that this is how pilot studies normally work: you put out a test protocol, and if it seems legit you then have a hypothesis to test with a properly powered study. Pooling pilot studies seems bizarre.
 
I don't believe it does, because pooling requires checking for similarity of studies and any differences within the protocol would be equivalent to flipping variables around mid-study. If you consider that up and changing the experimenter or the lab for no apparent reason would be madness, and you consider that treating the studies as unified in this way implies you have done just that, I don't believe it can do more than show you that commissioning a proper sized study is worth doing.

It is my vague understanding that this is how pilot studies normally work: you put out a test protocol, and if it seems legit you then have a hypothesis to test with a properly powered study. Pooling pilot studies seems bizarre.

Ok, so what does this mean on how we should view this Ganzfeld meta-analysis?

Here's what Kennedy has to say on this issue: http://jeksite.org/psi/jp13a.htm

The minimum power typically recommended in both behavioral and medical research is .8 (e.g., Cohen, 1988, p. 56; Food and Drug Administration, 1998, p. 22). If an effect is real, at least 80% of properly designed confirmatory studies should obtain significant outcomes. This degree of replication provides convincing evidence that the experimenters understand and control the phenomena being investigated. A power of .90 or .95 is preferable when possible.

WIth regard to Ganzfeld in particular he writes:

By the usual methodological standards recommended for experimental research, there have been no well-designed ganzfeld experiments. Based on available data, Rosenthal (1986), Utts (1991), and Dalton (1997b) described 33% as the expected hit rate for a typical ganzfeld experiment where 25% is expected by chance. With this hit rate, a sample size of 201 is needed to have a .8 probability of obtaining a .05 result one-tailed.[1] No existing ganzfeld experiments were preplanned with that sample size. The median sample size in recent studies was 40 trials, which has a power of .22.[2]


He goes into other issues re: Ganzfeld as well. (its a long paper going into many methodological issues in parapsycghology). The gist of it as I see it is that he basically considers the work to date to be pilot projects and what is really needed are properly powered replications.

Keep in mind, Kennedy is a proponent parapsycholgist who believes that such studies will show more reliably anaomalous results. Until that happens though, don't we have to treat the results of this study as an intriging justification for doing such a properly powered study but on its own not sufficient for drawing reliable conclusions about the ganzfeld resutls?
 
Back
Top