Once you think of an idea you would like to pursue (and get permission to do so, if necessary), you then need to design a study to conduct. 'Study' here is misleading as it typically refers to some kind of analysis being conducted on collected data. However, a study can be anything towards the progression of an idea: building a platform, collecting data from a specific group of people, generating figures which represent outcomes. As long as you can conduct an investigation of a subject, it can be considered a 'study'. Once you have a study in mind, you almost always should preregister it.
What is a Preregistration?
A preregistration is essentially a document stored on some registry which holds the methodology of a study before the study is actually run. For example, you would upload a preregistration before collecting data within an empirical study, or a tech spec written to develop an online platform. Preregistrations have increasingly become more version controlled, meaning you can view any updates or changes along with when it was done.
Are there any benefits for creating a preregistration?
Yes, though it depends upon your use-case.
Analyses
P-Hacking
XKCD does a great example of this using jelly beans (embedded under CC-BY-NC-2.5), so we'll use that as an example:
Let's say you were conducting an analysis of some collected dataset (in this case, jelly beans) and trying to determine whether your hypothesized effect was significant (does jelly beans cause acne?). You run a statistical test and set your alpha criterion to be 0.05. This means that you want the probability of the null hypothesis (jelly beans does not cause acne) being true, also known as the p-value, under 0.05 to be considered significant by your statistical test, and as such, have an effect.
However, an alpha criterion of 0.05 does not mean that the results are actually valid. If we reframe the statement differently, it means 1 out of 20 statistical tests will say that the results are significant even though they actually are not. This is known as a 'false positive' or 'Type I Error' in statistics. Going back to the jelly bean example, if you run the same statistical test on each of the 20 colors of jelly beans, it is likely that one of the tests will state that 'green jelly beans cause acne' when it was actually a false positive.
Choosing to only report the significant results after running numerous tests on the data is known as cherry-picking or p-hacking. As you can see, it can lead to misrepresentations of the data, incorrectly reporting coincidence as truth.
Of course, there are numerous ways to mitigate the issue, such as running multiple test corrections on p-values to correct for false positives (which you should do anyways when running multiple tests on the same dataset). Preregistrations help mitigate this by requiring the researcher to provide their analysis plan ahead of time, such that they are only confirming whether their analysis is significant, known as a confirmatory analysis, rather than cherry-picking the result that was significant.
Hypothesizing After Results are Known (HARKing)
What if, instead, you didn't find any significant results where jelly beans caused acne, but instead found that there was a link between jelly beans and cancer? If you chose to report this finding instead of the one originally proposed, you are hypothesizing after the results are known of your research, also known as HARKing. This runs into the same issues as p-hacking where the results could be 'false positives' while obscuring the actual outcomes of the research. Additionally, the initial study was not designed to determined such an effect, so it would be impossible to draw any useful connections.
Once again, preregistrations are useful as the hypothesis made is already provided ahead of time. As such, you are simply measuring the results of the hypothesis without introducing a new one.
Specifications for Development
Technology specifications can be considered to be a form of preregistration when developing something for some purpose. You plan out what you are going to build ahead of time and use the spec to develop it. As preregistrations are version controlled and updatable, you can fix any issues within the specification while also being able to see the iterative process on how it developed over time.
Documentation
Preregistrations move a good portion of the methodological process when writing a report or research paper to before the study has started. In most cases, you can refer or copy a preregistration directly to your report or paper as its methodology. Additionally, it can provide clear information on how your experiment was run such that it can be more easily replicated in the future.
What if I didn't preregister before starting my study?
You can still preregister your study up until you run the analysis or have finished building the system you wish to develop. Of course, the earlier you have an idea, the better. However, you just need to be honest about what context and phase of the study you are in while providing your plan of attack. Remember, you can always update a preregistration later if things change or if you are unsure about what you are trying to accomplish.
Additionally, you can still run analyses that are not in the preregistration! Just make sure to report them as not being preregistered. Typically, preregistered analyses are called confirmatory while non-preregistered analyses are called exploratory.
Is a preregistration required?
It depends on wherever you are trying to publish the final results. A number of journals require that any outcomes reported must be preregistered beforehand. Otherwise, it is up to you whether you want to preregister your study.
How do I preregister a study?
Typically, all it takes is writing a document and storing it some location. However, there are services like AsPredicted or Open Science Framework (OSF) Registries which are official sites dedicated to this explicit purpose, which is highly recommended. They also come with a bunch of nifty features: templates for whatever study you would like to conduct, version control, linking to resources used.
Each website already has their own support service, so I will not be restating the information here. AsPredicted takes a more hands on approach by providing the support within the preregistration setup itself. OSF Registries, meanwhile, provides their entire support through a support page with step-by-step tutorials on how to use their service.
Now, you have created your preregistration!
Some Additional Thoughts
Not the Intended Use
Preregistrations are typically not intended for use outside of empirical studies for confirmatory analysis. I mention otherwise above because the concept of a preregistration can extend to numerous other cases while being called different things in different fields. Now, of course, many of the documents are still different from a preregistration: a technical specification can be drafted after the platform has finished being built. However, they still roughly have the same purpose and use case.
Getting Scooped
Researchers sometimes worry about their research being taken from under their nose and getting reported by someone else. Preregistrations can, in fact, be embargoed such that they are only made public after a certain amount of time has passed. However, there is nothing wrong with being the second person to achieve the desired results. All research needs to be done by at least two people: one to provide the initial discovery and one to validate the results. Without this, we could have research which reports a significant result but is not replicable when pursuing further.
Anonymized Preregistrations
One of the more interesting features for preregistration services is to anonymize its metadata. This allows researchers to include a link to their preregistration within their paper submission without violating any blinded review process. Afterwards, they would just need to swap it out with the non-anonymized link. Of course, make sure that any associated content or resources does not contain identifiable information that could limit how anonymized your preregistration is.