Reproduction of Computational Models in Neuroscience and Understanding
Our joint paper (written by myself, Mateusz Hohol, and Witold Hensel) on reproducibility of computational neuroscience has just been assigned to the December issue of the Journal of Computational Neuroscience (open access). In this paper, we argue that assuring replication of scientific results does not yield to a single solution. And this is still a problem, with few reproductions being published and low code availability in major journals of computational neuroscience, as our preliminary study shows:
Most importantly, repeating the model, or rerunning the same model by the same researchers, and replicating the model, or rerunning it by others, requires different set of best practices. They actually would benefit from minute documentation of the whole modelling process, including noting random seeds, versioning all the scripts, recording all intermediate results, etc. (a proposal of ensuring repeatability and replicability of computer science this way was recently defended by Sandve et al. 2013).
But for theoretical purposes of understanding and explanation, these could be detrimental. If one wishes to actually reproduce the results by building another model by following its theoretical description, all the minute details could be actually detrimental and distracting. As we argue in the paper, publications regarding models in computational neuroscience should therefore contain all and only information relevant to reproducing a model and evaluating its value. Alas, many papers currently publish fail in both respects: sometimes they include redundant introductions of the theoretical framework, for example, instead of describing how a particular model was produced, and sometimes they simply fail to make clear how theoretical understanding was operationalized in a model.
Our proposal is, as it turned out, similar in spirit to what Guest and Cooper proposed in their paper. They argue that what they call ‘specification of a model’ is not to be conflated with its implementation. Thus, a theory behind a model should be included in its specification, and implementation details, which sometimes include ad hoc assumptions required to make the model actually run, should not be confused with the theory. A good example of such a confusion is how Pinker and Prince (1988) criticize the influential model of past-tense learning by Rumelhart and McClelland (1986): they take the implementation detail to be theoretically important, while later connectionist work shows that the theoretical account is much more general than a particular choice of problem representation (such as the particular phonological representation in the first model).
Our approach is similar to what Guest and Cooper defend. Nonetheless, we stress that one should distinguish two practices – and ensure not only replication and repetition but also reproduction. The first is served better by open repositories, public code review and such, while the second is best ensured by good theoretical publication and subsequent attempted reproduction.