-
Notifications
You must be signed in to change notification settings - Fork 94
Description
I just uploaded mnist.tar.gz to GitHub file storage in way that is persistent without being part of the git
repository. I recommend against storing binary files in a repository. Over time, they increase download times and because git
can't do useful diffs on binary files, every change to the file means a completely new version must be stored. What's worse, git rm
doesn't take any of the committed versions of the file out of the commit history so the only way to fix the problem is to rewrite history, which essentially means that everyone who has a local copy of the repository will need to do a fresh clone and if having everyone do a fresh clone is not practical, then it's probably necessary to set up a new repository.
I recommend either
- Use Fortran's intrinsic
execute_commnand_line()
subroutine to download and uncompress the file from the above location at runtime if it's missing or - Provide a short script that users can run to handle basic set up tasks (including downloading data files) and then build and test the repository's software.
On the OpenCoarrays project, I found that our downloads went up by a factor of 2-3 soon after I wrote an installation script. More recently, I settled on a much simpler approach to writing an installer for Caffeine that is an order of magnitude smaller than the OpenCoarrays installer, more robust, and much more maintainable. I'll be glad to adapt Caffeine's install.sh
script to neural-fortran if you like. When someone can get a package built and tested by typing nothing more than ./install.sh
, it saves a lot of time over reading build/test instructions, no matter how simple those instructions are.