The locale-gen
script can take some time, especially when trying to generate all supported locales.
Part of the problem is that it only runs the actual localedef
tool for one locale at a time, so utilizing only one CPU core at a time.
So what can be done to improve the situation, preferably without touching the distributions script itself?
This is what I ended up doing when provisioning virtual machines:
# create a unique temporary directory based on current PID mkdir /tmp/locales.$$ # now do everything else in a sub shell to avoid changing # the current shells environment ( cd /tmp/locales.$$ # create a local "localedef" wrapper script that justs # prints the actual command to be run echo 'echo; echo /usr/bin/localedef "$@"' > localedef chmod a+x localedef # make sure our wrapper script is found first export PATH=.:$PATH # just make sure that we are not going to get warnings # about "locale not found" while running "locale-gen" # the "C" locale should always be there export LANG=C export LC_ALL=C # now run "locale-gen", filter out the "localedef" # commands only (there is no way to otherwise silence # local-gen progress messages) and feed the printed # commands into GNU parallel to utilize all CPU cores locale-gen | egrep "^/usr" | parallel ) # cleanup rm -rf /tmp/locales.$$ |
The prerequisite to this is obviously to have the GNU parallel tool installed.
With that in place the wall clock time needed to (re-)generate all supported locales is drastically reduced, and all CPU cores can be seen working close to 100% while running this instead of just one.