This adds an additional queue configuration that asks for multithreaded jobs, thus doubling the virtual memory limit to 16GB.
With this fix I can run heavy Tensorflow baselines without OOM crash.
Long term it would be better to:
Expose the exact thread multiplier (now 2 by default)
Use the resource tagging of transformers to have the queue dynamically selected based on the transformer