Abaqus - Explicit ALE meshing - Please help

Hello Abaqus community, I am performing forming simulations in Abaqus explicit. I am using ALE remeshing technique. My simulation is 100 seconds long. I know that using ALE will have an effect on Domain decomposition. I am running the simulations in HPC with 16 cores but no use because of ALE. The domain is not properly decomposed for parallel processing. Is there any work around or any solution. The simulation takes considerably very long time even with mass scaling and time scaling. Any help would be highly appreciated. Thanks