Most of the codes I develop run in parallel using MPI (Message Passing Interface) using the python wrapper, mpi4py. There is a reason why highly scalable programs use this approach, and that is because each processor handles its own chunk of memory and communicates with other processors only when it’s needed.