A matrix of size is striped row-wise amoung p processes, so that each processor stores n/p rows of the matrix. A Vector of size n is sent to each processor. Each processor multiplies it's chunk of the matrix with the corresponding rows of the vector. The root process then gathers in all the values. This is more efficient than the previous solution, as the entire matrix is broadcast to the nodes at the start, instead of looping around and sending one row at a time.