Since the objective function is convex in $W$ alone or $H$ alone, but not jointly, standard methodologies use Block Coordinate Descent (BCD) .

modelling in mathematical programming methodol hot