I used bruteforce but gave each state a score, and always process the 'best' first, and this ran in under 10 seconds for part 1, without jo's optimization above. (Part 1 also worked with it, but didn't require it.)
The score was simply how top-heavy it was, i.e. (4 x count on floor 4) + (3 x count on floor 3) + (2x count on floor 2) + (count on floor 1) etc.
I started off processing all of each depth before moving onto the next depth, and left this going while I went to squash (several hours) and it still hadn't found an answer by the time I got back. Once I switched it to allow it to process e.g. some of depth n+2 before it had processed all of depth n if they had a better score, it found a result in seconds.
I also pruned any branches that had some items on the 1st floor and some on 4th but none on 2nd or 3rd, as I thought this was no better strategically than with those items on 2nd and 4th, and while this was triggered a few times it didn't seem to make much difference to computation time.
jo's optimization was required (to run in under a minute anyhow) for part 2 however, whether I would have been too thick to figure that out myself I don't know, and whether it would have found a result eventually without it I'm not sure. The only difference between part 1 and 2 seems to be that that optimization is required, so maybe realizing that was what he was getting at with part 2.
https://github.com/bjtaylor1/AoC_2016/blob/master/Day11/UnitTest1.cs (method Solve())