I didn’t get the example of shallow neural network why does it take 2^n-1 hidden units to compute XOR!! like in deep neural network example I get that how its computing XOR but in shallow neural network I didn’t get that! Please help?

I wouldn’t spend too much mental energy on this point. Prof Ng himself even says at the end of that section that he doesn’t really find this analogy all that useful. You’ll never see him refer to anything like this again in the later courses. It’s not making some deep or subtle point. You’ve got n bits as the input, each of which can be either 0 or 1. With a neural network, each node gets all the inputs. So the “flat” or “shallow” one layer NN would have one node for each possible input combination that “knows” the answer for that combination. So how many possible combinations are there for n binary bits? 2^n, right?

What about if I just made one node followed by sigmoid function to determine if the result is 1 or 0. maybe I don’t get 100% accuracy but I think it will get great result so I mean maybe shallow network is better than deep in some problems.? I am Novice so if my talk is nonsense just pass it, Thanks a lot for your Time.

But the point is you need one node per possible combination in the “shallow” architecture, right? That’s the point here. So, yes, you could use sigmoid if you wanted, but that doesn’t really add anything. You end up with 2^n such nodes.