Atleast as it pertains to Prof. Andrew’s courses (not limited to this specialization alone), he is maintaining a certain set of standards and conventions, and it could lead to a lot of confusion if we deviate from it.
To give you an idea about the conventions followed: x_n or a_n (in case of neural networks) stands for the n^{th} feature or unit or node. x^{(i)} or a^{(i)} stands for the i^{th} sample a^{[l]} stands for the l^{th} layer - Layers are introduced in the context of neural networks in Course 2.
All put together we have a_n^{[ l ](i)} - Each of the subscripts and superscripts already having a fixed meaning associated with it. So I leave it to your imagination, what can happen if we dont maintain consistency with these conventions