I think this is about technical English usage rather than physics and indeed there is no accepted universal definition of the words - in the sense that IUPAP doesn't decree a definition and doesn't meet to debate things like whether such and such a law can be upgraded to a principle and so forth.
However, here's how I hear these words: A law is simply an equation describing a relationship between physical quantities, whereas "principle" bespeaks more an intuition, a particular standpoint or way of seeing something physical that sometimes organizes and sometimes motivates "laws". I agree with you that principles are most comparable to the axioms in mathematics. A theory is a whole framework of thought grounded on principles - analogous to the set of theorems following from mathematical axioms. So the ideas are roughly hierarchical: laws make up and motivate principles, which are themselves organized into theories. "Theory" for me also betokens a reasonably "complete" underlying explanation for a physical phenomenon, and indeed can go beyond this and unify the description of a group of related phenomena - as Maxwell's theory of electromagnetism did. Naturally, all theories are either backed up experimentally, meaning that what they foretell is experimentally observed, or they make "falsifiable foretellings" which will in principle be tested either now or when our technological capability lets us.
Going with your example: the constancy of the speed of light is a law that motivated the more general "principle of relativity" for Einstein: the idea that the laws of physics should be the same for all "inertial" observers. This latter idea of "inertial" observers began as (1) those moving uniformly in special relativity - later generalized to (2) those moving along geodesics and whose frames are Lie-dragged by the Levi-Civita connexion: sorry to sound too technical in that last bit but I threw it in because it is a "jargon" phrase which you can do successful Google searches on: its a way of characterizing the conditions in GR that mean that if an observer carries a system of relatively stationary $x,y,z$ orthogonal rods with little vectorial accelerometers (those able to measure acceleration direction) mounted along the rods' lengths, all of those accelerometers will read nought. The "theory" was the set of deductions Einstein made from this principle: the Lorentz transformation, the Einstein field equations with their attendant calculations of the extent of precession of Mercury's perihelion and bending of light around the Sun (both confirmed by experiment) as well as many others, equivalence of mass and energy and so on and so forth.
As for your reference request: I don't know of any and you might look rather at accounts of both the history and philosophy of science for help: may I suggest the Philosophy Stack exchange site.