What is the minimum number of amplifying stages from mic to A to D converter for having a ready produced sound?
Clearly the requirement posed by this question is not how much gain is needed but what is needed to have correct reproduction of top quality at the listener's brain. [1]
And this post does not give an answer for lack of knowledge. You would have to find the answer yourself. You may know already.
It may be ironic that keeping the same waveform and multiplying it by a fixed number (ie not distortion) seems to be a reproduced sound waveform that when it is played at an ordinary player it is unacceptable. 
An ordinary 100W amplifier should not be driven above 1 Watt average, to avoid transients being driven into power supply rail clipping. [2]
But how did our grandfathers recorded and reproduced Simatra with amazing quality to out brain reproduced from few watt amplifiers or even the tiny speaker of a smartphone?
How did they fit hundreds of instruments into a tiny speaker?
Why do we find it difficult to even reproduce one single instrument with stunning quality to a listener's brain on 2017?
Could it be that they used "distortion" in a creative way? One example is the well known psycho acoustic observation that when the harmonic overtones are present but the fundamental is missing, the brain completes the picture by generating, adding the fundamental to our perception! And it keeps itself interested too. 
This post is about how is sound produced as soon as possible with as few amplifying stages?
How do we make that first Watt the best possible as this is where most of the music is?
Do not let spectrum analyzers (although they are useful tools) to change your opinion. They analyze the signal, not brain perception. Radio stations are closer to our brain by asking selective audience before adding a song to their playlist if they like it. (Liking a song involves the sound too). They do not show them spectrum analyzer pictures.
A Beethoven symphony is great on the analyzer. Just joking! If you would like to read what note combinations can sound great to our brain, the Bach's long hidden manual is a great tool [3]. Perhaps it is even better finding for yourself using your own brain and a keyboard, piano or other instrument. It is just 12 notes out there.
How about that 1 Watt where most music is?
Our grandfathers knew what they were doing. They were producing a signal by passing it through electron tubes and also magnetic tape used in the analog domain. All these processes can gracefully and artfully saturate and trim peaks. They were producing a signal that sounds great even on an 1 Watt amplifier 
.
An example is This Girl's in Love - Dionne Warwick, music by Burt Bacharach, lyrics by David, produced by Phil Ramone. How great does it sound on a smartphone from YouTube while you are on the bus and 2 persons listen happily through the tiny loudspeaker next to their ear! Credit should also go to the designers of smartphones and of those tiny loudspeakers. They sound good.
But nowadays if we try to get the juice out of a bad captured signal, using a plug in, we may well have garbage in garbage out. Hopefully new generations are finding their way to mew ways without disrespecting the "old" ways.
By produced sound we may mean a sound that sounds like a finished product. Natural, loud, full.
Not thin, weak, soft anemic as it usually may be when connecting a mic to a digital recorder.
Instantaneous peak limiting and harmonic enhancement at least may be needed that analog circuits, and in fact the simplest ones, produce at the speed of light. This is usually done when the signal that cannot exceed the voltage rail of the power supply is progressively trimmed or clipped.  
With class A operation (electrons flowing all the time)[4], single stage electron tube, transistor or JFETs this is done at a gentle progressively overloading curve. And the juice of the music, the lower parts of the signal are left intact. The less the signal is the less the distortion of class A amplifiers and the more the devices are heated up, distortion tends to 0. Just like our ears. The average parts of the signal around 0 are left undistorted with an unbelievable smooth sound and great midrange. [5]
Here it can be mentioned before it is forgotten that Phil Ramone for example did the best he could to avoid generation loss etc. When he produced the Girl from Ipanema he took the master 3 track tape to the disc cutting facility [8]. Similar attention to excellent is always done but by the few like Bruce Swendien. 
Do not let anyone disappoint you by saying, that was Ramone, that was Bach etc. Ramone even experimented with what varnish or not was on the floor, explained in his book. This is what makes a Bach or a Ramone. God is in the details as Bach was saying. We can all do it if we do not listen to those that try to disappoint us. We are human. And it does not matter at all if we do or not succeed. Success can take away freedom [7]. The importance is being free and doing what we like. And if we like to pay attention to detail we this is something we do for ourselves. 
Examples of single ended class A stages are the Neumann U47, Pleiades V series prepreamps, Neve output preamp stages, single ended tube power amps very popular at the east etc.
Then the whole signal can be increased as the nasty peaks are trimmed.
What one wants is there and what one does not want is out. This is happening in real time or at the speed of light.
Then the signal, is ready for a nice digitization in an A to C converter.
We know this when we listen to many decade earlier recorded masterpieces where the signal has passed many times through electron tube preamps and analog recorders.
A more modern example is seeing Amy Winehouse while recording Back to Black on YouTube.
It should be possible too with using as few equipment as possible. A simple experiment was performed of connecting a Pleiades one stage electron tube prepreamp, the V0. A Beyer M55 mic sas connected to it and the V0 feeding directly a digital M-Audio Microtrack recorder at high gain.
Then another recording was made using the high signal to noise ratio and boosted output of the mic from the V0 prepreamp to drive a world class preamplifier, the EMI RS61. The RS61 was generously driven and its output was attenuated by a variable constant impedance pad to drive the digital recorder. The difference was amazing. The RS61 made the sound from the M55 and V0 much fuller, louder, with produced quality. An amazing difference. See a previous post for a full description of the experiment [9].
Signal path: M55 - Pleiades V0 - EMI RS61 - EMI RS106A attenuating - M-Audio Microtrack - Sennheiser HD580 
Can this analog domain processing be done using even less that those 4 class A stages of amplification, (excluding what happens inside the A to D, should there not be any analog amplifier?
The Pleiades V5 has 2 stages of battery powered EF183 electron tubes. It seems one more is needed and a variable resistor after the first stage. With electric bass of course 2 stages were enough to drive it as fully as wanted until it compresses in a musical way.
But is there a way of doing it so that it sounds full, a ready product with just one stage of amplification from mic to output?
A Pleiades K117 was also supplied by a ridiculously small voltage by two batteries connected back to back. It was used as a bass preamp and the sound was interesting. Also 2 or more stages of the K117 JFet were tried with 2.4 or so volts to amplify a mic with an input transformer. At 2 stages the sound is richer. At 3 distorted. A attenuator after the 1st stage is needed. But experiment showed multiple stages of electron tubes giving a better enchanted sound with less obvious distortion and more enchanted sound in the right way.
Is this also because tubes do it better by creating a side chain compressor signal by the negative rectification bias from the diode effect?[10]
This can be seen on a U47 prepreamp with a signal generator and XY connected oscilloscope. Input to VF14 prepreamp connected as X axis and output as Y axis. The slope clearly changes as the signal is increased above a threshold. But it is still a line not a curve. It is compression, the gain changes. The 60MΩ resistor then discharges the grid capacitance and a release time constant is created. This happens at loud signals. But peaks should have their effect too.
Is it possible to replicate this to JFETs by connecting a diode across its input?
Another experiment done was constructing the Pleiades V1. Just a one stage prepreamp with an electrometer tube. Both heaters and anode operating from a 1.2 volt rechargeable AAA battery. The sound was very nice. There was some hiss and microphonics.
The point is, instead of using 200V and need many stages until the signal becomes as big so that saturation effects take place...
To use a small Vb and get these effects as soon as possible (playing with bias too), with a smaller signal out ready to supply an ordinary mic preamp or an A to D converter. The Neumann U47 just uses 34V at the anode of the VF14. On the Pleiades prepreamps the supply is 10 times less in an attempt to have the nice effects happening at even soft singing voices. It is so far found that just one Pleiades stage is not enough, it sounds great but it may be too clean, it may be suitable for recording loud instruments like trumpets or drums? More experiments can be made by changing the bias. The positive thing is that a Pleiades single electron tube front end as V0 can make a world class preamp really sing as it would with the high output high quality of a Neumann U47. 
The point of the exercise is creating the best possible sound with a circuit very similar to U47, saving a lot of space, expense, equipment power, weight requirements. Would artists be able to have at some point the best no compromise quality produced sound by just connecting a small battery powered box to a portable recorder or iPhone? Why should it not be as simple as top quality canvas, brush and paint?
References:
[1] flat frequency responce for vocal chords to brain, Morion pictures sound recording and reproducing curves - Loye, Morgan, Journal of motion pictures sound engineers
[2] http://education.lenardaudio.com/en/12_amps_5.html
[3] http://normanschmidt.net/scores/bachjs-general_bass_rules.pdf 
[4] Applied Electronics - T.S. Gray - MIT 
[5] Tubes vs Transistors is there an audible difference - Russel O. Hamm - JAES
[6] http://www.jazz.fm/index.php/music-a-video-mainmenu/jazzwax-mainmenu-153/3937-interview-phil-ramone-part-3
[7] Vangelis Quatar interview on YouTube 
[8] Making Records - Phil Ramone
[9] Pleiades V0 with EMI RS61 - euroelectron post
[10] operating features of the Audion - Edwin H. Armstrong
Electronics a systems approach - Neil Storey