IT might help you too, if you can get a PDF of the IRF520 - it's a bit dry in reading - but there are several key pointers that stand out.
One, the Bi-polar Paradox, the Smith Chart was used back then to plot out the input admittance of a particular transistor. Now, if you can't find the part, or that part is made by a company that would rather sell you an automotive MOSFET for hi-voltage gating for a spark plug - you're gonna' be hard pressed to use that supplier - even though the part MIGHT fit the profile it's labeled as. That admittance was a means to match the part to the signal path and amplify or mix that signal and pass it along. The Base region - in the biasing of it, let you tailor the output and depending upon selection - a knee that you can use to obtain nearly uniform and linear amplification. You don't get that readily with FET-based designs.
Second, MOSFET - the Spongy Trampoline - at least if you get the wrong part, that's what it will act like in the replacement part - Bipolar was analog direct current munchers. MOSFET are more like Tubes, use a Gate - like a Grid, only the Cathode and Anode was changed to Drain and Source (don't ask) but handle current quite well and act more like a switch than an analog device - the Gate being insulated - again makes the part act more like a tube than a silicon-based semiconductor. Even though the Drain To Source connection is direct and doped - formed - shaved to a thickness - to perform like a switch - a VERY DARN FAST switch.
Thirdly - our biggest benefit was the advent of Class C and Class D (and above) types of amplification. At least we can apply an analog signal mix it in with the bias - apply it across the Drain to Source connection and only make the Gate switch it on and off 27 million times a second, once cycle on to off is about 25~35 PICO seconds in duration - so it's a very fast spike. Another advantage is the "Built in capacitance" these MOSFET have. Granted Capacitance is a bane and can be an unwarranted aftereffect, but the capacitive effects of how the gate works against the Drain to Source junction - makes PROPER SELECTION paramount - but knowing the On and Off Times let alone the rise and fall rates (skew and Slew) can provide a knee similar to the Bi-polar - but takes a trick to do it and no smith chart can help - it's still a best guess game to get the best results by trial and error in most cases.
Lastly - bias itself - the early Bi-polar brothers use current rushing thru the line - and their voltage is set quite low - just enough to keep the base region off, but enough there to spill the RF wave over into the base region and away she goes. The current rushing thru, helps to keep the Bi-polar forward conducting until the cycle swings into the positive side and the base doesn't need the bias reserve pool - just the power of the incoming wave. This current also provides a means to wash away noises and cutoff events that can ruin the waveform and even damage the Bi-polar. In MOSFET the "bias" has to be set to a level of VOLTAGE (with some current to cleanse off noisy switching spikes and other artifacts) and that voltage level is pretty tricky. Too low, you are chopping the RF wave and cutting off your signal - too high and she's latched on and in a meltdown - not unlike a Slo-Blo fuse - can take some hits but will pop and then you're done. And on top of that, not too much power as voltage or current - for she will puncture the Gate and no, sorry, it's not the way you make a transistor. The insulation used for the Gate to Substrate (the area of Source and Drain) is an oxide - and that means a temperature sensitivity and if run too hard or hot - breakdown occurs.
So, twist a fresh LED "easy on the eyes" light into your reading lamp and crack open some PDF's about MOSFET's - because as Bi-polars that once were the norm are drained off - lots of the dies, artwork and the art of making them work - will be gone forever - and MOSFET is the replacement - a high speed switch that transistors once held.
Good luck
:+> Andy <+: