One way to pose the question you are trying to answer with your specimens is "how much variability can we accept?" There is an actual strength value that you may think these specimens can achieve, but there is also a value that you can accept for design purposes. If the two values are really close, then you will have to test many samples to prove that 95% of the time your insert will have the required strength. OTOH, if you need a design value just 33% of the possible strength of the insert, then you may only need to test a few specimens to have that same confidence. Testing more samples doesn't make the specimens stronger - they just look stronger in the end because the higher confidence in the strength value increases what you're allowed to use for design. Statistics - it can be a game of words.
You will also have to test in different load directions: tension, shear, combined tension+shear and so on.
If you have combinations of numerous sizes of insert and/or panel to test, you may be able to ask for an alleviation in the following way: Say you have 3 inserts of sizes A, B, and C. You also have 2 panel types X and Z, which you use in 2 thicknesses, and expect 8 specimens for each test configuration. Do you have to do all 96 combinations? Maybe not. You may be able to interpolate the properties of B if it's in between sizes A and C. You may be able to select only panel type X for tests if its core is much lighter than panel type Z, maybe doing just a few tests in panel Z to show that it makes stronger insert joints. Same thinking with panel thicknesses, and so on. You may also consider testing a larger number of insert A and then showing that inserts B and C will have the same confidence even if you don't test as many of them...
All fodder for discussion with your FAA engineering authority BEFORE you plan these tests.
Have you looked at Chapter 9 of AR-MMPDS?
STF