Designing Research and Demonstration Tests for Farmers' Fields (B 1177) University of Georgia Extension As farmers and farm advisers, you make many management choices during a growing season. For example, you may have to decide which cultivar to plant, which herbicide to use, how frequently to apply a fungicide, and what rate of nematicide to use. Often the information needed to make the best decision is available to you, but when it is not available you can frequently compare the options by conducting your own small experiments. Your experiments can be just as valid as any university study if you follow a few important principles of experimental design. 2017-01-30 14:17:09.29 2006-06-02 14:27:34.0 Designing Research and Demonstration Tests for Farmers' Fields | Publications | UGA Extension Skip to content

Designing Research and Demonstration Tests for Farmers' Fields (B 1177)

Download PDF

Richard F. Davis, Extension Nematologist
Glen H. Harris, Extension Agronomist
Phillip M. Roberts, Extension Entomologist
Greg E. MacDonald, Extension Agronomist

As farmers and farm advisers, you make many management choices during a growing season. For example, you may have to decide which cultivar to plant, which herbicide to use, how frequently to apply a fungicide, and what rate of nematicide to use. Often the information needed to make the best decision is available to you, but when it is not available you can frequently compare the options by conducting your own small experiments. Your experiments can be just as valid as any university study if you follow a few important principles of experimental design.

All experiments have certain things in common, so designing an experiment usually includes the following steps. You must decide what question you want to have answered. This is the goal, or objective, of the experiment. The goal of the experiment will dictate what to include in the experiment to help you answer your question. The individual things that you wish to test in your experiment are called "treatments" and the physical areas to which the treatments are applied are called "plots." Then you need to decide how the treatments should be physically arranged in the field. Technically, this is what is called the "experimental design."

Experiments answer your original question by allowing you to make unbiased comparisons among the treatments you selected. You will need some way to evaluate how well each treatment worked to make comparisons among treatments. The information you collect to help you make those comparisons (such as yield, insect counts, or disease severity) is called "data." Finally, you need an objective way to evaluate the data. This is usually done through statistical analysis.

Selecting Treatments

The objective, or purpose, of the study will determine the treatments included in an experiment. Writing down the test objectives is helpful because this forces you to define them precisely. A test may have more than one objective, although multiple objectives should be closely related.

The selection of treatments is usually logical if you can define the purpose of the study; all treatments necessary to address the test's objective should be included. For example, if the purpose were to determine which of five fungicides works the best, then the treatments would include all five of those fungicides. If the purpose were to determine if any of the five fungicides works better than your current choice, then the treatments would include the five fungicides plus the fungicide you currently use. Accurately stating the purpose of the test before the treatments are applied in the field is critical. After the treatments have begun, it will be too late to add other treatments to answer the question you really wanted to address.

The selection of treatments and the experimental design get more complicated as the question you are trying to answer gets more complex. It is common to want to test in the same experiment two (or more) things that influence crop production. For example, you may want to test chicken litter as a fertilizer and test five corn hybrids to maximize yield. The specific questions addressed in this case are:

  1. What effect does chicken litter have on corn production?
  2. What effect do the hybrids have on corn production?
  3. Does chicken litter have the same effect on each hybrid?

The third question may not be as obvious as the first two, but it will always be asked or implied if you are testing two or more factors in the same experiment. In this example, you have to determine what the effect of chicken litter is on each hybrid and then compare those effects to each other. To do that, the treatment list must include each hybrid without chicken litter and each hybrid with chicken litter (a total of 10 treatments). With this list of treatments, you can make the comparisons necessary to answer our three questions. This example employs a "factorial arrangement of treatments" that will be discussed in more detail in a later section.

One final note about treatment selection involves including additional treatments to provide a relative measure of effect. Comparing the yield of five new corn hybrids does little good if you cannot tell how those yields compare with the hybrids you already grow. You should include at least one hybrid with which you are already familiar (often called a "standard" treatment) to provide a relative measure of how well the new hybrids produce. If you wish to test a new nematicide, you should include a treatment with the currently used nematicide and a treatment with no nematicide as a basis for comparison. Such treatments are known as "checks" or "controls." Without the proper controls, you will not be able to say that the new nematicide worked better than the currently used nematicide or even that the new nematicide worked better than no nematicide! The questions you wish the experiment to answer should indicate what treatments should be included as controls.

It is frequently desirable to have both a positive and a negative control in an experiment. The negative control helps you determine if the treatments being tested work better than some minimal treatment (or nothing) and positive controls help you determine if the treatments being tested work better than the current standard practice. You may have several control treatments in an experiment if you currently have several viable options from which to choose. For example, if you currently can choose either of two fungicides to control a leafspot problem, you may wish to include them both as controls in your experiment when you test new products. You do not have to include all currently available options as controls for the experiment to be useful, but you can.


In an experiment, replication means that individual treatments (such as each of the five pesticides being tested in an experiment) have been applied to more than one plot. Replication is necessary because all test plots are not identical, and that leads to variation in the data you collect; you will not get exactly the same results from two plots that received the same treatment. You can take steps to minimize the effect of variation if it has an identifiable cause, but there will always be some variation among plots that cannot be controlled. In statistical terms, uncontrolled variation is called experimental error. The purpose of replication is to allow you to make a more accurate estimate of how each treatment performed even though there is uncontrolled variation in the experiment. This can best be shown in an example.

Suppose you have 10 rose bushes and you want to test if a new fungicide will protect the bushes from black spot, a fungal leaf disease. You could pick five plants to leave untreated as a control and spray the other five with the fungicide. Later, when black spot is evident on the leaves, you count the number of diseased spots on each plant and compare the two treatments. The five untreated plants have 26, 21, 19, 25, and 23 infected spots (a treatment mean, or average, of 22.8 spots per plant), and the fungicide treated plants have 20, 15, 18, 21, and 20 spots (a mean of 18.8). Statistical analysis indicates that the fungicide did in fact reduce the number of infected sites, but you would not be able to determine that if you only had one treated and one untreated plot. Think about the following examples based on the data above. If you only had two plants and the untreated plant had 26 spots and the treated plant had 15 spots, it may seem easy to determine that the fungicide reduced disease. But what if the untreated plant had 19 spots and the treated plant had 21 spots? You might conclude that the fungicide did not work or even that it increased disease! Adequate replication can minimize this problem.

It is common when you have several replications of each treatment to have data like that in the rose example above: The treatment means are different but individual measurements may overlap. In this example, the lowest measurement from the untreated plants was 19, and the highest measurement from the fungicide treated plants was 21, but the treatment means were 22.8 for the untreated plants and 18.8 for the treated plants. Replication of treatments increases your ability to detect differences in treatment means. Having more replications allows you to identify (statistically) smaller differences in treatment means than you could identify with fewer replications.

The number of replications that you need is influenced by the biology of what you are testing, how close together the treatment means are, and how much variation exists within a treatment. For field tests in plant pathology, nematology, weed science, soil fertility studies, and entomology, a minimum of four replications is suggested, but five or six replications are much better. If treatment means are close together or variation is relatively large among the plots that received the same treatment, then you may need more replications to detect differences among treatments.

Just as the data may vary within a replicated treatment, the results may vary among experiments if the whole experiment is repeated. This can happen because of different weather conditions, different disease or insect pressure, or many other factors beyond your control. This does not mean that the results of a single experiment are not valid, but it does make it dangerous to draw conclusions from a single experiment. The one set of results you have may indicate treatment differences, but if you repeated the test several times you might not see those treatment differences again. If the test is repeated (and that means you cannot change any of the treatments) and you get similar results, then you can be much more confident that your conclusions are correct.


Figure1 Figure 1. Test plots on the left are not randomized. Plots on the right are randomized. The numbers (1-5) represent the five treatments in this test.

Randomization in an experiment means that the treatments are assigned to plots with no discernable pattern to the assignments. The reason randomization is important is that the positioning of treatments within the block may affect their performance. One example of this is an experiment testing five corn hybrids (labeled 1 through 5) in which you plant the hybrids in the same order in each block: 1, 2, 3, 4, then 5 (see figure 1). If hybrid 2 is naturally much taller than the others, it can slightly shade the hybrids planted next to it (hybrids 1 and 3) and unfairly make them look a little bit worse than they would look if they were not planted next to hybrid 2. Another example is a field in which soil fertility gets progressively lower as you cross the field from east to west so that productivity is reduced as you go from one side of the field to the other. If two corn hybrids are planted side by side but within a block, hybrid 1 is always planted on the east side of hybrid 2, then hybrid 1 is always planted in slightly more fertile soil and therefore has an unfair advantage.

In both of the preceding examples, randomization could have prevented the unintentional bias because the arrangement of the treatments would have been different within each block. Because you cannot anticipate all the influences that may introduce bias into a test, ALL experiments should be randomized. There are many ways to randomize treatments within a block, but the simplest is literally to pull the numbers out of a hat. Assign each treatment a number, write the numbers on individual pieces of paper, mix the slips of paper up, and then select the slips one at a time without looking at them first. The order in which the numbers are drawn is the order in which they will be arranged in a block. Repeat these steps for each block in the experiment.

If you wish to use the experiment as a demonstration plot (such as for a field day), it is common that one block not be truly randomized. This is done so that particular treatments can be seen side-by-side to facilitate comparisons and highlight differences for casual observers. Though it is better to randomize all blocks and not intentionally arrange treatments, arranging the treatments in one block is unlikely to affect the test's results as long as the other blocks are truly randomized.

Plot Size

A plot, the area to which an individual treatment is applied, can be any size, including a single plant growing in a pot or 5 acres or more of a field. Before you can apply treatments to your test area, you must decide how large your plots should be. Although there is a lot of subjectivity in selecting plot size, there are some important considerations including the equipment to be used in planting, harvesting, and treatment application; how much space is available for use in the experiment and how many other treatments there are; and the biology of what you are studying. Accommodating equipment and space concerns makes it easier to conduct the test. Accommodating biological concerns reduces the chances of overlooking differences among treatments. Equipment and space considerations are usually easy to identify, but biological considerations are not always obvious.

If you have equipment to plant, harvest, and apply treatments to four rows at a time, then the logical plot width would be some multiple of four rows (four rows, eight rows, 12 rows, etc.). Any other width (such as six rows) would make it more difficult to conduct the experiment. The length that plots should be is more flexible than plot width. For example, if you plan to weigh the harvest from each plot, the scales you have may influence the length plots should be. If you have scales that are designed for weighing hundreds of pounds, your plots should be large enough to provide a harvest weight that can be accurately determined by your equipment, and increasing the length of plots is an easy way to do that. Also, the length of your plots may be adjusted so that all of your plots (all replications of all treatments) will fit into the area available for your test. If you have a large area for your test, space may not be an important consideration.

To accommodate biological considerations, you should answer two questions:

  1. How large a plot is needed to observe the biological effect (disease severity, insect damage, weed frequency, nematode population levels, etc.) that you are studying?
  2. How large a plot is needed to minimize the influence of a treatment (chemical application, etc.) on the plots next to it?

By answering these questions, you can determine the minimum plot size necessary to get useful data from the experiment.

To get an accurate measurement of the effect of pest management treatments, the plot must be large enough to account for uneven initial distribution of the pest (pathogen, insect, weed, etc.). Some areas may start with the pest present, but the pest may occur in other areas only after it has spread from its initial location. This is very important for pests that spread very slowly (such as most soilborne organisms).

Some diseases and pests are highly mobile and spread very rapidly (such as many insects). In an insect management trial, measuring the effect of a treatment can be very difficult if your plots are too small because the insects that you see in the plot may have simply spread from the plot next to it. To minimize this problem, you can increase your plot size and then collect data from the middle section of the plot. For example, you might have an eight-row plot but only collect data from the middle four rows. The rows from which you do not collect data are often referred to as "buffer rows" because they buffer the effect of the neighboring plots. If you do not use buffer rows when they are needed, you may fail to detect differences among treatments and incorrectly conclude that many treatments were ineffective. Buffer rows are frequently used when there is uncertainty whether treatments can influence nearby rows.

A similar concept involves the use of border rows along the edges of your test area. A significant "border effect" commonly exists at the edge of a field where the plants may grow differently than plants not at the edge. Although you may be able to minimize this problem with blocking, it is often better to eliminate the problem by not using the rows at the edge of a field in your experiment.

Once the plots are large enough to be representative of a much larger area, further increasing plot size will not significantly improve the accuracy of the results. For example, in an experiment testing fungicides for control of white mold, or stem rot, in peanut, a four-row-wide by 100-foot-long plot should be just as good as an eight-row-wide by 400-foot-long plot. Plots that are larger than necessary take more field space and may increase the amount of work required for an experiment, but they usually will not adversely affect the test results unless the plots get so large that the plots within a block are no longer uniform. Plots that are too small may prevent the accurate assessment of treatment effects. If the space available for an experiment is limiting, having more replications is usually more beneficial than having larger plots as long as your plot size allows accurate assessment of treatment effects.

Experimental Designs

Completely Randomized Design

The completely randomized design is the simplest experimental design. In this design, treatments are replicated but not blocked, which means that the treatments are assigned to plots in a completely random manner (as in the left side of figure 2). This design is appropriate if the entire test area is homogeneous (uniform in every way that can influence the results). Unfortunately, it is rare that you can ever be confident of a test site's uniformity, so a completely randomized design is rarely used in field tests. The completely randomized design is used more commonly in greenhouse tests, though blocking is often useful even in the more controlled environment of a greenhouse.

Figure2 Figure 2. The shaded area represents an area of the field that is different from the unshaded area. Treatments (A, B, and C) are replicated but not blocked in the field on the left. On the right, treatments are replicated and blocked; each block contains one plot of each treatment.

Randomized Complete Block Design

The randomized complete block design is the most commonly used design in agricultural field research. In this design, treatments are both replicated and blocked, which means that plots are arranged into blocks and then treatments are assigned to plots within a block in a random manner (as in the right side of figure 2). This design is most effective if you can identify the patterns of non-uniformity in a field such as changing soil types, drainage patterns, fertility gradients, direction of insect migration into a field, etc. If you cannot identify the potential sources of variation, you should still use this design for field research but make your blocks as square as possible. This usually will keep plots within a block as uniform as possible even if you cannot predict the variation among plots.

Blocking refers to physically grouping treatments together in an experiment to minimize unexplained variation in the data you collect (referred to as experimental error). This allows the statistical analysis to identify treatment differences that would otherwise be obscured by too much unexplained variation in the experiment. Variation in an experiment can be divided into two types: variation for which you can account in the statistical analysis and variation that is unexplained. The goal in blocking is to allow you to measure the variation among blocks and then remove that variation from the statistical comparison of treatment means. If you can anticipate causes of variation, you can block the treatments to minimize variation within each block and remove some variation from the statistical analysis. The mathematics of how blocking allows you to reduce unexplained variation is beyond the scope of this bulletin.

In the most common experimental designs, a block will contain one plot of each treatment in the experiment. If an experiment has five treatments, then each block will contain five plots, with each plot receiving a different treatment. When a block contains one plot of each treatment, then each block represents one replication of each treatment. For this reason, blocks are frequently referred to as "replications" or "reps," but the concept of blocking should not be confused with the concept of replication; replication and blocking serve different purposes. In agricultural research, field plots are almost always blocked even when no obvious differences are present in the field. It is much better to block when you did not really need to than not to block when you should have blocked.

Figure3 Figure 3. An easy way to arrange blocks is to put them side by side across the field. Letters represent different treatments.

Blocking is a very powerful tool that is most effective if you can anticipate sources of variation before you begin an experiment. For example, in a herbicide trial, one side of a field may have a history of more severe weed problems. If you just scattered your treatments randomly through the field, a lot of the variation in the data you collected could be due to the increased weed pressure on one side of the field. Such variation would make it difficult to determine how well each treatment worked. Because you know one side of the field will have more weeds, you can remove that source of variation from the statistical analysis by blocking and improve your chances of identifying differences among treatments.

The process of blocking follows a logical sequence. First, you determine that there is something (weeds, drainage, sun/shadow, water, soil type, etc.) that is not uniform throughout the experimental area (field, greenhouse, etc.) that may influence whatever you are measuring (yield, plant height, etc.). Then you can arrange your treatments into blocks so that the area within each block is as uniform as possible (see figure 2). Though the area within a block should be relatively uniform, there may be large differences among the blocks, but that is what makes blocking effective. Your goal is to maximize the differences among blocks while minimizing the differences within a block.

The shape of the blocks is not important as long as the plots within a block are as uniform as possible. Ideally, the only differences among plots within a block should be due to the treatments. Blocks in field experiments are usually square or rectangular, but they may be any shape. Blocks in the same experiment do not have to be the same shape; the shape of individual blocks will be determined by variation in the field that you are trying to minimize. If you are not sure what shape your blocks should be, square or nearly square blocks are usually a safe choice.

Blocks may be arranged through the field in many ways. If the field is wide enough, an easy way to arrange blocks is to place them side-by-side all the way down the field (see figure 3). But blocks do not have to be contiguous and may be scattered through the field in any way that is convenient for you.

Factorial Arrangement of Treatments

A factorial arrangement of treatments is not an experimental design, though you will often hear it referred to as a factorial design or a factorial experiment. A factorial arrangement of treatments means that the experiment is testing two or more factors at the same time, and that the experiment includes all combinations of all factors. The term "factor" is used to describe a group of treatments that have something in common. Fungicides, sources of nitrogen, or corn hybrids could be considered factors in an experiment. Factors may be defined broadly or narrowly in different experiments. All herbicides may be grouped as a factor in one experiment, but pre-plant and post-plant herbicides may be treated as separate factors in another experiment. A single-factor experiment tests one factor at a time; a two-factor experiment tests two factors at once.

Figure4 Figure 4. A 2x5 factorial arrangement of treatments in a randomized complete block design (above) and in a split-plot design (below). A and B represent two levels of one factor, and the numbers (1-5) represent five levels of a second factor. The combinations (e.g., 4A, 5B, etc.) denote individual treatment combinations. Either experimental design could be used, but the randomized complete block design is preferred unless the split-plot design is required by some limitation on randomization.

Most simple on-farm experiments are single-factor experiments (in a Completely Randomized or Randomized Complete Block design) and compare things such as crop varieties or herbicides, but it is sometimes useful to test two or more factors at once. For example, a two-factor experiment would allow you to compare the yields five corn hybrids at three planting dates. This accomplishes three things at once:

  1. It allows you to compare the corn hybrids with each other.
  2. It allows you to evaluate the effect of planting date.
  3. It allows you to determine if varying the planting date changes the relative performance of the hybrids (e.g. one hybrid may only perform well if planted early).

The first two could be done in separate single-factor experiments, but the third can only be achieved by having both factors in a single experiment. This becomes especially important if one factor can have a significant influence on the effect of the other factor. For example, you might test soybean varieties as one factor and nematicides as another factor. If a few varieties have good nematode resistance but others do not, they may appear equally good when effective nematicides are used but varieties with resistance would appear much better when nematicides are not used. In cases like this, the effect of one factor (variety) is strongly influenced by the other factor (nematicide). When one factor influences the effect of the other factor, there is said to be a significant interaction between the two factors. It can be very important to know if there is an interaction between factors, because if there is an interaction, you can make predictions or recommendations based on the results of single-factor experiments ONLY when all other factors are at the same levels they were at in the experiment. If you change some factor not included in the experiment, the results from your single-factor experiment may no longer be valid.

With a factorial arrangement of treatments, all values (or levels) of each factor must be paired with all levels of the other factors. If you have two nematicides and five soybean varieties, then your treatment list must include each variet

Status and Revision History
Published on May 1, 1999
Re-published on Feb 9, 2009
Re-published on Apr 28, 2009
Reviewed on Apr 25, 2012
Reviewed on Jan 30, 2017