Maxent model for Yellow-breasted Chat (YBCH)


This page contains some analysis of the Maxent model for YBCH, created Fri Jun 15 08:50:26 PDT 2007 using Maxent version 2.3.33.


Analysis of omission/commission

The following picture shows the omission rate and predicted area as a function of the cumulative threshold. The omission rate is is calculated both on the training presence records, and (if test data are used) on the test records. The omission rate should be close to the predicted omission, because of the definition of the cumulative threshold.


The next picture is the receiver operating characteristic (ROC) curve for the same data. Note that the specificity is defined using predicted area, rather than true commission (see the paper by Phillips, Anderson and Schapire cited on the help page for discussion of what this means). If the true distribution is equal to the Maxent distribution, then the maximum possible test AUC is 0.897 rather than 1, because the AUC is being calculated using background rather than true absence data.



Some common thresholds and corresponding omission rates are as follows. If test data are available, binomial probabilities are calculated exactly if the number of test samples is at most 25, otherwise using a normal approximation to the binomial. These are 1-sided p-values for the null hypothesis that test points are predicted no better than by a random prediction with the same fractional predicted area. The "Balance" threshold minimizes 6 * training omission rate + .04 * cumulative threshold + 1.6 * fractional predicted area.

Cumulative thresholdDescriptionFractional predicted areaTraining omission rateTest omission rateP-value
1.000Fixed cumulative value0.5780.0000.0247.746E-14
5.000Fixed cumulative value0.3700.0080.0951.752E-24
10.000Fixed cumulative value0.2620.0270.1675.865E-33
2.318Minimum training presence0.4770.0000.0481.29E-18
20.52110 percentile training presence0.1670.0980.2625.334E-45
25.608Equal training sensitivity and specificity0.1410.1410.3455.801E-42
18.199Maximum training sensitivity plus specificity0.1820.0670.2262.634E-45
15.047Equal test sensitivity and specificity0.2060.0510.2022.47E-41
13.160Maximum test sensitivity plus specificity0.2230.0390.1672.332E-41
3.622Balance training omission, predicted area and threshold value0.4170.0040.0711.018E-21

This is the projection of the Maxent model for YBCH onto the environmental variables. Warmer colors show areas with better predicted conditions. White dots show the presence locations used for training, while violet dots show test locations. Click on the image for a full-size version.


The following picture shows where clamping occurred while projecting the Maxent model onto the environmental variables in V:\PROJECT\terrestrial\lip\distrib_model\Data\maxent_ascii_files. Clamping means that environmental values are restricted to the range of values encountered during training, and similarly for the raw predictions. The values shown in the picture give the total absolute change to the exponent in the Maxent model due to clamping. Warmer colors show areas where more clamping has occurred because the environemtal variables (or the prediction) are further from the range of values seen during training. Click on the image for a full-size version.




Response curves


These curves show how each environmental variable affects the Maxent prediction. The (raw) Maxent model has the form exp(...)/constant, and the curves show how the exponent changes as each environmental variable is varied, keeping all other environmental variables at their average sample value. Click on a response curve to see a larger version.




Analysis of variable importance


The following table gives a heuristic estimate of relative contributions of the environmental variables to the Maxent model. To determine the estimate, in each iteration of the training algorithm, the increase in regularized gain is added to the contribution of the corresponding variable, or subtracted from it if the change to the absolute value of lambda is negative.

VariablePercent contribution
tmax_0622.2
tmin_0617.9
whrnum9.3
tmax_016.1
ppt_066
elevation_5.1
num484.9
tmax_103.9
ten0_shrub2.6
p_urban2.5
p_fieldcrop2
dens_inter1.9
p_pasture1.7
p_emergwet1.6
slope_dem1.5
dist_stream1.4
dens_peren1.3
num31.3
whr101.1
p_orchard1
ten6_conifer1
num340.8
p_vineyard0.7
ten4_hardwood0.6
tmax_030.5
ppt_100.4
tmin_010.4
tmin_100.1
hu_eco_union10
ppt_010
ppt_030
tmin_030


The following picture shows the results of the jackknife test of variable importance. The environmental variable with highest gain when used in isolation is tmin_06, which therefore appears to have the most useful information by itself. The environmental variable that decreases the gain the most when it is omitted is p_urban, which therefore appears to have the most information that isn't present in the other variables.



The next picture shows the same jackknife test, using test gain instead of training gain. Note that conclusions about which variables are most important can change, now that we're looking at test data.


Lastly, we have the same jackknife test, using AUC on test data.



Raw data outputs and control parameters


The data used in the above analysis is contained in the next links. Please see the Help button for more information on these.
The model applied to the training environmental layers
The coefficients of the model
The omission and predicted area for varying cumulative and raw thresholds
The prediction strength at the training and (optionally) test presence sites
Results for all species modeled in the same Maxent run, with summary statistics and (optionally) jackknife results


Regularized training gain is 1.411, training AUC is 0.930, unregularized training gain is 1.669.
Unregularized test gain is 0.914.
Test AUC is 0.855, standard deviation is 0.016 (calculated as in DeLong, DeLong & Clarke-Pearson 1988, equation 2).
Algorithm terminated after 500 iterations (227 seconds).

The follow parameters and settings were used during the run:
255 presence records used for training, 84 for testing.
53536 background points used during training.
Environmental layers used: dens_inter dens_peren dist_stream elevation_ hu_eco_union1 num3 num34 num48 p_emergwet p_fieldcrop p_orchard p_pasture p_urban p_vineyard ppt_01 ppt_03 ppt_06 ppt_10 slope_dem ten0_shrub ten4_hardwood ten6_conifer tmax_01 tmax_03 tmax_06 tmax_10 tmin_01 tmin_03 tmin_06 tmin_10 whr10(categorical) whrnum(categorical)
Command line:
Feature types used: Linear Quadratic Product Threshold Hinge
Regularization multiplier is 1.0
Regularization values: linear/quadratic/product: 0.050 categorical: 0.050 threshold: 1.000 hinge: 0.500
Output format is Cumulative
Output file type is .asc
Maximum iterations is 500
Convergence threshold is 1.0E-5
Random test percentage is 25
Jackknife selected
Make pictures selected
Create response curves selected