Maxent model for Song Sparrow (SOSP)


This page contains some analysis of the Maxent model for SOSP, created Fri Jun 15 02:39:35 PDT 2007 using Maxent version 2.3.33.


Analysis of omission/commission

The following picture shows the omission rate and predicted area as a function of the cumulative threshold. The omission rate is is calculated both on the training presence records, and (if test data are used) on the test records. The omission rate should be close to the predicted omission, because of the definition of the cumulative threshold.


The next picture is the receiver operating characteristic (ROC) curve for the same data. Note that the specificity is defined using predicted area, rather than true commission (see the paper by Phillips, Anderson and Schapire cited on the help page for discussion of what this means). If the true distribution is equal to the Maxent distribution, then the maximum possible test AUC is 0.792 rather than 1, because the AUC is being calculated using background rather than true absence data.



Some common thresholds and corresponding omission rates are as follows. If test data are available, binomial probabilities are calculated exactly if the number of test samples is at most 25, otherwise using a normal approximation to the binomial. These are 1-sided p-values for the null hypothesis that test points are predicted no better than by a random prediction with the same fractional predicted area. The "Balance" threshold minimizes 6 * training omission rate + .04 * cumulative threshold + 1.6 * fractional predicted area.

Cumulative thresholdDescriptionFractional predicted areaTraining omission rateTest omission rateP-value
1.000Fixed cumulative value0.7340.0010.0033.689E-47
5.000Fixed cumulative value0.5690.0170.0340E0
10.000Fixed cumulative value0.4820.0600.0740E0
0.122Minimum training presence0.8830.0000.0022.191E-18
13.73710 percentile training presence0.4340.1000.1280E0
32.080Equal training sensitivity and specificity0.2720.2720.3710E0
16.839Maximum training sensitivity plus specificity0.4010.1310.1740E0
27.532Equal test sensitivity and specificity0.3060.2310.3060E0
11.015Maximum test sensitivity plus specificity0.4680.0740.0840E0
3.124Balance training omission, predicted area and threshold value0.6200.0070.0240E0

This is the projection of the Maxent model for SOSP onto the environmental variables. Warmer colors show areas with better predicted conditions. White dots show the presence locations used for training, while violet dots show test locations. Click on the image for a full-size version.


The following picture shows where clamping occurred while projecting the Maxent model onto the environmental variables in V:\PROJECT\terrestrial\lip\distrib_model\Data\maxent_ascii_files. Clamping means that environmental values are restricted to the range of values encountered during training, and similarly for the raw predictions. The values shown in the picture give the total absolute change to the exponent in the Maxent model due to clamping. Warmer colors show areas where more clamping has occurred because the environemtal variables (or the prediction) are further from the range of values seen during training. Click on the image for a full-size version.




Response curves


These curves show how each environmental variable affects the Maxent prediction. The (raw) Maxent model has the form exp(...)/constant, and the curves show how the exponent changes as each environmental variable is varied, keeping all other environmental variables at their average sample value. Click on a response curve to see a larger version.




Analysis of variable importance


The following table gives a heuristic estimate of relative contributions of the environmental variables to the Maxent model. To determine the estimate, in each iteration of the training algorithm, the increase in regularized gain is added to the contribution of the corresponding variable, or subtracted from it if the change to the absolute value of lambda is negative.

VariablePercent contribution
ten4_hardwood16.5
elevation_16
tmin_069.4
tmax_018.5
ppt_068.3
tmax_106.9
whrnum6.2
tmax_066.1
p_orchard2.9
p_emergwet2
ppt_101.9
ppt_011.8
slope_dem1.8
tmin_101.3
dens_peren1.1
tmax_031
p_urban0.9
p_fieldcrop0.8
ten6_conifer0.8
dist_stream0.8
ten0_shrub0.8
dens_inter0.7
ppt_030.6
num30.6
p_pasture0.6
num340.5
num480.4
whr100.3
p_vineyard0.3
tmin_010.2
tmin_030.1
hu_eco_union10


The following picture shows the results of the jackknife test of variable importance. The environmental variable with highest gain when used in isolation is whrnum, which therefore appears to have the most useful information by itself. The environmental variable that decreases the gain the most when it is omitted is p_orchard, which therefore appears to have the most information that isn't present in the other variables.



The next picture shows the same jackknife test, using test gain instead of training gain. Note that conclusions about which variables are most important can change, now that we're looking at test data.


Lastly, we have the same jackknife test, using AUC on test data.



Raw data outputs and control parameters


The data used in the above analysis is contained in the next links. Please see the Help button for more information on these.
The model applied to the training environmental layers
The coefficients of the model
The omission and predicted area for varying cumulative and raw thresholds
The prediction strength at the training and (optionally) test presence sites
Results for all species modeled in the same Maxent run, with summary statistics and (optionally) jackknife results


Regularized training gain is 0.612, training AUC is 0.817, unregularized training gain is 0.673.
Unregularized test gain is 0.582.
Test AUC is 0.783, standard deviation is 0.007 (calculated as in DeLong, DeLong & Clarke-Pearson 1988, equation 2).
Algorithm terminated after 500 iterations (258 seconds).

The follow parameters and settings were used during the run:
1758 presence records used for training, 585 for testing.
55039 background points used during training.
Environmental layers used: dens_inter dens_peren dist_stream elevation_ hu_eco_union1 num3 num34 num48 p_emergwet p_fieldcrop p_orchard p_pasture p_urban p_vineyard ppt_01 ppt_03 ppt_06 ppt_10 slope_dem ten0_shrub ten4_hardwood ten6_conifer tmax_01 tmax_03 tmax_06 tmax_10 tmin_01 tmin_03 tmin_06 tmin_10 whr10(categorical) whrnum(categorical)
Command line:
Feature types used: Linear Quadratic Product Threshold Hinge
Regularization multiplier is 1.0
Regularization values: linear/quadratic/product: 0.050 categorical: 0.050 threshold: 1.000 hinge: 0.500
Output format is Cumulative
Output file type is .asc
Maximum iterations is 500
Convergence threshold is 1.0E-5
Random test percentage is 25
Jackknife selected
Make pictures selected
Create response curves selected