Title: | Probabilistic Sex Estimate using Logistic Regression, Based on VISual Traits of the Human Os Coxae |
---|---|
Description: | An R-Shiny application implementing a method of sexing the human os coxae based on logistic regressions and Bruzek's nonmetric traits <doi:10.1002/ajpa.23855>. |
Authors: | Frédéric Santos [aut, cre] , Élodie Bernardeau [ctb] |
Maintainer: | Frédéric Santos <[email protected]> |
License: | CeCILL-2 | file LICENSE |
Version: | 2.0.4 |
Built: | 2024-11-26 04:07:15 UTC |
Source: | https://gitlab.com/f-santos/pelvis |
An R-Shiny application implementing a method of sexing the human os coxae based on logistic regressions and Bruzek's nonmetric traits.
Package: | PELVIS |
Type: | Package |
License: | CeCILL-2.1 |
Frédéric Santos, <[email protected]>
Bruzek, J. (2002) A method for visual determination of sex, using the human hip bone. American Journal of Physical Anthropology 117, 157–168. doi: 10.1002/ajpa.10012
Santos, F., Guyomarc'h, P., Rmoutilova, R. and Bruzek, J. (2019) A method of sexing the human os coxae based on logistic regressions and Bruzek's nonmetric traits. American Journal of Physical Anthropology 169(3), 435-447. doi: 10.1002/ajpa.23855
if(interactive()){ StartPELVIS() }
if(interactive()){ StartPELVIS() }
From a given dataset including the 11 visual traits exposed by Bruzek (2002), this function adds three corresponding main characters (PrSu, GrSN and InfP) based on the majority rule exposed in the original article.
add_metavars(dat)
add_metavars(dat)
dat |
A dataframe including the 11 visual traits described by Bruzek. |
A dataframe including also the main characters derived from those visual traits.
This is mainly an internal function for the R-Shiny application implemented in PELVIS.
Frédéric Santos, <[email protected]>
Bruzek, J. (2002) A method for visual determination of sex, using the human hip bone. American Journal of Physical Anthropology 117, 157–168. doi: 10.1002/ajpa.10012
# Load a dataset: data(CTscanDataBruzek) # Visualize the traits: head(CTscanDataBruzek) # Add all the main Bruzek's characters: complete <- add_metavars(CTscanDataBruzek) head(complete)
# Load a dataset: data(CTscanDataBruzek) # Visualize the traits: head(CTscanDataBruzek) # Add all the main Bruzek's characters: complete <- add_metavars(CTscanDataBruzek) head(complete)
Produces a single (and non-probabilistic) sex estimate from five characters observed on the human os coxae, following Bruzek (2002)
bruzek02(x)
bruzek02(x)
x |
A character vector of length 5, having three possible values: ‘F’, ‘0’ or ‘M’. |
One unique character value, ‘F’, ‘I’ or ‘M’, according to the majority rule exposed by Bruzek (2002).
This is mainly an internal function for the R-Shiny application implemented in PELVIS.
Frédéric Santos, <[email protected]>
Bruzek, J. (2002) A method for visual determination of sex, using the human hip bone. American Journal of Physical Anthropology 117, 157–168. doi: 10.1002/ajpa.10012
# Here we create manually an individual: individual <- c(PrSu = "M", GrSN = "F", CArc = "F", InfP = "0", IsPu = "F") individual # Determination produced by Bruzek (2002): female individual. bruzek02(individual)
# Here we create manually an individual: individual <- c(PrSu = "M", GrSN = "F", CArc = "F", InfP = "0", IsPu = "F") individual # Determination produced by Bruzek (2002): female individual. bruzek02(individual)
This dataset includes 198 ossa coxae segmented from CT-scans. The eleven trichotomic traits are given for each os coaxe (possibly with missing values for incomplete bones), along with the geographical origin and known sex of the individual. When possible, the age and stature of the individual are also given. This dataset is used as a training sample for the logistic regression models implemented in PELVIS.
data(CTscanDataBruzek)
data(CTscanDataBruzek)
A data frame with 198 observations on the following 16 variables:
Id
a factor with 198 levels (unique ID of each os coxae)
Indiv
a factor with 99 levels (ID of each individual to whom the bone belongs)
Sex
a factor with levels F
, M
(known sex)
Age
a numeric vector (age of the associated individual in years)
Side
a factor with levels L
, R
(left or right side)
PrSu1
an ordered factor with levels f
, i
, m
PrSu2
an ordered factor with levels f
, i
, m
PrSu3
an ordered factor with levels f
, i
, m
GrSN1
an ordered factor with levels f
, i
, m
GrSN2
an ordered factor with levels f
, i
, m
GrSN3
an ordered factor with levels f
, i
, m
CArc
an ordered factor with levels F
, 0
, M
IsPu
an ordered factor with levels F
, 0
, M
InfP1
an ordered factor with levels f
, i
, m
InfP2
an ordered factor with levels f
, i
, m
InfP3
an ordered factor with levels f
, i
, m
Santos, F., Guyomarc'h, P., Rmoutilova, R. and Bruzek, J. (2019) A method of sexing the human os coxae based on logistic regressions and Bruzek's nonmetric traits. American Journal of Physical Anthropology 169(3), 435-447. doi: 10.1002/ajpa.23855
Bruzek, J., Rmoutilova, R., Guyomarc'h, P., & Santos, F. (2019) Supporting data for: A method of sexing the human os coxae based on logistic regressions and Bruzek's nonmetric traits [Data set]. Zenodo. http://doi.org/10.5281/zenodo.2589917
Produces sex estimates from each of the ossa coxae submitted by the user through the graphical user interface of the R-Shiny application.
dataframe_sexing(data, ref, updateProgressBar = NULL, conf_level = 0.95, strategy = c("BIC", "AIC", "None"), trace = 1)
dataframe_sexing(data, ref, updateProgressBar = NULL, conf_level = 0.95, strategy = c("BIC", "AIC", "None"), trace = 1)
data |
A test dataset submitted by the user throught the graphical user interface. The predictive factors (i.e. the eleven trichotomic traits) should have the same headers and levels as in the reference dataset ‘refData’ included in PELVIS. An example of valid data file can be found on Zenodo: doi:10.5281/zenodo.2586897 (its field separator is the semicolon ";"). |
ref |
A learning dataset for logistic regression models, basically the dataset ‘refDataBruzek02’ included in PELVIS (or any other dataset with the same variables). |
updateProgressBar |
Internal option for the R-Shiny application. |
conf_level |
0.95 by default, confidence level needed to produce a sex estimate. |
strategy |
A choice of information criterion ( |
trace |
See |
A complete dataframe of results displayed through the R-Shiny application.
This is an internal function for the R-Shiny application implemented in PELVIS.
Frédéric Santos, <[email protected]>
Santos, F., Guyomarc'h, P., Rmoutilova, R. and Bruzek, J. (2019) A method of sexing the human os coxae based on logistic regressions and Bruzek's nonmetric traits. American Journal of Physical Anthropology, 169(3), 435–447. doi: 10.1002/ajpa.23855
Produces a statistical sex estimate from up to eleven characters observed on the human os coxae, following Bruzek (2019), and using logistic regression models.
indiv_sexing(ref, new_ind, strategy = c("BIC", "AIC", "None"), trace = 1, conf_level = 0.95)
indiv_sexing(ref, new_ind, strategy = c("BIC", "AIC", "None"), trace = 1, conf_level = 0.95)
ref |
A learning dataset for logistic regression models, basically the dataset ‘refData’ included in PELVIS (or any other dataset with the same variables). |
new_ind |
A new os coxae to be determined, with eleven observed traits (possibly with missing values). |
strategy |
A choice of information criterion ( |
trace |
Passed to |
conf_level |
Required posterior probability threshold to produce a sex estimate. |
A list with the following components:
PredictedSex |
One unique character value, ‘F’, ‘I’ or ‘M’: final sex estimate for the studied os coxae. |
PostProb |
Posterior probability for the individual to be a male. |
BestModel |
Best logistic regression model for the studied os coxae according to the BIC criterion. |
VariablesUsed |
Names of the variables (including part or all of the nonmissing traits for the studied os coaxe) used in this best model. |
cvRate |
Success rate in cross-validation. Cf. Santos et al. (2019) for more details about cross-validation here. |
cvIndet |
Rate of individuals remaining indeterminate using the best logistic regression model. |
This is mostly an internal function for the R-Shiny application implemented in PELVIS.
Frédéric Santos, <[email protected]>
Santos, F., Guyomarc'h, P., Rmoutilova, R. and Bruzek, J. (2019) A method of sexing the human os coxae based on logistic regressions and Bruzek's nonmetric traits. American Journal of Physical Anthropology 169(3), 435-447. doi: 10.1002/ajpa.23855
data(refDataBruzek02) # Pick the first individual of the reference dataset with its 11 traits, as an example: individual <- refDataBruzek02[1, -c(1:6)] individual # Compute a sex estimate for this individual: indiv_sexing(ref = refDataBruzek02, new_ind = individual)
data(refDataBruzek02) # Pick the first individual of the reference dataset with its 11 traits, as an example: individual <- refDataBruzek02[1, -c(1:6)] individual # Compute a sex estimate for this individual: indiv_sexing(ref = refDataBruzek02, new_ind = individual)
This dataset includes 592 ossa coxae from five population samples. The eleven trichotomic traits are given for each os coaxe (possibly with missing values for incomplete bones), along with the geographical origin and known sex of the individual. When possible, the age and stature of the individual are also given. This dataset is used as a training sample for the logistic regression models implemented in PELVIS.
data(refDataBruzek02)
data(refDataBruzek02)
A data frame with 592 observations on the following 17 variables:
Id
a factor with 592 levels (unique ID of the individual to whom the bone belongs)
Orig
a factor with 5 levels (geographical origin)
Sex
a factor with levels F
, M
(known sex)
Age
a numeric vector (age of the associated individual in years)
Side
a factor with levels L
, R
(left or right side)
Stature
a numeric vector (in cm)
PrSu1
an ordered factor with levels f
, i
, m
PrSu2
an ordered factor with levels f
, i
, m
PrSu3
an ordered factor with levels f
, i
, m
GrSN1
an ordered factor with levels f
, i
, m
GrSN2
an ordered factor with levels f
, i
, m
GrSN3
an ordered factor with levels f
, i
, m
CArc
an ordered factor with levels F
, 0
, M
IsPu
an ordered factor with levels F
, 0
, M
InfP1
an ordered factor with levels f
, i
, m
InfP2
an ordered factor with levels f
, i
, m
InfP3
an ordered factor with levels f
, i
, m
Santos, F., Guyomarc'h, P., Rmoutilova, R. and Bruzek, J. (2019) A method of sexing the human os coxae based on logistic regressions and Bruzek's nonmetric traits. American Journal of Physical Anthropology 169(3), 435-447. doi: 10.1002/ajpa.23855
Bruzek, J., Rmoutilova, R., Guyomarc'h, P., & Santos, F. (2019) Supporting data for: A method of sexing the human os coxae based on logistic regressions and Bruzek's nonmetric traits [Data set]. Zenodo. http://doi.org/10.5281/zenodo.2589917
This dataset includes 518 right ossa coxae from five population samples. These bones are used as a validation sample in the original study (Santos et al., 2019). The eleven trichotomic traits are given for each os coaxe (possibly with missing values for incomplete bones), along with the geographical origin and known sex of the individual. When possible, the age and stature of the individual are also given.
data(rightBonesDataBruzek)
data(rightBonesDataBruzek)
A data frame with 518 observations on the following 17 variables:
Id
a factor with 518 levels (unique ID of each os coxae)
Orig
a factor with 5 levels (geographical origin or collection)
Sex
a factor with levels F
, M
(known sex)
Age
a numeric vector (age of the associated individual in years)
Side
a factor with one single level R
(right side)
Stature
a numeric vector (in cm)
PrSu1
an ordered factor with levels f
, i
, m
PrSu2
an ordered factor with levels f
, i
, m
PrSu3
an ordered factor with levels f
, i
, m
GrSN1
an ordered factor with levels f
, i
, m
GrSN2
an ordered factor with levels f
, i
, m
GrSN3
an ordered factor with levels f
, i
, m
CArc
an ordered factor with levels F
, 0
, M
IsPu
an ordered factor with levels F
, 0
, M
InfP1
an ordered factor with levels f
, i
, m
InfP2
an ordered factor with levels f
, i
, m
InfP3
an ordered factor with levels f
, i
, m
Santos, F., Guyomarc'h, P., Rmoutilova, R. and Bruzek, J. (2019) A method of sexing the human os coxae based on logistic regressions and Bruzek's nonmetric traits. American Journal of Physical Anthropology 169(3), 435-447. doi: 10.1002/ajpa.23855
Bruzek, J., Rmoutilova, R., Guyomarc'h, P., & Santos, F. (2019) Supporting data for: A method of sexing the human os coxae based on logistic regressions and Bruzek's nonmetric traits [Data set]. Zenodo. http://doi.org/10.5281/zenodo.2589917
Launches a graphical user interface (GUI) allowing to use Bruzek's methods (2002, 2019) for sexing the human os coxae, based on eleven visual traits.
start_pelvis() StartPELVIS()
start_pelvis() StartPELVIS()
The R-Shiny application proposes two tabs:
‘Data input: manual editing’ can be used for both data entry and sex classification. The eleven trichotomic traits are manually edited for each os coxae through the GUI, and the corresponding sex estimates are then produced.
‘Data input: from text file’ is the classical way to get the sex estimates for a whole sample of ossa coxae correctly described in a file. PELVIS accepts .CSV or .TXT data files, but does not support .ODS or .XLS(X) files. The predictive factors (i.e. the eleven trichotomic traits) should have the same headers and levels as in the reference dataset ‘refData’ included in PELVIS. An example of valid data file can be found on Zenodo: doi:10.5281/zenodo.2586897 (its field separator is the semicolon ";").
In both tabs, two sex estimates are given: the visual sex estimate from Bruzek (2002), and the probabilistic sex estimate from Santos, Guyomarc'h, Rmoutilova and Bruzek (2019). Depending on the traits possibly missing on the ossa coxae submitted to the program, the logistic regression models can use various subsets of best predictors (selected by AIC or BIC), or all predictors. The final subset of predictors used for each os coxae is given in the table of results. The user may also want to define a posterior probability threshold for sex estimation (0.90 or 0.95): any os coxae that does not reach this threshold will remain indeterminate.
The function returns no value by itself, but the results can be downloaded through the graphical interface. The table of results includes the following columns:
‘Sex estimate (Bruzek 2002)’: the visual sex estimate based on Bruzek's method (2002).
‘Statistical sex estimate (2019)’: a sex estimation based on a logistic regression model, following the method described in Santos, Guyomarc'h, Rmoutilova and Bruzek (2019, submitted).
‘Prob(M)’ is the probability (obtained with the logistic regression model) that the individual is a man. According to tradition in biological anthropology, we have the following decsion rule: if Prob(M)>0.95 then the sex estimate is ‘M’; if Prob(M)<0.05 then the sex estimate is ‘F’; else the individual remains indeterminate (‘I’).
‘Prob(F)’, defined as 1-Prob(M), is the probability that the individual is a woman.
‘Selected predictors in LR model’: for a given individual, the sex estimation proceeds as follows. First, a complete model is built using all available (i.e., nonmissing) traits for this individual. Then, a classical stepwise model selection by BIC is performed, and the subset of the most useful traits is used to produce the final sex estimate. This column gives the traits used for each individual.
‘10-fold CV accuracy (%)’: the rate of correct classification for the corresponding logistic regression model is estimated using a ten-fold cross-validation on the learning sample.
‘Indet. rate in CV (%)’: the rate of individuals remaining indeterminate in cross-validation for the corresponding logistic regression model.
The R console is not available when the GUI is active. To exit the GUI, type Echap (on MS Windows systems) or Ctrl+C (on Linux systems) in the R console.
Regardless of the size and resolution of your screen, for convenience, it is advisable to decrease the zoom level of your web browser and/or to turn on fullscreen mode.
Frédéric Santos, <[email protected]>
Bruzek, J. (2002) A method for visual determination of sex, using the human hip bone. American Journal of Physical Anthropology 117, 157–168. doi: 10.1002/ajpa.10012
Santos, F., Guyomarc'h, P., Rmoutilova, R. and Bruzek, J. (2019) A method of sexing the human os coxae based on logistic regressions and Bruzek's nonmetric traits. American Journal of Physical Anthropology, 169(3), 435–447. doi: 10.1002/ajpa.23855
if(interactive()){start_pelvis()}
if(interactive()){start_pelvis()}