Random forest and glmnet models for predicting various clinical and molecular features of patients diagnosed with prostate cancer. The predictors are gene-level methylation estimated by gene.methylation.
Details
Models are available for predicting the following features:
age.continuous
: patient age in yearsISUP.grade
: International Society of Urological Pathology (ISUP) grade risk group (1-5). See here for further details.t.category
: TNM tumour category (1-4), measures the size and extent of the primary tumourpsa.categorical
: Prostate-specific antigen (PSA) category: <= 10, [10, 20), and >= 20 ng/mL.pga
: percentage of the genome altered was defined as PGA = (base-pair length of all genome regions with gain or loss) / 3.2 billion bases*
100<gene>.cna.<loss/gain>
: features with.cna.
in their name give the gene name and then identify whether there is a copy number loss or gain event. See the Examples section for the full list of cna features.log2p1.snvs.per.mbps
: single nucleotide variants (SNVs) per mega-base pairs (Mbps) with a log2(x + 1) transformation.
Examples
data(all.models);
# Models for predicting the following features:
names(all.models);
#> [1] "age.continuous" "ISUP.grade" "t.category"
#> [4] "psa.categorical" "pga" "CHD1.cna.loss"
#> [7] "NKX3-1.cna.loss" "MYC.cna.gain" "PTEN.cna.loss"
#> [10] "CDKN1B.cna.loss" "RB1.cna.loss" "CDH1.cna.loss"
#> [13] "TP53.cna.loss" "log2p1.snvs.per.mbps"
# Model class per feature, e.g. randomForest or glmnet:
lapply(all.models, class);
#> $age.continuous
#> [1] "randomForest"
#>
#> $ISUP.grade
#> [1] "multnet" "glmnet"
#>
#> $t.category
#> [1] "randomForest"
#>
#> $psa.categorical
#> [1] "randomForest"
#>
#> $pga
#> [1] "randomForest"
#>
#> $CHD1.cna.loss
#> [1] "lognet" "glmnet"
#>
#> $`NKX3-1.cna.loss`
#> [1] "randomForest"
#>
#> $MYC.cna.gain
#> [1] "randomForest"
#>
#> $PTEN.cna.loss
#> [1] "lognet" "glmnet"
#>
#> $CDKN1B.cna.loss
#> [1] "lognet" "glmnet"
#>
#> $RB1.cna.loss
#> [1] "randomForest"
#>
#> $CDH1.cna.loss
#> [1] "lognet" "glmnet"
#>
#> $TP53.cna.loss
#> [1] "lognet" "glmnet"
#>
#> $log2p1.snvs.per.mbps
#> [1] "randomForest"
#>
# Required genes for predicting each feature:
# lapply(all.models, function(x) x$xNames)