Validate input data for gene.methylation()
Source:R/validate.gene.methy.data.R
validate.gene.methy.data.Rd
Check whether gene.methy.data
contains all genes required by models
and that there is an acceptable level of missingness for each required gene.
Note that genes with acceptable levels of missing values are later imputed using KNN imputation when calling estimate.features.
If you'd rather use a different imputation method, then make sure to impute missing values before calling estimate.features.
Arguments
- gene.methy.data
A data frame with gene-level methylation data, created by gene.methylation. Patients are rows and columns are genes.
- models
A list of models used to predict features from gene-level methylation data. The models should come from
data('all.models')
.- prop.missing.cutoff
The maximum proportion of missing values allowed for each required gene.
Value
val.passed
a logical indicating whether the data passed validationfeatures.you.can.predict
logical vector indicating which features you can predict (i.e. you have the required genes with missing data rates < prop.missing.cutoff)required.genes
a list of genes required by each modelmissing.genes
a list of genes that are required but completely missing in the datarequired.genes.with.high.missing
a list of genes that are required and have a proportion of missing values greater thanprop.missing.cutoff
Examples
data('all.models');
### example gene-level methylation data
data('example.data.gene.methy');
# note this dataset is derived from the following commands:
# data('example.data');
# example.data.gene.methy <- gene.methylation(example.data);
check <- validate.gene.methy.data(example.data.gene.methy, all.models);
stopifnot(check$val.passed);
# genes required to fit each model:
#check$required.genes;
# genes that are required but completely missing in your data:
#check$missing.genes;
# genes that are required and have a proportion of missing values greater than `prop.missing.cutoff`
#check$required.genes.with.high.missing;