Commit d0cf5b60 authored by Edi Prifti's avatar Edi Prifti

- changing log

- updating description vignettebuilder, licence, encoding
- debugging
- cleaning unneccessary files
- vignette V0
parent 277a2141
Today we experienced a GIT crash and had to reinstall it from the backup. More than 30 hours of activity log are lost
\ No newline at end of file
* 17/05/2016: package creation and compilation
* 18/05/2016: git project creation and merging of Lucas and Edi's version
* 1/06/2016: New population (denseVect) for terGA and different rewritten operators.
* 3/06/2016: Digesting and comparative plot capability added.
......@@ -26,11 +26,14 @@ Imports:
viridis,
kernlab,
randomForest
Suggests: knitr, rmarkdown
VignetteBuilder: knitr
Description: The predomics offers access to a novel framework implementing several heuristics that allow to find
sparse and interpretable models in large datasets. These models are efficient and adopted for
classification and regression in metagenomics and other commensurable datasets. We introduce the BTR (BIN, TER, RATIO)
languages that describe different types of links between variables. In this framework are also implemented different
state of the art methods (SOTA) such as RF, ENET and SVM.
License: Attribution-NonCommercial-NoDerivatives 4.0 International
License: LICENCE
LazyData: TRUE
RoxygenNote: 6.1.0
Encoding: UTF-8
#-----------------------------------------------------
# TEST DATASETS
#-----------------------------------------------------
#' @name cir_train
#' @title Cirhosis stage 1 (frequencies)
#' @docType data
#' @author Qin, Nan, Fengling Yang, Ang Li, Edi Prifti, Yanfei Chen, Li Shao, Jing Guo, et al “Alterations of the human gut microbiome in liver cirrhosis.” Nature 513, no. 7516 (July 23, 2014): 59–64.
#' @keywords liver cirrhosis, microbiome, species
#' @description This dataset consists of frequency abundance files as downloaded from http://waldronlab.io/curatedMetagenomicData/.
#' This is a list containing two elements: (i) the X data matrix with 1045 species and 181 observations and (ii) patient class = -1 (n=98) and healthy controls (n=83)
NULL
#' @name cir_test
#' @title Cirhosis stage 2 (frequencies)
#' @docType data
#' @author Qin, Nan, Fengling Yang, Ang Li, Edi Prifti, Yanfei Chen, Li Shao, Jing Guo, et al “Alterations of the human gut microbiome in liver cirrhosis.” Nature 513, no. 7516 (July 23, 2014): 59–64.
#' @keywords liver cirrhosis, microbiome, species
#' @description This dataset consists of frequency abundance files as downloaded from http://waldronlab.io/curatedMetagenomicData/
#' This is a list containing two elements: (i) the X data matrix with 1045 species and 56 observations and (ii) patient class = -1 (n=25) and healthy controls (n=31)
NULL
#' @name ibd
#' @title Inflammatory Bowel Disease (frequencies) from the MetaHIT study
#' @docType data
#' @author Nielsen, H Bjørn, Mathieu Almeida, Agnieszka Sierakowska Juncker, Simon Rasmussen, Junhua Li, Shinichi Sunagawa, Damian R Plichta, et al “Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes.” Nature biotechnology (July 6, 2014): 1–11.
#' @keywords inflamatory bowel disease, microbiome, species
#' @description This dataset consists of frequency abundance files as downloaded from http://waldronlab.io/curatedMetagenomicData/
#' This is a list containing two elements: (i) the X data matrix with 1045 species and 396 observations and (ii) patient class = -1 (n=148) and healthy controls (n=248)
NULL
#' @name obesity
#' @title Obesity (frequencies) from the MetaHIT study
#' @docType data
#' @author Le Chatelier, Emmanuelle, Trine Nielsen, Junjie Qin, Edi Prifti, Falk Hildebrand, Gwen Falony, Mathieu Almeida, et al “Richness of human gut microbiome correlates with metabolic markers.” Nature 500, no. 7464 (April 9, 2014): 541–546.
#' @keywords obesity, microbiome, species
#' @description This dataset consists of frequency abundance files as downloaded from http://waldronlab.io/curatedMetagenomicData/
#' This is a list containing two elements: (i) the X data matrix with 1045 species and 292 observations and (ii) patient class = -1 (n=167) and healthy controls (n=96).
#' Caution, this dataset has also a class 0 with overweight patients, which needs to be omited from both X and y
NULL
#' @name t2d
#' @title Type 2 diabetes (frequencies) BGI
#' @docType data
#' @author Qin, Junjie, Yingrui Li, Zhiming Cai, Shenghui Li, Jianfeng Zhu, Fan Zhang, Suisha Liang, et al “A metagenome-wide association study of gut microbiota in type 2 diabetes.” Nature (September 26, 2012).
#' @keywords type 2 diabetes, microbiome, species
#' @description This dataset consists of frequency abundance files as downloaded from http://waldronlab.io/curatedMetagenomicData/
#' This is a list containing two elements: (i) the X data matrix with 1045 species and 344 observations and (ii) patient class = -1 (n=170) and healthy controls (n=174)
NULL
#' @name t2dw
#' @title Type 2 diabetes (frequencies) Women Sweden
#' @docType data
#' @author Karlsson, Fredrik H, Valentina Tremaroli, Intawat Nookaew, Göran Bergström, Carl Johan Behre, Björn Fagerberg, Jens Nielsen, and Fredrik Bäckhed. “Gut metagenome in European women with normal, impaired and diabetic glucose control.” Nature (May 29, 2013): 1–7.
#' @keywords type 2 diabetes, microbiome, species
#' @description This dataset consists of frequency abundance files as downloaded from http://waldronlab.io/curatedMetagenomicData/
#' This is a list containing two elements: (i) the X data matrix with 1045 species and 145 observations and (ii) patient class = -1 (n=53) and healthy controls (n=43)
#' Caution, this dataset has also a class 0 with IG patients, which needs to be omited from both X and y
NULL
......@@ -85,12 +85,12 @@ test_evaluateModel <- function()
load("../data/testing_data/test_population.rda")
# loading mod, X, y
predomics:::myAssert(condition = isModel(obj = mod), message = "isModel")
predomics:::myAssert(evaluateModel(mod, X, y, clf)$accuracy_, 0.7237569)
predomics:::myAssert(evaluateModel(mod, X, y, clf, force.re.evaluation = TRUE)$accuracy_, 0.7237569)
predomics:::myAssert(evaluateModel(mod, X, y, clf, force.re.evaluation = TRUE)$auc_, 0.7520285)
predomics:::myAssert(evaluateModel(mod, X[,-c(1:10)], y[-c(1:10)], clf, force.re.evaluation = TRUE, mode = "train")$accuracy_, 0.7237569)
predomics:::myAssert(evaluateModel(mod, X[,-c(1:10)], y[-c(1:10)], clf, force.re.evaluation = TRUE, mode = "train")$accuracy_, 0.7237569)
myAssert(condition = isModel(obj = mod), message = "isModel")
myAssert(evaluateModel(mod, X, y, clf)$accuracy_, 0.7237569)
myAssert(evaluateModel(mod, X, y, clf, force.re.evaluation = TRUE)$accuracy_, 0.7237569)
myAssert(evaluateModel(mod, X, y, clf, force.re.evaluation = TRUE)$auc_, 0.7520285)
myAssert(evaluateModel(mod, X[,-c(1:10)], y[-c(1:10)], clf, force.re.evaluation = TRUE, mode = "train")$accuracy_, 0.7237569)
myAssert(evaluateModel(mod, X[,-c(1:10)], y[-c(1:10)], clf, force.re.evaluation = TRUE, mode = "train")$accuracy_, 0.7237569)
# > evaluateModel(mod, X, y, clf)$accuracy_
# [1] 0.6187845
......
......@@ -735,7 +735,7 @@ pop_better <- function(mod.col,eval='fit_',k_penalty=0.01){
}else{
pop <- modelCollectionToPopulation(mod.col)
}
bests <- predomics:::getBestIndividual(mod.col,evalToOrder = eval)
bests <- getBestIndividual(mod.col,evalToOrder = eval)
ev <- populationGet_X(element2get = eval, toVec = TRUE, na.rm = TRUE)(bests)
names <- names(ev)
ev_2 <- shift(ev,1)
......
---
title: "L'avancement de predomics"
author: "Lucas Robin"
date: "13 juillet 2016"
output:
rmarkdown::md_document:
variant: markdown_github
toc: true
toc_depth: 2
vignette: >
%\VignetteIndexEntry{Vignette Title}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
Petit point sur l'avancement de predomics
## TerGA
Deux version de TerGA coexistent actuellement "l'ancienne" et la "nouvelle". L'ancienne versoin fonctionne sparsité par sparsité, pour chaque sparsité on crée une population basé sur celle de la sparsité précedente si elle existe. On la fait ensuite evoluer et on passe à la sparsité suivante.
La nouvelle version elle part d'une plus grande population contenant des individus dont la sparsité est tirée au hasard. On fait ensuite evoluer cette population (l'évolution peut faire changer la sparsité de chaque individus) sur plusieurs générations.
La nouvelle methode dispose maintenant de multiple fonction pour le croisement, la mutation, la selection ainsi que de la possibilité de désactiver le croisement ou la mutation.
Le système de mutation actuellement utilisé favorise les features qui apparaissent dans des modèles aillant une AUC élevée.
Le croisement quand à lui prend la sparsité d'un des parents au hasard, et sélectionne le bon nombre de features dans la liste de feature qui correspond à celles présentes dans les modèles parents.
## TerDA
Pour TerDA, une methode principalement est utilisé et est fonctionnelle, il s'agit de celle basé sur glmnet. Pour l'instant il y a quelque problème lorsque l'on calcule l'accuracy des modèles.
## TerBeam
Comme pour TerGA, TerBeam est disponnible sous 2 versions, En revanche, ici elles fonctionnent de la même façon, l'ancienne methode à été gardé pour comparer les performances des 2 versions.
## Les fonctions de test
J'ai ajouté une fonction permettant de calculer une classe afin de faire des tests avec, les methodes TerDA et TerBeam on été capable de retrouver le modèle utilisé pour creer la classe lors de tests rapides. En revanche lorsque les features utilisées pour créer la classes ne sont que très peux présentes dans le dataset, des bugs apparaissent lors de la cross validation.
Il existe aussi une fonction pour comparer les deux methodes TerGA, l'ancienne est d'abord lancé et on prend son temps d'execution, puis on lance la deuxième en lui donnant le temps d'execution de l'ancienne.
\ No newline at end of file
---
title: "Population"
author: "Edi Prifti, Yann Chevaleyre, Lucas Robin and Jean-Daniel Zucker"
date: "16 juin 2016"
output:
rmarkdown::md_document:
variant: markdown_github
toc: true
toc_depth: 2
vignette: >
%\VignetteIndexEntry{Vignette Title}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
Population data structure
=========================
The population data structure is a list of individuals (or models). Each individual is also a list, containing :
* `indices_` : the indices of the non-zero coefficients in the model ;
* `names_` : the names of features of non-zero coefficients ;
* `coeffs_` : the values of the non-zero coefficients ;
* `fit_` : the auc score of the model ;
* `accuracy_` : the accuracy of the model ;
* `eval.sparsity` : the number of non-zero coefficients in the model ;
---
title: "modular_testing"
author: "Edi Prifti"
date: "5/1/2018"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
In this document we will be testing different functions to make sure they work properly
```{r load package}
library(predomics)
print(paste("The current version of the package is",packageVersion("predomics")))
```
## Load a dataset and experiment
For this tests we will be using the cirrhosis dataset at the bug_species level.
```{r pressure, echo=FALSE}
load("/data/projects/predomics_testing/analyses/2.db_segata/2.db_cirrhose_stage1/bug_species/db.rda")
load("/data/projects/predomics_testing/analyses/2.db_segata/2.db_cirrhose_stage1/bug_species/results/results.metal.all._spar_1_to_30.rda")
```
## getNBestModels
This function in the current version is generic and will replace several other functions performing parts of the tasks.
```{r getNbestModels}
#?getNBestModels
print("GET THE BEST 5")
# enter an experiment
res <- getNBestModels(obj = res.metal.all.,
significance = TRUE,
by.k.sparsity = TRUE,
k.penalty = 0,
n.best = 5,
single.best = FALSE,
single.best.cv = TRUE,
single.best.k = NULL,
max.min.prevalence = FALSE,
X = NULL,
verbose = FALSE,
evalToOrder = "fit_",
return.population = FALSE # MC
)
# return an MC
printy(res)
res <- getNBestModels(obj = res.metal.all.,
significance = TRUE,
by.k.sparsity = TRUE,
k.penalty = 0,
n.best = 5,
single.best = FALSE,
single.best.cv = TRUE,
single.best.k = NULL,
max.min.prevalence = FALSE,
X = NULL,
verbose = FALSE,
evalToOrder = "fit_",
return.population = TRUE # population
)
# return a population
printy(res)
print("GET THE BEST FOR K")
# Control for significance
res <- getNBestModels(obj = res.metal.all.,
significance = TRUE,
by.k.sparsity = TRUE,
k.penalty = 0,
n.best = 1,
single.best = FALSE,
single.best.cv = TRUE,
single.best.k = NULL,
max.min.prevalence = FALSE,
X = NULL,
verbose = FALSE,
evalToOrder = "fit_",
return.population = FALSE # MC
)
# return best per k
printy(res)
res <- getNBestModels(obj = res.metal.all.,
significance = TRUE,
by.k.sparsity = TRUE,
k.penalty = 0,
n.best = 1,
single.best = FALSE,
single.best.cv = TRUE,
single.best.k = NULL,
max.min.prevalence = FALSE,
X = NULL,
verbose = FALSE,
evalToOrder = "fit_",
return.population = TRUE # population
)
# return a population
printy(res)
print("GET THE POP MINMAXPREV")
res <- getNBestModels(obj = res.metal.all.,
significance = TRUE,
by.k.sparsity = TRUE,
k.penalty = 0,
n.best = 5,
single.best = FALSE,
single.best.cv = FALSE,
single.best.k = NULL,
max.min.prevalence = TRUE, # use the max.min.prevalence selection
X = X, # this is needed when this parameter is active
verbose = FALSE,
evalToOrder = "fit_",
return.population = FALSE # population
)
# return a population
printy(res)
# Get the best CV
print("GET THE BEST CV")
res <- getNBestModels(obj = res.metal.all.,
significance = TRUE,
by.k.sparsity = TRUE,
k.penalty = 0,
n.best = 5,
single.best = TRUE,
single.best.cv = TRUE,
single.best.k = NULL,
max.min.prevalence = FALSE,
X = NULL,
verbose = FALSE,
evalToOrder = "fit_",
return.population = FALSE # MC
)
# return best per k
printy(res)
res <- getNBestModels(obj = res.metal.all.,
significance = TRUE,
by.k.sparsity = TRUE,
k.penalty = 0,
n.best = 5,
single.best = FALSE,
single.best.cv = FALSE,
single.best.k = NULL,
max.min.prevalence = FALSE,
X = NULL,
verbose = FALSE,
evalToOrder = "fit_",
return.population = FALSE # population
)
# return a population
printy(res)
print("GET THE BEST CV")
res <- getNBestModels(obj = res.metal.all.,
significance = TRUE,
by.k.sparsity = TRUE,
k.penalty = 0,
n.best = 5,
single.best = TRUE, # give best
single.best.cv = TRUE, # based on CV
single.best.k = NULL,
max.min.prevalence = FALSE,
X = NULL,
verbose = FALSE,
evalToOrder = "fit_",
return.population = FALSE # population
)
# return a population
printy(res)
print("GET THE BEST CV penalty")
res <- getNBestModels(obj = res.metal.all.,
significance = TRUE,
by.k.sparsity = TRUE,
k.penalty = 0.75/100,
n.best = 5,
single.best = TRUE, # give best
single.best.cv = TRUE, # based on CV
single.best.k = NULL,
max.min.prevalence = FALSE,
X = NULL,
verbose = FALSE,
evalToOrder = "fit_",
return.population = FALSE # population
)
# return a population
printy(res)
print("GET THE BEST no CV")
res <- getNBestModels(obj = res.metal.all.,
significance = TRUE,
by.k.sparsity = TRUE,
k.penalty = 0,
n.best = 5,
single.best = TRUE, # give best
single.best.cv = FALSE, # not CV
single.best.k = NULL,
max.min.prevalence = FALSE,
X = NULL,
verbose = FALSE,
evalToOrder = "fit_",
return.population = FALSE # population
)
# return a population
printy(res)
print("GET THE BEST K")
res <- getNBestModels(obj = res.metal.all.,
significance = TRUE,
by.k.sparsity = TRUE,
k.penalty = 0,
n.best = 5,
single.best = TRUE,
single.best.cv = FALSE,
single.best.k = 3, # decide the best k to return
max.min.prevalence = FALSE,
X = NULL,
verbose = FALSE,
evalToOrder = "fit_",
return.population = FALSE # population
)
# return a population
printy(res)
```
## digest
This function in the current version will allow to extract from an experiment object important information that is needed for many functions
```{r digest}
dig <- digest(obj = res.metal.all.,
penalty = 0.5/100,
best.cv = TRUE,
best.k = NULL,
plot = FALSE)
printy(dig$best$model)
```
## digestModelCollection
Testing digestModelCollection
```{r digestModelCollection}
dig <- digestModelCollection(obj = res.metal.all.$classifier$models, X = NULL,clf = clf.metal.all., k.penalty = 0.75/100, mmprev = TRUE)
printy(dig$best$models)
printy(dig$best$model)
dig <- digestModelCollection(obj = res.metal.all.$classifier$models, X = NULL,clf = clf.metal.all., k.penalty = 0.75/100, mmprev = FALSE)
printy(dig$best$models)
printy(dig$best$model)
```
---
title: "Predomics"
author: "Edi Prifti, Yann Chevaleyre, Lucas Robin and Jean-Daniel Zucker"
date: "16 juin 2016"
output:
rmarkdown::md_document:
variant: markdown_github
toc: true
toc_depth: 2
vignette: >
%\VignetteIndexEntry{Vignette Title}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
Predomics objectives
====================
The objective of this work is to propose new approaches that are adopted and efficient for prediction and regression in very large data sets such as metagenomics and other omics data. We use a ternary approach where it makes sense to sum features such as for instance the abundance of species in an ecosystem. We propose here three different algorithms that try to solve the above mentioned problems describe in detail hereafter.
The three methods are :
* [TerGA](TerGA)
* [TerDA](TerDA)
* [TerBeam](TerBeam)
In these three methods we use the same data structure [Population](Population).
## UPDATES
* 17/05/2016: package creation and compilation
* 18/05/2016: git project creation and merging of Lucas and Edi's version
* 1/06/2016: New population (denseVect) for terGA and different rewritten operators.
* 3/06/2016: Digesting and comparative plot capability added.
* 21/08/2016: ver 0.3.1 Testing folder has been moved to another project to lighten the package
---
title: "Using predomics"
author: "Edi Prifti, Yann Chevaleyre, Lucas Robin and Jean-Daniel Zucker"
date: "20 mai 2016"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Vignette Title}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r include=F}
knitr::knit_hooks$set(margin = function(before,options,envir) {
if(before) par(mgp=c(1.5,0.5,0),bty="n",plt=c(.105,.97,.13,.97))
else NULL
})
knitr::opts_chunk$set(margin=T,prompt=T,comment="",collapse=T,cache=T,
dev.args=list(pointsize=11),fig.height=3.5,
fig.width=4.24725,fig.retina=2,fig.align="center")
dir.create("tmp")
knitr::opts_knit$set(root.dir=paste0(getwd(),"/tmp"))
```
This vignette shows how to install and use the `predomics` package.
## Installing `predomics`
The package is currently on the GitHub repository of ICAN. To install it from this
repository you can use the `install_github` function from the `devtools`
package. If this package is not installed on your system, install it
```{r eval=F}
install.packages("devtools")
```
Then you can install the `predomics` package with a user name and password by typing:
```{r eval=F}
devtools::install_github("http://integromics.fr:8080")
```
Note that the `predomics` package depends on the `glmnet` package. The installation of
`predomics` package automatically installs the `glmnet` too, if not already installed
on your system. Then, you can load and attach the `predomics` library:
## Using `predomics`
```{r}
library(predomics)
```
We're going to make some tests so first we need to load a dataset, we'll use DATAMETA3 :
```{r}
data("DATAMETA3")
y <- DATAMETA3$class
X <- t(DATAMETA3[,-1])
```
Then to use any method, first you need to build its clf object with :
```{r eval=F}
terga() # To build a clf object for the genetic algorithm method
OLD_terga() # To build a clf object for the old genetic algorithm method
terda() # To build a clf object for the glmnet based method
terBeam() # To build a clf object for the beam search method
OLD_terBeam() # To build a clf object for the old beam search method
```
For all these classifier constructor there's a lot of parameters, the few common parameters are : the sparsity and the number of cores used for parallel computing. Let's set them for the next tests :
```{r eval=F}
sparsity <- 1:30
nCores <- parallel::detectCores()
```
Now for every classifiers there's a lot of paramters but the ones we use the most are :
* `size_pop`, `size_world`, `nb_gen` and `print_ind_method` for terGA
* `size_pop`, `size_world` and `convergence_steps` for the OLD_terGA method
* `nIterations`, `sparsity` and `nRR` for terDA
* `maxNbOfModels` for terBeam
* `size_pop` for OLD_terBeam
So let's set them for our tests.
```{r eval=F}
size_pop_TerGA <- 500
nb_gen_TerGA <- 10
print_ind_method <- "short"
size_pop_OLD_TerGA <- 100
convergence_steps <- 10
size_world <- nrow(X)
nbIterTerDA <- 5
nbRRTerDA <- 500
maxNbOfModels <- 10000
size_pop_OLD_TerBeam <- 100
```
So to build the clf objects we use the constructors.
```{r eval=F}
clf.terga <- terga(sparsity = sparsity, size_pop = size_pop_TerGA, size_world = size_world,
nCores = nCores, print_ind_method = print_ind_method, nb_gen = nb_gen_TerGA)
clf.terga.old <- OLD_terga(sparsity = sparsity, size_pop = size_pop_OLD_TerGA, nCores = nCores,
convergence_steps = convergence_steps, size_world = size_world)
clf.terda <- terda(sparsity = sparsity, nIterations = nbIterTerDA, nCores = nCores, nRR = nbRRTerDA)
clf.terbeam <- terBeam(sparsity = sparsity, nCores = nCores, maxNbOfModels = maxNbOfModels)
clf.terbeam.old <- OLD_terBeam(sparsity = sparsity, nCores = nCores, size_pop = size_pop_OLD_TerBeam)
```
Now we can start the fit function to build models.
```{r eval=F}
fit(X, y, clf.terga)
fit(X, y, clf.terga.old)
fit(X, y, clf.terda)
fit(X, y, clf.terbeam)
fit(X, y, clf.terbeam.old)
```
Then to do the cross validation first we create the folds.
```{r eval=F}
lfolds <- create.folds(y, k = 10, list = TRUE, returnTrain = FALSE)
```
Next we can start the fit with the crossValidate option set to true
```{r eval=F}
res.terga.cv <- fit(X, y, clf.terga, crossValidate = TRUE, lfolds = lfolds)
res.terga.old.cv <- fit(X, y, clf.terga.old, crossValidate = TRUE, lfolds = lfolds)
res.terda.cv <- fit(X, y, clf.terda, crossValidate = TRUE, lfolds = lfolds)
res.terbeam.cv <- fit(X, y, clf.terbeam, crossValidate = TRUE, lfolds = lfolds)
res.terbeam.old.cv <- fit(X, y, clf.terbeam.old, crossValidate = TRUE, lfolds = lfolds)
```
Finaly to visualise the results, first step is to merge the results.
```{r eval=F}
list.results <- list(terga=res.terga.cv,
terga_old=res.terga.old.cv,
terda=res.terda.cv,
terbeam=res.terbeam.cv,
terbeam_old=res.terbeam.old.cv)
digested.results <- mergeResults(list.results = list.results, sparsity = sparsity)
```
And then we can build the figures.
```{r eval=F}
plotComparativeResults(digested.results = digested.results)
```
## Advenced options
If you want to test the performance of the methods with a custom build class, you can build one with the function `createClassForTest(X, y, f)` you give it the dataset X, the old class y, and the function with whcich you want to build the class.
Here is an exemple of this kind of function :
```{r eval=F}
f <- function(x)
{
return(x[[696]] + x[[733]] - x[[1006]])
}
```
```{r include=F, cache=F}
setwd("..")
knitr::opts_knit$set(root.dir=".")
```
```{r include=F, cache=F}
unlink("tmp",T,T)
```
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment