Package 'MLGL' reference manual

Title:	Multi-Layer Group-Lasso
Description:	It implements a new procedure of variable selection in the context of redundancy between explanatory variables, which holds true with high dimensional data (Grimonprez et al. (2023) <doi:10.18637/jss.v106.i03>).
Authors:	Quentin Grimonprez [aut, cre], Samuel Blanck [ctb], Alain Celisse [ths], Guillemette Marot [ths], Yi Yang [ctb], Hui Zou [ctb]
Maintainer:	Quentin Grimonprez <[email protected]>
License:	GPL (>=2)
Version:	1.0.0
Built:	2025-03-04 03:33:12 UTC
Source:	https://github.com/modal-inria/mlgl

MLGL

Description

This package presents a method combining Hierarchical Clustering and Group-lasso. Usually, a single partition of the covariates is used in the group-lasso. Here, we provide several partitions from the hierarchical tree.

A post-treatment method based on statistical test (with FWER and FDR control) for selecting the regularization parameter and the optimal group for this value is provided. This method can be applied for the classical group-lasso and our method.

Details

The MLGL function performs the hierarchical clustering and the group-lasso. The post-treatment method can be performed with hierarchicalFWER and selFWER functions. The whole process can be run with the fullProcess function.

Author(s)

Quentin Grimonprez

References

Grimonprez Q, Blanck S, Celisse A, Marot G (2023). "MLGL: An R Package Implementing Correlated Variable Selection by Hierarchical Clustering and Group-Lasso." Journal of Statistical Software, 106(3), 1-33. doi:10.18637/jss.v106.i03.

Examples

# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- MLGL(X, y)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- MLGL(X, y)

Hierarchical Clustering with distance matrix computed using bootstrap replicates

Description

Hierarchical Clustering with distance matrix computed using bootstrap replicates

Usage

bootstrapHclust(X, frac = 1, B = 50, method = "ward.D2", nCore = NULL)
bootstrapHclust(X, frac = 1, B = 50, method = "ward.D2", nCore = NULL)

Arguments

`X`	data
`frac`	fraction of sample used at each replicate
`B`	number of replicates
`method`	desired method: "single", "complete", "average", "mcquitty", "ward.D", "ward.D2", "centroid", "median".
`nCore`	number of cores

Value

An object of class hclust

Examples

hc <- bootstrapHclust(USArrests, nCore = 1)
hc <- bootstrapHclust(USArrests, nCore = 1)

Get coefficients from a `cv.MLGL` object

Description

Get coefficients from a cv.MLGL object

Usage

## S3 method for class 'cv.MLGL'
coef(object, s = c("lambda.1se", "lambda.min"), ...)
## S3 method for class 'cv.MLGL'
coef(object, s = c("lambda.1se", "lambda.min"), ...)

Arguments

`object`	`cv.MLGL` object
`s`	Either "lambda.1se" or "lambda.min"
`...`	Not used. Other arguments to predict.

Value

A matrix with estimated coefficients for given values of s.

Author(s)

Quentin Grimonprez

Get coefficients from a `MLGL` object

Description

Get coefficients from a MLGL object

Usage

## S3 method for class 'MLGL'
coef(object, s = NULL, ...)
## S3 method for class 'MLGL'
coef(object, s = NULL, ...)

Arguments

`object`	`MLGL` object
`s`	values of lambda. If NULL, use values from object
`...`	Not used. Other arguments to predict.

Value

A matrix with estimated coefficients for given values of s.

Author(s)

Quentin Grimonprez

Compute the group size weight vector with an authorized maximal size

Description

Compute the group size weight vector with an authorized maximal size

Usage

computeGroupSizeWeight(hc, sizeMax = NULL)
computeGroupSizeWeight(hc, sizeMax = NULL)

Arguments

`hc`	output of hclust
`sizeMax`	maximum size of cluster to consider

Value

the weight vector

Examples

set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# use 20 as the maximal number of group
hc <- hclust(dist(t(X)))
w <- computeGroupSizeWeight(hc, sizeMax = 20)
# Apply MLGL method
res <- MLGL(X, y, hc = hc, weightSizeGroup = w)
set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# use 20 as the maximal number of group
hc <- hclust(dist(t(X)))
w <- computeGroupSizeWeight(hc, sizeMax = 20)
# Apply MLGL method
res <- MLGL(X, y, hc = hc, weightSizeGroup = w)

Multi-Layer Group-Lasso with cross V-fold validation

Description

V-fold cross validation for MLGL function

Usage

cv.MLGL(
  X,
  y,
  nfolds = 5,
  lambda = NULL,
  hc = NULL,
  weightLevel = NULL,
  weightSizeGroup = NULL,
  loss = c("ls", "logit"),
  intercept = TRUE,
  sizeMaxGroup = NULL,
  verbose = FALSE,
  ...
)
cv.MLGL(
  X,
  y,
  nfolds = 5,
  lambda = NULL,
  hc = NULL,
  weightLevel = NULL,
  weightSizeGroup = NULL,
  loss = c("ls", "logit"),
  intercept = TRUE,
  sizeMaxGroup = NULL,
  verbose = FALSE,
  ...
)

Arguments

`X`	matrix of size n*p
`y`	vector of size n. If loss = "logit", elements of y must be in -1,1
`nfolds`	number of folds
`lambda`	lambda values for group lasso. If not provided, the function generates its own values of lambda
`hc`	output of `hclust` function. If not provided, `hclust` is run with ward.D2 method
`weightLevel`	a vector of size p for each level of the hierarchy. A zero indicates that the level will be ignored. If not provided, use 1/(height between 2 successive levels)
`weightSizeGroup`	a vector
`loss`	a character string specifying the loss function to use, valid options are: "ls" least squares loss (regression) and "logit" logistic loss (classification)
`intercept`	should an intercept be included in the model ?
`sizeMaxGroup`	maximum size of selected groups. If NULL, no restriction
`verbose`	print some informations
`...`	Others parameters for `cv.gglasso` function

Details

Hierarchical clustering is performed with all the variables. Then, the partitions from the different levels of the hierarchy are used in the different run of MLGL for cross validation.

Value

a cv.MLGL object containing:

lambda: values of lambda.
cvm: the mean cross-validated error.
cvsd: estimate of standard error of cvm
cvupper: upper curve = cvm+cvsd
cvlower: lower curve = cvm-cvsd
lambda.min: The optimal value of lambda that gives minimum cross validation error cvm.
lambda.1se: The largest value of lambda such that error is within 1 standard error of the minimum.
time: computation time

Author(s)

Quentin Grimonprez

Examples

set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply cv.MLGL method
res <- cv.MLGL(X, y)
set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply cv.MLGL method
res <- cv.MLGL(X, y)

F-test

Description

Perform a F-test

Usage

Ftest(X, y, varToTest)
Ftest(X, y, varToTest)

Arguments

`X`	design matrix of size n*p
`y`	response vector of length n
`varToTest`	vector containing the index of the column of X to test

Details

y = X * beta + epsilon

null hypothesis: beta[varToTest] = 0 alternative hypothesis: it exists an index k in varToTest such that beta[k] != 0

The test statistic is based on a full and a reduced model. full: y = X * beta[varToTest] + epsilon reduced: the null model

Value

a vector of the same length as varToTest containing the p-values of the test.

Full process of MLGL

Description

Run hierarchical clustering following by a group-lasso on all the different partition and a hierarchical testing procedure. Only for linear regression problem.

Usage

fullProcess(X, ...)

## Default S3 method:
fullProcess(
  X,
  y,
  control = c("FWER", "FDR"),
  alpha = 0.05,
  test = partialFtest,
  hc = NULL,
  fractionSampleMLGL = 1/2,
  BHclust = 50,
  nCore = NULL,
  addRoot = FALSE,
  Shaffer = FALSE,
  ...
)

## S3 method for class 'formula'
fullProcess(
  formula,
  data,
  control = c("FWER", "FDR"),
  alpha = 0.05,
  test = partialFtest,
  hc = NULL,
  fractionSampleMLGL = 1/2,
  BHclust = 50,
  nCore = NULL,
  addRoot = FALSE,
  Shaffer = FALSE,
  ...
)
fullProcess(X, ...)

## Default S3 method:
fullProcess(
  X,
  y,
  control = c("FWER", "FDR"),
  alpha = 0.05,
  test = partialFtest,
  hc = NULL,
  fractionSampleMLGL = 1/2,
  BHclust = 50,
  nCore = NULL,
  addRoot = FALSE,
  Shaffer = FALSE,
  ...
)

## S3 method for class 'formula'
fullProcess(
  formula,
  data,
  control = c("FWER", "FDR"),
  alpha = 0.05,
  test = partialFtest,
  hc = NULL,
  fractionSampleMLGL = 1/2,
  BHclust = 50,
  nCore = NULL,
  addRoot = FALSE,
  Shaffer = FALSE,
  ...
)

Arguments

`X`	matrix of size n*p
`...`	Others parameters for MLGL
`y`	vector of size n.
`control`	either "FDR" or "FWER"
`alpha`	control level for testing procedure
`test`	test used in the testing procedure. Default is partialFtest
`hc`	output of `hclust` function. If not provided, `hclust` is run with ward.D2 method. User can also provide the desired method: "single", "complete", "average", "mcquitty", "ward.D", "ward.D2", "centroid", "median".
`fractionSampleMLGL`	a real between 0 and 1: the fraction of individuals to use in the sample for MLGL (see Details).
`BHclust`	number of replicates for computing the distance matrix for the hierarchical clustering tree
`nCore`	number of cores used for distance computation. Use all cores by default.
`addRoot`	If TRUE, add a common root containing all the groups
`Shaffer`	If TRUE, a Shaffer correction is performed (only if control = "FWER")
`formula`	an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted.
`data`	an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment (formula)

Details

Divide the n individuals in two samples. Then the three following steps are done: 1) Bootstrap Hierarchical Clustering of the variables of X 2) MLGL on the second sample of individuals 3) Hierarchical testing procedure on the first sample of individuals.

Value

a list containing:

res: output of MLGL function
lambdaOpt: lambda values maximizing the number of rejects
var: A vector containing the index of selected variables for the first lambdaOpt value
group: A vector containing the values index of selected groups for the first lambdaOpt value
selectedGroups: Selected groups for the first lambdaOpt value
reject: Selected groups for all lambda values
alpha: Control level
test: Test used in the testing procedure
control: "FDR" or "FWER"
time: Elapsed time

Author(s)

Quentin Grimonprez

Examples

# least square loss
set.seed(42)
X <- simuBlockGaussian(50, 12, 5, 0.7)
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
res <- fullProcess(X, y)
# least square loss
set.seed(42)
X <- simuBlockGaussian(50, 12, 5, 0.7)
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
res <- fullProcess(X, y)

Hierarchical testing with FDR control

Description

Apply hierarchical test for each hierarchy, and test external variables for FDR control at level alpha

Usage

hierarchicalFDR(X, y, group, var, test = partialFtest, addRoot = FALSE)
hierarchicalFDR(X, y, group, var, test = partialFtest, addRoot = FALSE)

Arguments

`X`	original data
`y`	associated response
`group`	vector with index of groups. group[i] contains the index of the group of the variable var[i].
`var`	vector with the variables contained in each group. group[i] contains the index of the group of the variable var[i].
`test`	function for testing the nullity of a group of coefficients in linear regression. The function has 3 arguments: `X`, the design matrix, `y`, response, and `varToTest`, a vector containing the indices of the variables to test. The function returns a p-value
`addRoot`	If TRUE, add a common root containing all the groups

Details

Version of the hierarchical testing procedure of Yekutieli for MLGL output. You can use th selFDR function to select groups at a desired level alpha.

Value

a list containing:

pvalues: pvalues of the different test (without correction)
adjPvalues: adjusted pvalues
groupId: Index of the group
hierMatrix: Matrix describing the hierarchical tree.

References

Yekutieli, Daniel. "Hierarchical False Discovery Rate-Controlling Methodology." Journal of the American Statistical Association 103.481 (2008): 309-16.

Examples

set.seed(42)
X <- simuBlockGaussian(50, 12, 5, 0.7)
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
res <- MLGL(X, y)
test <- hierarchicalFDR(X, y, res$group[[20]], res$var[[20]])
set.seed(42)
X <- simuBlockGaussian(50, 12, 5, 0.7)
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
res <- MLGL(X, y)
test <- hierarchicalFDR(X, y, res$group[[20]], res$var[[20]])

Hierarchical testing with FWER control

Description

Apply hierarchical test for each hierarchy, and test external variables for FWER control at level alpha

Usage

hierarchicalFWER(
  X,
  y,
  group,
  var,
  test = partialFtest,
  Shaffer = FALSE,
  addRoot = FALSE
)
hierarchicalFWER(
  X,
  y,
  group,
  var,
  test = partialFtest,
  Shaffer = FALSE,
  addRoot = FALSE
)

Arguments

`X`	original data
`y`	associated response
`group`	vector with index of groups. group[i] contains the index of the group of the variable var[i].
`var`	vector with the variables contained in each group. group[i] contains the index of the group of the variable var[i].
`test`	function for testing the nullity of a group of coefficients in linear regression. The function has 3 arguments: `X`, the design matrix, `y`, response, and `varToTest`, a vector containing the indices of the variables to test. The function returns a p-value
`Shaffer`	boolean, if TRUE, a Shaffer correction is performed
`addRoot`	If TRUE, add a common root containing all the groups

Details

Version of the hierarchical testing procedure of Meinshausen for MLGL output. You can use th selFWER function to select groups at a desired level alpha

Value

a list containing:

pvalues: pvalues of the different test (without correction)
adjPvalues: adjusted pvalues
groupId: Index of the group
hierMatrix: Matrix describing the hierarchical tree.

References

Meinshausen, Nicolai. "Hierarchical Testing of Variable Importance." Biometrika 95.2 (2008): 265-78.

Examples

set.seed(42)
X <- simuBlockGaussian(50, 12, 5, 0.7)
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
res <- MLGL(X, y)
test <- hierarchicalFWER(X, y, res$group[[20]], res$var[[20]])
set.seed(42)
X <- simuBlockGaussian(50, 12, 5, 0.7)
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
res <- MLGL(X, y)
test <- hierarchicalFWER(X, y, res$group[[20]], res$var[[20]])

Hierarchical Multiple Testing procedure

Description

Apply Hierarchical Multiple Testing procedure on a MLGL object

Usage

HMT(
  res,
  X,
  y,
  control = c("FWER", "FDR"),
  alpha = 0.05,
  test = partialFtest,
  addRoot = FALSE,
  Shaffer = FALSE,
  ...
)
HMT(
  res,
  X,
  y,
  control = c("FWER", "FDR"),
  alpha = 0.05,
  test = partialFtest,
  addRoot = FALSE,
  Shaffer = FALSE,
  ...
)

Arguments

`res`	`MLGL` object
`X`	matrix of size n*p
`y`	vector of size n.
`control`	either "FDR" or "FWER"
`alpha`	control level for testing procedure
`test`	test used in the testing procedure. Default is partialFtest
`addRoot`	If TRUE, add a common root containing all the groups
`Shaffer`	If TRUE, a Shaffer correction is performed (only if control = "FWER")
`...`	extra parameters for selFDR

Value

a list containing:

lambdaOpt: lambda values maximizing the number of rejects
var: A vector containing the index of selected variables for the first lambdaOpt value
group: A vector containing the values index of selected groups for the first lambdaOpt value
selectedGroups: Selected groups for the first lambdaOpt value
indLambdaOpt: indices associated with optimal lambdas
reject: Selected groups for all lambda values
alpha: Control level
test: Test used in the testing procedure
control: "FDR" or "FWER"
time: Elapsed time
hierTest: list containing the output of the testing function for each lambda. Each element can be used with the selFWER or selFDR functions.
lambda: lambda path
nGroup: Number of groups before testing
nSelectedGroup: Numer of groups after testing

Examples

set.seed(42)
X <- simuBlockGaussian(50, 12, 5, 0.7)
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
res <- MLGL(X, y)

# perform hierarchical testing with FWER control
out <- HMT(res, X, y, alpha = 0.05)

# test a new value of alpha for a specific lambda
selFWER(out$hierTest[[60]], alpha = 0.1)
set.seed(42)
X <- simuBlockGaussian(50, 12, 5, 0.7)
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
res <- MLGL(X, y)

# perform hierarchical testing with FWER control
out <- HMT(res, X, y, alpha = 0.05)

# test a new value of alpha for a specific lambda
selFWER(out$hierTest[[60]], alpha = 0.1)

Obtain a sparse matrix of the coefficients of the path

Description

Obtain a sparse matrix of the coefficients of the path

Usage

listToMatrix(x, row = c("covariates", "lambda"))
listToMatrix(x, row = c("covariates", "lambda"))

Arguments

`x`	`MLGL` object
`row`	"lambda" or "covariates". If row="covariates", each row of the output matrix represents a covariate else if row="lambda", it represents a value of lambda.

Details

This function can be used with a MLGL object to obtain a matrix with all estimated coefficients for the p original variables. In case of overlapping groups, coefficients from repeated variables are summed.

Value

a sparse matrix containing the estimated coefficients for different lambdas

Examples

# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- MLGL(X, y)
# Convert output in sparse matrix format
beta <- listToMatrix(res)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- MLGL(X, y)
# Convert output in sparse matrix format
beta <- listToMatrix(res)

Multi-Layer Group-Lasso

Description

Run hierarchical clustering following by a group-lasso on all the different partitions.

Usage

MLGL(X, ...)

## Default S3 method:
MLGL(
  X,
  y,
  hc = NULL,
  lambda = NULL,
  weightLevel = NULL,
  weightSizeGroup = NULL,
  intercept = TRUE,
  loss = c("ls", "logit"),
  sizeMaxGroup = NULL,
  verbose = FALSE,
  ...
)

## S3 method for class 'formula'
MLGL(
  formula,
  data,
  hc = NULL,
  lambda = NULL,
  weightLevel = NULL,
  weightSizeGroup = NULL,
  intercept = TRUE,
  loss = c("ls", "logit"),
  verbose = FALSE,
  ...
)
MLGL(X, ...)

## Default S3 method:
MLGL(
  X,
  y,
  hc = NULL,
  lambda = NULL,
  weightLevel = NULL,
  weightSizeGroup = NULL,
  intercept = TRUE,
  loss = c("ls", "logit"),
  sizeMaxGroup = NULL,
  verbose = FALSE,
  ...
)

## S3 method for class 'formula'
MLGL(
  formula,
  data,
  hc = NULL,
  lambda = NULL,
  weightLevel = NULL,
  weightSizeGroup = NULL,
  intercept = TRUE,
  loss = c("ls", "logit"),
  verbose = FALSE,
  ...
)

Arguments

`X`	matrix of size n*p
`...`	Others parameters for `gglasso` function
`y`	vector of size n. If loss = "logit", elements of y must be in -1,1
`hc`	output of `hclust` function. If not provided, `hclust` is run with `ward.D2` method. User can also provide the desired method: "single", "complete", "average", "mcquitty", "ward.D", "ward.D2", "centroid", "median".
`lambda`	lambda values for group lasso. If not provided, the function generates its own values of lambda
`weightLevel`	a vector of size p for each level of the hierarchy. A zero indicates that the level will be ignored. If not provided, use 1/(height between 2 successive levels). Only if `hc` is provided
`weightSizeGroup`	a vector of size 2*p-1 containing the weight for each group. Default is the square root of the size of each group. Only if `hc` is provided
`intercept`	should an intercept be included in the model ?
`loss`	a character string specifying the loss function to use, valid options are: "ls" least squares loss (regression) and "logit" logistic loss (classification)
`sizeMaxGroup`	maximum size of selected groups. If NULL, no restriction
`verbose`	print some information
`formula`	an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted.
`data`	an optional data.frame, list or environment (or object coercible by as.data.frame to a data.frame) containing the variables in the model. If not found in data, the variables are taken from environment (formula)

Value

a MLGL object containing:

lambda: lambda values
b0: intercept values for lambda
beta: A list containing the values of estimated coefficients for each values of lambda
var: A list containing the index of selected variables for each values of lambda
group: A list containing the values index of selected groups for each values of lambda
nVar: A vector containing the number of non zero coefficients for each values of lambda
nGroup: A vector containing the number of non zero groups for each values of lambda
structure: A list containing 3 vectors. var: all variables used. group: associated groups. weight: weight associated with the different groups. level: for each group, the corresponding level of the hierarchy where it appears and disappears. 3 indicates the level with a partition of 3 groups.
time: computation time
dim: dimension of X
hc: Output of hierarchical clustering
call: Code executed by user

Author(s)

Quentin Grimonprez

Examples

set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- MLGL(X, y)
set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- MLGL(X, y)

Group-lasso with overlapping groups

Description

Group-lasso with overlapping groups

Usage

overlapgglasso(
  X,
  y,
  var,
  group,
  lambda = NULL,
  weight = NULL,
  loss = c("ls", "logit"),
  intercept = TRUE,
  ...
)
overlapgglasso(
  X,
  y,
  var,
  group,
  lambda = NULL,
  weight = NULL,
  loss = c("ls", "logit"),
  intercept = TRUE,
  ...
)

Arguments

`X`	matrix of size n*p
`y`	vector of size n. If loss = "logit", elements of y must be in -1,1
`var`	vector containing the variable to use
`group`	vector containing the associated groups
`lambda`	lambda values for group lasso. If not provided, the function generates its own values of lambda
`weight`	a vector the weight for each group. Default is the square root of the size of each group
`loss`	a character string specifying the loss function to use, valid options are: "ls" least squares loss (regression) and "logit" logistic loss (classification)
`intercept`	should an intercept be included in the model ?
`...`	Others parameters for `gglasso` function

Details

Use a group-lasso algorithm (see gglasso) to solve a group-lasso with overlapping groups. Each variable j of the original matrix X is paste k(j) times in a new dataset with k(j) the number of different groups containing the variable j. The new dataset is used to solve the group-lasso with overlapping groups running a group-lasso algorithm.

Value

a MLGL object containing:

lambda: lambda values
b0: intercept values for lambda
beta: A list containing the values of estimated coefficients for each values of lambda
var: A list containing the index of selected variables for each values of lambda
group: A list containing the values index of selected groups for each values of lambda
nVar: A vector containing the number of non zero coefficients for each values of lambda
nGroup: A vector containing the number of non zero groups for each values of lambda
structure: A list containing 3 vectors. var: all variables used. group: associated groups. weight: weight associated with the different groups.
time: computation time
dim: dimension of X

Source

Laurent Jacob, Guillaume Obozinski, and Jean-Philippe Vert. 2009. Group lasso with overlap and graph lasso. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML '09).

Examples

# Least square loss
set.seed(42)
X <- simuBlockGaussian(50, 12, 5, 0.7)
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
var <- c(1:60, 1:8, 7:15)
group <- c(rep(1:12, each = 5), rep(13, 8), rep(14, 9))
res <- overlapgglasso(X, y, var, group)

# Logistic loss
y <- 2 * (rowSums(X[, 1:4]) > 0) - 1
var <- c(1:60, 1:8, 7:15)
group <- c(rep(1:12, each = 5), rep(13, 8), rep(14, 9))
res <- overlapgglasso(X, y, var, group, loss = "logit")
# Least square loss
set.seed(42)
X <- simuBlockGaussian(50, 12, 5, 0.7)
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
var <- c(1:60, 1:8, 7:15)
group <- c(rep(1:12, each = 5), rep(13, 8), rep(14, 9))
res <- overlapgglasso(X, y, var, group)

# Logistic loss
y <- 2 * (rowSums(X[, 1:4]) > 0) - 1
var <- c(1:60, 1:8, 7:15)
group <- c(rep(1:12, each = 5), rep(13, 8), rep(14, 9))
res <- overlapgglasso(X, y, var, group, loss = "logit")

Partial F-test

Description

Perform a partial F-test

Usage

partialFtest(X, y, varToTest)
partialFtest(X, y, varToTest)

Arguments

`X`	design matrix of size n*p
`y`	response vector of length n
`varToTest`	vector containing the index of the column of X to test

Details

y = X * beta + epsilon

null hypothesis: beta[varToTest] = 0 alternative hypothesis: it exists an index k in varToTest such that beta[k] != 0

The test statistic is based on a full and a reduced model. full: y = X * beta + epsilon reduced: y = X * beta[-varToTest] + epsilon

Value

a vector of the same length as varToTest containing the p-values of the test.

Plot the cross-validation obtained from `cv.MLGL` function

Description

Plot the cross-validation obtained from cv.MLGL function

Usage

## S3 method for class 'cv.MLGL'
plot(x, log.lambda = FALSE, ...)
## S3 method for class 'cv.MLGL'
plot(x, log.lambda = FALSE, ...)

Arguments

`x`	`cv.MLGL` object
`log.lambda`	If TRUE, use log(lambda) instead of lambda in abscissa
`...`	Other parameters for plot function

Examples

set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply cv.MLGL method
res <- cv.MLGL(X, y)
# Plot the cv error curve
plot(res)
set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply cv.MLGL method
res <- cv.MLGL(X, y)
# Plot the cv error curve
plot(res)

Plot the path obtained from `fullProcess` function

Description

Plot the path obtained from fullProcess function

Usage

## S3 method for class 'fullProcess'
plot(
  x,
  log.lambda = FALSE,
  lambda.lines = FALSE,
  lambda.opt = c("min", "max", "both"),
  ...
)
## S3 method for class 'fullProcess'
plot(
  x,
  log.lambda = FALSE,
  lambda.lines = FALSE,
  lambda.opt = c("min", "max", "both"),
  ...
)

Arguments

`x`	`fullProcess` object
`log.lambda`	If TRUE, use log(lambda) instead of lambda in abscissa
`lambda.lines`	If TRUE, add vertical lines at lambda values
`lambda.opt`	If there is several optimal lambdas, which one to print "min", "max" or "both"
`...`	Other parameters for plot function

Examples

set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- fullProcess(X, y)
# Plot the solution path
plot(res)
set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- fullProcess(X, y)
# Plot the solution path
plot(res)

Plot the path obtained from `HMT` function

Description

Plot the path obtained from HMT function

Usage

## S3 method for class 'HMT'
plot(
  x,
  log.lambda = FALSE,
  lambda.lines = FALSE,
  lambda.opt = c("min", "max", "both"),
  ...
)
## S3 method for class 'HMT'
plot(
  x,
  log.lambda = FALSE,
  lambda.lines = FALSE,
  lambda.opt = c("min", "max", "both"),
  ...
)

Arguments

`x`	`fullProcess` object
`log.lambda`	If TRUE, use log(lambda) instead of lambda in abscissa
`lambda.lines`	If TRUE, add vertical lines at lambda values
`lambda.opt`	If there is several optimal lambdas, which one to print "min", "max" or "both"
`...`	Other parameters for plot function

Examples

set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- MLGL(X, y)

out <- HMT(res, X, y)
plot(out)
set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- MLGL(X, y)

out <- HMT(res, X, y)
plot(out)

Plot the path obtained from `MLGL` function

Description

Plot the path obtained from MLGL function

Usage

## S3 method for class 'MLGL'
plot(x, log.lambda = FALSE, lambda.lines = FALSE, ...)
## S3 method for class 'MLGL'
plot(x, log.lambda = FALSE, lambda.lines = FALSE, ...)

Arguments

`x`	`MLGL` object
`log.lambda`	If TRUE, use log(lambda) instead of lambda in abscissa
`lambda.lines`	if TRUE, add vertical lines at lambda values
`...`	Other parameters for plot function

Examples

# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
set.seed(42)
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- MLGL(X, y)
# Plot the solution path
plot(res)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
set.seed(42)
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- MLGL(X, y)
# Plot the solution path
plot(res)

Plot the stability path obtained from `stability.MLGL` function

Description

Plot the stability path obtained from stability.MLGL function

Usage

## S3 method for class 'stability.MLGL'
plot(x, log.lambda = FALSE, threshold = 0.75, ...)
## S3 method for class 'stability.MLGL'
plot(x, log.lambda = FALSE, threshold = 0.75, ...)

Arguments

`x`	`stability.MLGL` object
`log.lambda`	If TRUE, use log(lambda) instead of lambda in abscissa
`threshold`	Threshold for selection frequency
`...`	Other parameters for plot function

Value

A list containing:

var: Index of selected variables for the given threshold.
group: Index of the associated group.
threshold: Value of threshold

Examples


set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)

# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)

# Apply stability.MLGL method
res <- stability.MLGL(X, y)
selected <- plot(res)
print(selected)


set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)

# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)

# Apply stability.MLGL method
res <- stability.MLGL(X, y)
selected <- plot(res)
print(selected)

Predict fitted values from a `cv.MLGL` object

Description

Predict fitted values from a cv.MLGL object

Usage

## S3 method for class 'cv.MLGL'
predict(
  object,
  newx = NULL,
  s = c("lambda.1se", "lambda.min"),
  type = c("fit", "coefficients"),
  ...
)
## S3 method for class 'cv.MLGL'
predict(
  object,
  newx = NULL,
  s = c("lambda.1se", "lambda.min"),
  type = c("fit", "coefficients"),
  ...
)

Arguments

`object`	`cv.MLGL` object
`newx`	matrix with new individuals for prediction. If type="coefficients", the parameter has to be NULL
`s`	Either "lambda.1se" or "lambda.min"
`type`	if "fit", return the fitted values for each values of s, if "coefficients", return the estimated coefficients for each s
`...`	Not used. Other arguments to predict.

Value

A matrix with fitted values or estimated coefficients for given values of s.

Author(s)

Quentin Grimonprez

Predict fitted values from a `MLGL` object

Description

Predict fitted values from a MLGL object

Usage

## S3 method for class 'MLGL'
predict(object, newx = NULL, s = NULL, type = c("fit", "coefficients"), ...)
## S3 method for class 'MLGL'
predict(object, newx = NULL, s = NULL, type = c("fit", "coefficients"), ...)

Arguments

`object`	`MLGL` object
`newx`	matrix with new individuals for prediction. If type="coefficients", the parameter has to be NULL
`s`	values of lambda. If NULL, use values from object
`type`	if "fit", return the fitted values for each values of s, if "coefficients", return the estimated coefficients for each s
`...`	Not used. Other arguments to predict.

Value

A matrix with fitted values or estimated coefficients for given values of s.

Author(s)

original code from gglasso package Author: Yi Yang <[email protected]>, Hui Zou <[email protected]>

function inspired from predict function from gglasso package by Yi Yang and Hui Zou.

Examples

X <- simuBlockGaussian(n = 50, nBlock = 12, sizeBlock = 5, rho = 0.7)
y <- drop(X[, c(2, 7, 12)] %*% c(2, 2, -1)) + rnorm(50, 0, 0.5)

m1 <- MLGL(X, y, loss = "ls")
predict(m1, newx = X)
predict(m1, s=3, newx = X)
predict(m1, s=1:3, newx = X)

X <- simuBlockGaussian(n = 50, nBlock = 12, sizeBlock = 5, rho = 0.7)
y <- drop(X[, c(2, 7, 12)] %*% c(2, 2, -1)) + rnorm(50, 0, 0.5)

m1 <- MLGL(X, y, loss = "ls")
predict(m1, newx = X)
predict(m1, s=3, newx = X)
predict(m1, s=1:3, newx = X)

Print Values

Description

Print a fullProcess object

Usage

## S3 method for class 'fullProcess'
print(x, ...)
## S3 method for class 'fullProcess'
print(x, ...)

Arguments

`x`	`fullProcess` object
`...`	Not used.

Examples

set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- fullProcess(X, y)
print(res)
set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- fullProcess(X, y)
print(res)

Print Values

Description

Print a HMT object

Usage

## S3 method for class 'HMT'
print(x, ...)
## S3 method for class 'HMT'
print(x, ...)

Arguments

`x`	`HMT` object
`...`	Not used.

Examples

set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- MLGL(X, y)
out <- HMT(res, X, y)
print(out)
set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- MLGL(X, y)
out <- HMT(res, X, y)
print(out)

Print Values

Description

Print a MLGL object

Usage

## S3 method for class 'MLGL'
print(x, ...)
## S3 method for class 'MLGL'
print(x, ...)

Arguments

`x`	`MLGL` object
`...`	Not used.

Examples

set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- MLGL(X, y)
print(res)
set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- MLGL(X, y)
print(res)

Selection from hierarchical testing with FDR control

Description

Select groups from hierarchical testing procedure with FDR control (hierarchicalFDR)

Usage

selFDR(out, alpha = 0.05, global = TRUE, outer = TRUE)
selFDR(out, alpha = 0.05, global = TRUE, outer = TRUE)

Arguments

`out`	output of hierarchicalFDR function
`alpha`	control level for test
`global`	if FALSE the provided alpha is the desired level control for each family.
`outer`	if TRUE, the FDR is controlled only on outer node (rejected groups without rejected children). If FALSE, it is controlled on the full tree.

Details

See the reference for mode details about the method.

If each family is controlled at a level alpha, we have the following control: FDR control of full tree: alpha * delta * 2 (delta = 1.44) FDR control of outer node: alpha * L * delta * 2 (delta = 1.44)

Value

a list containing:

toSel: vector of boolean. TRUE if the group is selected
groupId: Names of groups
local.alpha: control level for each family of hypothesis
global.alpha: control level for the tree (full tree or outer node)

References

Yekutieli, Daniel. "Hierarchical False Discovery Rate-Controlling Methodology." Journal of the American Statistical Association 103.481 (2008): 309-16.

Examples

set.seed(42)
X <- simuBlockGaussian(50, 12, 5, 0.7)
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
res <- MLGL(X, y)
test <- hierarchicalFDR(X, y, res$group[[20]], res$var[[20]])
sel <- selFDR(test, alpha = 0.05)
set.seed(42)
X <- simuBlockGaussian(50, 12, 5, 0.7)
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
res <- MLGL(X, y)
test <- hierarchicalFDR(X, y, res$group[[20]], res$var[[20]])
sel <- selFDR(test, alpha = 0.05)

Selection from hierarchical testing with FWER control

Description

Select groups from hierarchical testing procedure with FWER control (hierarchicalFWER)

Usage

selFWER(out, alpha = 0.05)
selFWER(out, alpha = 0.05)

Arguments

`out`	output of hierarchicalFWER function
`alpha`	control level for test

Details

Only outer nodes (rejected groups without rejected children) are returned as TRUE.

Value

a list containing:

toSel: vector of boolean. TRUE if the group is selected
groupId: Names of groups

References

Meinshausen, Nicolai. "Hierarchical Testing of Variable Importance." Biometrika 95.2 (2008): 265-78.

Examples

set.seed(42)
X <- simuBlockGaussian(50, 12, 5, 0.7)
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
res <- MLGL(X, y)
test <- hierarchicalFWER(X, y, res$group[[20]], res$var[[20]])
sel <- selFWER(test, alpha = 0.05)
set.seed(42)
X <- simuBlockGaussian(50, 12, 5, 0.7)
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
res <- MLGL(X, y)
test <- hierarchicalFWER(X, y, res$group[[20]], res$var[[20]])
sel <- selFWER(test, alpha = 0.05)

Simulate multivariate Gaussian samples with block diagonal variance matrix

Description

Simulate n samples from a gaussian multivariate law with 0 vector mean and block diagonal variance matrix with diagonal 1 and block of rho.

Usage

simuBlockGaussian(n, nBlock, sizeBlock, rho)
simuBlockGaussian(n, nBlock, sizeBlock, rho)

Arguments

`n`	number of samples to simulate
`nBlock`	number of blocks
`sizeBlock`	size of blocks
`rho`	correlation within each block

Value

a matrix of size n * (nBlock * sizeBlock) containing the samples

Author(s)

Quentin Grimonprez

Examples

X <- simuBlockGaussian(50, 12, 5, 0.7)
X <- simuBlockGaussian(50, 12, 5, 0.7)

Stability Selection for Multi-Layer Group-lasso

Description

Stability selection for MLGL

Usage

stability.MLGL(
  X,
  y,
  B = 50,
  fraction = 0.5,
  hc = NULL,
  lambda = NULL,
  weightLevel = NULL,
  weightSizeGroup = NULL,
  loss = c("ls", "logit"),
  intercept = TRUE,
  verbose = FALSE,
  ...
)
stability.MLGL(
  X,
  y,
  B = 50,
  fraction = 0.5,
  hc = NULL,
  lambda = NULL,
  weightLevel = NULL,
  weightSizeGroup = NULL,
  loss = c("ls", "logit"),
  intercept = TRUE,
  verbose = FALSE,
  ...
)

Arguments

`X`	matrix of size n*p
`y`	vector of size n. If loss = "logit", elements of y must be in -1,1
`B`	number of bootstrap sample
`fraction`	Fraction of data used at each of the `B` sub-samples
`hc`	output of `hclust` function. If not provided, `hclust` is run with ward.D2 method
`lambda`	lambda values for group lasso. If not provided, the function generates its own values of lambda
`weightLevel`	a vector of size p for each level of the hierarchy. A zero indicates that the level will be ignored. If not provided, use 1/(height between 2 successive levels)
`weightSizeGroup`	a vector
`loss`	a character string specifying the loss function to use, valid options are: "ls" least squares loss (regression) and "logit" logistic loss (classification)
`intercept`	should an intercept be included in the model ?
`verbose`	print some informations
`...`	Others parameters for `gglasso` function

Details

Hierarchical clustering is performed with all the variables. Then, the partitions from the different levels of the hierarchy are used in the different runs of MLGL for estimating the probability of selection of each group.

Value

a stability.MLGL object containing:

lambda: sequence of lambda.
B: Number of bootstrap samples.
stability: A matrix of size length(lambda)*number of groups containing the probability of selection of each group
var: vector containing the index of covariates
group: vector containing the index of associated groups of covariates
time: computation time

Author(s)

Quentin Grimonprez

References

Meinshausen and Buhlmann (2010). Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 72.4, p. 417-473.

Examples


set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)

# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)

# Apply stability.MLGL method
res <- stability.MLGL(X, y)


set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)

# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)

# Apply stability.MLGL method
res <- stability.MLGL(X, y)

Object Summaries

Description

Summary of a fullProcess object

Usage

## S3 method for class 'fullProcess'
summary(object, ...)
## S3 method for class 'fullProcess'
summary(object, ...)

Arguments

`object`	`fullProcess` object
`...`	Not used.

Examples

set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- fullProcess(X, y)
summary(res)
set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- fullProcess(X, y)
summary(res)

Object Summaries

Description

Summary of a HMT object

Usage

## S3 method for class 'HMT'
summary(object, ...)
## S3 method for class 'HMT'
summary(object, ...)

Arguments

`object`	`HMT` object
`...`	Not used.

Examples

set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- MLGL(X, y)
out <- HMT(res, X, y)
summary(out)
set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- MLGL(X, y)
out <- HMT(res, X, y)
summary(out)

Object Summaries

Description

Summary of a MLGL object

Usage

## S3 method for class 'MLGL'
summary(object, ...)
## S3 method for class 'MLGL'
summary(object, ...)

Arguments

`object`	`MLGL` object
`...`	Not used.

Examples

set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- MLGL(X, y)
summary(res)
set.seed(42)
# Simulate gaussian data with block-diagonal variance matrix containing 12 blocks of size 5
X <- simuBlockGaussian(50, 12, 5, 0.7)
# Generate a response variable
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
# Apply MLGL method
res <- MLGL(X, y)
summary(res)

Find all unique groups in `hclust` results

Description

Find all unique groups in hclust results

Usage

uniqueGroupHclust(hc)
uniqueGroupHclust(hc)

Arguments

`hc`	output of `hclust` function

Value

A list containing:

indexGroup: Vector containing the index of variables.
varGroup: Vector containing the index of the group of each variable.

Author(s)

Quentin Grimonprez

Examples

hc <- hclust(dist(USArrests), "average")
res <- uniqueGroupHclust(hc)
hc <- hclust(dist(USArrests), "average")
res <- uniqueGroupHclust(hc)

Package 'MLGL'

Help Index

MLGL

Description

Details

Author(s)

References

See Also

Examples

Hierarchical Clustering with distance matrix computed using bootstrap replicates

Description

Usage

Arguments

Value

Examples

Get coefficients from a cv.MLGL object

Description

Usage

Arguments

Value

Author(s)

See Also

Get coefficients from a MLGL object

Description

Usage

Arguments

Value

Author(s)

See Also

Compute the group size weight vector with an authorized maximal size

Description

Usage

Arguments

Value

Examples

Multi-Layer Group-Lasso with cross V-fold validation

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

F-test

Description

Usage

Arguments

Details

Value

See Also

Full process of MLGL

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Hierarchical testing with FDR control

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Hierarchical testing with FWER control

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Hierarchical Multiple Testing procedure

Description

Get coefficients from a `cv.MLGL` object

Get coefficients from a `MLGL` object

Plot the cross-validation obtained from `cv.MLGL` function

Plot the path obtained from `fullProcess` function

Plot the path obtained from `HMT` function

Plot the path obtained from `MLGL` function

Plot the stability path obtained from `stability.MLGL` function

Predict fitted values from a `cv.MLGL` object