Title: | Targeted Gold Standard Testing |
---|---|
Description: | Functions for implementing the targeted gold standard (GS) testing. You provide the true disease or treatment failure status and the risk score, tell 'TGST' the availability of GS tests and which method to use, and it returns the optimal tripartite rules. Please refer to Liu et al. (2013) <doi:10.1080/01621459.2013.810149> for more details. |
Authors: | Yizhen Xu [aut, cre], Tao Liu [aut] |
Maintainer: | Yizhen Xu <[email protected]> |
License: | GPL-3 |
Version: | 1.0 |
Built: | 2024-11-10 04:45:10 UTC |
Source: | https://github.com/yizhenxu/tgst |
This function gives you the AUC associated with the rules set.
cal.AUC(Z, S, l, u)
cal.AUC(Z, S, l, u)
Z |
True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1). |
S |
Risk score. |
l |
Lower cutoff of all possible tripartite rules. |
u |
Upper cutoff of all possible tripartite rules. |
AUC.
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test rules = nonpar.rules( Z, S, phi) cal.AUC(Z,S,rules[,1],rules[,2])
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test rules = nonpar.rules( Z, S, phi) cal.AUC(Z,S,rules[,1],rules[,2])
This function provides graphical assessment to the suitability of the exponential tilt model for risk score in finding optimal tripartite rules by semiparametric approach.
Check.exp.tilt(Z, S)
Check.exp.tilt(Z, S)
Z |
True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1). |
S |
Risk score. |
Plot of empirical density for risk score S, joint empirical density for (S,Z=1) and (S,Z=0), and the density under the exponential tilt model assumption for (S,Z=1) and (S,Z=0).
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score Check.exp.tilt( Z, S)
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score Check.exp.tilt( Z, S)
This function allows you to compute the average of misdiagnoses rate for viral failure and the optimal risk under min- rules
from K-fold cross-validation.
CV.TGST(Obj, lambda, K = 10)
CV.TGST(Obj, lambda, K = 10)
Obj |
An object of class TGST. |
lambda |
A user-specified weight that reflects relative loss for the two types of misdiagnoses, taking value in |
K |
Number of folds in cross validation. The default is 10. |
Cross-validation results.
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test Obj = TGST(Z, S, phi, method="nonpar") lambda = 0.8 CV.TGST(Obj, lambda, K=10)
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test Obj = TGST(Z, S, phi, method="nonpar") lambda = 0.8 CV.TGST(Obj, lambda, K=10)
This function gives you the nonparametric FNR and FPR associated with a given tripartite rule.
nonpar.fnr.fpr(Z, S, l, u)
nonpar.fnr.fpr(Z, S, l, u)
Z |
True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1). |
S |
Risk score. |
l |
Lower cutoff of tripartite rule. |
u |
Upper cutoff of tripartite rule. |
Matrix with 2 columns. Each row is a set of nonparametric (FNR, FPR) on an associated tripartite rule.
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test rules = nonpar.rules( Z, S, phi) nonpar.fnr.fpr(Z,S,rules[1,1],rules[1,2])
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test rules = nonpar.rules( Z, S, phi) nonpar.fnr.fpr(Z,S,rules[1,1],rules[1,2])
This function gives you all possible cutoffs for tripartite rules, by applying nonparametric search to the given data.
nonpar.rules(Z, S, phi)
nonpar.rules(Z, S, phi)
Z |
True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1). |
S |
Risk score. |
phi |
Percentage of patients taking viral load test. |
Matrix with 2 columns. Each row is a possible tripartite rule, with output on lower and upper cutoff.
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test nonpar.rules( Z, S, phi)
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test nonpar.rules( Z, S, phi)
This function gives you the optimal nonparametric tripartite rule that minimizes the min- rules.
Opt.nonpar.rule(Z, S, phi, lambda)
Opt.nonpar.rule(Z, S, phi, lambda)
Z |
True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1). |
S |
Risk score. |
phi |
Percentage of patients taking viral load test. |
lambda |
A user-specified weight that reflects relative loss for the two types of misdiagnoses, taking value in |
Optimal nonparametric rule and its associated misclassification rates (FNR, FPR), optimal lambda risk, and total misclassification rate (TMR).
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test lambda = 0.5 Opt.nonpar.rule( Z, S, phi, lambda)
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test lambda = 0.5 Opt.nonpar.rule( Z, S, phi, lambda)
This function gives you the optimal semiparametric tripartite rule that minimizes the min- rules.
Opt.semipar.rule(Z, S, phi, lambda)
Opt.semipar.rule(Z, S, phi, lambda)
Z |
True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1). |
S |
Risk score. |
phi |
Percentage of patients taking viral load test. |
lambda |
A user-specified weight that reflects relative loss for the two types of misdiagnoses, taking value in |
Optimal semiparametric rule and its associated misclassification rates (FNR, FPR), optimal lambda risk, and total misclassification rate (TMR).
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test lambda = 0.5 Opt.semipar.rule( Z, S, phi, lambda)
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test lambda = 0.5 Opt.semipar.rule( Z, S, phi, lambda)
OptimalRule
is the main function of TGST
and it gives you the optimal tripartite rule that minimizes the min- risk based on the type of user selected approach.
The function takes the risk score and true disease status from a training data set and returns the optimal tripartite rule under the specified proportion of patients able to take gold standard test.
OptimalRule(Obj, lambda)
OptimalRule(Obj, lambda)
Obj |
An object of class TGST. |
lambda |
A user-specified weight that reflects relative loss for the two types of misdiagnoses, taking value in |
Optimal tripartite rule.
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test lambda = 0.5 Obj = TGST(Z, S, phi, method="nonpar") OptimalRule(Obj, lambda)
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test lambda = 0.5 Obj = TGST(Z, S, phi, method="nonpar") OptimalRule(Obj, lambda)
This class contains invisible results of OptimalRule function.
Percentage of patients taking viral load test.
A vector of true disease status (Failure Status coded as Z=1).
A vector of risk Score.
A matrix of all possible tripartite rules (two cutoffs) derived from the training data set.
A boolean indicating if nonparametric approach should be used in calculating the misclassfication rates. If FALSE, semiparametric approach would be used.
A matrix with two columns of misclassification rates, FNR and FPR.
A numeric vector with two elements, the lower and upper cutoffs of the optimal tripartite rule.
The Output class adds optimal rule to the TGST class.
This function This function gives visualize object of class TGST
or Output
.
This function This function gives visualize object of class TGST
or Output
.
## S4 method for signature 'TGST' plot(x) ## S4 method for signature 'Output' plot(x)
## S4 method for signature 'TGST' plot(x) ## S4 method for signature 'Output' plot(x)
x |
Distribution plot.
Distribution plot.
This function performs ROC analysis for tripartite rules by nonparametric approach. If , the ROC curve is returned.
ROC.nonpar(Z, S, phi, plot = TRUE)
ROC.nonpar(Z, S, phi, plot = TRUE)
Z |
True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1). |
S |
Risk score. |
phi |
Percentage of patients taking viral load test. |
plot |
Logical parameter indicating if ROC curve should be plotted. Default is |
AUC The area under the ROC curve. FNR Misdiagnoses rate for viral failure (false negative rate). FPR Misdiagnoses rate for treatment failure (false positive rate).
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test a = ROC.nonpar( Z, S, phi,plot=TRUE) a$AUC a$FNR a$FPR
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test a = ROC.nonpar( Z, S, phi,plot=TRUE) a$AUC a$FNR a$FPR
This function performs ROC analysis on the rules from nonparametric approach. If , the ROC curve is returned.
ROC.semipar(Z, S, phi, plot = TRUE)
ROC.semipar(Z, S, phi, plot = TRUE)
Z |
True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1). |
S |
Risk score. |
phi |
Percentage of patients taking viral load test. |
plot |
Logical parameter indicating if ROC curve should be plotted. Default is |
AUC The area under the ROC curve. FNR Misdiagnoses rate for viral failure (false negative rate). FPR Misdiagnoses rate for treatment failure (false positive rate).
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test a = ROC.semipar( Z, S, phi,plot=TRUE) a$AUC a$FNR a$FPR
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test a = ROC.semipar( Z, S, phi,plot=TRUE) a$AUC a$FNR a$FPR
This function performs ROC analysis for tripartite rules. If , the ROC curve is returned.
ROCAnalysis(Obj, plot = TRUE)
ROCAnalysis(Obj, plot = TRUE)
Obj |
An object of class TGST. |
plot |
Logical parameter indicating if ROC curve should be plotted. Default is |
AUC (the area under ROC curve) and ROC curve.
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test lambda = 0.5 Obj = TGST(Z, S, phi, method="nonpar") ROCAnalysis(Obj, plot=TRUE)
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test lambda = 0.5 Obj = TGST(Z, S, phi, method="nonpar") ROCAnalysis(Obj, plot=TRUE)
This function gives you the optimal semiparametric tripartite rule that minimizes TMR (total misclassification risk).
Semi.par.rule(Z, S, phi)
Semi.par.rule(Z, S, phi)
Z |
True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1). |
S |
Risk score. |
phi |
Percentage of patients taking viral load test. |
Semiparametric rule and its associated TMR.
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test Semi.par.rule( Z, S, phi)
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test Semi.par.rule( Z, S, phi)
This function gives you the semiparametric FNR and FPR associated with a given tripartite rule.
semipar.fnr.fpr(Z, S, l, u)
semipar.fnr.fpr(Z, S, l, u)
Z |
True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1). |
S |
Risk score. |
l |
Lower cutoff of tripartite rule. |
u |
Upper cutoff of tripartite rule. |
Matrix with 2 columns. Each row is a set of semiparametric (FNR, FPR) on an associated tripartite rule.
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test rules = nonpar.rules( Z, S, phi) semipar.fnr.fpr(Z,S,rules[1,1],rules[1,2])
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test rules = nonpar.rules( Z, S, phi) semipar.fnr.fpr(Z,S,rules[1,1],rules[1,2])
A simulated dataset containing true disease status and risk score. See details for simulation setting.
data(Simdata)
data(Simdata)
A data frame with 8000 simulated observations on the following 2 variables.
Z
True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1).
S
Risk score. Higher risk score indicates larger tendency of diseased / treatment failure.
We first simulate viral status assuming
with
; and then conditional on
, simulate
with
where
and
are shape and scale parameters.
and
.
data(Simdata) ## maybe str(Simdata) ; plot(Simdata) ...
data(Simdata) ## maybe str(Simdata) ; plot(Simdata) ...
This function gives the summary of the data from TGST
.
## S4 method for signature 'TGST' summary(object)
## S4 method for signature 'TGST' summary(object)
object |
Output object from |
Percentage of treatment failure; Summary statistics (mean, standard deviation, minimum, median, maximum and IQR) of risk score by true disease status; Distribution plot.
Create a TGST object, usually used as an input for optimal rule search and ROC analysis.
TGST(Z, S, phi, method = "nonpar")
TGST(Z, S, phi, method = "nonpar")
Z |
A vector of true disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1). |
S |
A vector of risk Score. |
phi |
Percentage of patients taking gold standard test. |
method |
Method for searching for the optimal tripartite rule, options are "nonpar" (default) and "semipar". |
An object of class TGST
.The class contains 6 slots: phi (percentage of gold standard tests), Z (true failure status), S (risk score), Rules (all possible tripartite rules), Nonparametric (logical indicator of the approach), and FNR.FPR (misclassification rates).
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test TGST( Z, S, phi, method="nonpar")
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score phi = 0.1 #10% of patients taking viral load test TGST( Z, S, phi, method="nonpar")
This class contains results of a run to search for all possible rules.
Percentage of patients taking viral load test.
A vector of true disease status (Failure Status coded as Z=1).
A vector of risk Score.
A matrix of all possible tripartite rules (two cutoffs) derived from the training data set.
A boolean indicating if nonparametric approach should be used in calculating the misclassfication rates. If FALSE, semiparametric approach would be used.
A matrix with two columns of misclassification rates, FNR and FPR.
If res is the result of rankclust(), each slot of results can be reached by res[k]@slotname, where k is the number of clusters and slotname is the name of the slot we want to reach (see Output-class). For the slots ll, bic, icl, res["slotname"] returns a vector of size K containing the values of the slot for each number of clusters.
This function allows you to compute the percentage of diseased (equivalent to treatment failure Z=1) and show distribution summary of risk score (S) by the true disease status (Z).
TGST.summ(Z, S)
TGST.summ(Z, S)
Z |
True disease status (No disease / treatment success coded as Z=0, diseased / treatment failure coded as Z=1). |
S |
Risk score. |
Percentage of treatment failure; Summary statistics (mean, standard deviation, minimum, median, maximum and IQR) of risk score by true disease status; Distribution plot.
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score TGST.summ(Z,S)
d = Simdata Z = d$Z # True Disease Status S = d$S # Risk Score TGST.summ(Z,S)