Title: | A Collection of R Functions by the Petersen Lab |
---|---|
Description: | A collection of R functions that are widely used by the Petersen Lab. Included are functions for various purposes, including evaluating the accuracy of judgments and predictions, performing scoring of assessments, generating correlation matrices, conversion of data between various types, data management, psychometric evaluation, extensions related to latent variable modeling, various plotting capabilities, and other miscellaneous useful functions. By making the package available, we hope to make our methods reproducible and replicable by others and to help others perform their data processing and analysis methods more easily and efficiently. The codebase is provided in Petersen (2024) <doi:10.5281/zenodo.7602890> and on CRAN: <doi: 10.32614/CRAN.package.petersenlab>. The package is described in "Principles of Psychological Assessment: With Applied Examples in R" (Petersen, 2024) <doi:10.1201/9781003357421>, <doi:10.5281/zenodo.6466589>. |
Authors: | Isaac T. Petersen [aut, cre] , Developmental Psychopathology Lab at the University of Iowa [ctb], Angela D. Staples [ctb] , Johanna Caskey [ctb] , Philipp Doebler [ctb] , Loreen Sabel [ctb] |
Maintainer: | Isaac T. Petersen <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0.11 |
Built: | 2024-10-27 00:57:39 UTC |
Source: | https://github.com/devpsylab/petersenlab |
NOTIN operator.
x %ni% table
x %ni% table
x |
vector or |
table |
vector or |
Determine whether values in one vector are not in another vector.
Vector of TRUE
and FALSE
, indicating whether values in
one vector are not in another vector.
https://www.r-bloggers.com/2018/07/the-notin-operator/ https://stackoverflow.com/questions/71309487/r-package-documentation-undocumented-arguments-in-documentation-object-for-a?noredirect=1
# Prepare Data v1 <- c("Sally","Tom","Barry","Alice") listToCheckAgainst <- c("Tom","Alice") v1 %ni% listToCheckAgainst v1[v1 %ni% listToCheckAgainst]
# Prepare Data v1 <- c("Sally","Tom","Barry","Alice") listToCheckAgainst <- c("Tom","Alice") v1 %ni% listToCheckAgainst v1[v1 %ni% listToCheckAgainst]
Find the accuracy at a given cutoff. Actuals should be binary, where 1
= present and 0
= absent.
accuracyAtCutoff( predicted, actual, cutoff, UH = NULL, UM = NULL, UCR = NULL, UFA = NULL )
accuracyAtCutoff( predicted, actual, cutoff, UH = NULL, UM = NULL, UCR = NULL, UFA = NULL )
predicted |
vector of continuous predicted values. |
actual |
vector of binary actual values ( |
cutoff |
numeric value at or above which the target condition is considered present. |
UH |
(optional) utility of hits (true positives), specified as a value from 0-1, where 1 is the most highly valued and 0 is the least valued. |
UM |
(optional) utility of misses (false negatives), specified as a value from 0-1, where 1 is the most highly valued and 0 is the least valued. |
UCR |
(optional) utility of correct rejections (true negatives), specified as a value from 0-1, where 1 is the most highly valued and 0 is the least valued. |
UFA |
(optional) utility of false positives (false positives), specified as a value from 0-1, where 1 is the most highly valued and 0 is the least valued. |
Compute accuracy indices of predicted values in relation to actual values at a given cutoff by specifying the predicted values, actual values, and cutoff value. The target condition is considered present at or above the cutoff value. Optionally, you can also specify the utility of hits, misses, correct rejections, and false alarms to calculate the overall utility of the cutoff. To compute accuracy at each possible cutoff, see accuracyAtEachCutoff.
cutoff
= the cutoff specified
TP
= true positives
TN
= true negatives
FP
= false positives
FN
= false negatives
SR
= selection ratio
BR
= base rate
percentAccuracy
= percent accuracy
percentAccuracyByChance
= percent accuracy by chance
percentAccuracyPredictingFromBaseRate
= percent accuracy from
predicting from the base rate
RIOC
= relative improvement over chance
relativeImprovementOverPredictingFromBaseRate
= relative
improvement over predicting from the base rate
SN
= sensitivty
SP
= specificity
TPrate
= true positive rate
TNrate
= true negative rate
FNrate
= false negative rate
FPrate
= false positive rate
HR
= hit rate
FAR
= false alarm rate
PPV
= positive predictive value
NPV
= negative predictive value
FDR
= false discovery rate
FOR
= false omission rate
youdenJ
= Youden's J statistic
balancedAccuracy
= balanced accuracy
f1Score
= F1-score
mcc
= Matthews correlation coefficient
diagnosticOddsRatio
= diagnostic odds ratio
positiveLikelihoodRatio
= positive likelihood ratio
negativeLikelhoodRatio
= negative likelihood ratio
dPrimeSDT
= d-Prime index from signal detection theory
betaSDT
= beta index from signal detection theory
cSDT
= c index from signal detection theory
aSDT
= a index from signal detection theory
bSDT
= b index from signal detection theory
differenceBetweenPredictedAndObserved
= difference between
predicted and observed values
informationGain
= information gain
overallUtility
= overall utility (if utilities were specified)
Other accuracy:
accuracyAtEachCutoff()
,
accuracyOverall()
,
nomogrammer()
,
optimalCutoff()
,
posttestOdds()
# Prepare Data data("USArrests") USArrests$highMurderState <- NA USArrests$highMurderState[which(USArrests$Murder >= 10)] <- 1 USArrests$highMurderState[which(USArrests$Murder < 10)] <- 0 # Calculate Accuracy accuracyAtCutoff(predicted = USArrests$Assault, actual = USArrests$highMurderState, cutoff = 200) accuracyAtCutoff(predicted = USArrests$Assault, actual = USArrests$highMurderState, cutoff = 200, UH = 1, UM = 0, UCR = .9, UFA = 0)
# Prepare Data data("USArrests") USArrests$highMurderState <- NA USArrests$highMurderState[which(USArrests$Murder >= 10)] <- 1 USArrests$highMurderState[which(USArrests$Murder < 10)] <- 0 # Calculate Accuracy accuracyAtCutoff(predicted = USArrests$Assault, actual = USArrests$highMurderState, cutoff = 200) accuracyAtCutoff(predicted = USArrests$Assault, actual = USArrests$highMurderState, cutoff = 200, UH = 1, UM = 0, UCR = .9, UFA = 0)
Find the accuracy at each possible cutoff. Actuals should be binary,
where 1
= present and 0
= absent.
accuracyAtEachCutoff( predicted, actual, UH = NULL, UM = NULL, UCR = NULL, UFA = NULL )
accuracyAtEachCutoff( predicted, actual, UH = NULL, UM = NULL, UCR = NULL, UFA = NULL )
predicted |
vector of continuous predicted values. |
actual |
vector of binary actual values ( |
UH |
(optional) utility of hits (true positives), specified as a value from 0-1, where 1 is the most highly valued and 0 is the least valued. |
UM |
(optional) utility of misses (false negatives), specified as a value from 0-1, where 1 is the most highly valued and 0 is the least valued. |
UCR |
(optional) utility of correct rejections (true negatives), specified as a value from 0-1, where 1 is the most highly valued and 0 is the least valued. |
UFA |
(optional) utility of false positives (false positives), specified as a value from 0-1, where 1 is the most highly valued and 0 is the least valued. |
Compute accuracy indices of predicted values in relation to actual values at each possible cutoff by specifying the predicted values and actual values. The target condition is considered present at or above each cutoff value. Optionally, you can specify the utility of hits, misses, correct rejections, and false alarms to calculate the overall utility of each possible cutoff.
cutoff
= the cutoff specified
TP
= true positives
TN
= true negatives
FP
= false positives
FN
= false negatives
SR
= selection ratio
BR
= base rate
percentAccuracy
= percent accuracy
percentAccuracyByChance
= percent accuracy by chance
percentAccuracyPredictingFromBaseRate
= percent accuracy from
predicting from the base rate
RIOC
= relative improvement over chance
relativeImprovementOverPredictingFromBaseRate
= relative
improvement over predicting from the base rate
SN
= sensitivty
SP
= specificity
TPrate
= true positive rate
TNrate
= true negative rate
FNrate
= false negative rate
FPrate
= false positive rate
HR
= hit rate
FAR
= false alarm rate
PPV
= positive predictive value
NPV
= negative predictive value
FDR
= false discovery rate
FOR
= false omission rate
youdenJ
= Youden's J statistic
balancedAccuracy
= balanced accuracy
f1Score
= F1-score
mcc
= Matthews correlation coefficient
diagnosticOddsRatio
= diagnostic odds ratio
positiveLikelihoodRatio
= positive likelihood ratio
negativeLikelhoodRatio
= negative likelihood ratio
dPrimeSDT
= d-Prime index from signal detection theory
betaSDT
= beta index from signal detection theory
cSDT
= c index from signal detection theory
aSDT
= a index from signal detection theory
bSDT
= b index from signal detection theory
differenceBetweenPredictedAndObserved
= difference between
predicted and observed values
informationGain
= information gain
overallUtility
= overall utility (if utilities were specified)
Other accuracy:
accuracyAtCutoff()
,
accuracyOverall()
,
nomogrammer()
,
optimalCutoff()
,
posttestOdds()
# Prepare Data data("USArrests") USArrests$highMurderState <- NA USArrests$highMurderState[which(USArrests$Murder >= 10)] <- 1 USArrests$highMurderState[which(USArrests$Murder < 10)] <- 0 # Calculate Accuracy accuracyAtEachCutoff(predicted = USArrests$Assault, actual = USArrests$highMurderState) accuracyAtEachCutoff(predicted = USArrests$Assault, actual = USArrests$highMurderState, UH = 1, UM = 0, UCR = .9, UFA = 0)
# Prepare Data data("USArrests") USArrests$highMurderState <- NA USArrests$highMurderState[which(USArrests$Murder >= 10)] <- 1 USArrests$highMurderState[which(USArrests$Murder < 10)] <- 0 # Calculate Accuracy accuracyAtEachCutoff(predicted = USArrests$Assault, actual = USArrests$highMurderState) accuracyAtEachCutoff(predicted = USArrests$Assault, actual = USArrests$highMurderState, UH = 1, UM = 0, UCR = .9, UFA = 0)
Find overall accuracy.
accuracyOverall(predicted, actual, dropUndefined = FALSE) wisdomOfCrowd(predicted, actual, dropUndefined = FALSE)
accuracyOverall(predicted, actual, dropUndefined = FALSE) wisdomOfCrowd(predicted, actual, dropUndefined = FALSE)
predicted |
vector of continuous predicted values. |
actual |
vector of actual values. |
dropUndefined |
|
Compute overall accuracy estimates of predicted values in relation to actual values. Estimates of overall accuracy span all cutoffs. Some accuracy estimates can be undefined under various circumstances. Optionally, you can drop undefined values in the calculation of accuracy indices. Note that dropping undefined values changes the meaning of these indices. Use this option at your own risk!
ME
= mean error
MAE
= mean absolute error
MSE
= mean squared error
RMSE
= root mean squared error
MPE
= mean percentage error
MAPE
= mean absolute percentage error
sMAPE
= symmetric mean absolute percentage error
MASE
= mean absolute scaled error
RMSLE
= root mean squared log error
rsquared
= R-squared
rsquaredAdj
= adjusted R-squared
rsquaredPredictive
= predictive R-squared
Mean absolute scaled error (MASE):
https://stats.stackexchange.com/questions/108734/alternative-to-mape-when-the-data-is-not-a-time-series
https://stats.stackexchange.com/questions/322276/is-mase-specified-only-to-time-series-data
https://stackoverflow.com/questions/31197726/calculate-mase-with-cross-sectional-non-time-series-data-in-r
https://stats.stackexchange.com/questions/401759/how-can-mase-mean-absolute-scaled-error-score-value-be-interpreted-for-non-tim
Predictive R-squared:
https://www.r-bloggers.com/2014/05/can-we-do-better-than-r-squared/
Other accuracy:
accuracyAtCutoff()
,
accuracyAtEachCutoff()
,
nomogrammer()
,
optimalCutoff()
,
posttestOdds()
# Prepare Data data("USArrests") # Calculate Accuracy accuracyOverall(predicted = USArrests$Assault, actual = USArrests$Murder) wisdomOfCrowd(predicted = USArrests$Assault, actual = 200)
# Prepare Data data("USArrests") # Calculate Accuracy accuracyOverall(predicted = USArrests$Assault, actual = USArrests$Murder) wisdomOfCrowd(predicted = USArrests$Assault, actual = 200)
Add correlation text to scatterplot.
addText( x, y, xcoord = NULL, ycoord = NULL, size = 1, col = NULL, method = "pearson" )
addText( x, y, xcoord = NULL, ycoord = NULL, size = 1, col = NULL, method = "pearson" )
x |
vector of the variable for the x-axis. |
y |
vector of the variable for the y-axis. |
xcoord |
x-coordinate for the location of the text. |
ycoord |
y-coordinate for the location of the text. |
size |
size of the text font. |
col |
color of the text font. |
method |
method for calculating the association. One of:
|
Adds a correlation coefficient and associated p-value to a scatterplot.
Correlation coefficient, degrees of freedom, and p-value printed on scatterplot.
Other plot:
plot2WayInteraction()
,
ppPlot()
,
semPlotInteraction()
,
vwReg()
Other correlations:
cor.table()
,
crossTimeCorrelation()
,
crossTimeCorrelationDF()
,
partialcor.table()
,
vwReg()
# Prepare Data data("USArrests") # Scatterplot plot(USArrests$Assault, USArrests$Murder) addText(x = USArrests$Assault, y = USArrests$Murder)
# Prepare Data data("USArrests") # Scatterplot plot(USArrests$Assault, USArrests$Murder) addText(x = USArrests$Assault, y = USArrests$Murder)
Format decimals and leading zeroes. Adapted from the MOTE package.
apa(value, decimals = 3, leading = TRUE)
apa(value, decimals = 3, leading = TRUE)
value |
A set of numeric values, either a single number, vector, or set of columns. |
decimals |
The number of decimal points desired in the output. |
leading |
Logical value: |
Formats decimals and leading zeroes for creating reports in scientific style, to be consistent with American Psychological Association (APA) format. This function creates "pretty" character vectors from numeric variables for printing as part of a report. The value can take a single number, matrix, vector, or multiple columns from a data frame, as long as they are numeric. The values will be coerced into numeric if they are characters or logical values, but this process may result in an error if values are truly alphabetical.
Value(s) in the format specified, with the number of decimals places indicated and with or without a leading zero, as indicated.
https://github.com/doomlab/MOTE
Other formatting:
pValue()
,
specify_decimal()
,
suppressLeadingZero()
apa(value = 0.54674, decimals = 3, leading = TRUE)
apa(value = 0.54674, decimals = 3, leading = TRUE)
Estimate the observed association between the predictor and criterion after accounting for the degree to which a true correlation is attenuated due to measurement error.
attenuationCorrelation( trueAssociation, reliabilityOfPredictor, reliabilityOfCriterion )
attenuationCorrelation( trueAssociation, reliabilityOfPredictor, reliabilityOfCriterion )
trueAssociation |
Magnitude of true association (r value). |
reliabilityOfPredictor |
Reliability of predictor (from 0 to 1). |
reliabilityOfCriterion |
Reliability of criterion/outcome (from 0 to 1). |
Estimate the association that would be observed between the predictor and criterion after accounting for the degree to which a true correlation is attenuated due to random measurement error (unreliability).
Observed correlation between predictor and criterion.
Other correlation:
disattenuationCorrelation()
attenuationCorrelation( trueAssociation = .7, reliabilityOfPredictor = .9, reliabilityOfCriterion = .85)
attenuationCorrelation( trueAssociation = .7, reliabilityOfPredictor = .9, reliabilityOfCriterion = .85)
Cleans up names of players for merging.
cleanUpNames(name)
cleanUpNames(name)
name |
character vector of player names. |
Cleans up names of NFL Football players, including making them all-caps, removing common suffixes, punctuation, spaces, etc. This is helpful for merging multiple datasets.
Vector of cleaned player names.
oldNames <- c("Peyton Manning","Tom Brady","Marvin Harrison Jr.") cleanNames <- cleanUpNames(oldNames) cleanNames
oldNames <- c("Peyton Manning","Tom Brady","Marvin Harrison Jr.") cleanNames <- cleanUpNames(oldNames) cleanNames
Column bind dataframes and fill with NA
s.
columnBindFill(...)
columnBindFill(...)
... |
Names of multiple dataframes. |
Binds columns of two or more dataframes together, and fills in missing rows.
Dataframe with columns binded together.
Other dataManipulation:
convert.magic()
,
dropColsWithAllNA()
,
dropRowsWithAllNA()
,
varsDifferentTypes()
# Prepare Data df1 <- data.frame(a = rnorm(5), b = rnorm(5)) df2 <- data.frame(c = rnorm(4), d = rnorm(4)) # Column Bind and Fill columnBindFill(df1, df2)
# Prepare Data df1 <- data.frame(a = rnorm(5), b = rnorm(5)) df2 <- data.frame(c = rnorm(4), d = rnorm(4)) # Column Bind and Fill columnBindFill(df1, df2)
Simulate data with a specified correlation in relation to an existing variable.
complement(y, rho, x)
complement(y, rho, x)
y |
The existing variable against which to simulate a complement variable. |
rho |
The correlation magnitude, ranging from [-1, 1]. |
x |
(optional) Vector with the same length as |
Simulates data with a specified correlation in relation to an existing variable.
Vector of a variable that has a specified correlation in relation to a given
variable y
.
https://stats.stackexchange.com/a/313138/20338
Other simulation:
simulateAUC()
,
simulateIndirectEffect()
v1 <- rnorm(100) complement(y = v1, rho = .5) complement(y = v1, rho = -.5) v2 <- complement(y = v1, rho = .85) plot(v1, v2)
v1 <- rnorm(100) complement(y = v1, rho = .5) complement(y = v1, rho = -.5) v2 <- complement(y = v1, rho = .85) plot(v1, v2)
Converts variable types of multiple columns of a dataframe at once.
convert.magic(obj, type)
convert.magic(obj, type)
obj |
name of dataframe (object) |
type |
type to convert variables to one of:
|
Converts variable types of multiple columns of a dataframe at once. Convert variable types to character, numeric, or factor.
Dataframe with columns converted to a particular type.
Other dataManipulation:
columnBindFill()
,
dropColsWithAllNA()
,
dropRowsWithAllNA()
,
varsDifferentTypes()
Other conversion:
convertHoursAMPM()
,
convertToHours()
,
convertToMinutes()
,
convertToSeconds()
,
percentileToTScore()
,
pom()
# Prepare Data data("USArrests") # Convert variables to character convert.magic(USArrests, "character")
# Prepare Data data("USArrests") # Convert variables to character convert.magic(USArrests, "character")
Convert hours to 24-hour time.
convertHoursAMPM(hours, ampm, am = 0, pm = 1, treatMorningAsLate = FALSE)
convertHoursAMPM(hours, ampm, am = 0, pm = 1, treatMorningAsLate = FALSE)
hours |
The vector of times in hours. |
ampm |
Vector indicating whether given times are AM or PM. |
am |
Value indicating AM in |
pm |
Value indicating PM in |
treatMorningAsLate |
|
Convert hours to the number of hours in 24-hour time. You can specify whether to treat morning hours (e.g., 1 AM) as late (25 H), e.g., for specifying late bedtimes
Hours in 24-hour-time.
Other times:
convertToHours()
,
convertToMinutes()
,
convertToSeconds()
Other conversion:
convert.magic()
,
convertToHours()
,
convertToMinutes()
,
convertToSeconds()
,
percentileToTScore()
,
pom()
# Prepare Data df1 <- data.frame(hours = c(1, 1, 12, 12), ampm = c(0, 0, 1, 1)) df2 <- data.frame(hours = c(1, 1, 12, 12), ampm = c(1, 1, 0, 0)) # Convert AM and PM hours convertHoursAMPM(hours = df1$hours, ampm = df1$ampm) convertHoursAMPM(hours = df1$hours, ampm = df1$ampm, treatMorningAsLate = TRUE) convertHoursAMPM(hours = df2$hours, ampm = df2$ampm, am = 1, pm = 0) convertHoursAMPM(hours = df2$hours, ampm = df2$ampm, am = 1, pm = 0, treatMorningAsLate = TRUE)
# Prepare Data df1 <- data.frame(hours = c(1, 1, 12, 12), ampm = c(0, 0, 1, 1)) df2 <- data.frame(hours = c(1, 1, 12, 12), ampm = c(1, 1, 0, 0)) # Convert AM and PM hours convertHoursAMPM(hours = df1$hours, ampm = df1$ampm) convertHoursAMPM(hours = df1$hours, ampm = df1$ampm, treatMorningAsLate = TRUE) convertHoursAMPM(hours = df2$hours, ampm = df2$ampm, am = 1, pm = 0) convertHoursAMPM(hours = df2$hours, ampm = df2$ampm, am = 1, pm = 0, treatMorningAsLate = TRUE)
Convert times to hours.
convertToHours(hours, minutes, seconds, HHMMSS, HHMM)
convertToHours(hours, minutes, seconds, HHMMSS, HHMM)
hours |
Character vector of the number of hours. |
minutes |
Character vector of the number of minutes. |
seconds |
Character vector of the number of seconds. |
HHMMSS |
Times in HH:MM:SS format. |
HHMM |
Character vector of times in HH:MM format. |
Converts times to hours. To convert times to minutes or seconds, see convertToMinutes or convertToSeconds.
Vector of times in hours.
Other times:
convertHoursAMPM()
,
convertToMinutes()
,
convertToSeconds()
Other conversion:
convert.magic()
,
convertHoursAMPM()
,
convertToMinutes()
,
convertToSeconds()
,
percentileToTScore()
,
pom()
# Prepare Data df <- data.frame(hours = c(0,1), minutes = c(15,27), seconds = c(30,13), HHMMSS = c("00:15:30","01:27:13"), HHMM = c("00:15","01:27")) # Convert to Hours convertToHours(hours = df$hours, minutes = df$minutes, seconds = df$seconds) convertToHours(HHMMSS = df$HHMMSS) convertToHours(HHMM = df$HHMM)
# Prepare Data df <- data.frame(hours = c(0,1), minutes = c(15,27), seconds = c(30,13), HHMMSS = c("00:15:30","01:27:13"), HHMM = c("00:15","01:27")) # Convert to Hours convertToHours(hours = df$hours, minutes = df$minutes, seconds = df$seconds) convertToHours(HHMMSS = df$HHMMSS) convertToHours(HHMM = df$HHMM)
Convert times to minutes.
convertToMinutes(hours, minutes, seconds, HHMMSS, HHMM, MMSS)
convertToMinutes(hours, minutes, seconds, HHMMSS, HHMM, MMSS)
hours |
Character vector of the number of hours. |
minutes |
Character vector of the number of minutes. |
seconds |
Character vector of the number of seconds. |
HHMMSS |
Times in HH:MM:SS format. |
HHMM |
Character vector of times in HH:MM format. |
MMSS |
Character vector of times in MM:SS format. |
Converts times to minutes. To convert times to hours or seconds, see convertToHours or convertToSeconds.
Vector of times in minutes.
Other times:
convertHoursAMPM()
,
convertToHours()
,
convertToSeconds()
Other conversion:
convert.magic()
,
convertHoursAMPM()
,
convertToHours()
,
convertToSeconds()
,
percentileToTScore()
,
pom()
# Prepare Data df <- data.frame(hours = c(0,1), minutes = c(15,27), seconds = c(30,13), HHMMSS = c("00:15:30","01:27:13"), HHMM = c("00:15","01:27")) # Convert to Minutes convertToMinutes(hours = df$hours, minutes = df$minutes, seconds = df$seconds) convertToMinutes(HHMMSS = df$HHMMSS) convertToMinutes(HHMM = df$HHMM)
# Prepare Data df <- data.frame(hours = c(0,1), minutes = c(15,27), seconds = c(30,13), HHMMSS = c("00:15:30","01:27:13"), HHMM = c("00:15","01:27")) # Convert to Minutes convertToMinutes(hours = df$hours, minutes = df$minutes, seconds = df$seconds) convertToMinutes(HHMMSS = df$HHMMSS) convertToMinutes(HHMM = df$HHMM)
Convert times to seconds.
convertToSeconds(hours, minutes, seconds, HHMMSS, HHMM, MMSS)
convertToSeconds(hours, minutes, seconds, HHMMSS, HHMM, MMSS)
hours |
Character vector of the number of hours. |
minutes |
Character vector of the number of minutes. |
seconds |
Character vector of the number of seconds. |
HHMMSS |
Times in HH:MM:SS format. |
HHMM |
Character vector of times in HH:MM format. |
MMSS |
Character vector of times in MM:SS format. |
Converts times to seconds. To convert times to hours or minutes, see convertToHours or convertToMinutes.
Vector of times in seconds.
Other times:
convertHoursAMPM()
,
convertToHours()
,
convertToMinutes()
Other conversion:
convert.magic()
,
convertHoursAMPM()
,
convertToHours()
,
convertToMinutes()
,
percentileToTScore()
,
pom()
# Prepare Data df <- data.frame(hours = c(0,1), minutes = c(15,27), seconds = c(30,13), HHMMSS = c("00:15:30","01:27:13"), HHMM = c("00:15","01:27"), MMSS = c("15:30","87:13")) # Convert to Minutes convertToSeconds(hours = df$hours, minutes = df$minutes, seconds = df$seconds) convertToSeconds(HHMMSS = df$HHMMSS) convertToSeconds(HHMM = df$HHMM) convertToSeconds(MMSS = df$MMSS)
# Prepare Data df <- data.frame(hours = c(0,1), minutes = c(15,27), seconds = c(30,13), HHMMSS = c("00:15:30","01:27:13"), HHMM = c("00:15","01:27"), MMSS = c("15:30","87:13")) # Convert to Minutes convertToSeconds(hours = df$hours, minutes = df$minutes, seconds = df$seconds) convertToSeconds(HHMMSS = df$HHMMSS) convertToSeconds(HHMM = df$HHMM) convertToSeconds(MMSS = df$MMSS)
Function that creates a correlation matrix similar to SPSS output.
cor.table(x, y, type = "none", dig = 2, correlation = "pearson")
cor.table(x, y, type = "none", dig = 2, correlation = "pearson")
x |
Variable or set of variables in the form of a vector or dataframe
to correlate with |
y |
(optional) Variable or set of variables in the form of a vector or
dataframe to correlate with |
type |
Type of correlation matrix to print. One of:
|
dig |
Number of decimals to print. |
correlation |
Method for calculating the association. One of:
|
Co-created by Angela Staples ([email protected]) and Isaac Petersen ([email protected]). For a partial correlation matrix, see partialcor.table.
A correlation matrix.
Other correlations:
addText()
,
crossTimeCorrelation()
,
crossTimeCorrelationDF()
,
partialcor.table()
,
vwReg()
# Prepare Data data("mtcars") # Correlation Matrix cor.table(mtcars[,c("mpg","cyl","disp")]) cor.table(mtcars[,c("mpg","cyl","disp")]) cor.table(mtcars[,c("mpg","cyl","disp")], dig = 3) cor.table(mtcars[,c("mpg","cyl","disp")], dig = 3, correlation = "spearman") cor.table(mtcars[,c("mpg","cyl","disp")], type = "manuscript", dig = 3) cor.table(mtcars[,c("mpg","cyl","disp")], type = "manuscriptBig") table1 <- cor.table(mtcars[,c("mpg","cyl","disp")], type = "latex") table2 <- cor.table(mtcars[,c("mpg","cyl","disp")], type = "latexSPSS") table3 <- cor.table(mtcars[,c("mpg","cyl","disp")], type = "manuscriptLatex") table4 <- cor.table(mtcars[,c("mpg","cyl","disp")], type = "manuscriptBigLatex") cor.table(mtcars[,c("mpg","cyl","disp")], mtcars[,c("drat","qsec")]) cor.table(mtcars[,c("mpg","cyl","disp")], mtcars[,c("drat","qsec")], type = "manuscript", dig = 3)
# Prepare Data data("mtcars") # Correlation Matrix cor.table(mtcars[,c("mpg","cyl","disp")]) cor.table(mtcars[,c("mpg","cyl","disp")]) cor.table(mtcars[,c("mpg","cyl","disp")], dig = 3) cor.table(mtcars[,c("mpg","cyl","disp")], dig = 3, correlation = "spearman") cor.table(mtcars[,c("mpg","cyl","disp")], type = "manuscript", dig = 3) cor.table(mtcars[,c("mpg","cyl","disp")], type = "manuscriptBig") table1 <- cor.table(mtcars[,c("mpg","cyl","disp")], type = "latex") table2 <- cor.table(mtcars[,c("mpg","cyl","disp")], type = "latexSPSS") table3 <- cor.table(mtcars[,c("mpg","cyl","disp")], type = "manuscriptLatex") table4 <- cor.table(mtcars[,c("mpg","cyl","disp")], type = "manuscriptBigLatex") cor.table(mtcars[,c("mpg","cyl","disp")], mtcars[,c("drat","qsec")]) cor.table(mtcars[,c("mpg","cyl","disp")], mtcars[,c("drat","qsec")], type = "manuscript", dig = 3)
Calculate the association of a variable across multiple time points.
crossTimeCorrelation(id = "tcid", time = "age", variable, data)
crossTimeCorrelation(id = "tcid", time = "age", variable, data)
id |
Name of variable indicating the participant ID. |
time |
Name of variable indicating the timepoint. |
variable |
Name of variable to estimate the cross-time correlation. |
data |
Dataframe. |
Calculate the association of a variable across multiple time points. It is especially useful when there are three or more time points.
output of cor.test()
Other correlations:
addText()
,
cor.table()
,
crossTimeCorrelationDF()
,
partialcor.table()
,
vwReg()
# Prepare Data df <- expand.grid(ID = 1:100, time = c(1, 2, 3)) df <- df[order(df$ID),] row.names(df) <- NULL df$score <- rnorm(nrow(df)) # Cross-Time Correlation crossTimeCorrelation(id = "ID", time = "time", variable = "score", data = df)
# Prepare Data df <- expand.grid(ID = 1:100, time = c(1, 2, 3)) df <- df[order(df$ID),] row.names(df) <- NULL df$score <- rnorm(nrow(df)) # Cross-Time Correlation crossTimeCorrelation(id = "ID", time = "time", variable = "score", data = df)
Dataframe used to compute cross-time correlations.
crossTimeCorrelationDF(id = "tcid", time = "age", variable, data)
crossTimeCorrelationDF(id = "tcid", time = "age", variable, data)
id |
Name of variable indicating the participant ID. |
time |
Name of variable indicating the timepoint. |
variable |
Name of variable to estimate the cross-time correlation. |
data |
Dataframe. |
Dataframe used to calculate the association of a variable across multiple time points. It is especially useful when there are three or more time points.
dataframe with three columns in the form of: ID
, time1
,
time2
Other correlations:
addText()
,
cor.table()
,
crossTimeCorrelation()
,
partialcor.table()
,
vwReg()
# Prepare Data df <- expand.grid(ID = 1:100, time = c(1, 2, 3)) df <- df[order(df$ID),] row.names(df) <- NULL df$score <- rnorm(nrow(df)) # Cross-Time Correlation crossTimeCorrelationDF(id = "ID", time = "time", variable = "score", data = df)
# Prepare Data df <- expand.grid(ID = 1:100, time = c(1, 2, 3)) df <- df[order(df$ID),] row.names(df) <- NULL df$score <- rnorm(nrow(df)) # Cross-Time Correlation crossTimeCorrelationDF(id = "ID", time = "time", variable = "score", data = df)
Estimate item information from Bayesian zero-inflated negative binomial
model that was fit using the brms
package.
deriv_d_negBinom(n, alpha, beta, theta, phi) d_negBinom(n, alpha, beta, theta, phi) log_gen_binom(n, phi) deriv_logd_negBinom(n, alpha, beta, theta, phi) info_neg_binom_analytical( theta = seq(-2.5, 2.5, length.out = 101), alpha, beta, phi, varpi ) item_info_NB_zero_analytical(theta, alpha, beta, phi, varpi)
deriv_d_negBinom(n, alpha, beta, theta, phi) d_negBinom(n, alpha, beta, theta, phi) log_gen_binom(n, phi) deriv_logd_negBinom(n, alpha, beta, theta, phi) info_neg_binom_analytical( theta = seq(-2.5, 2.5, length.out = 101), alpha, beta, phi, varpi ) item_info_NB_zero_analytical(theta, alpha, beta, phi, varpi)
n |
Integer. The observed count, representing the the event frequency. |
alpha |
Numeric. The slope/discrimination parameter of the item,
indicating how steeply the item response changes with the person's
( |
beta |
Numeric. The intercept/easiness parameter of the item,
indicating the expected count at a given level on the construct
( |
theta |
Numeric. The respondent's level on the latent factor/construct. |
phi |
Numeric. The shape/overdispersion parameter of the negative binomial distribution, indicating the variance beyond what is expected from a negative binomial distribution. |
varpi |
Numeric. The probability of observing a zero count due to a separate zero-inflation process. |
Created by Philipp Doebler ([email protected]) and Loreen Sabel ([email protected]).
The amount of information for a given item at each of the values of
theta
specified.
Other bayesian:
pA()
Other IRT:
discriminationToFactorLoading()
,
fourPL()
,
itemInformation()
,
reliabilityIRT()
,
standardErrorIRT()
## Not run: library(brms) library(rstan) coef_bayesianMixedEffectsGRM_gam <- coef(bayesianMixedEffectsGRM_gam) str(coef_bayesianMixedEffectsGRM_gam) itempars <- coef_bayesianMixedEffectsGRM_gam$item[,1,1:4] # define a grid of thetas for the computations: theta_seq <- seq(-4, 4, length.out = 201) # item information for all items # The resulting matrix has length(theta_seq) columns and a row per item. # We use a loop for the calcualtions item_info <- matrix(NA, nrow = nrow(itempars), ncol = length(theta_seq)) for(i in 1:nrow(itempars)){ item_info[i, ] <- item_info_NB_zero_analytical( theta_seq, itempars[i, "alpha_Intercept"], itempars[i, "beta_Intercept"], exp(itempars[i, "shape_Intercept"]), plogis(itempars[i, "zi_Intercept"])) } ## End(Not run)
## Not run: library(brms) library(rstan) coef_bayesianMixedEffectsGRM_gam <- coef(bayesianMixedEffectsGRM_gam) str(coef_bayesianMixedEffectsGRM_gam) itempars <- coef_bayesianMixedEffectsGRM_gam$item[,1,1:4] # define a grid of thetas for the computations: theta_seq <- seq(-4, 4, length.out = 201) # item information for all items # The resulting matrix has length(theta_seq) columns and a row per item. # We use a loop for the calcualtions item_info <- matrix(NA, nrow = nrow(itempars), ncol = length(theta_seq)) for(i in 1:nrow(itempars)){ item_info[i, ] <- item_info_NB_zero_analytical( theta_seq, itempars[i, "alpha_Intercept"], itempars[i, "beta_Intercept"], exp(itempars[i, "shape_Intercept"]), plogis(itempars[i, "zi_Intercept"])) } ## End(Not run)
Estimate the true association between the predictor and criterion after accounting for the degree to which a true correlation is attenuated due to measurement error.
disattenuationCorrelation( observedAssociation, reliabilityOfPredictor, reliabilityOfCriterion )
disattenuationCorrelation( observedAssociation, reliabilityOfPredictor, reliabilityOfCriterion )
observedAssociation |
Magnitude of observed association (r value). |
reliabilityOfPredictor |
Reliability of predictor (from 0 to 1). |
reliabilityOfCriterion |
Reliability of criterion/outcome (from 0 to 1). |
Estimate the true association between the predictor and criterion after accounting for the degree to which a true correlation is attenuated due to random measurement error (unreliability).
True association between predictor and criterion.
Other correlation:
attenuationCorrelation()
disattenuationCorrelation( observedAssociation = .7, reliabilityOfPredictor = .9, reliabilityOfCriterion = .85)
disattenuationCorrelation( observedAssociation = .7, reliabilityOfPredictor = .9, reliabilityOfCriterion = .85)
Convert a discrimination parameter in item response theory to a standardized factor loading.
discriminationToFactorLoading(a, model = "probit")
discriminationToFactorLoading(a, model = "probit")
a |
Discrimination parameter in item response theory. |
model |
Model type. One of:
|
Convert a discrimination parameter in item response theory to a standardized factor loading
Standardized factor loading.
https://aidenloe.github.io/introToIRT.html https://stats.stackexchange.com/questions/228629/conversion-of-irt-logit-discrimination-parameter-to-factor-loading-metric
Other IRT:
deriv_d_negBinom()
,
fourPL()
,
itemInformation()
,
reliabilityIRT()
,
standardErrorIRT()
discriminationToFactorLoading(0.5) discriminationToFactorLoading(1.3) discriminationToFactorLoading(1.3, model = "logit")
discriminationToFactorLoading(0.5) discriminationToFactorLoading(1.3) discriminationToFactorLoading(1.3, model = "logit")
Drop columns with all missing (NA
) values.
dropColsWithAllNA(data, ignore = NULL)
dropColsWithAllNA(data, ignore = NULL)
data |
Dataframe to drop columns from. |
ignore |
Names of columns to ignore for determining whether each row had all missing values. |
Drop columns that have no observed values, i.e., all values in the column are
missing (NA
), excluding the ignored columns.
A dataframe with columns removed that had all missing values in non-ignored columns.
Other dataManipulation:
columnBindFill()
,
convert.magic()
,
dropRowsWithAllNA()
,
varsDifferentTypes()
Other dataEvaluations:
dropRowsWithAllNA()
,
is.nan.data.frame()
,
not_all_na()
,
not_any_na()
# Prepare Data df <- expand.grid(ID = 1:100, time = c(1, 2, 3), rater = c(1, 2), naCol1 = NA, naCol2 = NA) df <- df[order(df$ID),] row.names(df) <- NULL df$score1 <- rnorm(nrow(df)) df$score2 <- rnorm(nrow(df)) df$score3 <- rnorm(nrow(df)) df[sample(1:nrow(df), size = 100), c("score1","score2","score3")] <- NA # Drop Rows with All NA in Non-Ignored Columns dropColsWithAllNA(df) dropColsWithAllNA(df, ignore = c("naCol2"))
# Prepare Data df <- expand.grid(ID = 1:100, time = c(1, 2, 3), rater = c(1, 2), naCol1 = NA, naCol2 = NA) df <- df[order(df$ID),] row.names(df) <- NULL df$score1 <- rnorm(nrow(df)) df$score2 <- rnorm(nrow(df)) df$score3 <- rnorm(nrow(df)) df[sample(1:nrow(df), size = 100), c("score1","score2","score3")] <- NA # Drop Rows with All NA in Non-Ignored Columns dropColsWithAllNA(df) dropColsWithAllNA(df, ignore = c("naCol2"))
Drop rows with all missing (NA
) values.
dropRowsWithAllNA(data, ignore = NULL)
dropRowsWithAllNA(data, ignore = NULL)
data |
Dataframe to drop rows from. |
ignore |
Names of columns to ignore for determining whether each row had all missing values. |
Drop rows that have no observed values, i.e., all values in the row are
missing (NA
), excluding the ignored columns.
A dataframe with rows removed that had all missing values in non-ignored columns.
Other dataManipulation:
columnBindFill()
,
convert.magic()
,
dropColsWithAllNA()
,
varsDifferentTypes()
Other dataEvaluations:
dropColsWithAllNA()
,
is.nan.data.frame()
,
not_all_na()
,
not_any_na()
# Prepare Data df <- expand.grid(ID = 1:100, time = c(1, 2, 3)) df <- df[order(df$ID),] row.names(df) <- NULL df$score1 <- rnorm(nrow(df)) df$score2 <- rnorm(nrow(df)) df$score3 <- rnorm(nrow(df)) df[sample(1:nrow(df), size = 100), c("score1","score2","score3")] <- NA # Drop Rows with All NA in Non-Ignored Columns dropRowsWithAllNA(df, ignore = c("ID","time"))
# Prepare Data df <- expand.grid(ID = 1:100, time = c(1, 2, 3)) df <- df[order(df$ID),] row.names(df) <- NULL df$score1 <- rnorm(nrow(df)) df$score2 <- rnorm(nrow(df)) df$score3 <- rnorm(nrow(df)) df[sample(1:nrow(df), size = 100), c("score1","score2","score3")] <- NA # Drop Rows with All NA in Non-Ignored Columns dropRowsWithAllNA(df, ignore = c("ID","time"))
Function that performs a chi-square equivalence test for structural equation models.
equiv_chi(alpha = 0.05, chi, df, m, N_sample, popRMSEA = 0.08)
equiv_chi(alpha = 0.05, chi, df, m, N_sample, popRMSEA = 0.08)
alpha |
Value of the significance level, which is set to .05 by default. |
chi |
Value of the observed chi-square test statistic. |
df |
Number of model (or model difference in) degrees of freedom. |
m |
Number of groups. |
N_sample |
Sample size. |
popRMSEA |
The value of the root-mean square error of approximation (RMSEA) to set for the equivalence bounds, which is set to .08 by default. |
Created by Counsell et al. (2020): Counsell, A., Cribbie, R. A., & Flora, D. B. (2020). Evaluating equivalence testing methods for measurement invariance. Multivariate Behavioral Research, 55(2), 312-328. https://doi.org/10.1080/00273171.2019.1633617
p-value indicating whether to reject the null hypothesis that the model is a poor fit to the data.
Other structural equation modeling:
make_esem_model()
,
puc()
,
satorraBentlerScaledChiSquareDifferenceTestStatistic()
,
semPlotInteraction()
# Prepare Data data("mtcars") # Fit structural equation model # Extract statistics N1 <- 1222 m <- 1 Tml1 <- 408.793 df1 <- 80 # Equivalence test equiv_chi(alpha = .05, chi = Tml1, df = df1, m = 1, N_sample = N1, popRMSEA = .08)
# Prepare Data data("mtcars") # Fit structural equation model # Extract statistics N1 <- 1222 m <- 1 Tml1 <- 408.793 df1 <- 80 # Equivalence test equiv_chi(alpha = .05, chi = Tml1, df = df1, m = 1, N_sample = N1, popRMSEA = .08)
4-parameter logistic curve for item response theory.
fourPL(a = 1, b, c = 0, d = 1, theta)
fourPL(a = 1, b, c = 0, d = 1, theta)
a |
Discrimination parameter (slope). |
b |
Difficulty (severity) parameter (inflection point). |
c |
Guessing parameter (lower asymptote). |
d |
Careless errors parameter (upper asymptote). |
theta |
Person's level on the construct. |
Estimates the probability of item endorsement as function of the four-parameter logistic (4PL) curve and the person's level on the construct (theta).
Probability of item endorsement (or expected value on the item).
Other IRT:
deriv_d_negBinom()
,
discriminationToFactorLoading()
,
itemInformation()
,
reliabilityIRT()
,
standardErrorIRT()
fourPL(b = 2, theta = -4:4) #1PL fourPL(b = 2, a = 1.5, theta = -4:4) #2PL fourPL(b = 2, a = 1.5, c = 0.10, theta = -4:4) #3PL fourPL(b = 2, a = 1.5, c = 0.10, d = 0.95, theta = -4:4) #4PL
fourPL(b = 2, theta = -4:4) #1PL fourPL(b = 2, a = 1.5, theta = -4:4) #2PL fourPL(b = 2, a = 1.5, c = 0.10, theta = -4:4) #3PL fourPL(b = 2, a = 1.5, c = 0.10, d = 0.95, theta = -4:4) #4PL
Determine package dependencies.
getDependencies(packs)
getDependencies(packs)
packs |
Character vector of names of target packages. |
Determine which packages depend on a target package (or packages).
Vector of packages that depend on the target package(s).
Other packages:
load_or_install()
old <- options("repos") options(repos = "https://cran.r-project.org") getDependencies("tidyverse") options(old)
old <- options("repos") options(repos = "https://cran.r-project.org") getDependencies("tidyverse") options(old)
Function that combines lme results across multiple imputation runs.
imputationCombine(model, dig = 3)
imputationCombine(model, dig = 3)
model |
name of |
dig |
number of decimals to print in output. |
[INSERT].
Summary of model fit and information for mixed effect imputation models.
Other multipleImputation:
imputationModelCompare()
,
imputationPRV()
,
lmCombine()
#INSERT
#INSERT
Function that compares two nested lme()
models from multiple
imputation using likelihood ratio test.
imputationModelCompare(model1, model2)
imputationModelCompare(model1, model2)
model1 |
name of first |
model2 |
name of second |
[INSERT].
Likelihood ratio test for model comparison of two mixed effect imputation models.
Other multipleImputation:
imputationCombine()
,
imputationPRV()
,
lmCombine()
#INSERT
#INSERT
Calculate the proportional reduction of variance in imputation models.
imputationPRV(baseline, full, baselineTime = 1, fullTime = 1)
imputationPRV(baseline, full, baselineTime = 1, fullTime = 1)
baseline |
The baseline model object fit with the imputed data. |
full |
The full model object fit with the imputed data. |
baselineTime |
The position of the random effect of time (random slopes) among the random slopes in the baseline model. For example:
|
fullTime |
The position of the random effect of time (random slopes) among the random slopes in the full model. For example:
|
[INSERT].
The proportional reduction of variance from a baseline mixed-effects model to a full mixed effects model.
Other multipleImputation:
imputationCombine()
,
imputationModelCompare()
,
lmCombine()
#INSERT
#INSERT
Check whether a value is "Not A Number" (NaN
) in a dataframe.
## S3 method for class 'data.frame' is.nan(x)
## S3 method for class 'data.frame' is.nan(x)
x |
Dataframe. |
[INSERT].
TRUE
or FALSE
, indicating whether values in a dataframe
are Not a Number (NA
).
Other dataEvaluations:
dropColsWithAllNA()
,
dropRowsWithAllNA()
,
not_all_na()
,
not_any_na()
# Prepare Data df <- data.frame(item1 = rnorm(1000), item2 = rnorm(1000), item3 = rnorm(1000)) df[sample(1:nrow(df), size = 100), c("item1","item2","item3")] <- NaN # Calculate Missingness-Adjusted Row Sum is.nan(df)
# Prepare Data df <- data.frame(item1 = rnorm(1000), item2 = rnorm(1000), item3 = rnorm(1000)) df[sample(1:nrow(df), size = 100), c("item1","item2","item3")] <- NaN # Calculate Missingness-Adjusted Row Sum is.nan(df)
Item information in item response theory.
itemInformation(a = 1, b, c = 0, d = 1, theta)
itemInformation(a = 1, b, c = 0, d = 1, theta)
a |
Discrimination parameter (slope). |
b |
Difficulty (severity) parameter (inflection point). |
c |
Guessing parameter (lower asymptote). |
d |
Careless errors parameter (upper asymptote). |
theta |
Person's level on the construct. |
Estimates the amount of information provided by a given item as function of the item parameters and the person's level on the construct (theta).
Amount of item information.
Other IRT:
deriv_d_negBinom()
,
discriminationToFactorLoading()
,
fourPL()
,
reliabilityIRT()
,
standardErrorIRT()
itemInformation(b = 2, theta = -4:4) #1PL itemInformation(b = 2, a = 1.5, theta = -4:4) #2PL itemInformation(b = 2, a = 1.5, c = 0.10, theta = -4:4) #3PL itemInformation(b = 2, a = 1.5, c = 0.10, d = 0.95, theta = -4:4) #4PL
itemInformation(b = 2, theta = -4:4) #1PL itemInformation(b = 2, a = 1.5, theta = -4:4) #2PL itemInformation(b = 2, a = 1.5, c = 0.10, theta = -4:4) #3PL itemInformation(b = 2, a = 1.5, c = 0.10, d = 0.95, theta = -4:4) #4PL
Computes weighted quantiles. whdquantile()
uses a weighted
Harrell-Davis quantile estimator. wthdquantile()
uses a weighted
trimmed Harrell-Davis quantile estimator. wquantile()
uses a weighted
traditional quantile estimator.
kish_ess(w) wquantile_generic(x, w, probs, cdf) whdquantile(x, w, probs) wthdquantile(x, w, probs, width = 1/sqrt(kish_ess(w))) wquantile(x, w, probs, type = 7)
kish_ess(w) wquantile_generic(x, w, probs, cdf) whdquantile(x, w, probs) wthdquantile(x, w, probs, width = 1/sqrt(kish_ess(w))) wquantile(x, w, probs, type = 7)
w |
Numeric vector of weights to give each value. Should be the same length as the vector of values. |
x |
Numeric vector of values of which to determine the quantiles. |
probs |
Numeric vector of the quantiles to retrieve. |
cdf |
Cumulative distribution function. |
width |
Numeric value for the width of the interval in the trimmed Harrell-Davis quantile estimator. |
type |
Numeric value for type of weighted quantile. |
Computes weighted quantiles according to Akinshin (2023).
Numeric vector of specified quantiles.
Other computations:
Mode()
,
meanSum()
,
mySum()
mydata <- c(1:100, 1000) mydataWithNAs <- mydata mydataWithNAs[c(1,5,7)] <- NA weights <- rep(1, length(mydata)) quantiles <- c(0.01, 0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95, 0.99) whdquantile( x = mydata, w = weights, probs = quantiles) wthdquantile( x = mydata, w = weights, probs = quantiles) wquantile( x = mydata, w = weights, probs = quantiles) whdquantile( x = mydataWithNAs, w = weights, probs = quantiles) wthdquantile( x = mydataWithNAs, w = weights, probs = quantiles) wquantile( x = mydataWithNAs, w = weights, probs = quantiles)
mydata <- c(1:100, 1000) mydataWithNAs <- mydata mydataWithNAs[c(1,5,7)] <- NA weights <- rep(1, length(mydata)) quantiles <- c(0.01, 0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95, 0.99) whdquantile( x = mydata, w = weights, probs = quantiles) wthdquantile( x = mydata, w = weights, probs = quantiles) wquantile( x = mydata, w = weights, probs = quantiles) whdquantile( x = mydataWithNAs, w = weights, probs = quantiles) wthdquantile( x = mydataWithNAs, w = weights, probs = quantiles) wquantile( x = mydataWithNAs, w = weights, probs = quantiles)
Function that combines lm()
results across multiple imputation runs.
lmCombine(model, dig = 3)
lmCombine(model, dig = 3)
model |
name of |
dig |
number of decimals to print in output. |
[INSERT].
Summary of multiple regression imputation models.
Other multipleImputation:
imputationCombine()
,
imputationModelCompare()
,
imputationPRV()
Other multipleRegression:
plot2WayInteraction()
,
ppPlot()
,
semPlotInteraction()
,
update_nested()
#INSERT
#INSERT
Summarizes the results of a model fit by the lme()
function of the
nlme
package.
lmeSummary(model, dig = 3)
lmeSummary(model, dig = 3)
model |
name of |
dig |
number of decimals to print in output. |
Summarizes the results of a model fit by the lme()
function of the
nlme
package. Includes summary of parameters, pseudo-r-squared, and
whether model is positive definite.
Output summary of lme()
model object.
# Fit Model library("nlme") model <- lme(distance ~ age + Sex, data = Orthodont, random = ~ 1 + age) # Model Summary summary(model) lmeSummary(model)
# Fit Model library("nlme") model <- lme(distance ~ age + Sex, data = Orthodont, random = ~ 1 + age) # Model Summary summary(model) lmeSummary(model)
Loads packages or, if not already installed, installs and loads packages.
load_or_install(package_names, ...)
load_or_install(package_names, ...)
package_names |
Character vector of one or more package names. |
... |
Additional arguments for |
Loads packages that are already installed, and if the packages are not already installed, it installs and then loads them.
Loaded packages.
https://www.r-bloggers.com/2012/05/loading-andor-installing-packages-programmatically/ https://stackoverflow.com/questions/4090169/elegant-way-to-check-for-missing-packages-and-install-them
Other packages:
getDependencies()
## Not run: old <- options("repos") options(repos = "https://cran.r-project.org") # Warning: the command below installs packages that are not already installed load_or_install(c("tidyverse","nlme")) options(old) ## End(Not run)
## Not run: old <- options("repos") options(repos = "https://cran.r-project.org") # Warning: the command below installs packages that are not already installed load_or_install(c("tidyverse","nlme")) options(old) ## End(Not run)
Make lavaan
syntax for exploratory structural equation model (ESEM).
make_esem_model(loadings, anchors)
make_esem_model(loadings, anchors)
loadings |
Dataframe with three columns from exploratory factor analysis (EFA):
|
anchors |
Dataframe whose names are the latent factors and whose values are the names of the anchor item for each latent factor. |
Makes syntax for exploratory structural equation model (ESEM) to be fit in
lavaan
.
lavaan
model syntax.
https://msilvestrin.me/post/esem/
Other structural equation modeling:
equiv_chi()
,
puc()
,
satorraBentlerScaledChiSquareDifferenceTestStatistic()
,
semPlotInteraction()
# Prepare Data data("HolzingerSwineford1939", package = "lavaan") # Specify EFA Syntax efa_syntax <- ' # EFA Factor Loadings efa("efa1")*f1 + efa("efa1")*f2 + efa("efa1")*f3 =~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 ' # Fit EFA Model mplusRotationArgs <- list(rstarts = 30, row.weights = "none", algorithm = "gpa", orthogonal = FALSE, jac.init.rot = TRUE, std.ov = TRUE, # row standard = correlation geomin.epsilon = 0.0001) efa_fit <- lavaan::sem( efa_syntax, data = HolzingerSwineford1939, information = "observed", missing = "ML", estimator = "MLR", rotation = "geomin", # mimic Mplus meanstructure = TRUE, rotation.args = mplusRotationArgs) # Extract Factor Loadings esem_loadings <- lavaan::parameterEstimates( efa_fit, standardized = TRUE ) |> dplyr::filter(efa == "efa1") |> dplyr::select(lhs, rhs, est) |> dplyr::rename(item = rhs, latent = lhs, loading = est) # Specify Anchor Item for Each Latent Factor anchors <- c(f1 = "x3", f2 = "x5", f3 = "x7") # Generate ESEM Syntax esemModel_syntax <- make_esem_model(esem_loadings, anchors) # Fit ESEM Model lavaan::sem( esemModel_syntax, data = HolzingerSwineford1939, missing = "ML", estimator = "MLR")
# Prepare Data data("HolzingerSwineford1939", package = "lavaan") # Specify EFA Syntax efa_syntax <- ' # EFA Factor Loadings efa("efa1")*f1 + efa("efa1")*f2 + efa("efa1")*f3 =~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 ' # Fit EFA Model mplusRotationArgs <- list(rstarts = 30, row.weights = "none", algorithm = "gpa", orthogonal = FALSE, jac.init.rot = TRUE, std.ov = TRUE, # row standard = correlation geomin.epsilon = 0.0001) efa_fit <- lavaan::sem( efa_syntax, data = HolzingerSwineford1939, information = "observed", missing = "ML", estimator = "MLR", rotation = "geomin", # mimic Mplus meanstructure = TRUE, rotation.args = mplusRotationArgs) # Extract Factor Loadings esem_loadings <- lavaan::parameterEstimates( efa_fit, standardized = TRUE ) |> dplyr::filter(efa == "efa1") |> dplyr::select(lhs, rhs, est) |> dplyr::rename(item = rhs, latent = lhs, loading = est) # Specify Anchor Item for Each Latent Factor anchors <- c(f1 = "x3", f2 = "x5", f3 = "x7") # Generate ESEM Syntax esemModel_syntax <- make_esem_model(esem_loadings, anchors) # Fit ESEM Model lavaan::sem( esemModel_syntax, data = HolzingerSwineford1939, missing = "ML", estimator = "MLR")
Compute a missingness-adjusted row sum.
meanSum(x)
meanSum(x)
x |
Matrix or dataframe with participants in the rows and items in the columns. |
Take row mean across columns (items) and then multiply by number of items to
account for missing (NA
) values.
Missingness-adjusted row sum.
Other computations:
Mode()
,
kish_ess()
,
mySum()
# Prepare Data df <- data.frame(item1 = rnorm(1000), item2 = rnorm(1000), item3 = rnorm(1000)) # Calculate Missingness-Adjusted Row Sum df$missingnessAdjustedSum <- meanSum(df)
# Prepare Data df <- data.frame(item1 = rnorm(1000), item2 = rnorm(1000), item3 = rnorm(1000)) # Calculate Missingness-Adjusted Row Sum df$missingnessAdjustedSum <- meanSum(df)
Calculate statistical mode.
Mode(x, multipleModes = "all")
Mode(x, multipleModes = "all")
x |
Numerical vector. |
multipleModes |
How to handle multiple modes. One of:
|
Calculates statistical mode(s).
Statistical mode(s).
https://stackoverflow.com/questions/2547402/how-to-find-the-statistical-mode/8189441#8189441
Other computations:
kish_ess()
,
meanSum()
,
mySum()
# Prepare Data v1 <- c(1, 1, 2, 2, 3) #Calculate Statistical Mode Mode(v1) Mode(v1, multipleModes = "mean") Mode(v1, multipleModes = "first")
# Prepare Data v1 <- c(1, 1, 2, 2, 3) #Calculate Statistical Mode Mode(v1) Mode(v1, multipleModes = "mean") Mode(v1, multipleModes = "first")
Amount of principal and interest payments on a mortgage.
mortgage(balance, interest, term = 30, n = 12)
mortgage(balance, interest, term = 30, n = 12)
balance |
Initial mortgage balance. |
interest |
Interest rate. |
term |
Payoff period (in years). |
n |
Number of payments per year. |
Calculates the amount of principal and interest payments on a mortgage.
Amount of principal and interest payments.
mortgage(balance = 300000, interest = .05) mortgage(balance = 300000, interest = .04) mortgage(balance = 300000, interest = .06) mortgage(balance = 300000, interest = .05, term = 15)
mortgage(balance = 300000, interest = .05) mortgage(balance = 300000, interest = .04) mortgage(balance = 300000, interest = .06) mortgage(balance = 300000, interest = .05, term = 15)
Sorts items' loadings based on their loadings from exploratoary factor
analysis fit with the psych::fa()
function.
my_loadings_sorter( fit, sort_type = "largest_loading", nchar = 40, return_blocks = FALSE, showlatentcor = TRUE, itemLabels = NULL )
my_loadings_sorter( fit, sort_type = "largest_loading", nchar = 40, return_blocks = FALSE, showlatentcor = TRUE, itemLabels = NULL )
fit |
the fitted object from the |
sort_type |
how to sort the loadings. One of:
|
nchar |
the limit for the number of characters to display for the item label |
return_blocks |
whether to return the block number that corresponds to each item |
showlatentcor |
whether or not to print the intercorrelation among the latent factors (only possible for models with an oblique rotation) |
itemLabels |
a vector of the item labels |
Adapted from code by Philipp Doebler ([email protected]).
Sorted loadings from exploratory factor analysis model.
Compute a row sum and retain NA
s when all values in the row are
NA
.
mySum(data)
mySum(data)
data |
dataframe |
Compute a row sum and set the row sum to be missing (not zero) when all
values in the row are missing (NA
).
Modified row sum to set row sum to be missing when all values in the
row are missing (NA
).
Other computations:
Mode()
,
kish_ess()
,
meanSum()
# Prepare Data df <- data.frame(item1 = rnorm(1000), item2 = rnorm(1000), item3 = rnorm(1000)) df[sample(1:nrow(df), size = 100), c("item1","item2","item3")] <- NA # Calculate Missingness-Adjusted Row Sum df$sum <- mySum(df)
# Prepare Data df <- data.frame(item1 = rnorm(1000), item2 = rnorm(1000), item3 = rnorm(1000)) df[sample(1:nrow(df), size = 100), c("item1","item2","item3")] <- NA # Calculate Missingness-Adjusted Row Sum df$sum <- mySum(df)
Create nomogram plot.
nomogrammer( TP = NULL, TN = NULL, FP = NULL, FN = NULL, pretestProb = NULL, selectionRate = NULL, SN = NULL, SP = NULL, FPR = NULL, PLR = NULL, NLR = NULL, Detail = FALSE, NullLine = FALSE, LabelSize = (14/5), Verbose = FALSE )
nomogrammer( TP = NULL, TN = NULL, FP = NULL, FN = NULL, pretestProb = NULL, selectionRate = NULL, SN = NULL, SP = NULL, FPR = NULL, PLR = NULL, NLR = NULL, Detail = FALSE, NullLine = FALSE, LabelSize = (14/5), Verbose = FALSE )
TP |
Number of true positive cases. |
TN |
Number of true negative cases. |
FP |
Number of false positive cases. |
FN |
Number of false negative cases. |
pretestProb |
Pretest probability (prevalence/base rate/prior probability) of characteristic, as a number between 0 and 1. |
selectionRate |
Selection rate (marginal probability of positive test), as a number between 0 and 1. |
SN |
Sensitivity of the test at a given cut point, as a number between 0 and 1. |
SP |
Specificity of the test at a given cut point, as a number between 0 and 1. |
FPR |
False positive rate of the test at a given cut point, as a number between 0 and 1. |
PLR |
Positive likelihood ratio of the test at a given cut point. |
NLR |
Positive likelihood ratio of the test at a given cut point. |
Detail |
If |
NullLine |
If |
LabelSize |
Label size. |
Verbose |
Print out relevant metrics in the console. |
Create nomogram plot from the following at a given cut point:
1) true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN)
2) pretest probability (pretestProb), sensitivity (SN), and specificity (SP), OR
3) pretest probability (pretestProb), sensitivity (SN), and false positive rate (FPR), OR
4) pretest probability (pretestProb), sensitivity (SN), and selection rate (selectionRate), OR
5) pretest probability (pretestProb), positive likelihood ratio (PLR), and negative likelihood ratio (NLR)
ggplot object of nomogram plot.
https://github.com/achekroud/nomogrammer
Other accuracy:
accuracyAtCutoff()
,
accuracyAtEachCutoff()
,
accuracyOverall()
,
optimalCutoff()
,
posttestOdds()
nomogrammer( TP = 253, TN = 386, FP = 14, FN = 347) nomogrammer( pretestProb = .60, SN = 0.421, SP = 0.965) nomogrammer( pretestProb = .60, SN = 0.421, FPR = 0.035) nomogrammer( pretestProb = .60, SN = 0.421, selectionRate = 0.267) nomogrammer( pretestProb = .60, PLR = 12, NLR = 0.6)
nomogrammer( TP = 253, TN = 386, FP = 14, FN = 347) nomogrammer( pretestProb = .60, SN = 0.421, SP = 0.965) nomogrammer( pretestProb = .60, SN = 0.421, FPR = 0.035) nomogrammer( pretestProb = .60, SN = 0.421, selectionRate = 0.267) nomogrammer( pretestProb = .60, PLR = 12, NLR = 0.6)
Check if any rows for a column are not NA
.
not_all_na(x)
not_all_na(x)
x |
vector or column |
Determine whether any rows for a column (or vector) are not missing
(NA
).
TRUE
or FALSE
Other dataEvaluations:
dropColsWithAllNA()
,
dropRowsWithAllNA()
,
is.nan.data.frame()
,
not_any_na()
# Prepare Data data("USArrests") # Check if any rows are not NA not_all_na(USArrests$Murder)
# Prepare Data data("USArrests") # Check if any rows are not NA not_all_na(USArrests$Murder)
Check if all rows for a column are NA
.
not_any_na(x)
not_any_na(x)
x |
column vector |
[INSERT].
TRUE
or FALSE
, indicating whether the whole column does
not have any missing values (NA
).
Other dataEvaluations:
dropColsWithAllNA()
,
dropRowsWithAllNA()
,
is.nan.data.frame()
,
not_all_na()
# Prepare Data df <- data.frame(item1 = rnorm(1000), item2 = rnorm(1000), item3 = rnorm(1000)) df[sample(1:nrow(df), size = 100), "item2"] <- NA df[,"item3"] <- NA # Check if Not Any NA not_any_na(df$item1) not_any_na(df$item2) not_any_na(df$item3)
# Prepare Data df <- data.frame(item1 = rnorm(1000), item2 = rnorm(1000), item3 = rnorm(1000)) df[sample(1:nrow(df), size = 100), "item2"] <- NA df[,"item3"] <- NA # Check if Not Any NA not_any_na(df$item1) not_any_na(df$item2) not_any_na(df$item3)
Find the optimal cutoff for different aspects of accuracy. Actuals should be
binary, where 1
= present and 0
= absent.
optimalCutoff(predicted, actual, UH = NULL, UM = NULL, UCR = NULL, UFA = NULL)
optimalCutoff(predicted, actual, UH = NULL, UM = NULL, UCR = NULL, UFA = NULL)
predicted |
vector of continuous predicted values. |
actual |
vector of binary actual values ( |
UH |
(optional) utility of hits (true positives), specified as a value from 0-1, where 1 is the most highly valued and 0 is the least valued. |
UM |
(optional) utility of misses (false negatives), specified as a value from 0-1, where 1 is the most highly valued and 0 is the least valued. |
UCR |
(optional) utility of correct rejections (true negatives), specified as a value from 0-1, where 1 is the most highly valued and 0 is the least valued. |
UFA |
(optional) utility of false positives (false positives), specified as a value from 0-1, where 1 is the most highly valued and 0 is the least valued. |
Identify the optimal cutoff for different aspects of accuracy of predicted values in relation to actual values by specifying the predicted values and actual values. Optionally, you can specify the utility of hits, misses, correct rejections, and false alarms to calculate the overall utility of each possible cutoff.
The optimal cutoff and optimal accuracy index at that cutoff based on:
percentAccuracy
= percent accuracy
percentAccuracyByChance
= percent accuracy by chance
RIOC
= relative improvement over chance
relativeImprovementOverPredictingFromBaseRate
= relative
improvement over predicting from the base rate
PPV
= positive predictive value
NPV
= negative predictive value
youdenJ
= Youden's J statistic
balancedAccuracy
= balanced accuracy
f1Score
= F1-score
mcc
= Matthews correlation coefficient
diagnosticOddsRatio
= diagnostic odds ratio
positiveLikelihoodRatio
= positive likelihood ratio
negativeLikelhoodRatio
= negative likelihood ratio
dPrimeSDT
= d-Prime index from signal detection theory
betaSDT
= beta index from signal detection theory
cSDT
= c index from signal detection theory
aSDT
= a index from signal detection theory
bSDT
= b index from signal detection theory
differenceBetweenPredictedAndObserved
= difference between
predicted and observed values
informationGain
= information gain
overallUtility
= overall utility (if utilities were specified)
Other accuracy:
accuracyAtCutoff()
,
accuracyAtEachCutoff()
,
accuracyOverall()
,
nomogrammer()
,
posttestOdds()
# Prepare Data data("USArrests") USArrests$highMurderState <- NA USArrests$highMurderState[which(USArrests$Murder >= 10)] <- 1 USArrests$highMurderState[which(USArrests$Murder < 10)] <- 0 # Determine Optimal Cutoff optimalCutoff(predicted = USArrests$Assault, actual = USArrests$highMurderState) optimalCutoff(predicted = USArrests$Assault, actual = USArrests$highMurderState, UH = 1, UM = 0, UCR = .9, UFA = 0)
# Prepare Data data("USArrests") USArrests$highMurderState <- NA USArrests$highMurderState[which(USArrests$Murder >= 10)] <- 1 USArrests$highMurderState[which(USArrests$Murder < 10)] <- 0 # Determine Optimal Cutoff optimalCutoff(predicted = USArrests$Assault, actual = USArrests$highMurderState) optimalCutoff(predicted = USArrests$Assault, actual = USArrests$highMurderState, UH = 1, UM = 0, UCR = .9, UFA = 0)
Estimate marginal and conditional probabilities using Bayes theorem.
pA(pAgivenB, pB, pAgivenNotB) pB(pBgivenA, pA, pBgivenNotA) pAgivenB(pBgivenA, pA, pB = NULL, pBgivenNotA = NULL) pBgivenA(pAgivenB, pB, pA = NULL, pAgivenNotB = NULL) pAgivenNotB(pAgivenB, pA, pB) pBgivenNotA(pBgivenA, pA, pB)
pA(pAgivenB, pB, pAgivenNotB) pB(pBgivenA, pA, pBgivenNotA) pAgivenB(pBgivenA, pA, pB = NULL, pBgivenNotA = NULL) pBgivenA(pAgivenB, pB, pA = NULL, pAgivenNotB = NULL) pAgivenNotB(pAgivenB, pA, pB) pBgivenNotA(pBgivenA, pA, pB)
pAgivenB |
The conditional probability of |
pB |
The marginal probability of event |
pAgivenNotB |
The conditional probability of |
pBgivenA |
The conditional probability of |
pA |
The marginal probability of event |
pBgivenNotA |
The conditional probability of |
Estimates marginal or conditional probabilities using Bayes theorem.
The requested marginal or conditional probability. One of:
the marginal probability of A
the marginal probability of B
the conditional probability of A
given B
the conditional probability of B
given A
the conditional probability of A
given NOT B
the conditional probability of B
given NOT A
Other bayesian:
deriv_d_negBinom()
pA(pAgivenB = .95, pB = .285, pAgivenNotB = .007171515) pB(pBgivenA = .95, pA = .285, pBgivenNotA = .007171515) pAgivenB(pBgivenA = .95, pA = .285, pB = .2758776) pAgivenB(pBgivenA = .95, pA = .285, pBgivenNotA = .007171515) pAgivenB(pBgivenA = .95, pA = .003, pBgivenNotA = .007171515) pBgivenA(pAgivenB = .95, pB = .285, pA = .2758776) pBgivenA(pAgivenB = .95, pB = .285, pAgivenNotB = .007171515) pBgivenA(pAgivenB = .95, pB = .003, pAgivenNotB = .007171515) pAgivenNotB(pAgivenB = .95, pB = .003, pA = .01) pBgivenNotA(pBgivenA = .95, pA = .003, pB = .01)
pA(pAgivenB = .95, pB = .285, pAgivenNotB = .007171515) pB(pBgivenA = .95, pA = .285, pBgivenNotA = .007171515) pAgivenB(pBgivenA = .95, pA = .285, pB = .2758776) pAgivenB(pBgivenA = .95, pA = .285, pBgivenNotA = .007171515) pAgivenB(pBgivenA = .95, pA = .003, pBgivenNotA = .007171515) pBgivenA(pAgivenB = .95, pB = .285, pA = .2758776) pBgivenA(pAgivenB = .95, pB = .285, pAgivenNotB = .007171515) pBgivenA(pAgivenB = .95, pB = .003, pAgivenNotB = .007171515) pAgivenNotB(pAgivenB = .95, pB = .003, pA = .01) pBgivenNotA(pBgivenA = .95, pA = .003, pB = .01)
Function that creates a partial correlation matrix similar to SPSS output.
partialcor.table( x, y, z = NULL, type = "none", dig = 2, correlation = "pearson" )
partialcor.table( x, y, z = NULL, type = "none", dig = 2, correlation = "pearson" )
x |
Variable or set of variables in the form of a vector or dataframe
to correlate with |
y |
(optional) Variable or set of variables in the form of a vector or
dataframe to correlate with |
z |
Covariate(s) to partial out from association. |
type |
Type of correlation matrix to print. One of:
|
dig |
Number of decimals to print. |
correlation |
Method for calculating the association. One of:
|
Co-created by Angela Staples ([email protected]) and Isaac Petersen ([email protected]). Creates a partial correlation matrix, controlling for one or more covariates. For a standard correlation matrix, see cor.table.
A partial correlation matrix.
Other correlations:
addText()
,
cor.table()
,
crossTimeCorrelation()
,
crossTimeCorrelationDF()
,
vwReg()
# Prepare Data data("mtcars") #Correlation Matrix partialcor.table(mtcars[,c("mpg","cyl","disp")], z = mtcars$hp) partialcor.table(mtcars[,c("mpg","cyl","disp")], z = mtcars[,c("hp","wt")]) partialcor.table(mtcars[,c("mpg","cyl","disp")], z = mtcars[,c("hp","wt")], dig = 3) partialcor.table(mtcars[,c("mpg","cyl","disp")], z = mtcars[,c("hp","wt")], dig = 3, correlation = "spearman") partialcor.table(mtcars[,c("mpg","cyl","disp")], z = mtcars[,c("hp","wt")], type = "manuscript", dig = 3) partialcor.table(mtcars[,c("mpg","cyl","disp")], z = mtcars[,c("hp","wt")], type = "manuscriptBig") table1 <- partialcor.table(mtcars[,c("mpg","cyl","disp")], z = mtcars[,c("hp","wt")], type = "latex") table2 <- partialcor.table(mtcars[,c("mpg","cyl","disp")], z = mtcars[,c("hp","wt")], type = "latexSPSS") table3 <- partialcor.table(mtcars[,c("mpg","cyl","disp")], z = mtcars[,c("hp","wt")], type = "manuscriptLatex") table4 <- partialcor.table(mtcars[,c("mpg","cyl","disp")], z = mtcars[,c("hp","wt")], type = "manuscriptBigLatex") partialcor.table(mtcars[,c("mpg","cyl","disp")], mtcars[,c("drat","qsec")], mtcars[,c("hp","wt")]) partialcor.table(mtcars[,c("mpg","cyl","disp")], mtcars[,c("drat","qsec")], mtcars[,c("hp","wt")], type = "manuscript", dig = 3)
# Prepare Data data("mtcars") #Correlation Matrix partialcor.table(mtcars[,c("mpg","cyl","disp")], z = mtcars$hp) partialcor.table(mtcars[,c("mpg","cyl","disp")], z = mtcars[,c("hp","wt")]) partialcor.table(mtcars[,c("mpg","cyl","disp")], z = mtcars[,c("hp","wt")], dig = 3) partialcor.table(mtcars[,c("mpg","cyl","disp")], z = mtcars[,c("hp","wt")], dig = 3, correlation = "spearman") partialcor.table(mtcars[,c("mpg","cyl","disp")], z = mtcars[,c("hp","wt")], type = "manuscript", dig = 3) partialcor.table(mtcars[,c("mpg","cyl","disp")], z = mtcars[,c("hp","wt")], type = "manuscriptBig") table1 <- partialcor.table(mtcars[,c("mpg","cyl","disp")], z = mtcars[,c("hp","wt")], type = "latex") table2 <- partialcor.table(mtcars[,c("mpg","cyl","disp")], z = mtcars[,c("hp","wt")], type = "latexSPSS") table3 <- partialcor.table(mtcars[,c("mpg","cyl","disp")], z = mtcars[,c("hp","wt")], type = "manuscriptLatex") table4 <- partialcor.table(mtcars[,c("mpg","cyl","disp")], z = mtcars[,c("hp","wt")], type = "manuscriptBigLatex") partialcor.table(mtcars[,c("mpg","cyl","disp")], mtcars[,c("drat","qsec")], mtcars[,c("hp","wt")]) partialcor.table(mtcars[,c("mpg","cyl","disp")], mtcars[,c("drat","qsec")], mtcars[,c("hp","wt")], type = "manuscript", dig = 3)
Calculate perons months for personnel effort in grants.
percentEffort( academicMonths = NULL, calendarMonths = NULL, summerMonths = NULL, appointment = 9 ) personMonths( academicMonths = NULL, calendarMonths = NULL, summerMonths = NULL, effortAcademic = NULL, effortCalendar = NULL, effortSummer = NULL, appointment = 9 )
percentEffort( academicMonths = NULL, calendarMonths = NULL, summerMonths = NULL, appointment = 9 ) personMonths( academicMonths = NULL, calendarMonths = NULL, summerMonths = NULL, effortAcademic = NULL, effortCalendar = NULL, effortSummer = NULL, appointment = 9 )
academicMonths |
The number of academic months. |
calendarMonths |
The number of calendar months. |
summerMonths |
The number of summer months. |
appointment |
The duration (in months) of one's annual appointment; used as the denominator for determining the timeframe out of which the academic months occur. Default is a 9-month appointment. |
effortAcademic |
Percent effort (in proportion) during academic months. |
effortCalendar |
Percent effort (in proportion) during calendar months. |
effortSummer |
Percent effort (in proportion) during summer months. |
Calculate person months for personnel effort in grant proposals from academic months, calendar months, and summer months.
The person months of effort.
https://nexus.od.nih.gov/all/2015/05/27/how-do-you-convert-percent-effort-into-person-months/
# Specify Values appointmentDuration <- 9 #(in months) # Specify either Set 1 (months) or Set 2 (percent effort) below: #Set 1: Months academicMonths <- 1.3 #AY (academic year) months (should be between 0 to appointmentDuration) calendarMonths <- 0 #CY (calendar year) months (should be between 0-12) summerMonths <- 0.5 #SM (summer) months (should be between 0 to [12-appointmentDuration]) # Set 2: Percent Effort percentEffortAcademic <- 0.1444444 #(a proportion; should be between 0-1) percentEffortCalendar <- 0 #(a proportion; should be between 0-1) percentEffortSummer <- 0.1666667 #(a proportion; should be between 0-1) # Calculations summerDuration <- 12 - appointmentDuration # Percent effort (in proportion) percentEffort(academicMonths = academicMonths) percentEffort(calendarMonths = calendarMonths) percentEffort(summerMonths = summerMonths) # Person-Months From NIH Website (percentEffort(academicMonths = academicMonths) * appointmentDuration) + (percentEffort(calendarMonths = calendarMonths) * 12) + (percentEffort(summerMonths = summerMonths) * summerDuration) # Person-Months from Academic/Calendar/Summer Months personMonths(academicMonths = academicMonths, calendarMonths = calendarMonths, summerMonths = summerMonths) # Person-Months from Percent Effort personMonths(effortAcademic = percentEffortAcademic, effortCalendar = percentEffortCalendar, effortSummer = percentEffortSummer)
# Specify Values appointmentDuration <- 9 #(in months) # Specify either Set 1 (months) or Set 2 (percent effort) below: #Set 1: Months academicMonths <- 1.3 #AY (academic year) months (should be between 0 to appointmentDuration) calendarMonths <- 0 #CY (calendar year) months (should be between 0-12) summerMonths <- 0.5 #SM (summer) months (should be between 0 to [12-appointmentDuration]) # Set 2: Percent Effort percentEffortAcademic <- 0.1444444 #(a proportion; should be between 0-1) percentEffortCalendar <- 0 #(a proportion; should be between 0-1) percentEffortSummer <- 0.1666667 #(a proportion; should be between 0-1) # Calculations summerDuration <- 12 - appointmentDuration # Percent effort (in proportion) percentEffort(academicMonths = academicMonths) percentEffort(calendarMonths = calendarMonths) percentEffort(summerMonths = summerMonths) # Person-Months From NIH Website (percentEffort(academicMonths = academicMonths) * appointmentDuration) + (percentEffort(calendarMonths = calendarMonths) * 12) + (percentEffort(summerMonths = summerMonths) * summerDuration) # Person-Months from Academic/Calendar/Summer Months personMonths(academicMonths = academicMonths, calendarMonths = calendarMonths, summerMonths = summerMonths) # Person-Months from Percent Effort personMonths(effortAcademic = percentEffortAcademic, effortCalendar = percentEffortCalendar, effortSummer = percentEffortSummer)
Conversion of percentile ranks to T-scores.
percentileToTScore(percentileRank)
percentileToTScore(percentileRank)
percentileRank |
Vector of percentile ranks. |
Converts percentile ranks to the equivalent T-scores.
Vector of T-scores.
Other conversion:
convert.magic()
,
convertHoursAMPM()
,
convertToHours()
,
convertToMinutes()
,
convertToSeconds()
,
pom()
percentileRanks <- c(1, 10, 20, 30, 40, 50, 60, 70, 80, 90, 99) percentileToTScore(percentileRanks)
percentileRanks <- c(1, 10, 20, 30, 40, 50, 60, 70, 80, 90, 99) percentileToTScore(percentileRanks)
Generates a plot of a 2-way interaction.
plot2WayInteraction( predictor, outcome, moderator, predictorLabel = "predictor", outcomeLabel = "outcome", moderatorLabel = "moderator", varList, varTypes, values = NA, interaction = "normal", legendLabels = NA, legendLocation = "topright", ylim = NA, pvalues = TRUE, data )
plot2WayInteraction( predictor, outcome, moderator, predictorLabel = "predictor", outcomeLabel = "outcome", moderatorLabel = "moderator", varList, varTypes, values = NA, interaction = "normal", legendLabels = NA, legendLocation = "topright", ylim = NA, pvalues = TRUE, data )
predictor |
character name of predictor variable (variable on x-axis). |
outcome |
character name of outcome variable (variable on y-axis). |
moderator |
character name of moderator variable (variable on z-axis). |
predictorLabel |
label on x-axis of plot |
outcomeLabel |
label on y-axis of plot |
moderatorLabel |
label on z-axis of plot |
varList |
names of predictor variables in model |
varTypes |
types of predictor variables in model; one of:
|
values |
specifies values at which to plot moderator (must specify
varType = |
interaction |
one of:
|
legendLabels |
vector of 2 labels for the two levels of the moderator;
leave as |
legendLocation |
one of: |
ylim |
vector of min and max values on y-axis (e,g., |
pvalues |
whether to include p-values of each slope in plot ( |
data |
name of data object |
Generates a plot of a 2-way interaction: the association between a predictor and an outcome at two levels of the moderator.
Plot of two-way interaction.
Other plot:
addText()
,
ppPlot()
,
semPlotInteraction()
,
vwReg()
Other multipleRegression:
lmCombine()
,
ppPlot()
,
semPlotInteraction()
,
update_nested()
# Prepare Data predictor <- rnorm(1000, 10, 3) moderator <- rnorm(1000, 50, 10) outcome <- (1.7 * predictor) + (1.3 * moderator) + (1.5 * predictor * moderator) + rnorm(1000, sd = 3) covariate <- rnorm(1000) df <- data.frame(predictor, moderator, outcome, covariate) # Linear Regression lmModel <- lm(outcome ~ predictor + moderator + predictor:moderator, data = df, na.action = "na.exclude") summary(lmModel) # 1. Plot 2-Way Interaction plot2WayInteraction(predictor = "predictor", outcome = "outcome", moderator = "moderator", varList = c("predictor","moderator","covariate"), varTypes = c("sd","binary","mean"), data = df) # 2. Specify y-axis Range plot2WayInteraction(predictor = "predictor", outcome = "outcome", moderator = "moderator", varList = c("predictor","moderator","covariate"), varTypes = c("sd","binary","mean"), ylim = c(10,50), #new data = df) # 3. Add Variable Labels plot2WayInteraction(predictor = "predictor", outcome = "outcome", moderator = "moderator", varList = c("predictor","moderator","covariate"), varTypes = c("sd","binary","mean"), ylim = c(10,50), predictorLabel = "Stress", #new outcomeLabel = "Aggression", #new moderatorLabel = "Gender", #new data = df) # 4. Change Legend Labels plot2WayInteraction(predictor = "predictor", outcome = "outcome", moderator = "moderator", varList = c("predictor","moderator","covariate"), varTypes = c("sd","binary","mean"), ylim = c(10,50), predictorLabel = "Stress", outcomeLabel = "Aggression", moderatorLabel = "Gender", legendLabels = c("Boys","Girls"), #new data = df) # 5. Move Legend Location plot2WayInteraction(predictor = "predictor", outcome = "outcome", moderator = "moderator", varList = c("predictor","moderator","covariate"), varTypes = c("sd","binary","mean"), ylim = c(10,50), predictorLabel = "Stress", outcomeLabel = "Aggression", moderatorLabel = "Gender", legendLabels = c("Boys","Girls"), legendLocation = "topleft", #new data = df) #6. Turn Off p-Values plot2WayInteraction(predictor = "predictor", outcome = "outcome", moderator = "moderator", varList = c("predictor","moderator","covariate"), varTypes = c("sd","binary","mean"), ylim = c(10,50), predictorLabel = "Stress", outcomeLabel = "Aggression", moderatorLabel = "Gender", legendLabels = c("Boys","Girls"), legendLocation = "topleft", pvalues = FALSE, #new data = df) #7. Get Regression Output from Mean-Centered Predictor and Moderator plot2WayInteraction(predictor = "predictor", outcome = "outcome", moderator = "moderator", varList = c("predictor","moderator","covariate"), varTypes = c("sd","binary","mean"), ylim = c(10,50), predictorLabel = "Stress", outcomeLabel = "Aggression", moderatorLabel = "Gender", legendLabels = c("Boys","Girls"), legendLocation = "topleft", interaction = "meancenter", #new data = df) #8. Get Regression Output from Orthogonalized Interaction Term plot2WayInteraction(predictor = "predictor", outcome = "outcome", moderator = "moderator", varList = c("predictor","moderator","covariate"), varTypes = c("sd","binary","mean"), ylim = c(10,50), predictorLabel = "Stress", outcomeLabel = "Aggression", moderatorLabel = "Gender", legendLabels = c("Boys","Girls"), legendLocation = "topleft", interaction = "orthogonalize", #new data = df)
# Prepare Data predictor <- rnorm(1000, 10, 3) moderator <- rnorm(1000, 50, 10) outcome <- (1.7 * predictor) + (1.3 * moderator) + (1.5 * predictor * moderator) + rnorm(1000, sd = 3) covariate <- rnorm(1000) df <- data.frame(predictor, moderator, outcome, covariate) # Linear Regression lmModel <- lm(outcome ~ predictor + moderator + predictor:moderator, data = df, na.action = "na.exclude") summary(lmModel) # 1. Plot 2-Way Interaction plot2WayInteraction(predictor = "predictor", outcome = "outcome", moderator = "moderator", varList = c("predictor","moderator","covariate"), varTypes = c("sd","binary","mean"), data = df) # 2. Specify y-axis Range plot2WayInteraction(predictor = "predictor", outcome = "outcome", moderator = "moderator", varList = c("predictor","moderator","covariate"), varTypes = c("sd","binary","mean"), ylim = c(10,50), #new data = df) # 3. Add Variable Labels plot2WayInteraction(predictor = "predictor", outcome = "outcome", moderator = "moderator", varList = c("predictor","moderator","covariate"), varTypes = c("sd","binary","mean"), ylim = c(10,50), predictorLabel = "Stress", #new outcomeLabel = "Aggression", #new moderatorLabel = "Gender", #new data = df) # 4. Change Legend Labels plot2WayInteraction(predictor = "predictor", outcome = "outcome", moderator = "moderator", varList = c("predictor","moderator","covariate"), varTypes = c("sd","binary","mean"), ylim = c(10,50), predictorLabel = "Stress", outcomeLabel = "Aggression", moderatorLabel = "Gender", legendLabels = c("Boys","Girls"), #new data = df) # 5. Move Legend Location plot2WayInteraction(predictor = "predictor", outcome = "outcome", moderator = "moderator", varList = c("predictor","moderator","covariate"), varTypes = c("sd","binary","mean"), ylim = c(10,50), predictorLabel = "Stress", outcomeLabel = "Aggression", moderatorLabel = "Gender", legendLabels = c("Boys","Girls"), legendLocation = "topleft", #new data = df) #6. Turn Off p-Values plot2WayInteraction(predictor = "predictor", outcome = "outcome", moderator = "moderator", varList = c("predictor","moderator","covariate"), varTypes = c("sd","binary","mean"), ylim = c(10,50), predictorLabel = "Stress", outcomeLabel = "Aggression", moderatorLabel = "Gender", legendLabels = c("Boys","Girls"), legendLocation = "topleft", pvalues = FALSE, #new data = df) #7. Get Regression Output from Mean-Centered Predictor and Moderator plot2WayInteraction(predictor = "predictor", outcome = "outcome", moderator = "moderator", varList = c("predictor","moderator","covariate"), varTypes = c("sd","binary","mean"), ylim = c(10,50), predictorLabel = "Stress", outcomeLabel = "Aggression", moderatorLabel = "Gender", legendLabels = c("Boys","Girls"), legendLocation = "topleft", interaction = "meancenter", #new data = df) #8. Get Regression Output from Orthogonalized Interaction Term plot2WayInteraction(predictor = "predictor", outcome = "outcome", moderator = "moderator", varList = c("predictor","moderator","covariate"), varTypes = c("sd","binary","mean"), ylim = c(10,50), predictorLabel = "Stress", outcomeLabel = "Aggression", moderatorLabel = "Gender", legendLabels = c("Boys","Girls"), legendLocation = "topleft", interaction = "orthogonalize", #new data = df)
Calculate the proportion of maximum (POM) score given a minimum and maximum score.
pom(data, min = NULL, max = NULL)
pom(data, min = NULL, max = NULL)
data |
The vector of data. |
min |
The minimum possible or observed value. |
max |
The maximum possible or observed value. |
The minimum and maximum score for calculating the proportion of maximum could be the possible or observed minimum and maximum, respectively. Using the possible minimum and maximum would yield the proportion of maximum possible score. Using the observed minimum and maximum would yield the proportion of minimum and maximum observed score. If the minimum and maximum possible scores are not specified, the observed minimum and maximum are used.
Proportion of maximum possible or observed values.
Other conversion:
convert.magic()
,
convertHoursAMPM()
,
convertToHours()
,
convertToMinutes()
,
convertToSeconds()
,
percentileToTScore()
# Prepare Data v1 <- sample(1:9, size = 1000, replace = TRUE) # Calculate Proportion of Maximum Possible (by specifying the minimum and maximum possible) pom(v1, min = 0, max = 10) # Calculate Proportion of Maximum Observed pom(v1)
# Prepare Data v1 <- sample(1:9, size = 1000, replace = TRUE) # Calculate Proportion of Maximum Possible (by specifying the minimum and maximum possible) pom(v1, min = 0, max = 10) # Calculate Proportion of Maximum Observed pom(v1)
Estimate posttest odds and posttest probability.
posttestOdds( TP, TN, FP, FN, pretestProb = NULL, SN = NULL, SP = NULL, likelihoodRatio = NULL ) posttestProbability( TP, TN, FP, FN, pretestProb = NULL, SN = NULL, SP = NULL, likelihoodRatio = NULL )
posttestOdds( TP, TN, FP, FN, pretestProb = NULL, SN = NULL, SP = NULL, likelihoodRatio = NULL ) posttestProbability( TP, TN, FP, FN, pretestProb = NULL, SN = NULL, SP = NULL, likelihoodRatio = NULL )
TP |
Number of true positive cases. |
TN |
Number of true negative cases. |
FP |
Number of false positive cases. |
FN |
Number of false negative cases. |
pretestProb |
Pretest probability (prevalence/base rate/prior probability) of characteristic, as a number between 0 and 1. |
SN |
Sensitivity of the test at a given cut point, as a number between 0 and 1. |
SP |
Specificity of the test at a given cut point, as a number between 0 and 1. |
likelihoodRatio |
Likelihood ratio of the test at a given cut point. |
Estimates posttest odds or posttest probability.
The requested posttest odds or pottest probability.
Other accuracy:
accuracyAtCutoff()
,
accuracyAtEachCutoff()
,
accuracyOverall()
,
nomogrammer()
,
optimalCutoff()
posttestOdds( TP = 26, TN = 56, FP = 14, FN = 14) posttestOdds( pretestProb = 0.3636364, SN = 0.65, SP = 0.80) posttestOdds( pretestProb = 0.3636364, likelihoodRatio = 3.25) posttestProbability( TP = 26, TN = 56, FP = 14, FN = 14) posttestProbability( pretestProb = 0.3636364, SN = 0.65, SP = 0.80) posttestProbability( pretestProb = 0.3636364, likelihoodRatio = 3.25)
posttestOdds( TP = 26, TN = 56, FP = 14, FN = 14) posttestOdds( pretestProb = 0.3636364, SN = 0.65, SP = 0.80) posttestOdds( pretestProb = 0.3636364, likelihoodRatio = 3.25) posttestProbability( TP = 26, TN = 56, FP = 14, FN = 14) posttestProbability( pretestProb = 0.3636364, SN = 0.65, SP = 0.80) posttestProbability( pretestProb = 0.3636364, likelihoodRatio = 3.25)
Normal Probability (P-P) Plot.
ppPlot(model)
ppPlot(model)
model |
The model object of a linear regression model fit using the
|
A normal probability (P-P) plot compares the empirical cumulative distribution to the theoretical cumulative distribution.
Normal probability (P-P) plot.
https://www.r-bloggers.com/2009/12/r-tutorial-series-graphic-analysis-of-regression-assumptions/
Other plot:
addText()
,
plot2WayInteraction()
,
semPlotInteraction()
,
vwReg()
Other multipleRegression:
lmCombine()
,
plot2WayInteraction()
,
semPlotInteraction()
,
update_nested()
# Prepare Data predictor1 <- rnorm(100) predictor2 <- rnorm(100) outcome <- rnorm(100) # Fit Model lmModel <- lm(outcome ~ predictor1 + predictor2) # P-P Plot ppPlot(lmModel)
# Prepare Data predictor1 <- rnorm(100) predictor2 <- rnorm(100) outcome <- rnorm(100) # Fit Model lmModel <- lm(outcome ~ predictor1 + predictor2) # P-P Plot ppPlot(lmModel)
Percent of uncontaminated correlations (PUC) from bifactor model.
puc(numItems, numSpecificFactors)
puc(numItems, numSpecificFactors)
numItems |
Number of items (or indicators). |
numSpecificFactors |
Number of specific factors. |
Estimates the percent of uncontaminated correlations (PUC) from a bifactor model. The PUC represents the percentage of correlations (i.e., covariance terms) that reflect variance from only the general factor (i.e., not variance from a specific factor). Correlations that are explained by the specific factors are considered "contaminated" by multidimensionality.
Percent of Uncontaminated Correlations (PUC).
doi:10.31234/osf.io/6tf7j doi:10.1177/0013164412449831 doi:10.1037/met0000045
Other structural equation modeling:
equiv_chi()
,
make_esem_model()
,
satorraBentlerScaledChiSquareDifferenceTestStatistic()
,
semPlotInteraction()
puc( numItems = 9, numSpecificFactors = 3 ) mydata <- data.frame( numItems = c(9,18,18,36,36,36), numSpecificFactors = c(3,3,6,3,6,12) ) puc( numItems = mydata$numItems, numSpecificFactors = mydata$numSpecificFactors )
puc( numItems = 9, numSpecificFactors = 3 ) mydata <- data.frame( numItems = c(9,18,18,36,36,36), numSpecificFactors = c(3,3,6,3,6,12) ) puc( numItems = mydata$numItems, numSpecificFactors = mydata$numSpecificFactors )
Suppress the leading zero when printing p-values.
pValue(value, digits = 3)
pValue(value, digits = 3)
value |
The p-value. |
digits |
Number of decimal digits for printing the p-value. |
[INSERT].
p-value.
Other formatting:
apa()
,
specify_decimal()
,
suppressLeadingZero()
pValue(0.70) pValue(0.04) pValue(0.00002)
pValue(0.70) pValue(0.04) pValue(0.00002)
Read data from encrypted file.
read.aes(filename, key)
read.aes(filename, key)
filename |
Location of encrypted data. |
key |
Encryption key. |
Reads data from an encrypted file. To write an data to an encrypted file, see write.aes.
Unencrypted data.
Other encrypted:
write.aes()
# Location of Encryption Key on Local Computer (where only you should have access to it) #encryptionKeyLocation <- file.path(getwd(), "/encryptionKey.RData", # fsep = "") #Can change to a different path, e.g.: "C:/Users/[USERNAME]/" # Generate a Temporary File Path for Encryption Key encryptionKeyLocation <- tempfile(fileext = ".RData") # Generate Encryption Key key <- as.raw(sample(1:16, 16)) # Save Encryption Key save(key, file = encryptionKeyLocation) # Specify Credentials credentials <- "Insert My Credentials Here" # Generate a Temporary File Path for Encrypted Credentials encryptedCredentialsLocation <- tempfile(fileext = ".txt") # Save Encrypted Credentials #write.aes( # df = credentials, # filename = file.path(getwd(), "/encrypytedCredentials.txt", fsep = ""), # key = key) # Change the file location to save this on the lab drive write.aes( df = credentials, filename = encryptedCredentialsLocation, key = key) rm(credentials) rm(key) # Read and Unencrypt the Credentials Using the Encryption Key load(encryptionKeyLocation) #credentials <- read.aes( # filename = file.path(getwd(), "/encrypytedCredentials.txt", fsep = ""), # key = key) credentials <- read.aes( filename = encryptedCredentialsLocation, key = key)
# Location of Encryption Key on Local Computer (where only you should have access to it) #encryptionKeyLocation <- file.path(getwd(), "/encryptionKey.RData", # fsep = "") #Can change to a different path, e.g.: "C:/Users/[USERNAME]/" # Generate a Temporary File Path for Encryption Key encryptionKeyLocation <- tempfile(fileext = ".RData") # Generate Encryption Key key <- as.raw(sample(1:16, 16)) # Save Encryption Key save(key, file = encryptionKeyLocation) # Specify Credentials credentials <- "Insert My Credentials Here" # Generate a Temporary File Path for Encrypted Credentials encryptedCredentialsLocation <- tempfile(fileext = ".txt") # Save Encrypted Credentials #write.aes( # df = credentials, # filename = file.path(getwd(), "/encrypytedCredentials.txt", fsep = ""), # key = key) # Change the file location to save this on the lab drive write.aes( df = credentials, filename = encryptedCredentialsLocation, key = key) rm(credentials) rm(key) # Read and Unencrypt the Credentials Using the Encryption Key load(encryptionKeyLocation) #credentials <- read.aes( # filename = file.path(getwd(), "/encrypytedCredentials.txt", fsep = ""), # key = key) credentials <- read.aes( filename = encryptedCredentialsLocation, key = key)
Recode intensity of behavior based on frequency of behavior.
recode_intensity(intensity, did_not_occur = NULL, frequency = NULL) mark_intensity_as_zero( item_names, data, did_not_occur_vars = NULL, frequency_vars = NULL )
recode_intensity(intensity, did_not_occur = NULL, frequency = NULL) mark_intensity_as_zero( item_names, data, did_not_occur_vars = NULL, frequency_vars = NULL )
intensity |
The intensity of the behavior. |
did_not_occur |
Whether or not the behavior did NOT occur. If |
frequency |
The frequency of the behavior. |
item_names |
The names of the questionnaire items. |
data |
The data object. |
did_not_occur_vars |
The name(s) of the variables corresponding to
whether the behavior did not occur in the past year ( |
frequency_vars |
The name(s) of the variables corresponding to the
number of occurrences ( |
Recodes the intensity of behavior to zero if the frequency of the behavior is zero (i.e., if the behavior has not occurred).
The intensity of the behavior.
Function that identifies the values for a progress bar in REDCap.
redcapProgressBar(numSurveys, beginning = 2, end = 99)
redcapProgressBar(numSurveys, beginning = 2, end = 99)
numSurveys |
the number of surveys to establish progress. |
beginning |
the first value to use in the sequence. |
end |
the last value to use in the sequence. |
A progress bar in REDCap can be created using the following code:
Progress: <div style="width:100%;border:0;margin:0;padding:0;background-color: #A9BAD1;text-align:center;"><div style="width:2%;border: 0;margin:0; padding:0;background-color:#8491A2"><span style="color:#8491A2">. </span></div></div>
where width:2%
specifies the progress (out of 100%).
sequence of numbers for the progress bar in REDCap.
redcapProgressBar(numSurveys = 6) redcapProgressBar(6) redcapProgressBar(4) redcapProgressBar(numSurveys = 7, beginning = 1, end = 99)
redcapProgressBar(numSurveys = 6) redcapProgressBar(6) redcapProgressBar(4) redcapProgressBar(numSurveys = 7, beginning = 1, end = 99)
Estimate the reliability in item response theory.
reliabilityIRT(information, varTheta = 1)
reliabilityIRT(information, varTheta = 1)
information |
Test information. |
varTheta |
Variance of theta. |
Estimate the reliability in item response theory using the test information (i.e., the sum of all items' information).
Reliability for that amount of test information.
https://groups.google.com/g/mirt-package/c/ZAgpt6nq5V8/m/R3OEeEqdAQAJ
Other IRT:
deriv_d_negBinom()
,
discriminationToFactorLoading()
,
fourPL()
,
itemInformation()
,
standardErrorIRT()
# Calculate information for 4 items item1 <- itemInformation(b = -2, a = 0.6, theta = -4:4) item2 <- itemInformation(b = -1, a = 1.2, theta = -4:4) item3 <- itemInformation(b = 1, a = 1.5, theta = -4:4) item4 <- itemInformation(b = 2, a = 2, theta = -4:4) items <- data.frame(item1, item2, item3, item4) # Calculate test information items$testInformation <- rowSums(items) # Estimate reliability reliabilityIRT(items$testInformation)
# Calculate information for 4 items item1 <- itemInformation(b = -2, a = 0.6, theta = -4:4) item2 <- itemInformation(b = -1, a = 1.2, theta = -4:4) item3 <- itemInformation(b = 1, a = 1.5, theta = -4:4) item4 <- itemInformation(b = 2, a = 2, theta = -4:4) items <- data.frame(item1, item2, item3, item4) # Calculate test information items$testInformation <- rowSums(items) # Estimate reliability reliabilityIRT(items$testInformation)
Estimate the reliability of a difference score.
reliabilityOfDifferenceScore(x, y, reliabilityX, reliabilityY)
reliabilityOfDifferenceScore(x, y, reliabilityX, reliabilityY)
x |
Vector of one variable that is used in the computation of difference score. |
y |
Vector of second variable that is used in the computation of the difference score. |
reliabilityX |
The reliability of the |
reliabilityY |
The reliability of the |
Estimates the reliability of a difference score.
Reliability of the difference score that is computed from the difference of
x
and y
.
Other reliability:
repeatability()
v1 <- rnorm(1000, mean = 100, sd = 15) v2 <- v1 + rnorm(1000, mean = 1, sd = 15) reliabilityOfDifferenceScore(x = v1, y = v2, reliabilityX = .7, reliabilityY = .8)
v1 <- rnorm(1000, mean = 100, sd = 15) v2 <- v1 + rnorm(1000, mean = 1, sd = 15) reliabilityOfDifferenceScore(x = v1, y = v2, reliabilityX = .7, reliabilityY = .8)
Estimate the repeatability of a measure's scores across two time points.
repeatability(measure1, measure2)
repeatability(measure1, measure2)
measure1 |
Vector of scores from the measure at time 1. |
measure2 |
Vector of scores from the measure at time 2. |
Estimates the coefficient of repeatability (CR), bias, and the lower and upper limits of agreement (LOA).
Dataframe with the coefficient of repeatability (CR
), bias, the lower
limit of agreement (lowerLOA
), and the upper limit of agreement
(upperLOA
). Also generates a Bland-Altman plot with a solid black
reference line (indicating a difference of zero), a dashed red line
indicating the bias, and dashed blue lines indicating the limits of
agreement.
Other reliability:
reliabilityOfDifferenceScore()
v1 <- rnorm(1000, mean = 100, sd = 15) v2 <- v1 + rnorm(1000, mean = 1, sd = 3) repeatability(v1, v2)
v1 <- rnorm(1000, mean = 100, sd = 15) v2 <- v1 + rnorm(1000, mean = 1, sd = 3) repeatability(v1, v2)
Reverse score variables using either the theoretical min and max, or the observed max.
reverse_score( data, variables, theoretical_max = NULL, theoretical_min = NULL, append_string = NULL )
reverse_score( data, variables, theoretical_max = NULL, theoretical_min = NULL, append_string = NULL )
data |
Data object. |
variables |
Names of variables to reverse score. |
theoretical_max |
(Optional): the theoretical maximum score. |
theoretical_min |
(Optional): the theoretical minimum score. |
append_string |
(Optional): a string to append to each variable name. |
Reverse scores variables using either the theoretical min and max (by subtracting the theoretical maximum from each score and adding the theoretical minimum to each score) or by subtracting each score from the maximum score for that variable.
Dataframe with reverse-scored variables.
mydata <- data.frame( var1 = c(1, 2, NA, 4, 5), var2 = c(NA, 4, 3, 2, 1) ) variables_to_reverse_score <- c("var1", "var2") reverse_score( mydata, variables = variables_to_reverse_score) reverse_score( mydata, variables = variables_to_reverse_score, append_string = ".R") reverse_score( mydata, variables = variables_to_reverse_score, theoretical_max = 7) reverse_score( mydata, variables = variables_to_reverse_score, theoretical_max = 7, theoretical_min = 1)
mydata <- data.frame( var1 = c(1, 2, NA, 4, 5), var2 = c(NA, 4, 3, 2, 1) ) variables_to_reverse_score <- c("var1", "var2") reverse_score( mydata, variables = variables_to_reverse_score) reverse_score( mydata, variables = variables_to_reverse_score, append_string = ".R") reverse_score( mydata, variables = variables_to_reverse_score, theoretical_max = 7) reverse_score( mydata, variables = variables_to_reverse_score, theoretical_max = 7, theoretical_min = 1)
Function that computes the Satorra-Bentler Scaled Chi-Square Difference Test statistic.
satorraBentlerScaledChiSquareDifferenceTestStatistic(T0, c0, d0, T1, c1, d1)
satorraBentlerScaledChiSquareDifferenceTestStatistic(T0, c0, d0, T1, c1, d1)
T0 |
Value of the chi-square statistic for the nested model. |
c0 |
Value of the scaling correction factor for the nested model. |
d0 |
Number of model degrees of freedom for the nested model. |
T1 |
Value of the chi-square statistic for the comparison model. |
c1 |
Value of the scaling correction factor for the comparison model. |
d1 |
Number of model degrees of freedom for the comparison model. |
Computes the Satorra-Bentler Scaled Chi-Square Difference Test statistic between two structural equation models.
Satorra-Bentler Scaled Chi-Square Difference Test statistic.
Other structural equation modeling:
equiv_chi()
,
make_esem_model()
,
puc()
,
semPlotInteraction()
# Fit structural equation model HS.model <- ' visual =~ x1 + x2 + x3 textual =~ x4 + x5 + x6 speed =~ x7 + x8 + x9 ' fit1 <- lavaan::cfa(HS.model, data = lavaan::HolzingerSwineford1939, estimator = "MLR") fit0 <- lavaan::cfa(HS.model, data = lavaan::HolzingerSwineford1939, orthogonal = TRUE, estimator = "MLR") # Chi-square difference test # lavaan::anova(fit1, fit0) satorraBentlerScaledChiSquareDifferenceTestStatistic( T0 = lavaan::fitMeasures(fit0)["chisq.scaled"], c0 = lavaan::fitMeasures(fit0)["chisq.scaling.factor"], d0 = lavaan::fitMeasures(fit0)["df.scaled"], T1 = lavaan::fitMeasures(fit1)["chisq.scaled"], c1 = lavaan::fitMeasures(fit1)["chisq.scaling.factor"], d1 = lavaan::fitMeasures(fit1)["df.scaled"])
# Fit structural equation model HS.model <- ' visual =~ x1 + x2 + x3 textual =~ x4 + x5 + x6 speed =~ x7 + x8 + x9 ' fit1 <- lavaan::cfa(HS.model, data = lavaan::HolzingerSwineford1939, estimator = "MLR") fit0 <- lavaan::cfa(HS.model, data = lavaan::HolzingerSwineford1939, orthogonal = TRUE, estimator = "MLR") # Chi-square difference test # lavaan::anova(fit1, fit0) satorraBentlerScaledChiSquareDifferenceTestStatistic( T0 = lavaan::fitMeasures(fit0)["chisq.scaled"], c0 = lavaan::fitMeasures(fit0)["chisq.scaling.factor"], d0 = lavaan::fitMeasures(fit0)["df.scaled"], T1 = lavaan::fitMeasures(fit1)["chisq.scaled"], c1 = lavaan::fitMeasures(fit1)["chisq.scaling.factor"], d1 = lavaan::fitMeasures(fit1)["df.scaled"])
Generates a plot of a 2-way interaction from a structural equation model (SEM) that was estimated using the lavaan package.
semPlotInteraction( data, fit, predictor, centered_predictor, moderator, centered_moderator, interaction, outcome, covariates = NULL, predStr = NULL, modStr = NULL, outStr = NULL )
semPlotInteraction( data, fit, predictor, centered_predictor, moderator, centered_moderator, interaction, outcome, covariates = NULL, predStr = NULL, modStr = NULL, outStr = NULL )
data |
the dataframe object from which the model was derived |
fit |
the fitted model lavaan object |
predictor |
the variable name of the predictor variable that is in its raw metric (in quotes) |
centered_predictor |
the variable name of the mean-centered predictor variable as it appears in the model object syntax in lavaan (in quotes) |
moderator |
the variable name of the moderator variable that is in its raw metric (in quotes) |
centered_moderator |
the variable name of the mean-centered moderator variable that as it appears in the model object syntax in lavaan (in quotes) |
interaction |
the variable name of the interaction term as it appears in the model object syntax in lavaan (in quotes) |
outcome |
the variable name of the outcome variable as it appears in the model object syntax in lavaan (in quotes) |
covariates |
default NULL; a vector of the names of the covariate variables as they appear in the model object syntax in lavaan (each in quotes) |
predStr |
default NULL; optional addition of an x-axis title for the name of the predictor variable (in quotes); if left unset, plot label will default to "Predictor" |
modStr |
default NULL; optional addition of an z-axis title for the name of the moderator variable (in quotes); if left unset, plot label will default to "Moderator" |
outStr |
default NULL; optional addition of an x-axis title for the name of the outcome variable (in quotes); if left unset, plot label will default to "Outcome" |
Created by Johanna Caskey ([email protected]).
Plot of two-way interaction from structural equation model.
Other plot:
addText()
,
plot2WayInteraction()
,
ppPlot()
,
vwReg()
Other multipleRegression:
lmCombine()
,
plot2WayInteraction()
,
ppPlot()
,
update_nested()
Other structural equation modeling:
equiv_chi()
,
make_esem_model()
,
puc()
,
satorraBentlerScaledChiSquareDifferenceTestStatistic()
states <- as.data.frame(state.x77) names(states)[which(names(states) == "HS Grad")] <- "HS.Grad" states$Income_rescaled <- states$Income/100 # Mean Center Predictors states$Illiteracy_centered <- scale(states$Illiteracy, scale = FALSE) states$Murder_centered <- scale(states$Murder, scale = FALSE) # Compute Interaction Term states$interaction <- states$Illiteracy_centered * states$Murder_centered # Specify model syntax moderationModel <- ' Income_rescaled ~ Illiteracy_centered + Murder_centered + interaction + HS.Grad ' # Fit the model moderationFit <- lavaan::sem( moderationModel, data = states, missing = "ML", estimator = "MLR", fixed.x = FALSE) # Pass model to function (unlabeled plot) semPlotInteraction( data = states, fit = moderationFit, predictor = "Illiteracy", centered_predictor = "Illiteracy_centered", moderator = "Murder", centered_moderator = "Murder_centered", interaction = "interaction", outcome = "Income_rescaled", covariates = "HS.Grad") # Pass model to function (labeled plot) semPlotInteraction( data = states, fit = moderationFit, predictor = "Illiteracy", centered_predictor = "Illiteracy_centered", moderator = "Murder", centered_moderator = "Murder_centered", interaction = "interaction", outcome = "Income_rescaled", covariates = "HS.Grad", predStr = "Illiteracy Level", modStr = "Murder Rate", outStr = "Income")
states <- as.data.frame(state.x77) names(states)[which(names(states) == "HS Grad")] <- "HS.Grad" states$Income_rescaled <- states$Income/100 # Mean Center Predictors states$Illiteracy_centered <- scale(states$Illiteracy, scale = FALSE) states$Murder_centered <- scale(states$Murder, scale = FALSE) # Compute Interaction Term states$interaction <- states$Illiteracy_centered * states$Murder_centered # Specify model syntax moderationModel <- ' Income_rescaled ~ Illiteracy_centered + Murder_centered + interaction + HS.Grad ' # Fit the model moderationFit <- lavaan::sem( moderationModel, data = states, missing = "ML", estimator = "MLR", fixed.x = FALSE) # Pass model to function (unlabeled plot) semPlotInteraction( data = states, fit = moderationFit, predictor = "Illiteracy", centered_predictor = "Illiteracy_centered", moderator = "Murder", centered_moderator = "Murder_centered", interaction = "interaction", outcome = "Income_rescaled", covariates = "HS.Grad") # Pass model to function (labeled plot) semPlotInteraction( data = states, fit = moderationFit, predictor = "Illiteracy", centered_predictor = "Illiteracy_centered", moderator = "Murder", centered_moderator = "Murder_centered", interaction = "interaction", outcome = "Income_rescaled", covariates = "HS.Grad", predStr = "Illiteracy Level", modStr = "Murder Rate", outStr = "Income")
Sets the path directory to the lab drive.
setLabPath()
setLabPath()
Sets the path directory to the lab drive, and saves it in the object
petersenLab
.
The object petersenLab
with containing the path directory to the lab
drive.
petersenLabPath <- setLabPath()
petersenLabPath <- setLabPath()
Simulate data with a specified area under the receiver operating characteristic curve—i.e., the AUC of an ROC curve.
simulateAUC(auc, n)
simulateAUC(auc, n)
auc |
The area under the receiver operating characteristic (ROC) curve. |
n |
The number of observations to simulate. |
Simulates data with a specified area under the receiver operating characteristic curve—i.e., the AUC of an ROC curve.
Dataframe with two columns:
x
is the predictor variable.
y
is the dichotomous criterion variable.
https://stats.stackexchange.com/questions/422926/generate-synthetic-data-given-auc/424213
Other simulation:
complement()
,
simulateIndirectEffect()
simulateAUC(.60, 50000) simulateAUC(.70, 50000) simulateAUC(.80, 50000) simulateAUC(.90, 50000) simulateAUC(.95, 50000) simulateAUC(.99, 50000)
simulateAUC(.60, 50000) simulateAUC(.70, 50000) simulateAUC(.80, 50000) simulateAUC(.90, 50000) simulateAUC(.95, 50000) simulateAUC(.99, 50000)
Simulate indirect effect from mediation analyses.
simulateIndirectEffect( N = NA, x = NA, m = NA, XcorM = NA, McorY = NA, corTotal = NA, proportionMediated = NA, seed = NA )
simulateIndirectEffect( N = NA, x = NA, m = NA, XcorM = NA, McorY = NA, corTotal = NA, proportionMediated = NA, seed = NA )
N |
Sample size. |
x |
Vector for the predictor variable. |
m |
Vector for the mediating variable. |
XcorM |
Coefficient of the correlation between the predictor variable and mediating variable. |
McorY |
Coefficient of the correlation between the mediating variable and outcome variable. |
corTotal |
Size of total effect. |
proportionMediated |
The proportion of the total effect that is mediated. |
seed |
Seed for replicability. |
Co-created by Robert G. Moulder Jr. and Isaac T. Petersen
the correlation between the predictor variable (x
) and the
mediating variable (m
).
the correlation between the mediating variable (m
) and the
outcome variable (Y
).
the correlation between the predictor variable (x
) and the
outcome variable (Y
).
the direct correlation between the predictor variable (x
) and
the outcome variable (Y
), while controlling for the mediating
variable (m
).
the indirect correlation between the predictor variable (x
)
and the outcome variable (Y
) through the mediating variable
(m
).
the total correlation between the predictor variable (x
) and
the outcome variable (Y
): i.e., the sum of the direct correlation
and the indirect correlation.
the proportion of the correlation between the predictor variable
(x
) and the outcome variable (Y
) that is mediated through the
mediating variable (m
).
Other simulation:
complement()
,
simulateAUC()
#INSERT
#INSERT
Specify the number of decimals to print.
specify_decimal(x, k)
specify_decimal(x, k)
x |
Numeric vector. |
k |
Number of decimals to print. |
[INSERT].
Character vector of numbers with the specified number of decimal places.
Other formatting:
apa()
,
pValue()
,
suppressLeadingZero()
# Prepare Data v1 <- rnorm(1000) # Specify Decimals specify_decimal(v1, 2)
# Prepare Data v1 <- rnorm(1000) # Specify Decimals specify_decimal(v1, 2)
Estimate the standard error of measurement in item response theory.
standardErrorIRT(information)
standardErrorIRT(information)
information |
Test information. |
Estimate the standard error of measurement in item response theory using the test information (i.e., the sum of all items' information).
Standard error of measurement for that amount of test information.
Other IRT:
deriv_d_negBinom()
,
discriminationToFactorLoading()
,
fourPL()
,
itemInformation()
,
reliabilityIRT()
# Calculate information for 4 items item1 <- itemInformation(b = -2, a = 0.6, theta = -4:4) item2 <- itemInformation(b = -1, a = 1.2, theta = -4:4) item3 <- itemInformation(b = 1, a = 1.5, theta = -4:4) item4 <- itemInformation(b = 2, a = 2, theta = -4:4) items <- data.frame(item1, item2, item3, item4) # Calculate test information items$testInformation <- rowSums(items) # Calculate standard error of measurement standardErrorIRT(items$testInformation)
# Calculate information for 4 items item1 <- itemInformation(b = -2, a = 0.6, theta = -4:4) item2 <- itemInformation(b = -1, a = 1.2, theta = -4:4) item3 <- itemInformation(b = 1, a = 1.5, theta = -4:4) item4 <- itemInformation(b = 2, a = 2, theta = -4:4) items <- data.frame(item1, item2, item3, item4) # Calculate test information items$testInformation <- rowSums(items) # Calculate standard error of measurement standardErrorIRT(items$testInformation)
Suppress leading zero of numbers.
suppressLeadingZero(value)
suppressLeadingZero(value)
value |
Numeric vector. |
[INSERT].
Character vector of numbers without leading zeros.
Other formatting:
apa()
,
pValue()
,
specify_decimal()
# Prepare Data v1 <- rnorm(1000) # Suppress Leading Zero suppressLeadingZero(v1)
# Prepare Data v1 <- rnorm(1000) # Suppress Leading Zero suppressLeadingZero(v1)
Estimate frequency of a behavior for a particular duration.
timesPerInterval( num_occurrences = NULL, interval = NULL, duration = "month", not_occurred_past_year = NULL ) timesPerLifetime(num_occurrences = NULL, never_occurred = NULL) computeItemFrequencies( item_names, data, duration = "month", frequency_vars, interval_vars, not_in_past_year_vars ) computeLifetimeFrequencies( item_names, data, frequency_vars, never_occurred_vars )
timesPerInterval( num_occurrences = NULL, interval = NULL, duration = "month", not_occurred_past_year = NULL ) timesPerLifetime(num_occurrences = NULL, never_occurred = NULL) computeItemFrequencies( item_names, data, duration = "month", frequency_vars, interval_vars, not_in_past_year_vars ) computeLifetimeFrequencies( item_names, data, frequency_vars, never_occurred_vars )
num_occurrences |
The number of times the behavior occurred during the
specified interval, |
interval |
The specified interval corresponding to the number of times
the behavior occurred,
|
duration |
The desired duration during which to estimate how many times the behavior occurred:
|
not_occurred_past_year |
Whether or not the behavior did NOT occur in
the past year. If |
never_occurred |
Whether or not the behavior has NEVER occurred in
the person's lifetime. If |
item_names |
The names of the questionnaire items. |
data |
The data object. |
frequency_vars |
The name(s) of the variables corresponding to the
number of occurrences ( |
interval_vars |
The name(s) of the variables corresponding to the
intervals ( |
not_in_past_year_vars |
The name(s) of the variables corresponding to
whether the behavior did not occur in the past year
( |
never_occurred_vars |
The name(s) of the variables corresponding to
whether the behavior has never occurred during the person's lifetime
( |
Estimates the frequency of a given behavior for a particular duration, given a specified number of times it occurred during a specified interval.
The frequency of the behavior for the specified duration.
timesPerInterval( num_occurrences = 2, interval = 3, duration = "month", not_occurred_past_year = 0 ) timesPerInterval( duration = "month", not_occurred_past_year = 1 ) timesPerLifetime( num_occurrences = 2, never_occurred = 0 ) timesPerLifetime( never_occurred = 1 )
timesPerInterval( num_occurrences = 2, interval = 3, duration = "month", not_occurred_past_year = 0 ) timesPerInterval( duration = "month", not_occurred_past_year = 1 ) timesPerLifetime( num_occurrences = 2, never_occurred = 0 ) timesPerLifetime( never_occurred = 1 )
Wrapper function to ensure the same observations are used for each updated model as were used in the first model.
update_nested(object, formula., ..., evaluate = TRUE)
update_nested(object, formula., ..., evaluate = TRUE)
object |
model object to update |
formula. |
updated model formula |
... |
further parameters passed to the fitting function |
evaluate |
whether to evaluate the model. One of: |
Convenience wrapper function to ensure the same observations are used for each updated model as were used in the first model, to ensure comparability of models.
lm
model
https://stackoverflow.com/a/37341927
https://stackoverflow.com/a/37416336
https://stackoverflow.com/a/47195348
Other multipleRegression:
lmCombine()
,
plot2WayInteraction()
,
ppPlot()
,
semPlotInteraction()
# Prepare Data data("mtcars") dat <- mtcars # Create some missing values in mtcars dat[1, "wt"] <- NA dat[5, "cyl"] <- NA dat[7, "hp"] <- NA m1 <- lm(mpg ~ wt + cyl + hp, data = dat) m2 <- update_nested(m1, . ~ . - wt) # Remove wt m3 <- update_nested(m1, . ~ . - cyl) # Remove cyl m4 <- update_nested(m1, . ~ . - wt - cyl) # Remove wt and cyl m5 <- update_nested(m1, . ~ . - wt - cyl - hp) # Remove all three variables # (i.e., model with intercept only) anova(m1, m2, m3, m4, m5)
# Prepare Data data("mtcars") dat <- mtcars # Create some missing values in mtcars dat[1, "wt"] <- NA dat[5, "cyl"] <- NA dat[7, "hp"] <- NA m1 <- lm(mpg ~ wt + cyl + hp, data = dat) m2 <- update_nested(m1, . ~ . - wt) # Remove wt m3 <- update_nested(m1, . ~ . - cyl) # Remove cyl m4 <- update_nested(m1, . ~ . - wt - cyl) # Remove wt and cyl m5 <- update_nested(m1, . ~ . - wt - cyl - hp) # Remove all three variables # (i.e., model with intercept only) anova(m1, m2, m3, m4, m5)
Identifies the variables in common across two dataframes that have different types.
varsDifferentTypes(df1, df2)
varsDifferentTypes(df1, df2)
df1 |
dataframe 1 (object) |
df2 |
dataframe 2 (object) |
Identifies the variables that have the same name across two dataframes that have different types, which can pose challenges for merging two dataframes.
Dataframe with columns for the variable name, the variable type in df1
and the variable type in df2
.
Other dataManipulation:
columnBindFill()
,
convert.magic()
,
dropColsWithAllNA()
,
dropRowsWithAllNA()
# Prepare Data df1 <- data.frame( A = 1:3, B = 2:4, C = 3:5 ) df2 <- data.frame( A = as.character(1:3), B = 2:4, C = as.factor(3:5) ) # Check if any rows are not NA varsDifferentTypes(df1, df2)
# Prepare Data df1 <- data.frame( A = 1:3, B = 2:4, C = 3:5 ) df2 <- data.frame( A = as.character(1:3), B = 2:4, C = as.factor(3:5) ) # Check if any rows are not NA varsDifferentTypes(df1, df2)
Create watercolor plot to visualize weighted regression.
vwReg( formula, data, title = "", B = 1000, shade = TRUE, shade.alpha = 0.1, spag = FALSE, spag.color = "darkblue", mweight = TRUE, show.lm = FALSE, show.median = TRUE, median.col = "white", shape = 21, show.CI = FALSE, method = loess, bw = FALSE, slices = 200, palette = colorRampPalette(c("#FFEDA0", "#DD0000"), bias = 2)(20), ylim = NULL, quantize = "continuous", add = FALSE, ... )
vwReg( formula, data, title = "", B = 1000, shade = TRUE, shade.alpha = 0.1, spag = FALSE, spag.color = "darkblue", mweight = TRUE, show.lm = FALSE, show.median = TRUE, median.col = "white", shape = 21, show.CI = FALSE, method = loess, bw = FALSE, slices = 200, palette = colorRampPalette(c("#FFEDA0", "#DD0000"), bias = 2)(20), ylim = NULL, quantize = "continuous", add = FALSE, ... )
formula |
regression model. |
data |
dataset. |
title |
plot title. |
B |
number of bootstrapped smoothers. |
shade |
whether to plot the shaded confidence region. |
shade.alpha |
whether to fade out the confidence interval shading at the edges (by reducing alpha; 0 = no alpha decrease, 0.1 = medium alpha decrease, 0.5 = strong alpha decrease). |
spag |
whether to plot spaghetti lines. |
spag.color |
the fitting function for the spaghettis; default:
|
mweight |
logical indicating whether to make the median smoother visually weighted. |
show.lm |
logical indicating whether to plot the linear regression line. |
show.median |
logical indicating whether to plot the median smoother. |
median.col |
color of the median smoother. |
shape |
shape of points. |
show.CI |
logical indicating whether to plot the 95% confidence interval limits. |
method |
color of spaghetti lines. |
bw |
logical indicating whether to use a b&w palette; default:
|
slices |
number of slices in x and y direction for the shaded region.
Higher numbers make a smoother plot, but takes longer to draw. I would not
set |
palette |
provide a custom color palette for the watercolors. |
ylim |
restrict range of the watercoloring. |
quantize |
either |
add |
if |
... |
further parameters passed to the fitting function, in the case of
loess, for example, |
Creates a watercolor plot to visualize weighted regression.
plot
https://www.nicebread.de/visually-weighted-regression-in-r-a-la-solomon-hsiang/
https://www.nicebread.de/visually-weighted-watercolor-plots-new-variants-please-vote/
http://www.fight-entropy.com/2012/07/visually-weighted-regression.html
http://www.fight-entropy.com/2012/08/visually-weighted-confidence-intervals.html
http://www.fight-entropy.com/2012/08/watercolor-regression.html
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2265501
Other plot:
addText()
,
plot2WayInteraction()
,
ppPlot()
,
semPlotInteraction()
Other correlations:
addText()
,
cor.table()
,
crossTimeCorrelation()
,
crossTimeCorrelationDF()
,
partialcor.table()
# Prepare Data data("mtcars") df <- data.frame(x = mtcars$hp, y = mtcars$mpg) ## Visually Weighted Regression # Default vwReg(y ~ x, df) # Shade vwReg(y ~ x, df, shade = TRUE, show.lm = TRUE, show.CI = TRUE, quantize = "continuous") vwReg(y ~ x, df, shade = TRUE, show.lm = TRUE, show.CI = TRUE, quantize = "SD") # Spaghetti vwReg(y ~ x, df, shade = FALSE, spag = TRUE, show.lm = TRUE, show.CI = TRUE) vwReg(y ~ x, df, shade = FALSE, spag = TRUE) # Black/white vwReg(y ~ x, df, shade = TRUE, spag = FALSE, show.lm = TRUE, show.CI = TRUE, bw = TRUE, quantize = "continuous") vwReg(y ~ x, df, shade = TRUE, spag = FALSE, show.lm = TRUE, show.CI = TRUE, bw = TRUE, quantize = "SD") vwReg(y ~ x, df, shade = FALSE, spag = TRUE, show.lm = TRUE, show.CI = TRUE, bw = TRUE, quantize = "SD") # Change the bootstrap smoothing vwReg(y ~ x, df, family = "symmetric") # use an M-estimator for # bootstrap smoothers. Usually yields wider confidence intervals vwReg(y ~ x, df, span = 1.7) # increase the span of the smoothers vwReg(y ~ x, df, span = 0.5) # decrease the span of the smoothers # Change the color scheme vwReg(y ~ x, df, palette = viridisLite::viridis(4)) # viridis vwReg(y ~ x, df, palette = viridisLite::magma(4)) # magma vwReg(y ~ x, df, palette = RColorBrewer::brewer.pal(9, "YlGnBu")) # change the # color scheme, using a predefined ColorBrewer palette. You can see all # available palettes by using this command: # `library(RColorBrewer); display.brewer.all()` vwReg(y ~ x, df, palette = grDevices::colorRampPalette(c("white","yellow", "green","red"))(20)) # use a custom-made palette vwReg(y ~ x, df, palette = grDevices::colorRampPalette(c("white","yellow", "green","red"), bias = 3)(20)) # use a custom-made palette, with the # parameter bias you can shift the color ramp to the “higher” colors vwReg(y ~ x, df, bw = TRUE) # black and white version vwReg(y ~ x, df, shade.alpha = 0, palette = grDevices::colorRampPalette( c("black","grey30","white"), bias = 4)(20)) # Milky-Way Plot vwReg(y ~ x, df, shade.alpha = 0, slices = 400, palette = grDevices::colorRampPalette(c("black","green","yellow","red"), bias = 5)(20), family = "symmetric") # Northern Light Plot/ fMRI plot vwReg(y ~ x, df, quantize = "SD") # 1-2-3-SD plot
# Prepare Data data("mtcars") df <- data.frame(x = mtcars$hp, y = mtcars$mpg) ## Visually Weighted Regression # Default vwReg(y ~ x, df) # Shade vwReg(y ~ x, df, shade = TRUE, show.lm = TRUE, show.CI = TRUE, quantize = "continuous") vwReg(y ~ x, df, shade = TRUE, show.lm = TRUE, show.CI = TRUE, quantize = "SD") # Spaghetti vwReg(y ~ x, df, shade = FALSE, spag = TRUE, show.lm = TRUE, show.CI = TRUE) vwReg(y ~ x, df, shade = FALSE, spag = TRUE) # Black/white vwReg(y ~ x, df, shade = TRUE, spag = FALSE, show.lm = TRUE, show.CI = TRUE, bw = TRUE, quantize = "continuous") vwReg(y ~ x, df, shade = TRUE, spag = FALSE, show.lm = TRUE, show.CI = TRUE, bw = TRUE, quantize = "SD") vwReg(y ~ x, df, shade = FALSE, spag = TRUE, show.lm = TRUE, show.CI = TRUE, bw = TRUE, quantize = "SD") # Change the bootstrap smoothing vwReg(y ~ x, df, family = "symmetric") # use an M-estimator for # bootstrap smoothers. Usually yields wider confidence intervals vwReg(y ~ x, df, span = 1.7) # increase the span of the smoothers vwReg(y ~ x, df, span = 0.5) # decrease the span of the smoothers # Change the color scheme vwReg(y ~ x, df, palette = viridisLite::viridis(4)) # viridis vwReg(y ~ x, df, palette = viridisLite::magma(4)) # magma vwReg(y ~ x, df, palette = RColorBrewer::brewer.pal(9, "YlGnBu")) # change the # color scheme, using a predefined ColorBrewer palette. You can see all # available palettes by using this command: # `library(RColorBrewer); display.brewer.all()` vwReg(y ~ x, df, palette = grDevices::colorRampPalette(c("white","yellow", "green","red"))(20)) # use a custom-made palette vwReg(y ~ x, df, palette = grDevices::colorRampPalette(c("white","yellow", "green","red"), bias = 3)(20)) # use a custom-made palette, with the # parameter bias you can shift the color ramp to the “higher” colors vwReg(y ~ x, df, bw = TRUE) # black and white version vwReg(y ~ x, df, shade.alpha = 0, palette = grDevices::colorRampPalette( c("black","grey30","white"), bias = 4)(20)) # Milky-Way Plot vwReg(y ~ x, df, shade.alpha = 0, slices = 400, palette = grDevices::colorRampPalette(c("black","green","yellow","red"), bias = 5)(20), family = "symmetric") # Northern Light Plot/ fMRI plot vwReg(y ~ x, df, quantize = "SD") # 1-2-3-SD plot
Write data to encrypted file.
write.aes(df, filename, key)
write.aes(df, filename, key)
df |
Data to encrypt. |
filename |
Location where to save encrypted data. |
key |
Encryption key. |
Writes data to an encrypted file. To read data from an encrypted file, see read.aes.
A file with encrypted data.
Other encrypted:
read.aes()
# Location Where to Save Encryption Key on Local Computer #(where only you should have access to it) #encryptionKeyLocation <- file.path(getwd(), "/encryptionKey.RData", # fsep = "") #Can change to a different path, e.g.: "C:/Users/[USERNAME]/" # Generate a Temporary File Path for Encryption Key encryptionKeyLocation <- tempfile(fileext = ".RData") # Generate Encryption Key key <- as.raw(sample(1:16, 16)) # Save Encryption Key save(key, file = encryptionKeyLocation) # Specify Credentials credentials <- "Insert My Credentials Here" # Generate a Temporary File Path for Encrypted Credentials encryptedCredentialsLocation <- tempfile(fileext = ".txt") # Save Encrypted Credentials #write.aes( # df = credentials, # filename = file.path(getwd(), "/encrypytedCredentials.txt", fsep = ""), # key = key) #Change the file location to save this on the lab drive write.aes( df = credentials, filename = encryptedCredentialsLocation, key = key) rm(credentials) rm(key)
# Location Where to Save Encryption Key on Local Computer #(where only you should have access to it) #encryptionKeyLocation <- file.path(getwd(), "/encryptionKey.RData", # fsep = "") #Can change to a different path, e.g.: "C:/Users/[USERNAME]/" # Generate a Temporary File Path for Encryption Key encryptionKeyLocation <- tempfile(fileext = ".RData") # Generate Encryption Key key <- as.raw(sample(1:16, 16)) # Save Encryption Key save(key, file = encryptionKeyLocation) # Specify Credentials credentials <- "Insert My Credentials Here" # Generate a Temporary File Path for Encrypted Credentials encryptedCredentialsLocation <- tempfile(fileext = ".txt") # Save Encrypted Credentials #write.aes( # df = credentials, # filename = file.path(getwd(), "/encrypytedCredentials.txt", fsep = ""), # key = key) #Change the file location to save this on the lab drive write.aes( df = credentials, filename = encryptedCredentialsLocation, key = key) rm(credentials) rm(key)