When I run a principal components analysis in R and then plot 2 of the PCs in ggplot I would like to be able to have the axis labels automatically include which PC # is on the axis and the percent of variation it explains. Right now I have to change the labels manually when I switch to different PCs.
I have example code here (I've left out quite a bit of code that I believe has nothing to do with the question but please let me know if I'm mistaken):
# Example dataset SR
SR = structure(list(Site_ID = 1:6, A = c(0.102, 1.34, 0.875, 0.564,
0.075, 0.141), B = c(0.01, 0.05, 0.021, 0.018, 0.006, 0.144),
C = c(1.329, 2.029, 2.466, 6.648, 0.735, 2.49), D = c(0.025,
0.045, 0.039, 0.024, 0.045, 0.112), E = c(0.007, 0.001618893,
0.022, 0.018, 0.006, 0.035), F = c(17.52188, 27.412, 18.69,
118.8684, 9.7188, 2.9904)), class = "data.frame", row.names = c(NA,
-6L))
##### PCA calcuation ##############################################
SR.pca <- prcomp(SR, scale=TRUE, retx=TRUE)
PCAvalues <- summary(SR.pca)
#Example output
#Importance of components:
# PC1 PC2 PC3 PC4 PC5 PC6
#Standard deviation 2.3467 2.1712 1.8408 1.12707 1.05835 8.756e-16
#Proportion of Variance 0.3442 0.2946 0.2118 0.07939 0.07001 0.000e+00
#Cumulative Proportion 0.3442 0.6388 0.8506 0.92999 1.00000 1.000e+00
summ <- summary(SR.pca)$importance[2,]
#gives proportion of variance for each PC
summ
#Example output
#PC1 PC2 PC3 PC4 PC5 PC6
#0.34420 0.29463 0.21178 0.07939 0.07001 0.00000
###### Graph the PCA ########
library(ggplot2)
ggplot(PCAvalues, aes(x = PC1, y = PC2)) +
geom_text(data=PCAvalues, aes(x = PC1, y = PC2, label=Site_ID), size=2)+
scale_color_gradient(low = "red", high = "blue") +
coord_equal() +
labs(color="Dist. Grad.")+
theme_bw()
# + labs(y = "PC2 (29.46%)", x = "PC1 (34.42%)")
Right now I have to manually change this last line of code every time I change which PCs I'm plotting. If it could somehow take the PC # (i.e. PC2) from ggplot(PCAvalues, aes(x = PC1, y = PC2)) and the % from summ for the label that would be awesome.
When I run a principal components analysis in R and then plot 2 of the PCs in ggplot I would like to be able to have the axis labels automatically include which PC # is on the axis and the percent of variation it explains. Right now I have to change the labels manually when I switch to different PCs.
I have example code here (I've left out quite a bit of code that I believe has nothing to do with the question but please let me know if I'm mistaken):
# Example dataset SR
SR = structure(list(Site_ID = 1:6, A = c(0.102, 1.34, 0.875, 0.564,
0.075, 0.141), B = c(0.01, 0.05, 0.021, 0.018, 0.006, 0.144),
C = c(1.329, 2.029, 2.466, 6.648, 0.735, 2.49), D = c(0.025,
0.045, 0.039, 0.024, 0.045, 0.112), E = c(0.007, 0.001618893,
0.022, 0.018, 0.006, 0.035), F = c(17.52188, 27.412, 18.69,
118.8684, 9.7188, 2.9904)), class = "data.frame", row.names = c(NA,
-6L))
##### PCA calcuation ##############################################
SR.pca <- prcomp(SR, scale=TRUE, retx=TRUE)
PCAvalues <- summary(SR.pca)
#Example output
#Importance of components:
# PC1 PC2 PC3 PC4 PC5 PC6
#Standard deviation 2.3467 2.1712 1.8408 1.12707 1.05835 8.756e-16
#Proportion of Variance 0.3442 0.2946 0.2118 0.07939 0.07001 0.000e+00
#Cumulative Proportion 0.3442 0.6388 0.8506 0.92999 1.00000 1.000e+00
summ <- summary(SR.pca)$importance[2,]
#gives proportion of variance for each PC
summ
#Example output
#PC1 PC2 PC3 PC4 PC5 PC6
#0.34420 0.29463 0.21178 0.07939 0.07001 0.00000
###### Graph the PCA ########
library(ggplot2)
ggplot(PCAvalues, aes(x = PC1, y = PC2)) +
geom_text(data=PCAvalues, aes(x = PC1, y = PC2, label=Site_ID), size=2)+
scale_color_gradient(low = "red", high = "blue") +
coord_equal() +
labs(color="Dist. Grad.")+
theme_bw()
# + labs(y = "PC2 (29.46%)", x = "PC1 (34.42%)")
Right now I have to manually change this last line of code every time I change which PCs I'm plotting. If it could somehow take the PC # (i.e. PC2) from ggplot(PCAvalues, aes(x = PC1, y = PC2)) and the % from summ for the label that would be awesome.
Share Improve this question edited Nov 20, 2024 at 19:16 Friede 8,4512 gold badges9 silver badges29 bronze badges asked Nov 20, 2024 at 18:43 Bridget WheelockBridget Wheelock 111 silver badge2 bronze badges 1 |2 Answers
Reset to default 0I can't run your code because PCAvalues is a list, not a data frame, and ggplot() cannot use it. Here is an example of a function that takes a data frame, the names of two columns, and a named vector and makes a plot from the data frame labeled with the column names and the corresponding values in the named vector. I think this corresponds to your data.
library(ggplot2)
DF <- data.frame(PC1 = rnorm(10), PC2 = rnorm(10), PC3 = rnorm(10))
summ <- c(PC1 = 42.3, PC2 = 23.0, PC3 = 5.2)
plotFunc <- function(DATA, col1, col2, vec) {
label1 = paste(col1, round(vec[col1],2), "%")
label2 = paste(col2, round(vec[col2],2), "%")
ggplot(DATA, aes(.data[[col1]], .data[[col2]])) + geom_point() +
labs(x = label1, y = label2)
}
plotFunc(DF, "PC1","PC3", summ)
Created on 2024-11-20 with reprex v2.1.1
Here is a tweak using sprintf()
, which I personally like.
SR = structure(
list(
Site_ID = 1:6,
A = c(0.102, 1.34, 0.875, 0.564, 0.075, 0.141),
B = c(0.01, 0.05, 0.021, 0.018, 0.006, 0.144),
C = c(1.329, 2.029, 2.466, 6.648, 0.735, 2.49),
D = c(0.025, 0.045, 0.039, 0.024, 0.045, 0.112),
E = c(0.007, 0.001618893, 0.022, 0.018, 0.006, 0.035),
F = c(17.52188, 27.412, 18.69, 118.8684, 9.7188, 2.9904)
),
class = "data.frame",
row.names = c(NA, -6L))
SR.pca = prcomp(SR, scale=TRUE, retx=TRUE)
summ = summary(SR.pca)$importance[2, ]
library(ggfortify)
#> Loading required package: ggplot2
library(ggplot2)
ggplot(SR.pca, aes(x=PC1, y=PC2)) +
geom_text(aes(label=Site_ID), size=5) +
coord_equal() +
theme_bw() +
labs(x=sprintf("PCA1 (%.2f%%)", 100*summ[1]),
y=sprintf("PCA2 (%.2f%%)", 100*summ[2]))
It remains unclear to me what the lines concerning colour should do. Add them back in please. Currently, the plot is ... More natural to {ggplot2}
is {scales}
, have a look.
Alternatively, you could overwrite autoplot()
, i.e.
library(ggfortify)
library(ggplot2)
autoplot(SR.pca) +
theme_bw()
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1742336659a4424786.html
paste
. However, please correct your code so others can copy-paste and run without error(s). – Friede Commented Nov 20, 2024 at 19:01