您的位置 首页 新闻

R语言之数据可视化

数据科学家需要具备的知识 完整的数据分析流程 观测,变量和数据矩阵 变量的类型 数值型+分类型 变量之间的关系 变量类型不同,分析方法也不同 数值变量的特征与可视化 集中趋势量 分…

数据科学家需要具备的知识

完整的数据分析流程

观测,变量和数据矩阵

变量的类型

数值型+分类型

变量之间的关系

变量类型不同,分析方法也不同

数值变量的特征与可视化

集中趋势量 分散趋势量

稳健统计量:受到均值的影响是否大

可视化一个变量:箱图 四分位点 以及 极值的定义

分类变量的特征与可视化

两个分类变量的关系

列联表 +相对频率表

两个分类变量的关系(可视化):分段条形图,相对频率分段条形图

一个分类变量、一个数值变量的关系:并排箱图

小结

三大绘图系统

基本绘图系统 Lattice绘图系统 ggplot2绘图系统

基本绘图系统

绘图函数 -> graphics包

hist/ boxplot/ points/lines/text/title/axis

柱状图/箱图/点/线/文字/命名/坐标轴

plot

xlab/ylab/lwd/lty/pch/col x标签/y标签/线宽/线型/点型/颜色

全局参数 : bg/mar/las/mfrow/mfcol 背景颜色/边距/横竖排版/分行列按行/列填充

实例

hist(airquality$Wind,xlab = “Wind”) boxplot(airquality$Wind,xlab=”Wind”,ylab=”speed(mph)”) boxplot(Wind~Month,airquality,xlab=”Month”,ylab=”speed(mph)”) plot(airquality$Wind,airquality$Temp) with(airquality,plot(Wind,Temp)) title(main=”Wind and Temp in NYC”) with(airquality,plot(Wind,Temp, main=”Wind and Temp in NYC”, type = “n”)) with(subset(airquality,Month==9), points(Wind,Temp,col = “red”)) with(subset(airquality,Month==5), points(Wind,Temp,col = “blue”)) with(subset(airquality,Month==8), points(Wind,Temp,col = “black”)) with(subset(airquality,Month %in% c(6,7,8)), points(Wind,Temp,col = “black”)) fit <- lm(Temp ~ Wind,airquality) abline(fit,lwd=2) #添加图例 legend("topright", pch = 1, cex = 1, col = c("red", "blue", "black"), legend = c("sep", "May", "Other")) 全局参数 par("bg")#背景颜色 par("col")#颜色 par("mar")#(bottom,left,right, right) par("mfrow") par("mfcol") par(mfrow = c(1,2)) hist(airquality$Temp) hist(airquality$Wind) par(mfrow = c(1,1)) boxplot(airquality$Temp) Lattice绘图系统 实例 library(lattice) xyplot(Temp ~ Ozone, data = airquality) airquality$Month <- factor(airquality$Month) xyplot(Temp ~ Ozone | Month, data = airquality, layout = c(5,1)) q <- xyplot(Temp ~ Wind, data = airquality) print(q) set.seed(1) x <- rnorm(100) f <- rep(0:1, each=50) y <- x + f - f * x + rnorm(100, sd=0.5)#使用随机数时,切记使用种子,保证后期检查,纠错方便. f <- factor(f, labels = c("Group1", "Group2")) xyplot(y ~ x | f, layout = c(2, 1)) xyplot(y ~ x | f, panel = function(x,y){ panel.xyplot(x,y) panel.abline(v = mean(x), h = mean(y), lty = 2) panel.lmline(x,y, col = "red") }) ggplot2 绘图系统 实例 library(ggplot2) airquality$Month <- factor(airquality$Month) qplot(Wind, Temp, data = airquality, col = Month, shape = Month, size = Month, xlab = "Wind(mph)", ylab = "Temperature", main = "Wind vs.Temperature" ) qplot(Wind, Temp, data = airquality, geom = c("point", "smooth")) qplot(Wind, Temp, data = airquality, facets = Month~.) qplot(Wind, Temp, data = airquality, facets = .~Month) qplot(Wind, data = airquality, facets = .~Month) qplot(Wind, data = airquality, fill = Month) qplot(Wind, data = airquality, geom = "dotplot") ggplot函数的使用 library(ggplot2) ggplot(airquality, aes(Wind, Temp))+ geom_point(aes(color = factor(Month), group = 1,alpha = 0.4, size = 5))+ geom_smooth(method = "lm", se = F, aes(group = 1))#前一个group只输出群体拟合,后一个控制再做一条群体拟合 ggplot(airquality, aes(Wind, Temp)) + geom_point()+ geom_smooth(method = "lm", se = F, aes(group = 1)) library(RColorBrewer) myColors<-c(brewer.pal(5,"Dark2"),"black") ggplot(airquality, aes(Wind, Temp, col = factor(Month))) + geom_point()+ geom_smooth(method = "lm", se = F, aes(group = 1))+ scale_color_manual("Month", values = myColors)+ facet_grid(.~Month)+ theme_classic() R语言绘图之颜色 library(RColorBrewer) pal<-colorRamp(c("red","blue")) pal(0) pal(1) pal(0.5) pal(seq(0,1,len=10)) pal<-colorRampPalette(c("red","blue")) pal(0) pal(1) pal(0.5) pal(10) brewer.pal.info cols<-brewer.pal(3,"Greens") cols pal<-colorRampPalette(cols) pal image(volcano,col = pal(20)) display.brewer.pal(3,"Greens") 图形设备

本文来自网络,不代表Xnewv立场,转载请注明出处:https://xnewv.com/2157.html

为您推荐

联系我们

联系我们

18873343099

在线咨询: QQ交谈

邮箱: [email protected]

关注微信
微信扫一扫关注我们

微信扫一扫关注我们

关注微博
返回顶部