121_方法_Power Query之R.Execute的read.xlsx&ODBC

焦棚子

87
文章

72
评论

2020-03-0311:45:14评论3,7261字数 1036阅读3分27秒阅读模式

焦棚子的文章目录

请在文末下载附件

一、问题

pq在用 Excel.Workbook 读取一些Excel早期版本（.xls后缀）的文件时候，报错：DataFormat.Error: 外部表不是预期的格式。

121_Power Query之R.Execute的read.xlsx&ODBC

二、解决方案

方案1

如果文件少可以另存为.xlsx版本即可用 Excel.Workbook 读取，也有批量xls转xlsx的工具（可自行搜索）

方案2

在不更改文件版本的情况下，可以用 R.Execute 调用R脚本读取也是非常简单的。

情况1：单个文件

let
    源 = R.Execute(
        "library(xlsx)
        data <- read.xlsx(file = 'C:\\Users\\pyj\\Desktop\\test\\demo1.xls',1, startRow=5,colIndex=c(1),header = FALSE,encoding = 'UTF-8')")
in
    源

情况2：多个文件

let
    源 = R.Execute(
        "library(xlsx)
            #设定文件夹路径
        setwd('C:\\Users\\pyj\\Desktop\\test')
        filenames <- dir()
            #数据框
        data <- data.frame()
        for (i in filenames){
            #循环file
        path <- paste0(getwd(),'\\',i)
            # 读取并合并数据
        data <- rbind(data,read.xlsx(file = path,1, startRow=5,colIndex=c(1),header = FALSE,encoding = 'UTF-8'))}"
    )
in
    源

三、总结

1、安装R的xlsx包；

2、注意 read.xlsx 参数的使用；

#官方帮助
read.xlsx(file, sheetIndex, sheetName=NULL, rowIndex=NULL,
  startRow=NULL, endRow=NULL, colIndex=NULL,
  as.data.frame=TRUE, header=TRUE, colClasses=NA,
  keepFormulas=FALSE, encoding="unknown", password=NULL, ...)

3、注意：rowIndex，colIndex的参数可以使向量指定行列,如：c(1,3,4)，c(5:10)；支持pq中list拼接。

4、在补充一个用ODBC读取的。

by 焦棚子

焦棚子的文章目录

请点击【立即购买】或者【升级VIP】获得案例附件。

隐藏内容需要支付：¥2

立即购买升级VIP

#	项目	JVIP	SVIP	注册者	游客
1	文章查看	免费	免费	免费	免费
2	登录查看	免费	免费	免费	-
3	讲解视频	免费	免费	付费	付费
4	附件下载	免费	免费	付费	付费
5	视频课	免费	免费	付费	付费
6	VIP时长	永久	12个月	-	-
7	价格	￥999	￥365	￥0	￥0

121_Power Query之R.Execute的read.xlsx&ODBC

请在文末下载附件

二、解决方案

三、总结

185_技巧_Power Query(M)语言快捷输入之搜狗输入法设置自定义短语

170_DAX & Power Query M 文档整理

164_Power Query 之巧解-外部表不是预期的格式

162_Power Query 快速合并文件夹中表格之自定义函数 TableXlsxCsv_2.0

156_格式化技巧之 DAX & Power Query(M)