Rä¸è¿æä¸ä¸ªæ趣çsqldfå
ï¼å®å¯ä»¥è®©ä½ ç¨SQLæ¥æä½dataframeï¼è¿ç§åè½è½è®©ä¼Rç人è½ç»ä¹ SQLï¼ä¼SQLç人è½ç»ä¹ Rï¼ä¸å¾ä¸æå¹Rè¯è¨ç强大æè½åç¥å¥é
åã
å½ç¶ä¹å¯ä»¥å°Rä¸å¤é¨æ°æ®åºè¿æ¥ï¼ç´æ¥å¨Rä¸æä½æ°æ®åºï¼å¹¶çææç»ç»æï¼è¿ä¹æ¯ä¸ç§å¯è¡çæ¹æ³ãå¨Rä¸è¿æ¥æ°æ®åºéè¦å®è£
å
¶å®çæ©å±å
ï¼æ ¹æ®è¿æ¥æ¹å¼ä¸åæ们æ两ç§éæ©ï¼ä¸ç§æ¯ODBCæ¹å¼ï¼éè¦å®è£
RODBCå
并å®è£
ODBC驱å¨ãå¦ä¸ç§æ¯DBIæ¹å¼ï¼å¯ä»¥æ ¹æ®å·²ç»å®è£
çæ°æ®åºç±»åæ¥å®è£
ç¸åºç驱å¨ãå 为åè
ä¿çäºåæ°æ®åºåæ¬çç¹æ§ï¼æ以个人æ¯è¾å好ç¨DBIè¿æ¥æ¹å¼ãæä¸é¢è¿å ç§ä¸»è¦çå
æä¾äºDBIè¿æ¥ï¼RMySQLï¼RSQLiteï¼ROracleï¼RPostgreSQLãç±ååçå¾åºå®ä»¬åå«å¯¹åºäºå ç§ä¸»æµçæ°æ®åºã
注ï¼ä»¥ä¸æåæ¥èªR-bloggerââãå¨Rè¯è¨ä¸ä½¿ç¨SQLã
å¼å¾ä¸æçæ¯å¨è¿ç¨sqldfå
æ¶å¯è½è¦å¯¹R软件å级å°3.1.0ç以ä¸ï¼å¦åå®è£
ä¸ä¼æåï¼è¿éç®åä»ç»ä¸ç§å¨Windowsç¯å¢ä¸å¯¹R软件èªå¨å级çæ¹æ³ã
Windowsä¸å级Rè¯è¨ï¼
install.packages("installr");
library(installr) #å è½½å
updateR()
ä¹åéæ©âç¡®å®âå°±å¯ä»¥èªå¨æ´æ°äº
è½ç¶å¨Rè¯è¨ä¸æå¾å¤ä¼ç§çå½æ°å¦aggregate()ådaply()å¯ä»¥å¯¹æ°æ®æ¡ç»è®¡ï¼ä½sqlåè½å¼ºå¤§ï¼ä¸ä»
è½å®ç°æ°æ®çæ¸
æ´ãç»è®¡ãè¿ç®,è¿å¯ä»¥å®ç°æ°æ®åå¨ãæ§å¶ãå®ä¹åè°ç¨ãè¶æ¥è¶å¤çå
¬å¸å¨æèçæ¶åé½è¦æ±æ°æ®åæå¸é¤äºè¦ææ¡ç»è®¡å»ºæ¨¡åæ°æ®ææçç论æ¹æ³åç¼ç¨è½åå¤ï¼è¿è¦æ±å
¶å
·å¤ä½¿ç¨sqlçè½åï¼æ¥¼ä¸»ä¹å¨ç§¯æå¦ä¹ sqlå½ä¸ã
ä¸é¢æ¯æå©ç¨sqldf()å®æçä¸äºç¨åº,è½ç¶ç®åï¼ä½è¿æ¯å¦ä¹ ç第ä¸æ¥
> name<-c(rep('å¼ ä¸',1,3),rep('æå',3))
> subject<-c('æ°å¦','è¯æ','è±è¯','æ°å¦','è¯æ','è±è¯')
> score<-c(89,80,70,90,70,80)
> stuid<-c(1,1,1,2,2,2)
> stuscore<-data.frame(name,subject,score,stuid)
> library(sqldf)
1.计ç®æ¯ä¸ªäººçæ»æ绩并æå(è¦æ±æ¾ç¤ºå段ï¼å§åï¼æ»æ绩)
> sqldf('select name,sum(score) as allscore from stuscore group by name order by allscore')
name allscore
1 å¼ ä¸ 239
2 æå 240
2.计ç®æ¯ä¸ªäººçæ»æ绩并æå(è¦æ±æ¾ç¤ºå段: å¦å·ï¼å§åï¼æ»æ绩)
> sqldf('select name,stuid,sum(score) as allscore from stuscore group by name order by allscore')
name stuid allscore
1 å¼ ä¸ 1 239
2 æå 2 240
3.计ç®æ¯ä¸ªäººåç§çæé«æ绩(è¦æ±æ¾ç¤ºå段: å¦å·ï¼å§åï¼è¯¾ç¨ï¼æé«æ绩)
> sqldf('select stuid,name,subject, max(score) as maxscore from stuscore group by stuid')
stuid name subject maxscore
1 1 å¼ ä¸ æ°å¦ 89
2 2 æå æ°å¦ 90
4.计ç®æ¯ä¸ªäººçå¹³åæ绩ï¼è¦æ±æ¾ç¤ºå段: å¦å·ï¼å§åï¼å¹³åæ绩ï¼
> sqldf('select stuid,name,subject, avg(score) as avgscore from stuscore group by stuid')
stuid name subject avgscore
1 1 å¼ ä¸ è±è¯ 79.66667
2 2 æå è±è¯ 80.00000
5.ååºåé¨è¯¾ç¨æ绩æ好çå¦ç(è¦æ±æ¾ç¤ºå段: å¦å·ï¼å§å,ç§ç®ï¼æ绩)
> sqldf('select stuid,name,subject,max(score) as maxscore from stuscore group by subject order by stuid')
stuid name subject maxscore
1 1 å¼ ä¸ è¯æ 80
2 2 æå æ°å¦ 90
3 2 æå è±è¯ 80
8ï¼ååºåé¨è¯¾ç¨çå¹³åæ绩ï¼è¦æ±æ¾ç¤ºå段ï¼è¯¾ç¨ï¼å¹³åæ绩ï¼
> sqldf('select subject,avg(score) as avgscore from stuscore group by subject')
subject avgscore
1 æ°å¦ 89.5
2 è±è¯ 75.0
3 è¯æ 75.0
温馨提示:答案为网友推荐,仅供参考