-
Notifications
You must be signed in to change notification settings - Fork 63
RHive example code
Seonghak (Aiden) Hong edited this page Jun 20, 2013
·
1 revision
- Example 1
how to connect and execute HQL (HiveQL)
rhive.connect("10.10.10.1") # 10.10.10.1 is the IP address of hiveserver rhive.query("select * from emp")
- Example 2
how to use RUDF like scalar function and export R Object
coefficient <- 1.1 scoring <- function(sal) { coefficient * sal } rhive.assign('coefficient',coefficient) rhive.assign('scoring',scoring) rhive.exportAll(‘scoring’) rhive.query("select R('scoring',col_sal,0.0) from emp")
- Example 3
how to use RUDAF like aggregation function
hsum <- function(prev,sal) { if(is.null(prev)) { sal <- c(0) return(sal) } return(prev + sal) } hsum.partial <- function(agg_sal) { agg_sal } hsum.merge <- function(prev, agg_sal) { if(is.null(prev)) agg_sal else prev + agg_sal } hsum.terminate <- function(agg_sal) { return(agg_sal) } rhive.assign('hsum',hsum) rhive.assign('hsum.partial',hsum.partial) rhive.assign('hsum.merge',hsum.merge) rhive.assign('hsum.terminate',hsum.terminate) rhive.exportAll('hsum') rhive.query("select RA('hsum',sal) from emp group by empno")
- Example 4
how to manipulate Hive table using RHive API
emp <- rhive.desc.table(‘emp’) colnames(emp) </pre>
- Example 5
how to use custom Map/Reduce script in RHive
map <- function(key,value) { if(is.null(value)) { put(NA,1) } lapply(value, function(column) { lapply(strsplit(x=column, split=" ")[[[[1]]]], function(word) put(word,1)) }) } reduce <- function(key, values) { put(key, sum(as.numeric(values))) } rhive.mrapply("emp", map, reduce, c("ename", "position"), c("word", "one"), by="word", c("word", "one"), c("word", "count"))
- Example 6
how to export R data object to Hive
tablename <- rhive.write.table(USArrests) rhive.desc.table(tablename) rhive.load.table(tablename)