-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
upload files
- Loading branch information
0 parents
commit 22adfe8
Showing
95 changed files
with
71,779 additions
and
0 deletions.
There are no files selected for viewing
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
base | ||
BBmisc | ||
devtools | ||
stringr | ||
stringi | ||
reshape | ||
reshape2 | ||
data.table | ||
plyr | ||
dplyr | ||
magrittr | ||
foreach | ||
iterators | ||
doParallel | ||
knitr | ||
rmarkdown | ||
tidyr | ||
gtable | ||
gridExtra | ||
pander | ||
stringdist | ||
slidify | ||
RColorBrewer | ||
leaflet | ||
installr | ||
plot3D | ||
markdown | ||
broman | ||
foreign |
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
18,205 changes: 18,205 additions & 0 deletions
18,205
.cache/read-datasetB_6da8ccf3a6b94806526e2dce4d84e550.rdb
Large diffs are not rendered by default.
Oops, something went wrong.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
# History files | ||
.Rhistory | ||
.Rapp.history | ||
|
||
# Example code in package build process | ||
*-Ex.R | ||
|
||
# RStudio files | ||
.Rproj.user/ | ||
|
||
# produced vignettes | ||
vignettes/*.html | ||
vignettes/*.pdf | ||
.Rproj.user |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
Version: 1.0 | ||
|
||
RestoreWorkspace: Default | ||
SaveWorkspace: Default | ||
AlwaysSaveHistory: Default | ||
|
||
EnableCodeIndexing: Yes | ||
UseSpacesForTab: Yes | ||
NumSpacesForTab: 2 | ||
Encoding: UTF-8 | ||
|
||
RnwWeave: Sweave | ||
LaTeX: pdfLaTeX |
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
base | ||
methods | ||
datasets | ||
utils | ||
grDevices | ||
graphics | ||
stats | ||
scimapClient | ||
BBmisc | ||
devtools | ||
stringr | ||
stringi | ||
reshape | ||
reshape2 | ||
data.table | ||
DT | ||
plyr | ||
dplyr | ||
magrittr | ||
foreach | ||
iterators | ||
parallel | ||
doParallel | ||
rmarkdown | ||
tidyr | ||
grid | ||
gtable | ||
gridExtra | ||
pander | ||
stringdist | ||
knitr |
Binary file added
BIN
+2.82 KB
Natural_Language_Analysis_cache/html/matching-01_fbddb581e18e5d961a0e875bc7293d24.RData
Binary file not shown.
Binary file added
BIN
+395 KB
Natural_Language_Analysis_cache/html/matching-01_fbddb581e18e5d961a0e875bc7293d24.rdb
Binary file not shown.
Binary file added
BIN
+184 Bytes
Natural_Language_Analysis_cache/html/matching-01_fbddb581e18e5d961a0e875bc7293d24.rdx
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,170 @@ | ||
## Loading the packages | ||
if(!'BBmisc' %in% installed.packages()){ | ||
install.packages('BBmisc')} | ||
if(!'BiocParallel' %in% installed.packages()){ | ||
source("http://bioconductor.org/biocLite.R") | ||
biocLite("BiocParallel")} | ||
if(!'seleniumJars' %in% installed.packages()){ | ||
install_github('LluisRamon/seleniumJars')} | ||
|
||
suppressPackageStartupMessages(library('BBmisc')) | ||
suppressPackageStartupMessages(lib(c('zoo','stringi','stringr','reshape','reshape2','plyr','dplyr','magrittr', | ||
'ggplot2','ggthemes','plotly','foreach','memoise','doMC','doParallel','BiocParallel', | ||
'markdown','parallel','rmarkdown','manipulate','knitr','turner','scales', | ||
'lubridate','whisker'))) #'RStudioAMI','editR' | ||
|
||
## --------------------------------------------------------------------------------------------- | ||
## http://stackoverflow.com/questions/22954623/view-markdown-generated-html-in-rstudio-viewer | ||
render(paste0(getwd(),'/Betting Strategy and Model Validation.Rmd'),'all') | ||
#'@ View(paste0(getwd(),'/Betting_Strategy_and_Model_Validation.html')) | ||
browseURL(paste0(getwd(),'/Betting Strategy and Model Validation.html')) | ||
|
||
## https://github.com/swarm-lab/editR | ||
editR(paste0(getwd(),'/Betting Strategy and Model Validation.Rmd')) | ||
|
||
|
||
## Besides, need to scrap the final-scores / half-time scores / result of soccer matches | ||
teamID <- sort(unique(c(as.character(mbase$Home), as.character(mbase$Away)))) | ||
dateID <- sort(unique(mbase$Date)); spboDate <- gsub('-','',dateID) | ||
lnk <- paste0('http://www8.spbo.com/history.plex?day=',spboDate,'&l=en') | ||
|
||
## http://stackoverflow.com/questions/2158780/r-catching-an-error-and-then-branching-logic | ||
## http://www.win-vector.com/blog/2012/10/error-handling-in-r/ | ||
## Due to the scrapSPBO function scrapped unmatched data, example lnk[827], | ||
## therefore I rewrite the function as scrapSPBO2 | ||
source(paste0(getwd(),'/function/scrapSPBO2.R')) | ||
scrapSPBO2(lnk=lnk, dateID=dateID, path='livescore', parallel=FALSE) | ||
|
||
## Read scraped spbo datasets | ||
source(paste0(getwd(),'/function/readSPBO.R')) | ||
spboData <- readSPBO(dateID=dateID, path='livescore', parallel=FALSE) | ||
|
||
|
||
|
||
## https://github.com/pablobarbera/instaR | ||
## https://github.com/pablobarbera/Rfacebook | ||
install_github ## can try during free time | ||
## --------------------------------------------------------------------------------------------- | ||
## Load the scraped spbo livescore datasets. | ||
##... will take some times since dim spboData [156841 x 17] | ||
source(paste0(getwd(),'/function/readSPBO2.R')) | ||
spboData <- readSPBO2(dateID=dateID, parallel=TRUE) | ||
|
||
## filter spboTeamID | ||
spboTeamID <- sort(c(unique(as.vector(spboData$Home)),unique(as.vector(spboData$Away)))) | ||
tmID <- teamID[!teamID %in% mbase$others] | ||
|
||
spboData[(is.na(spboData$Date))&(nchar(as.vector(spboData$Time))==5),] | ||
spboData[subset(spboData, (is.na(data.frame(spboData)$Date))&(nchar(as.vector(spboData$Time))==5))$X,] | ||
|
||
> dim(mbase$datasets) | ||
[1] 48744 17 | ||
> dim(spboData) | ||
[1] 319744 20 | ||
|
||
mbase$datasets[mbase$datasets$DateUK %in% spboData$DateUK,] | ||
#Source: local data frame [17,934 x 17] | ||
|
||
na.omit(mbase$datasets[mbase$datasets$DateUK %in% spboData$DateUK,][order(mbase$datasets$No,decreasing=FALSE),]) | ||
#Source: local data frame [25,489 x 17] | ||
|
||
library('tau') | ||
library('textcat') | ||
library('stringdist') | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
http://wizardofvegas.com/forum/gambling/sports/10555-halt-time-betting/3/ | ||
http://quant.stackexchange.com/questions/2500/how-to-apply-the-kelly-criterion-when-expected-return-may-be-negative | ||
https://en.wikipedia.org/wiki/Gambling_and_information_theory | ||
http://www.eecs.harvard.edu/cs286r/courses/fall12/papers/Thorpe_KellyCriterion2007.pdf | ||
http://www.sportsbookreview.com/betting-tools/kelly-calculator/ | ||
http://thestakingmachine.com/laykelly.php | ||
### http://www.sportsbettingcalculator.co.uk/kelly-staking-calculator/ | ||
http://tipstertables.com/blog/betting-system-using-tipster-statistics-and-kelly-criterion | ||
######################################################################################## | ||
|
||
## Scrape the League in order to assign the virogish/spread margins/overrounds | ||
library(RSelenium) | ||
teamID <- sort(unique(unlist(mbase$Home), unlist(mbase$Away))) | ||
lnk <- 'http://www8.spbo.com/history.plex?day=20110107&l=en' | ||
|
||
#'@ system('java -jar selenium-server-standalone.jar') | ||
checkForServer() ## if you need the stand-alone Java binary | ||
startServer() | ||
webDr <- remoteDriver$new() | ||
webDr$open() | ||
webDr$navigate(lnk) | ||
webDr$navigate("http://www.bbc.co.uk") | ||
webDr$goBack() | ||
webDr$goForward() | ||
webDr$quit() | ||
|
||
## https://github.com/greenore/RSeleniumUtilities | ||
library(RSeleniumUtilities) | ||
RSeleniumUtilities::checkSelenium() | ||
webDr <- ieDriver() | ||
webDr <- firefoxDriver(use_profile=TRUE, profile_name="selenium") | ||
webDr <- chromeDriver(use_profile=TRUE, profile_name="selenium", internal_testing=TRUE) | ||
|
||
|
||
## Linear regression | ||
llply(split(mbase,mbase$Sess),function(x)lm(PL~Selection+HCap+Price,x)) | ||
|
||
|
||
#'@ stopCluster(cl) | ||
|
||
x <- seq(as.Date('2011-01-01'), as.Date('2015-07-31'), by='months') | ||
y <- seq(min(mbase$PL),max(mbase$Stake), by=10000) | ||
labels <- date_format('%b')(x) | ||
breaks <- as.Date(sort(c(as.POSIXct(x), as.POSIXct(seq(min(mbase$Date), | ||
max(mbase$Date), by='months')), ymd('2015-08-01')))) | ||
labels <- c('', as.vector(rbind(labels, rep('', length(labels))))) | ||
|
||
ggplot(data=mbase, aes(x=x, y=y, shape=AHOU)) + | ||
geom_line(aes(y = mbase$Stake, colour = 'Stake'), size=1.5) + | ||
geom_line(aes(y = mbase$PL, colour = 'PL'), size=1.5) + | ||
geom_point(size=2, fill='blue') + expand_limits(y=0) + ## Set y range to include 0 | ||
scale_colour_hue(name='PL', l=30) + ## Set legend title use darker colors (lightness=30) | ||
scale_shape_manual(name='PL', values=c(22,21)) + ## Use points with a fill color | ||
scale_shape_manual(values=c(22,21)) + xlab('Time of Day') + ylab('HK Dollars (HKD)') + | ||
scale_x_date(labels = labels, breaks = breaks, limits=range(breaks)) + ## scale_x_date(labels = date_format("%b"),breaks = date_breaks("months")) + | ||
ggtitle('Stakes and Profit & Lose') + ## Set title | ||
theme_bw() + theme(legend.position=c(.7, .4)) ## Position legend inside this must go after theme_bw | ||
|
||
qplot(Stake, data=mbase, geom='density', fill=AHOU, alpha=I(.5), | ||
main='Turnover and P&L', xlab='Year in Month', | ||
ylab='HKD Amount') + scale_x_date(breaks=date_breaks('months'), labels = date_format("%b")) | ||
|
||
### http://statisticalrecipes.blogspot.com/2012/02/simulating-genetic-drift.html | ||
dtm <- factor(sapply(strsplit(as.character(mbase$Date),'-'),function(x) x[2])) | ||
dtm <- data.frame(month=mapvalues(dtm, sort(levels(dtm)),c('Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec')), | ||
mbase$Stake/10000, mbase$PL/10000); names(dtm) <- c('Month','Stake','PL') | ||
sdata <- data.frame(Date=factor(paste0(dtm$Month,'-',mbase$Sess)),dtm[-1]); rm(dtm) | ||
sdata <- ddply(sdata, .(Date), summarise, Stake=sum(Stake), PL=sum(PL)) | ||
sdata[order(sdata$Date, decresing=FALSE),] | ||
|
||
## plot on same grid, each series colored differently -- | ||
## good if the series have same scale | ||
ggplot(sim_data, aes(Month,'HKD 0000')) + geom_line(aes(colour = Series)) + | ||
scale_x_discrete(labels=c('Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec'))+ | ||
theme(axis.text.x=element_text(face="bold",colour="red",size=14)) | ||
|
||
## ================================================================================================================================== | ||
## http://wenku.baidu.com/view/3574f639580216fc700afdfc.html | ||
## https://stat.ethz.ch/R-manual/R-devel/library/mgcv/html/gam.models.html | ||
## http://doc.qkzz.net/article/e6f33685-e220-4803-8c89-3228501b9412.htm | ||
|
||
|
||
|
||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Betting-Strategy-and-Model-Validation | ||
Betting Strategy and Model Validation, I analyse the staking model of a sportsbook agency which follow bets from consultancy firm A |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
--- | ||
title: "Testing efficiency of coding" | ||
author: "Ryo®, Eng Lian Hu" | ||
date: "8/28/2015" | ||
output: | ||
html_document: | ||
fig_height: 3 | ||
fig_width: 5 | ||
highlight: haddock | ||
theme: cerulean | ||
toc: yes | ||
--- | ||
|
||
This is an casting page of testing the efficiency of the coding for the research on `Betting Strategy and Model Validation` | ||
|
||
```{r load-packages} | ||
## Loading the packages | ||
if(!'devtools' %in% installed.packages()){ | ||
install.packages('devtools')} | ||
if(!'BBmisc' %in% installed.packages()){ | ||
install.packages('BBmisc')} | ||
suppressPackageStartupMessages(library('BBmisc')) | ||
pkgs <- c('devtools','RStudioAMI','zoo','chron','stringr','stringi','reshape','reshape2','data.table','sparkline','DT','plyr','dplyr','magrittr','parallel','foreach','memoise','manipulate','ggplot2','ggthemes','proto','extrafont','directlabels','PerformanceAnalytics','plotly','doMC','doParallel','BiocParallel','rvest','RSelenium','highlightHTML','knitr','rmarkdown','editR','scales','lubridate','tidyr','whisker','gtable','grid','gridExtra') | ||
suppressAll(lib(pkgs)); rm(pkgs) | ||
``` | ||
|
||
```{r get-data-summary-table-2.1} | ||
nrow(do.call(rbind, llply(as.list(seq(2011,2015)), function(x) data.frame(Sess=x,read.csv(paste0(getwd(),'/datasets/',x,'.csv'))),.parallel=TRUE))) | ||
nrow(rbind_all(llply(as.list(seq(2011,2015)), function(x) data.frame(Sess=x,read.csv(paste0(getwd(),'/datasets/',x,'.csv'))),.parallel=TRUE))) | ||
system.time(do.call(rbind, llply(as.list(seq(2011,2015)), function(x) data.frame(Sess=x,read.csv(paste0(getwd(),'/datasets/',x,'.csv'))),.parallel=TRUE))) | ||
system.time(rbind_all(llply(as.list(seq(2011,2015)), function(x) data.frame(Sess=x,read.csv(paste0(getwd(),'/datasets/',x,'.csv'))),.parallel=TRUE))) | ||
``` | ||
|
||
You can also embed plots, for example: | ||
|
||
```{r merge_all-dataframes-2.2} | ||
#'@ system.time(Reduce(function(x,y) {merge(x,y,all=TRUE)}, llply(list(df1,df1.sps,df1.pst),function(x) x[[1]]))) | ||
#'@ system.time(merge_all(list(df1[[1]],df1.sps[[1]],df1.pst[[1]]))) | ||
``` | ||
|
||
Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot. | ||
|
||
```{r} | ||
#'@ system.time(merge(socData, othData, all=TRUE)) | ||
#'@ system.time(merge_all(list(socData, othData))) | ||
``` | ||
|
||
|
||
|
||
From the research, I learned from some articles which compare the efficiency of data measurement which will apply in future data analyse and data mining etc. | ||
|
||
- [Comparing performance of by, ddply and data.table](http://www.r-bloggers.com/transforming-subsets-of-data-in-r-with-by-ddply-and-data-table/) | ||
|
||
- [R高性能包介绍与并行运算](https://mp.weixin.qq.com/s?__biz=MzA3NDUxMjYzMA%3D%3D&mid=216065319&idx=1&sn=31af52816c7e8b937f15480c4d5f6e41&key=0acd51d81cb052bcbc420864d8003491eba2f4bbc722bf3a7bc7da0d59fefc64ea6fc32bdb33673eebd62f201cbc2190&ascene=7&uin=MjAwMTM4MjU0OA%3D%3D&devicetype=android-19&version=26020236&nettype=WIFI&pass_ticket=GdViEIR%2F5PLzVFnzLxc71K39ze4fb6VAwvFp1bhH3inbu5xBjyQ7BLEpDOrQhWZ1) | ||
|
||
- [A biased comparsion of JSON packages in R](https://rstudio-pubs-static.s3.amazonaws.com/31702_9c22e3d1a0c44968a4a1f9656f1800ab.html) | ||
|
||
- [Video how-to: Speed up R with C++ and Rcpp](http://www.computerworld.com/article/2961056/data-analytics/video-how-to-speed-up-r-with-c-plus-plus-and-rcpp-package.html) | ||
|
||
- [benchmarking logistic regression using glm.fit , bigglm, speedglm, glmnet, LiblineaR](http://stackoverflow.com/questions/19532651/benchmarking-logistic-regression-using-glm-fit-bigglm-speedglm-glmnet-libli) | ||
|
||
- [Dates and Times Made Easy with lubridate](http://www.jstatsoft.org/article/view/v040i03/v40i03.pdf) | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
/* http://www.w3schools.com/css/css_examples.asp */ | ||
|
||
table { | ||
max-width: 95%; | ||
border: 1px solid #ccc; | ||
} | ||
th { | ||
background-color: #0000FF; | ||
color: #0000A0; | ||
} | ||
td { | ||
background-color: #00FFFF; | ||
} |
Oops, something went wrong.